Distributed Obfuscation Model for Software Protection

- Introduction

A very important aspect in the world of software development is the security of software and it’s the most critical issues in nowadays. The software security contains protect information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction in order to provide confidentiality, integrity, and availability.

People need to hide their shared data through network communication. Because they want to maintain their privacy [1].

But there are some problems affect the software protection. Many software gets distributed over the Internet. Once software distributed to a client machine, the software owner actually loses all control of the (client) software. An illegal access could be obtained when software is being cracked, where illegal copying and distributing of cracked software is a form of copyright infringement.

Attacker can violate the copyrights of software by applying reverse engineering process that give attackers a chance to understand the behavior of software and extract the proprietary algorithms and key data structures from it. There are many reverse engineering tools that can provides an access control to software’s data and makes it easier for adversaries and reverse engineers to analyses software and steal the intellectual property [1] [2][3]. Some of these tools, dis-assembler (translate binary code into readable assembly language), de-compiler (translate byte code to source code). And it is difficult to enforce copyright laws and penalties upon users. “**Copyright is a form of protection grounded, it’s the certain right for an author to original work they have written**”. Therefore, copyright infringement is the illegal distribution and unauthorized copying of copyrighted software. protect the copyright of software is one of the most critical issues in nowadays [3][4].

Software protection is required and we need to solve all problems of software protection. Firstly, The Software contains secret, confidential or sensitive information. To protect this data, obfuscation technique is used. Secondly, it’s necessary to protect a software implementation from reverse engineering. And third, there is the program’s execution itself, where critical code is executed and some confidential data is accessed. During execution one needs to protect the code and data from malicious intents, such as dynamic analysis and tampering. All these elements need to be sufficiently strengthened to guarantee data confidentiality and secure program execution to the user [23][4].

The obfuscation technique is an important technique aiming to protects software from the risk of reverse engineering. It transforms program into another functionally equivalent program, while hiding its internal logic to enhance the difficulty of interpretation, thus obstruct software analysis [1][2].

This research, presents a novel solution aimed to protect copyright and distributed of software, we developed ** Distributed obfuscation model technique** that protect software from above problems. It operating at a high level of security against reverse engineering by integrating multi-levels of protections.

- Research Objectives
- Help protect the intellectual property contained within software by making reverse-engineering of a program difficult and impossible.
- Protect licensing mechanisms and unauthorized access, and shrinking the size of the executable.
- Protect copyright and distributed of software over client/server.
- Improve the level of software protection by integrates many levels of obfuscations and combines different techniques in order to complicate the process of reverse engineering analysis.
- To explore the disadvantage of existing model.
- Improve the protect of software against reverse engineering analysis.
- Develop robust protection technique against reverse engineering analysis.
- To ensure data confidentiality from any type of reverse engineering while distribution the software over client device, and while transmit data through network by using DH and AES algorithms.
- To read and analysis existing obfuscation techniques and cryptosystem models.
- To defined obfuscation techniques.
- To develop new distributed obfuscation technique that protect software against reverse engineering analysis.
- To simulate and test proposed distributed obfuscation technique including obfuscation/ de-obfuscation. Key generation and key exchange.
- To comparison between our model and existing models through time, key length.
- Model designed to prevent mass wiretapping and malicious corruption of Transmission Control Protocol -TCP traffic on the Internet.

- Contributions

Even though the *Distributed obfuscation model*** technique** with multi – levels of obfuscations is robust and more security than traditional techniques,

The main contribution of this thesis, presenting a suitable solution to protect software, copyright and distributed of software over client/server against reverse engineering analysis, and these contributions is summarized as follows:

- Develop robust technique that integration multi levels of obfuscations, traditional techniques depend on only one level of obfuscation and these techniques suffered from reverse engineering analysis. So, depending on multi-levels of obfuscation make reverse engineering analysis impossible.
- Protect Software against all attacker and against reverse engineering analysis.
- Develop new technique using AES and DH encryption algorithms to improve the protect of software and confidentiality of using this technique.
- Another important contribution, which improves the security of AES and DH algorithms against reverse engineering analysis, this enhancement in security is possible by using random key generation idea based on key exchange and generation process using Diffie Hellman principle.
- Protect data, software and intellectual property from unauthorized access by another users during distribution software over client/ server devices.
- Motivation
- The major problem of software protection is the distribution of software (Client / Server) over the client devices, in which the owners lose the control on their software.
- Over the last years, client devices became more powerful, where an attacker with a malicious intent can violate the copyrights and tampering the software via applying many analysis and reverse engineering tools.
- An illegal access could be obtained when the software is being cracked, where illegal copying and distributing of cracked software is a form of copyright infringement.
- Threat Model

The main aim of the threat modeling is to identify the import functionalities of the software/ model and protect it.

We begin threat modeling by focusing on four key questions: ref [ book ]

“ Threat Modeling: Designing for Security Published by John Wiley & Sons, Inc. 10475 Crosspoint BoulevardIndianapolis, IN 46256 www.wiley.com Copyright © 2014 by Adam Shostack Published by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada “

- What are you building?
- What can go wrong?
- What should you do about those things that can go wrong?
- Did you do a decent job of analysis?

Transmit data through network that connect server to several clients and maintain shared secret key, can be accessed by different attacker, and software can be attack by reverse engineering analysis, software protection techniques are suffer from a set of threats that our model will solve:

- An unauthorized person, such as a contractor or visitor, might gain access to a company’s computer system.
- Confidential information might be intercepted as it is being sent to an authorized user.
- Users may share documents between geographically separated offices over the Internet or Extranet, or telecommuters accessing the corporate Intranet from their home computer can expose sensitive data as it is sent over the wire.
- Electronic mail can be intercepted in transit.

Threat to the client:

- Virus: attached to an executable file, requires human action to spread, some damage software and hardware of the user.

Danial of the service (DOS): Its any type of attack that occurs on network structure while data transmission, and it disabled a server from servicing its client. ( ref : Denial of Service Attack Techniques: Analysis, Implementation and Comparison

Khaled M. Elleithy, Computer Science Department, University of Bridgeport , Bridgeport, CT 06604, USA , Drazen Blagovic, Wang Cheng, and Paul Sideleau , Computer Science Department, Sacred Heart University

Fairfield, CT 06825, USA )

the attacker sending a lot of requests to the a server in a tempt to make it down, flooding server with large packets of invalid data

- Problem Statement

The need for robust software protection technique against tampering and reverse

engineering analysis is highly needed and recommended nowadays, and these techniques should address the lack of confidently software in an untrusted environment.

Many studies have been investigated one to one protection, where there is a clear lack of studies that are constructing the many to one protection, in which most of these approaches protect the intellectual property and seen as trade secrets.

Research Aims

- The research develops new technique depend of obfuscation technique to prevent

reverse engineering analysis and modifications during distribution the software over client/server over client device.

- Improve the level of software protection by integrating obfuscation and using AES encryption technique in order to make reverse engineering analysis hard and complicate and make decompiling the programs infeasible.

- Research Questions

This study is meant to address the following questions and seek answers for them:

- Can we use Advanced Encryption Standard (AES) algorithm combined with Diffie-Hellman (DH) exchange key algorithm in obfuscation structure to improve security of obfuscation, quality of obfuscation and to protect software against reverse engineering analysis
- Research Hypotheses
- Research Structure

This section summarizes briefly what each chapter explains.

**Chapter One: Background.**

** **This chapter gives the description of the thesis idea, justification for choose research, problem statement, research aims, objectives, questions, hypotheses and structure of research.

This chapter talks about my obfuscation model thesis use.

This chapter presents the testing and result.

This chapter presents the conclusion and future work.

Chapter Two

Background

- Introduction

This chapter will focus on symmetric encryption, which is the simplest type of encryption. In symmetric encryption, the key used to decrypt is the same as the key used to encrypt. You’ll start by learning about the weaker forms of symmetric encryption, the classic ciphers that are only secure against the most illiterate attacker, and then move on to the stronger forms that are secure forever.

- Encryption
- Definition

Cryptography is derived from Greek word ‘crypto’ means secret ‘graphy’ means writing that is used to conceal the content of message from all except the sender and the receiver and is used to authenticate the correctness of message to the recipient. Cryptography is such a way that make sure of integrity, availability and identification, confidentiality, authentication of user and as well as security and privacy of data can be provided to the user [1].

The word cryptography may use alternatively with the words cryptology or cryptanalysis

but each of them has its own meaning that differ slightly using term ” Crypto ” which comes

from Greek ” krypto” to mean hidden and ending with “graphy” which mean writing, so the

whole word cryptography is mean to hidden writing that done as an output of encryption

process in secret system [19]. Cryptanalysis is what the layperson calls breaking the code.

The areas of cryptography and cryptanalysis together are called cryptology [20].

Cryptographic systems are used to provide privacy and authentication in computer and

communication systems. As shown in figure [3.1], encryption algorithms encipher the

plaintext, or clear messages, into unintelligible cipher text or cryptograms using a key. A

deciphering algorithm is used for decryption or decipherment in order to restore the original

information. Ciphers are cryptographic algorithms [18].

Figure [3.1] Cryptography

Cryptography is used for many goals that can be either all achieved at the same time in one

application, or only one of them [17]. These goals are:

- Confidentiality: it ensures that nobody can understand the received message except the

one who has the decipher key.

- Authentication: it is the process of proving the identity. This means that the user or the

system can prove their own identities to other parties who don’t have personal

knowledge of their identities.

- Data Integrity: it ensures that the received message has not been changed in any way

from its original form.

- Non-Repudiation: it is mechanism used to prove that the sender really sent this message,

and the message was received by the specified party, so the recipient cannot claim that

the message was not sent.

- Access Control: it is the process of preventing an unauthorized use of resources. This

goal controls who can have access to the resources, if one can access, under which

restrictions and conditions the access can be occurred, and what is the permission level

of a given access [17].

Cryptosystems have two type based on the number of keys used, they are either

symmetric (private key encryption), in which case both the enciphering and deciphering keys

must be kept secret, or asymmetric (public-key encryption) in which case one of the keys can

be made public without compromising the other [18] [12] [16].

**3.2** ** ****Asymmetric encryption (public-key encryption) **

Each person has a pair of keys (a public key and a private key). person’s public key is

published but the private key is kept secret. Messages are encrypted using the recipient’s

public key and can only be decrypted using his private key. In this method the sender and

the receiver eliminate the need to share keys (secret information). All communications

use only public keys, and no private key is transmitted [10], see figure [3.2].

Figure [3.2]: A simplified model for asymmetric encryption.

**3.2.1** ** Rivest–Shamir–Adleman Algorithm (RSA)**

** ****3.2.1** ** Rivest–Shamir–Adleman Algorithm (RSA)**

RSA implements a public-key cryptosystem, as well as digital signatures.

- Public-key encryption: In RSA, encryption keys are public, while the decryption

keys are not, only the person with the correct decryption key can decipher an encrypted message. Everyone has their own encryption and decryption keys.

- Digital signatures: The receiver may need to verify that a transmitted message

actually originated from the sender (signature), and didn’t just come from there

(authentication). This is done using the sender’s decryption key, and the signature can

be verified by anyone, using the corresponding public encryption key [23].

The security of the RSA algorithm has been validated, since no known attempts to

break it have yet been successful, mostly due to the difficulty of factoring large

product of two prime numbers [23].

RSA operation involves four steps key generation, key distribution, encryption and

decryption. The main idea is to find large positive integers 𝒆, 𝒅 and 𝒏 such that with

modular exponentiation for all 𝒎 as equation 3.1:

(𝒎17𝒆)𝒅 𝒎𝒐𝒅 𝒏 = 𝒎 ** **

** **Where encryption key 𝒆 , decryption key 𝒅 and 𝒏 is a composite number which is a product of

two large prime number. Although knowing of **e** and 𝒏 or even 𝒎 **, **it is difficult to find 𝒅 **.** [24].

**Key Generation: ** key generation can be done using the following steps [24]:

First, we select two large prime numbers of about the same size, 𝑝 𝑎𝑛𝑑 𝑞, then compute n and

j(n) as equations 3.2, 3.3:

𝒏 = 𝒑 ∗ 𝒒** **

**(3.2)**

** **𝒋 (𝒏) = (𝒒 − 𝟏) (𝒑 − 𝟏)** **

** (3.3)**

** ** After that select encryption key 𝒆, 𝟏 < 𝒆 < 𝒋(𝒏), based on relationship specified on

equation 3.4:

𝒈𝒄𝒅 (𝒆, 𝒋)) = 𝟏** **

** (3.4) ** Then calculate private key using equation 3.5:

𝒅 = 𝒆18−𝟏(𝒎𝒐𝒅 𝒋(𝒏))** **

**(3.5)**

** ** Finally, the public key can be computed using n and e, also private key is equal to d; RSA is

used for encryption/decryption process as following:

** Encryption**: message 𝑴 at senders’ side is converted into an integer 𝒎 where 𝒎 < 𝒏** **and

𝒈𝒄𝒅 (𝒎 , 𝒏 ) = 𝟏** .**

The cipher text 𝒄 , is calculated by using receivers public key 𝒆 , which is send to the receiver.as

equation 3.6:

𝑪 = 𝑴𝒆 𝒎𝒐𝒅 𝒏 ** **

**(3.6) ****Decryption**: At receiver’s end 𝒎 is deciphered from 𝒄 by using receiver’s private key 𝒅 by

computing equation 3.7 [24]:

𝒄𝒅=(𝒎𝒆)𝒅= 𝒎 𝒎𝒐𝒅 𝒏** **

** (3.7)**

**3.2.2** **Diffie –Helman Algorithm (DH) **

** **The Diffie–Hellman is a specific method of exchanging cryptographic keys. It is key

exchange method allows two parties that have no prior knowledge of each other to jointly

establish a shared secret key over an insecure communications channel. This key can then

be used to encrypt subsequent communications using a symmetric key cipher [25]. the

following steps show how the Diffie-Hellman Algorithm Key exchanges work [26]:

- Compute global public elements: in this step choose 𝒒 as prime number and 𝒂 is

primitive root of 𝒒, such that 𝒂 < 𝒒.

- User
**A**Key generation: this can be done by selecting private key 𝑿

<

𝒒, then calculate public key 𝒀

, as equation 3.8:

19 𝑨𝒀𝑨= 𝒂 ∗ 𝑿𝑨𝑨, such that 𝑿

𝒎𝒐𝒅 𝒒** **

** (3.8)**

** **Key generation for user B: this can be done by selecting private key 𝑿

, such that

𝑿𝑩< 𝒒, then calculate public key 𝒀

, as equation 3.2:

𝑩𝒀𝑩= 𝒂 ∗ 𝑿𝑩𝑩 𝒎𝒐𝒅 𝒒** **

** (3.9)**

** **Calculation of secret key by user A: secret key of user **A** can be computed as equation

3.10:

𝑲 = (𝒀𝑩)𝑿𝑨 𝒎𝒐𝒅 𝒒** **

**(3.10) **Calculation of secret key by user B: secret key of user **B** can be computed as equation

3.11:

𝑲 = (𝒀𝑨)𝑿 𝒎𝒐𝒅 𝒒** **

** (3.11) **Notice that, the result is that the two sides have exchanged a secret key value**. **

** **** **𝑩

**3.2.3** **Elliptic Curve Cryptography algorithm (ECC) **

** ** ECC is an alternative mechanism for implementing public-key cryptography. ECC is

based on discrete logarithms that are much more difficult to challenge at equivalent key

lengths. The security of a public key system using elliptic curves is based on difficulty of

computing discrete algorithms in the group of points on an elliptic curve defined over a

finite field. Elliptic curve equation over a finite field 𝐹

can be described by equation

[3.12] [21]:

20𝑝** **𝒚𝟐= 𝒙𝟑+ 𝒂𝒙 + 𝒃** **

** (3.12)**

Here, y, x, a and b are all within 𝐹

, and p is an integer modulo p. a and b is the

coefficients which determine what points will be on the curve. Curve coefficients have to

fulfill one condition that is:

𝑝𝟒𝒂𝟑+ 𝟐𝟕𝒃𝟐≠ 𝟎

** **This condition guarantees that the curve will not contain any singularities [21].

Each value of a and b gives a different elliptic curve. The public key is a point on the

curve and the private key is a random number. The public key is obtained by multiplying

the private key with a generator point G in the curve [22].

**3.3** **Symmetric encryption (private key encryption) **

** ** Using the same secret key in encrypt and decrypt messages. its problem is transmitting

the secret key to a legitimate person that needs it [10].

Figure [3.3]: A simplified model for symmetric encryption.

**Advanced Encryption Standard (AES)**

AES is based on substitution permutation network. It is fast in both hardware and

software. AES has affixed block size of 128, 192, 256 bits, and it can specify with block

and key sizes in any multiple of 32 bits. The block size has a maximum of 256 bits.

AES algorithm have many characteristics: resistance against all known attacks, speed and

code compactness on a wide range of platforms and design simplicity [26].

AES operate on 4 × 4 matrix of bytes, termed a state. The AES cipher is specified as

number of repetitions of transformation rounds that convert the plaintext into ciphertext.

Each round consists of several processing steps, including one that depend on encryption

key. A set of reverse rounds are applied to transform ciphertext back into original

plaintext using the same encryption key [27]. The encryption algorithm is organized into

three rounds. Round 0 is simply an add key round; round 1 is a full round of four

functions; and round 2 contains only 3 functions. Each round includes the add key

function, which makes use of 16 bits of key. The initial 16-bit key is expanded to 48 bits,

so that each round uses a distinct 16-bit round key [26]. Figure 3.4 represent AES

encryption process, and relationship between number of round and cipher key sizes,

where 10 cycles need a key length of 128 bits, 128 cycles supports a key length of 192 bit

and 14 cycles need a key length of 256 bit.

Figure 3.4: AES general encryption process [27]

The encryption algorithm involves the use of four different functions, or transformations:

add key, nibble substitution, shift row, and mix column, whose shown on Figure 3.5.

Figure 3.5: AES structure encryption process [26].

- Standard Ciphertext algorithms
- Key management algorithms

Rivest shamir adleman algorithms(RSA) / ecc /DH

- Attack Analysis
- Summary

**Chapter Three: Models of obfuscation **

**Introduction**

**Obfuscation techniques**

**Reverse engineering**

**Summary**

** ****Chapter 4: Proposed Model: DOSP**

**Chapter 5: Results and Analysis**

** ****Chapter 6: Conclusion and future work **

Reference

Payment Methods

All online transactions are done using all major Credit Cards or Electronic Check through PayPal. These are safe, secure, and efficient online payment methods.

Our customer support team is here to answer your questions. Ask us anything!

👋 Hi, how can I help?