Reverse Engineering and Information Security
What are reverse engineering and information security? In this article, we discuss how we can use reverse engineering in a good way; ethical hacking.
Join the DZone community and get the full member experience.Join For Free
Upon the first look at these things, you could think that there is a big difference between them and maybe you are right: one of them means to protect information from stealing, compromise, and so on, but another one means to hack or trying to look under the hood of software if we can say that.
In this article, we can try to look at this from another perspective – these two things may go together as equals. So, let’s started.
As you must know, information security is in all aspects related to defining, achieving, and maintaining confidentiality, integrity, availability, non-repudiation, accountability, and authenticity.
On the other hand, reverse engineering is a process of restoration of principles, ideas, algorithms of the program for researching, and (or) creating similar software. We can split it up into three big pieces:
- Software reverse engineering
- Hardware reverse engineering
- Social reverse engineering
Of course, there is no silver bullet. Your application cannot be completely protected from all vulnerabilities and methods that hackers use to reach desired credentials.
But, we can, or I would say must, get closer to the ideal picture of secure applications (web service, mobile app, and so on). And, that is what this article is all about. I hope to write more about this topic in the future.
So, from my point of view, if you want to be good at information security and write secure software, you need to be good at reverse engineering.
To begin, we'll start with reverse engineering and a practical example.
Some companies can offer you to try to hack or to research the algorithm of a simple test application; I heard that Kaspersky Lab was doing that.
As far as I’m concerned, I think this is a good approach because you need to analyze the signature of the virus and add it into the database of viruses. In that way, your antivirus will recognize those little evil e-creatures and remove them.
There are other areas, where you can use knowledge of reverse engineering including software analysis, file formats, drivers, and so on.
Of course, there is a lot of software that you can use in your work (all software depends on operation systems, targets, file formats, and programming languages) such as PEiD, PE Explorer (to find out which language was used to write the program), dotPeek or NetReflector (.NET decompilers), DJ Java Decompiler, IDA Pro (disassembler), OllyDBG (debugger ring-3 level), and many others.
Let’s see how we can use some of these instruments.
We will try to do research on a small test program and see how hackers will be able to neutralize the protection. You will see two approaches: patch and key-generator.
Below is how the test program looks:
The First Approach – PATCH
At first, let’s try to find out which programming language is used for creating this program, find the entry point, and see if the program uses any protection. We will use PeiD (or ExeInfo pe). See the screenshot below:
What can we see on that screenshot? Well, the first thing is writing on Delphi. Now, you can use the black box method. First, look at the program without seeing source code or using a disassembler, and so on. So, in this scenario, we will be a QA engineer, we will input some data, and we will observe how the program deals with our test data. So, let’s do it.
Open our test program and in front of us, we can see the main window of the program. Take a look closer. We see the button “Try it” and two text fields, “Name” and “Serial #”.
Just a little note: there are two types of serial numbers: static and dynamic.
Static serial number or “Hard Serial number” was made by a programmer/developer and exists as a constant in source code, or it could be an entry in a database or even an entry in the file.
The dynamic serial number is the opposite of static. These types of serial numbers are based on input data: user name, email, secret word, or something else. By the way, there might not be any text fields and the serial number will be generating based on system information (computer name maybe), or it could be a specific algorithm.
Back to our program; we see two text fields for name and serial. Regarding this information, I suppose that our serial number will be based on the name. Let’s input some data. As we have noticed, when we try to input serial numbers, we cannot input any symbols except numbers. In this way, we can conclude the serial number consists of numbers only.
After that, hit the “Try it” button, and you will see the small window with the next message, “That isn’t it, keep on trying…”.
There is one small thing that we can notice if we input a long name and do not input a serial number; hit the “Try it” button, and we will be able to see the message, “Hey, you have done it”.
Let’s dive in. Open OllyDBG and load the test program.
Start with looking at some text strings that we were able to see in the program earlier. For that, use the context menu (right button click of a mouse) and choose Search for -> All referenced text string. You will see this next window:
In the screenshot, we can be able to see some familiar text string such as, “Hey, you have done it”, and “That isn’t it, keep on trying…”. I have to notice that sometimes there could be a protection from string reference analyzing, and if something like that is being used, you will not be able to see all or some text string.
Let’s choose the “Hey, you have done it” text string and double click it; we will appear in a place where it is used like the following:
The selected line shows us that the address 42515C, where our text string is located, moves to the EDX register. Then, something is writing into the EAX register and calling a function – CALL 00421ACC. Let’s see what that is.
Select CALL 00421ACC and press enter; you must be able to see the body of that function. It is a typical MessageBoxA function (WinAPI). To return to CALL 00421ACC, just press minus.
So, we can conclude that a part of the disassembling code from 004250AB address to 004250BC address (calling function) is showing us a message box with "congratulations".
Let’s take a look a little bit higher from CALL 00421ACC and see the text string, “That isn’t it, keep on trying...”, and a part of a disassembling code from the 00425093 address to the 004250A4 address; it shows us a message box with that message.
Just a little bit higher of that, we can notice the next command, JE SHORT 004250AB, this is an if statement (if equals then go somewhere). In our case, it will be the 004250AB address. This is where we located the code responsible for the congrats message. Move forward.
Above the JE SHORT 004250AB command, we can see CMP EDI, ESI; It is a comparison command. But, let’s leave it now. For this moment, we need to patch our test program and save it. For that, double click on JE SHORT 004250AB, and you will see the next window.
Input JMP 004250AB. This is an unconditional jump command and you can now hit the “Assemble” button. In this way, whatever we ever input in the “Name” text field, we will be able to see the congrats window.
But, if we close OllyDBG right now, we will lose our changes. So, we need to save them.
Of course, for that, we could use one of the HEX editors, but why? We have OllyDBG, so let’s use it.
Use the context menu, Copy to executable -> All modifications, and hit the “Copy” button. See the next window:
Again, go to the context menu, click “Save File,” and we are done.
The Second Approach – KEYGEN
Before writing a keygen program, we need to do research and understand how our test program generates a serial number based on “Name”. Let’s go back to the comprising command, CMP EDI, ESI, which is located above on the changed unconditional jump, JMP 004250AB.
This command compares two serial numbers, generated by the program and ours. If we take a closer look, we will see a function CALL 004251A0, which calculates the serial number. How can we be certain about it? Well, we can check this out. We could make some breakpoints and go inside the function, observe registers, and so on. But, this is unnecessary. After CALL 004251A0, we see the MOV EDI, EAX command (almost all functions push the result of the execution into EAX register) and, after that, CMP EDI, ESI. Look inside the function CALL 004251A0, select it, and hit Enter.
Let’s analyze it. Make a breakpoint on function CALL 004251A0, select it, and press F2. Then, run the test program (press F9). Input the name and serial number, click on the “Try it” button, and we will stop on our function.
If we take a look at the window of registers, we will notice that the EAX register contains our name, and the ECX register contains our serial number. Trace the disassembling code (press F7). Once we enter the function, we will see the PUSH EBX command (you can see that part on the screenshot, in the selected area).
We will try to analyze that part of the code and write a small program for key generating. Below you can see the description of all commands of function:
By the end of it, the EAX register will contain the real serial number for the name that we input.
I think that will be enough to write keygen. You can find the code of keygen and the test program itself in my Github repository (Java and Delphi versions are available).
In the next article, we will take a look at how we can protect our applications.
Thank you, and take care!
Opinions expressed by DZone contributors are their own.