Tech

RE not for run-of-the-mill developers

CIOL Bureau

09 Jun 2004 00:00 IST

Updated On 09 Jun 2004 21:15 IST

New Update

Anoop Srivastava, FCS Software Solutions

What is Reverse Engineering?

Programming language is a program that allows us to write programs understandable by a computer. Application is any compiled program that has been composed with the aid of a programming language. Reverse Engineering is the de-compilation of any application, regardless of the programming language that was used to create it, so that one can acquire its source code or any part of it. The reverse engineer can re-use this code in his own programs or modify an existing (already compiled) program to perform in other ways. He can use the knowledge gained from reverse engineering to correct application programs, also known as bugs. But the most important is that one can get extremely useful ideas by observing how other programmers work and think, thus improve his skills and knowledge.

What comes in our minds when we hear reverse engineering, is cracking. Cracking is as old as the programs themselves. To crack a program, means to trace and use a serial number or any other sort of registration information, required for the proper operation of a program. Therefore, if a shareware program (freely distributed, but with some inconveniences, like crippled functions, nag screens or limited capabilities) requires valid registration information, a reverse engineer can provide that information by de-compiling a particular part of the program. Many times in the past, several software corporations have accused others for performing reverse engineering in their products and stealing technology and knowledge. Reverse Engineering is not limited to computer applications, the same happens with car, weapons, hi-fi components etc.

Forms of Reverse Engineering

Although there are many forms of reverse engineering, the common goal is to extract information from existing software systems to better understand them. The subject software system is represented in a form where many of its structural and functional characteristics can be analyzed. This knowledge can then be used to improve subsequent development, ease maintenance and re-engineering, and aid project management. This knowledge can help defend against brittle software systems that resist graceful change. Problems can be exposed and corrected if reverse engineering is applied preventatively during evolution. As maintenance and re-engineering cost for large legacy software systems increases, the importance of reverse engineering will grow accordingly.

The reverse engineering process involves two distinct phases. The first identifies the system's current components and captures their dependencies; the second discovers design information and generates system abstractions. The second, discovery phase of reverse engineering is a highly interactive and cognitive activity. The user may build up hierarchical subsystem components that embody software-engineering principles such as low coupling and high cohesion. Discovery can also include the reconstruction of design and requirements specifications (often referred to as the \domain model") and the correlation of this model to the code.

Applications of Reverse Engineering

Advertisment

Recovery of source from compiled binary when original source lost

Debugging "release" code

Examining generated code for compiler output verification

Locating patch points for Hot-Fixes

Stealing Intellectual Property

Discovering undocumented/hidden APIs

Security analysis and Exploit hunting

Cracking / circumventing Copy Protection

Learning Undocumented File Formats

Digital Forensics

Techniques and Approaches

Although it may sound difficult in the beginning, reverse engineering is actually simple, when de-compiling a program, the engineer is just reading the programmer’s thoughts and he tries to make sense out of them. But what it does require is a very good knowledge of computer organization (and I mean right down to processor architecture, memory organization, system buses and registers). Deep understanding of data structures and a very good programming acumen plus a good knowledge of assembly language and deep understanding of GUI programming. There are no manuals around that can tell you how to reverse engineer a program. The reason is that something generic is impossible in the case of reverse engineering. There is no single recipe to reverse engineer a program. One could claim that the amount of techniques required to reverse all existing programs is equal to the amount of programs you have! But some very basic techniques are listed below.

Dead Listing
- Disassembly of binary yielding some low level code (assembly language) whose structure is then examined to decipher what operation is being performed.

Advertisment

Live Tracing
- Tracing execution as the target is running and watching data structures change.

Advertisment

Behavioral
- Examining how the target interacts with its environment (Operating System, registry, file system, other system components).

Advertisment

Differential
- comparing consecutive snapshots to discover changes and hypothesize an algorithm, which is derived and verified in a recursive process.

Advertisment

Going into details in each of these methods can be justified by a separate article on each of them but we should keep in mind that, in most cases using just one of the above mentioned techniques is not sufficient. In fact it may be grossly inadequate, thus a combination of these techniques are applied, but this also might not be enough. There are many tools available that can be used to reverse engineer, but in the end it solely depends on the ability, intelligence, creativity and determination of the engineers to successfully reverse engineer software.

Advertisment

*The views expressed in this column are those of the author and does not, reflect that of the organization.

tech-news