Jump to content

Reverse engineering

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by SimonMorgan (talk | contribs) at 01:25, 29 December 2005 (spam domain). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Reverse engineering (RE) is the process of taking something (a mechanical device, an electrical component, a software program, etc.) apart and analyzing its workings in detail, usually with the intention to construct a new device or program that does the same thing without actually copying anything from the original. The verb form is to reverse-engineer, spelled with a hyphen.

A telling analogy of RE is that the research of physical laws can be seen as reverse-engineering the world itself.

Under United States law, reverse engineering a patented item can be infringement; however, if the artifact or process is protected by trade secrets instead of by a patent, then reverse engineering the artifact or process is lawful as long as the artifact or process is obtained legitimately. In fact, one common motivation of reverse engineering is to determine whether a competitor's product infringes on your patents.

Types and applications of RE

Reverse engineering is often used by military in order to copy other nations' technology, devices or information, or parts of which, have been obtained by regular troops in the fields or by intelligence operations. It was often used during the Second World War and the Cold War. Well-known examples of WWII:

  • Jerry can – British and American forces noticed that the Germans had gasoline cans with an excellent design. They reverse engineered copies of those cans. The cans were popularly known as Jerry cans.
  • Tupolev Tu-4 – A number of American B-29 bombers on missions over Japan were forced to land in the USSR. The Soviets, who did not have a similar strategic bomber, decided to copy the B-29. Within a few years they had developed the Tu-4, a near perfect copy.

For mechanical components of mechanical systems, modern techniques in reverse engineering include laser scanning—the use of laser beams to scan across the surface of components of any shape and create a very precise image of the component surface. This process creates a series of slices that, when combined, can represent the surface of the object in a computer simulation, or, using a special 3D printer, can actually create a physical model of the object.

Reverse engineering software or hardware systems which is done for the purposes of interoperability (for example, to support undocumented file formats or undocumented hardware peripherals), is mostly believed to be legal, though patent owners often contest this and attempt to stifle any reverse engineering of their products for any reason.

On a related note, black box testing in software engineering has a lot in common with reverse-engineering. The tester usually has the API, but his goals are to find bugs and undocumented features by bashing the product from outside.

Other purposes of reverse engineering include security auditing, removal of copy protection ("cracking"), circumvention of access restrictions often present in consumer electronics, and pure curiosity and customization of embedded systems (such as engine management systems).

Reverse engineering is also used by businesses to assess competitors' products. It is used to analyze, for instance, how a competitor's product works, what it does, who manufactures it, what components it consists of, estimate costs, identify potential patent infringement, etc.

Value engineering is a related activity also used by business. It involves deconstructing and analysing products, but the objective is to find opportunities for cost cutting.

Finally, reverse engineering often is done because the documentation of a particular device has been lost (or was never written), and the person who built the thing is no longer working at the company. Integrated circuits often seem to have been designed on obsolete, proprietary systems, which means that the only way to incorporate the functionality into new technology is to reverse-engineer the existing chip and then re-design it.

Reverse engineering of software

The term "reverse engineering" as applied to software means different things to different people, prompting Chikofsky and Cross Template:Fn to write a paper researching the various uses and defining a taxonomy. From their paper: "Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction."

It can also be seen as "going backwards through the development cycle" Template:Fn. In this model, the output of the implementation phase (in source code form) is reverse engineered back to the analysis phase, in an inversion of the traditional waterfall model.

Reverse engineering is a process of examination only: the software system under consideration is not modified (which would make it reengineering).

In practice, two main types of reverse engineering emerge. In the first case, source code is already available for the software, but higher level aspects of the program, perhaps poorly documented or documented but no longer valid, are discovered. In the second case, there is no source code available for the software, and any efforts towards discovering one possible source code for the software are regarded as reverse engineering. This second usage of the term is the one most people are familiar with.

Binary software

This process is sometimes termed Reverse Code Engineering or RCE Template:Fn. As an example, decompilation of binaries for the Java platform can be accomplished using ARGOuml.org. One famous case of reverse engineering was the first non-IBM implementation of BIOS which launched the historic PC clone industry.

In the United States, the Digital Millennium Copyright Act exempts from the circumvention ban some acts of reverse engineering aimed at interoperability of file formats and protocols (17 USC 1201(f)), but judges in key cases have ignored this law, since it is acceptable to circumvent restrictions for use, but not for access.

The Samba software, which allows systems that are not running Microsoft Windows systems to share files with systems that are, is a classic example of software reverse engineering, since the Samba project had to reverse-engineer unpublished information about how Windows file sharing worked, so that non-Windows computers could emulate it. The WINE project does the same thing for the Windows API, and OpenOffice.org is one party doing this for the Microsoft Office file formats.

Binary software: techniques

Reverse engineering of software can be accomplished by various methods. The three main groups of software reverse engineering are:

  1. Analysis through observation of information exchange, most prevalent in protocol reverse engineering, which involve using bus analyzers and packet sniffers for example for listening into a computer bus or computer network connection, revealing the traffic data underneath. Behaviour on the bus or network can then be analyzed for producing a stand-alone implementation that mimics the same behaviour. This is especially good for reverse engineering of device drivers. Sometimes reverse-engineering on embedded systems is greatly helped by tools deliberately introduced by the manufacturer, such as JTAG ports or other debugging means.
  2. Disassembly using a disassembler, meaning the raw machine language of the program is read and understood in its own terms, only with the aid of machine language mnemonics. This works on any computer program but can take quite some time, especially for someone not used to machine code.
  3. Decompilation using a decompiler, a process that tries, with varying result, to recreate the source code in some high level language for a program only available in machine code or bytecode.

See also

References