Reversing: Secrets of Reverse Engineering

Author: Eldad Eilam, Elliot J. Chikofsky
All Stack Overflow 15
This Year Stack Overflow 3
This Month Stack Overflow 1


by anonymous   2017-08-20

(I don't know about you but I was excited with assembly)

A simple tool for experimenting with assembly is already installed in your pc.

Go to Start menu->Run, and type debug

debug (command)

debug is a command in DOS, MS-DOS, OS/2 and Microsoft Windows (only x86 versions, not x64) which runs the program debug.exe (or DEBUG.COM in older versions of DOS). Debug can act as an assembler, disassembler, or hex dump program allowing users to interactively examine memory contents (in assembly language, hexadecimal or ASCII), make changes, and selectively execute COM, EXE and other file types. It also has several subcommands which are used to access specific disk sectors, I/O ports and memory addresses. MS-DOS Debug runs at a 16-bit process level and therefore it is limited to 16-bit computer programs. FreeDOS Debug has a "DEBUGX" version supporting 32-bit DPMI programs as well.


If you want to understand the code you see in IDA Pro (or OllyDbg), you'll need to learn how compiled code is structured. I recommend the book Reversing: Secrets of Reverse Engineering

I experimented a couple of weeks with debug when I started learning assembly (15 years ago).
Note that debug works at the base machine level, there are no high level assembly commands.

And now a simple example:

Give a to start writing assembly code - type the below program - and finally give g to run it.

alt text

(INT 21 display on screen the ASCII char stored in the DL register if the AH register is set to 2 -- INT 20 terminates the program)

by anonymous   2017-08-20

"In WinDbg I can set breakpoints, but it is difficult for me to imagine at what point to set a breakpoint and at what memory location. Similarly, when I view the static code in IDA Pro, I'm not sure where to even begin to find the function or datastructure that represents the mine field."


Well, you can look for routines like random() that will be called during the construction of the mines table. This book helped me a lot when I was experimenting with reverse engineering. :)

In general, good places for setting break points are calls to message boxes, calls to play a sound, timers and other win32 API routines.

BTW, I am scanning minesweeper right now with OllyDbg.

Update: nemo reminded me a great tool, Cheat Engine by Eric "Dark Byte" Heijnen.

Cheat Engine (CE) is a great tool for watching and modifying other processes memory space. Beyond that basic facility, CE has more special features like viewing the disassembled memory of a process and injecting code into other processes.

(the real value of that project is that you can download the source code -Delphi- and see how those mechanisms were implemented - I did that many years ago :o)

by Simucal   2017-08-20

I've discussed why I don't think Obfuscation is an effective means of protection against cracking here:
Protect .NET Code from reverse engineering

However, your question is specifically about source theft, which is an interesting topic. In Eldad Eiliams book, "Reversing: Secrets of Reverse Engineering", the author discusses source theft as one reason behind reverse engineering in the first two chapters.

Basically, what it comes down to is the only chance you have of being targeted for source theft is if you have some very specific, hard to engineer, algorithm related to your domain that gives you a leg up on your competition. This is just about the only time it would be cost-effective to attempt to reverse engineer a small portion of your application.

So, unless you have some top-secret algorithm you don't want your competition to have, you don't need to worry about source theft. The cost involved with reversing any significant amount of source-code out of your application quickly exceeds the cost of re-writing it from scratch.

Even if you do have some algorithm you don't want them to have, there isn't much you can do to stop determined and skilled individuals from getting it anyway (if the application is executing on their machine).

Some common anti-reversing measures are:

  • Obfuscating - Doesn't do much in terms of protecting your source or preventing it from being cracked. But we might as well not make it totally easy, right?
  • 3rd Party Packers - Themida is one of the better ones. Packs an executable into an encrypted win32 application. Prevents reflection if the application is a .NET app as well.
  • Custom Packers - Sometimes writing your own packer if you have the skill to do so is effective because there is very little information in the cracking scene about how to unpack your application. This can stop inexperienced RE's. This tutorial gives some good information on writing your own packer.
  • Keep industry secret algorithms off the users machine. Execute them as a remove service so the instructions are never executed locally. The only "fool-proof" method of protection.

However, packers can be unpacked, and obfuscation doesn't really hinder those who want to see what you application is doing. If the program is run on the users machine then it is vulnerable.

Eventually its code must be executed as machine code and it is normally a matter of firing up debugger, setting a few breakpoints and monitoring the instructions being executed during the relevant action and some time spent poring over this data.

You mentioned that it took you several months to write ~20kLOC for your application. It would take almost an order of magnitude longer to reverse those equivalent 20kLOC from your application into workable source if you took the bare minimum precautions.

This is why it is only cost-effective to reverse small, industry specific algorithms from your application. Anything else and it isn't worth it.

Take the following fictionalized example: Lets say I just developed a brand new competing application for iTunes that had a ton of bells and whistles. Let say it took several 100k LOC and 2 years to develop. One key feature I have is a new way of serving up music to you based off your music-listening taste.

Apple (being the pirates they are) gets wind of this and decides they really like your music suggest feature so they decide to reverse it. They will then hone-in on only that algorithm and the reverse engineers will eventually come up with a workable algorithm that serves up the equivalent suggestions given the same data. Then they implement said algorithm in their own application, call it "Genius" and make their next 10 trillion dollars.

That is how source theft goes down.

No one would sit there and reverse all 100k LOC to steal significant chunks of your compiled application. It would simply be too costly and too time consuming. About 90% of the time they would be reversing boring, non-industry-secretive code that simply handled button presses or handled user input. Instead, they could hire developers of their own to re-write most of it from scratch for less money and simply reverse the important algorithms that are difficult to engineer and that give you an edge (ie, music suggest feature).

by anonymous   2017-08-20

This is not an easy task and might require tools other than gdb. I read a couple interesting RE tutorials and even though they are not specific to OSX they still provide interesting insight and examples on deciphering function parameters:

Reversing (Undocumented) Windows API Functions

Matt's Cracking Guide

Secrets of Reverse Engineering: Appendix C - Deciphering Program Data

Reverse Engineering and Function Calling by Address

by anonymous   2017-08-20

I like OllyDbg. (with a good companion :)

by anonymous   2017-08-20

That instruction simply pushes 32-bit constant (0x804a254) in the stack.

That instruction alone is not enough for us to tell how it is later used. Could you provide more dissasembly of the code? Especially I would like to see where this value is popped out, and how this value is later being used.

Before starting any reverse engineering I would recommend reading this book (Reverse Engineering secrets) and then X86 instruction set manual (Intel or AMD). I am assuming that you are Reverse Engineering for x86 CPU.