So on the previous post I promised I would give some background on reverse engineering so here it comes –
Reverse engineering, also called back engineering, is the processes of extracting knowledge or design information from anything man-made and re-producing it or re-producing anything based on the extracted information. The process often involves disassembling something (a mechanical device, electronic component, computer program, or biological, chemical, or organic matter) and analyzing its components and workings in detail. – Wikipedia
Software reverse engineering is the art (yes, ART!) of disassembling a program of any sort (game, utility, OS, malware etc) in order to gain a deep understanding of what it does, it’s architecture and design so the reverser (the reverse engineer, from now and on -the reverser) could recreate the source code, manipulate the software to his needs and wills and research for vulnerabilities within the target program.
Today, software engineers of any background, avoid developing software using technologies that are easy for the machine to understand and use a “high-level” programming (and scripting) languages. I’ll explain:
The more the language is close to human language, the more stages the code has to pass in order to become a machine-readable language aka “assembly instructions” or machine code. It’s a bit more complicated than that and there are entire courses about this matter in universities (for those of you who want to get a deeper understanding I recommend on reading about software compilation and interpretation in high level and low level programming languages) but for us this would do for now.
Since programmers of today mainly use high-level languages (Java, C# etc) and scripting languages (python, js, php etc) and though it has it’s benefits (which are beyond the scope of this post, most of the programmers are just dumb and lazy), most of them lack the knowledge on how the easy-to-read code becomes machine instructions, how the computer executes these instructions and how everything works beneath the surface basically (that’s the dumb and lazy part).
This is where hackers join in.
With a deep understanding of code compilation, CPU functionality and memory management, a hacker can gain a full access to a machine, exploiting vulnerabilities in all stages of compilation and interpretation and interfere with software’s execution.
Reverse engineering skills are very important for hackers and security researchers (the “good” hackers) since vulnerabilities aren’t always so transparent, one has to look behind the scenes in order to find them! And not all vulnerabilities are exploitable or worth exploiting.. more on that later.
So after this long introduction on reverse engineering I assume you understand why I am interested in becoming a reverser. But how can one become a reverser you ask?
A reverse engineer has to know the ins and outs of software, therefore one has to practice “low level” programming languages. I’ll focus on intel x86 assembler (32 bit) architecture since it is easier to understand and is much more “academic” than its successor the x64 architecture. But before that I will practice some C/C++ which are high level languages (since it is more human readable than machine readable haha) since these languages are the base of all software. Operating systems cores are developed mainly in C and some assembly and C++ for example.
I do not have to be a brilliant coder (that is not ma’ job!) but I do need some programming abilities to understand the goings of the software I wish to reverse. Does it make sense? no? deal with it.
I advised my mentor, Cru5d3r and he suggested the following:
- Learn to code
- Understand you have no clue what you’re doing
- Go back to square one
The binary-auditor suggested practicing C++ and Assembly x86 as I wrote in previous post.
I am not going to teach any of these languages here, if you find it disappointing – sorry I’m not sorry. I will however share with you the exercises I’ll do (from the binary-auditing kit, duh!) and walk you through them.
But first things first.
Set up a Windows 7 (32 bit) workstation, it is recommended to use a VM (virtual machine) since we will run malware there eventually, and quite honestly – who the hell wants to use win (any version) on the host anyway!?
I like Oracle’s VM VirtualBox but you may use whatever you like.
After done installing the OS we’re gonna have to install some utilities such as google chrome (not gonna use IE whatsoever) and IDA Pro. The binary-auditor was kind to include a free version of IDA in the kit but you can get a better one if you know where to look or willing to spend some money.
Then we’re gonna need an IDE (Integrated Development Environment). This doesn’t have to be installed on the VM, not for now at least since we’re gonna need it mainly for practicing C/C++ and not for the “real thing” – I have it installed on the host computer. Binary-auditor recommends using code::blocks, I do too. It’s easy to use, very practical and it is cross-platform (to some extent). Make sure you set code::blocks to use the right compiler or you won’t be able to execute the programs you write on the host (ELF on linux and EXE on win are two different executable file formats).
For assembly we will need some other software but we’ll get to it.
I have downloaded an app to my mobile, called Learn C++ by solo learn and I also work with several books and tutorials. If this is going to be your first programming language I suggest you get someone to teach you in like a course or something, I do not recommend starting to learn coding from books, tutorials, youtube vids or even lynda. It will just mess you up. Luckily enough this isn’t my first language so for me it’s ok to use the free resources lol!
That’s it for now!
On the next post we will get our hands dirty – I will demonstrate and walk you through the first real C++ exercise! How exciting is that??? What? No? Fuck off.