Binary Code Analysis Tools

Linux tools

GDB

GNU Debugger (GDB) is not only good to debug buggy applications . It can also be used to learn about a program’s control flow, change a program’s control flow, and modify the code, registers, and data structures. These tasks are common for a hacker who is working to exploit a software vulnerability or is unraveling the inner workings of a sophisticated virus. GDB works on ELF binaries and Linux processes. It is an essential tool for Linux hackers.

Objdump from GNU binutils

Object dump (objdump) is a simple and clean solution for a quick disassembly of code. It is great for disassembling simple and untampered binaries, but will show its limitations quickly when attempting to use it for any real challenging reverse engineering tasks, especially against hostile software. Its primary weakness is that it relies on the ELF section headers and doesn’t perform control flow analysis, which are both limitations that greatly reduce its robustness. This results in not being able to correctly disassemble the code within a binary, or even open the binary at all if there are no section headers. For many conventional tasks, however, it should suffice, such as when disassembling common binaries that are not fortified, stripped, or obfuscated in any way. It can read all common ELF types. Here are some common examples of how to use objdump:

View all data/code in every section of an ELF file:

objdump -D OBJECT

View only program code in an ELF file:

objdump -d OBJECT

View all symbols:

objdump -tT OBJECT

Objcopy from GNU binutils

Object copy (Objcopy) is an incredibly powerful little tool that we cannot summarize with a simple synopsis. I recommend that you read the manual pages for a complete description. Objcopy can be used to analyze and modify ELF objects of any kind, although some of its features are specific to certain types of ELF objects. Objcopy is often times used to modify or copy an ELF section to or from an ELF binary.

To copy the .data section from an ELF object to a file, use this line:

objcopy –only-section=.data INFILE OUTFILE

strace

System call trace (strace) is a tool that is based on the ptrace(2) system call, and it utilizes the PTRACE_SYSCALL request in a loop to show information about the system call (also known as syscalls) activity in a running program as well as signals that are caught during execution. This program can be highly useful for debugging, or just to collect information about what syscalls are being called during runtime.

This is the strace command used to trace a basic program, writing the output to a file “ls.out”:

strace /bin/ls -o ls.out

The strace command used to attach to an existing process is as follows:

strace -p PID -o daemon.out

The initial output will show you the file descriptor number of each system call that takes a file descriptor as an argument, such as this:

SYS_read(3, buf, sizeof(buf));

If you want to see all of the data that was being read into file descriptor 3, you can run the following command:

strace -e read=3 /bin/ls

You may also use -e write=fd to see written data. The strace tool is a great little tool, and you will undoubtedly find many reasons to use it.

ltrace

library trace (ltrace) is another neat little tool, and it is very similar to strace. It works similarly, but it actually parses the shared library-linking information of a program and prints the library functions being used.

Basic ltrace command

You may see system calls in addition to library function calls with the -S flag. The ltrace command is designed to give more granular information, since it parses the dynamic segment of the executable and prints actual symbols/functions from shared and static libraries:

ltrace PROGRAM -o program.out

ftrace

Function trace (ftrace) is similar to ltrace, but it also shows calls to functions within the binary itself. This tool can be found on GitHub.

readelf

The readelf command is one of the most useful tools around for dissecting ELF binaries. It provides every bit of the data specific to ELF necessary for gathering information about an object before reverse engineering it. This tool will be used often throughout the book to gather information about symbols, segments, sections, relocation entries, dynamic linking of data, and more. The readelf command is the Swiss Army knife of ELF. The ELF Binary Format’s  most commonly used flags:

To retrieve a section header table:

readelf -S OBJECT

To retrieve a symbol table:

readelf -s OBJECT

To retrieve the ELF file header data:

readelf -e OBJECT

To retrieve relocation entries:

readelf -r OBJECT

To retrieve a dynamic segment:

readelf -d OBJECT

ERESI – The ELF Reverse Engineering System Interface

ERESI project (http://www.eresi-project.org) contains a suite of many tools that are a Linux binary hacker’s dream. Unfortunately, many of them are not kept up to date and aren’t fully compatible with 64-bit Linux. They do exist for a variety of architectures, however, and are undoubtedly the most innovative single collection of tools for the purpose of hacking ELF binaries that exist today. There are two Phrack articles that demonstrate the innovation and powerful features of the ERESI tools:

Cerberus ELF interface (http://www.phrack.org/archives/issues/61/8.txt)

Embedded ELF debugging (http://www.phrack.org/archives/issues/63/9.txt)

The ERESI project has moved from eresi-project.org to GitHub

https://www.packtpub.com/big-data-and-business-intelligence/learning-linux-binary-analysis

Leave a Reply

Your email address will not be published. Required fields are marked *

*