Reverse-Engineering
Books
- The IDA Pro Book, 2nd Edition by Chris Eagle
- Reverse Engineering Code with IDA Pro by Dan Kaminsky et al
- Practical Malware Analysis by Michael Sikorski
- Reversing: Secrets of Reverse Engineering by Eldad Eilam
- Crackproof Your Software by Pavol Cerven
- Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection
- Wikibooks Subject:Software_reverse_engineering
- Reverse Engineering for Beginners, free, by @yurichev
Resources
- Reverse-Engineering on StackExchange
- OpenRCE
- Hex Blog
- http://www.reverse-engineering.info
- Automating RE with Python (slides) by Carlos Prado
- Intel® 64 and IA-32 Architectures Software Developer Manuals
- Technology survey
- Going Maths
- Python Arsenal for Reverse Engineering
- corkami, x86 oddities etc
- Intel Intrinsics Guide
Static Analysis Tools
IDA Pro
REC Studio
- x86, x64
- Windows, Linux, Mac OS X
- HLA disassembler
Useful commands:
help strings calltree showprocs decompile /tmp/myprog.c
click on a function in the "Project" function list to HLA disass it
Hopper
- Intel (32 and 64bits), and ARM (ARMv6, ARMv7 and ARM64) processors
- Mach-O binaries (Mac and iOS), PE32/32+/64 Windows binaries and ELF binaries
- decompiler
- debugger
- patcher
Capstone
- ARM, ARM64 (ARMv8), Mips, PowerPC, Sparc, SystemZ & Intel
Radare
The reverse engineering framework
Bokken
GUI
git repo synced with mercurial repo
Amoco
Amoco is a python package dedicated to the (static) analysis of binaries
Very young but promising, seems easy to add an arch
With BBL symbolic execution
Miasm
Miasm is a a free and open source (GPLv2) reverse engineering framework. Miasm aims at analyzing/modifying/generating binary programs. * opening/modifying/generating PE/ELF 32/64 le/be using Elfesteem
- Assembling/Disassembling ia32/ppc/arm
- Representing assembly semantic using intermediate language
- Emulating using jit (dynamic code analysis, unpacking, ...)
- Expression simplification for automatic de-obfuscation
- ...
Medusa
Medusa is a disassembler designed to be both modular and interactive. It runs on Windows and Linux
SmartDec
Native code to C/C++ decompiler
x86 and x86-64 architectures, ELF and PE file formats
IDA Pro & standalone versions, for Windows
Standalone i86 Windows version runs fine under Wine
Beta
Misc
Distorm
diStorm3 is really a decomposer, which means it takes an instruction and returns a binary structure which describes it rather than static text, this is great for advanced binary code analysis
PyPEELF
PyPEELF is a multi-platform binary editor written in Python, wxPython and BOA Constructor. It allows you to manage binary data in PE32, PE32+ (x64) and ELF binary files.
PyPEELF uses pefile to manage PE32 and PE32+ files and pyelf to manage ELF files. Besides, it uses winappdbg and pydasm in some others features like Task Running Viewer and Disassembling files.
PyPEELF was designed for Reverse Engineers who want to edit or visualize binary file data in multi-platforms. That is why PyPEELF runs under Windows and Unix/BSD operating systems
Retargetable decompiler
Support ELF & PE for Intel x86, ARM, ARM+Thumb, MIPS, PIC32, and PowerPC architectures
Online decompilation service available!
binwalk
Binwalk is a fast, easy to use tool for analyzing and extracting firmware images.
PREF
Portable Reverse Engineering Framework
On github
apt-get install qtbase5-dev ... qmake make
Poor man's tools
File, -z to uncompress, -s to inspect non-files, e.g. /dev/sda1
file -k [-z] [-s] mybin
Strings
strings [-n min_length] -a -e [s|S|b|l|B|L] mybin
Android
Documentation
- Dalvik: bytecode, dex & VM instructions
Dex2jar
A tool for converting Android’s .dex format to Java’s .class format
See also DeObfuscate jar with dex tool
./d2j-dex2jar.sh myapp.apk
This returns a file myapp-dex2jar.jar
Then use Java decompilers: jad, jd-gui, cf below
Smali
smali/baksmali is an assembler/disassembler for the dex format used by dalvik, Android’s Java VM implementation
Apktool
It is a tool for reverse engineering 3rd party, closed, binary Android apps. It can decode resources to nearly original form and rebuild them after making some modifications; it makes possible to debug smali code step by step. Also it makes working with app easier because of project-like files structure and automation of some repetitive tasks like building apk, etc.
apktool d myapp.apk
Apk Multi-Tool
Swiss knive (was Apk Manager)
Contains apktool, smali/baksmali etc
on Github for Linux release
- 9 decompile apk / 1 select apk / 9 decompile apk
GetStrings
Small script to prepare a sed script to inject resource strings into jad, to ease reversing.
Update: inject resource names for other resources than strings, will still be more explanatory than 0x7f123456
To be used e.g. after apktool / Apk Multi-Tool decompilation
#!/bin/bash
DECOMPILED_DIR=working/*apk/
cat $DECOMPILED_DIR/res/values/public.xml|grep "type=.string"|\
sed 's/.*name="\?//;s/" id="\?/ /;s/"\? \/>//'|\
awk --non-decimal-data '{print $2, int($2), $1}'\
> getstring-pub
cat $DECOMPILED_DIR/res/values/strings.xml|grep '<string'|\
sed 's/.*name="\?//; s/"\?>/ /;s/<\/string>//;s/#/\\#/g'\
> getstring-str
join -1 3 -2 1 --nocheck-order getstring-pub getstring-str|\
sed 's/[^ ]\+ \([[:alnum:]]\+\) [[:alnum:]]\+ \(.*\)/s#\1#"\2"#/'\
> getstring-sed
rm getstring-pub getstring-str
cat $DECOMPILED_DIR/res/values/public.xml|grep "type="|\
grep -v "type=.string"|\
sed 's/.*type="\(.*\)" name="\(.*\)" id="\(.*\)" \/>/s#\3#\1:\2#/'\
>> getstring-sed
SetStrings
find $1 -name "*.jad" -exec sed -i -f getstring-sed {} \;
Soot
Soot is a Java bytecode analysis and transformation framework, now supporting Dalvik too.
Get soot.jar
Help:
java -jar soot.jar --help|less
SootDisassembleApkToJimple.sh
#In case you don't have the right platform android.jar, you can force using another one, e.g.:
#FORCEJAR="-force-android-jar /path/to/android-sdk-linux_x86/platforms/android-17/android.jar"
java -jar soot.jar -allow-phantom-refs -android-jars /path/to/android-sdk-linux_x86/platforms -src-prec apk -process-dir $1 -output-format jimple $FORCEJAR
SootAssembleJimpleToDex.sh
java -jar soot.jar -allow-phantom-refs -android-jars /path/to/android-sdk-linux_x86/platforms -src-prec jimple -process-dir sootOutput -output-format dex
mv sootOutput/classes.dex .
Example
Example of reverse-engineering and modding APK with smali:
- in APK-Multi-Tool-Linux working dir:
- Drop myapp.apk in place-apk-here-for-modding/
- ./script.sh (and leave it always open in a separate window)
- 9 decompile / 1 select myapp.apk / 9 decompile
- ./getstrings
- Copy apk to dex2jar working dir
- Copy getstring-sed to jad working dir
- in dex2jar working dir:
- ./d2j-dex2jar.sh myapp.apk
- Copy myapp-dex2jar.jar to jad working dir (and/or jd-gui)
- in jad working dir:
- ./unjar myapp-dex2jar.jar
- ./setstrings.sh myapp-dex2jar
- Analyse .jad file and understand what to modify
- in jd-gui working dir:
- As alternative analysis can also be done with jd-gui directly on .jar file
- in APK-Multi-Tool-Linux working dir:
- In working/ find corresponding .smali file and modify it
- (in script.sh windows) 13 compile/sign/install
Example 2
Example of reverse-engineering and modding APK with Soot / jimple
- in APK-Multi-Tool-Linux working dir:
- Drop myapp.apk in place-apk-here-for-modding/
- ./script.sh (and leave it always open in a separate window)
- 1 extract apk
- Copy apk to soot working dir
- in soot working dir:
- ./SootDisassembleApkToJimple.sh myapp.apk
- Analyse and modify sootoutput/*.jimple files
- ./SootAssembleJimpleToDex.sh
- Copy classes.dex to overwrite APK-Multi-Tool-Linux/out/classes.dex
- in APK-Multi-Tool-Linux working dir (in script.sh windows)
- 3 zip apk / 2 regular app
- 4 sign app
- adb install place-apk-here-for-modding/repackaged-signed.apk
Dare
Dalvik Retargeting, a tool for converting Android’s .dex format to Java’s .class format
Retargeted .class:
./dare -d output_dir -e myapp.apk
Optimized retargeted .class: (using Soot, slow!)
./dare -o -d output_dir -e myapp.apk
Decompiled optimized retargeted .class: (using Soot, very slow!)
./dare -c -d output_dir -e myapp.apk
APKInspector
The goal of this project is to help analysts and reverse engineers to visualize compiled Android packages and their corresponding DEX code. APKInspector provides both analysis functions and graphic features for the users to gain deep insight into the malicious apps
Still beta and inactive for a year.
GUI around other tools
Androguard
Reverse engineering, Malware analysis of Android applications … and more !
Seems to be able to tackle also dynamically loaded code, native code, reflection code
Dexdump
Java .dex file format decompiler
Inactive since 2009
FlowDroid
FlowDroid is a context-, flow-, field-, object-sensitive and lifecycle-aware static taint analysis tool for Android applications
Mobile Sandbox
Provides online static analysis of malware images.
JEB Decompiler
Commercial ($1000)
Decompile Android apps and obfuscated Dalvik bytecode
Misc
More tools at http://wiki.secmobi.com/tools:android_reversing_analysis
Online decompilation at http://www.decompileandroid.com/ (using dex2jar, jad, apktool, zip/unzip)
Java
JAD
Java Decompiler
To use on a jar (from dex2jar):
#!/bin/bash
JAD=$(pwd)/jad
ODIR=${1%.jar}
if [ "$ODIR" == "$1" ]; then
echo "Error: expecting a file ending with .jar"
exit 1
fi
7z x -o${ODIR} $1
for d in $(find ${ODIR}/com -type d); do
echo Entering $d
cd $d
# Clean Android stuffs
rm *\$*.class
for c in *.class; do
$JAD $c
# Want to keep the .class or not?
rm $c
done
cd -
done
./unjar myapp-dex2jar.jar
jadretro
Helps converting Java 1.4, Java 1.5 or later classes so JAD gives better results
JadAlign
Aligns java-files, which are decompiled by jad
java -jar JadHelper-0.0.1.jar myfile.java
No much effect on jad from dex
Jd-gui
JD-GUI is a standalone graphical utility that displays Java source codes of “.class” files
binary-refactor
Helper to manual de-obfuscate obfuscated jars
- rename class/packages in a jar
- match a jarjar-ed & obfuscated jar with a known jar,to find the 'same' classes
- bytecode dump(asm)
- class dependency graph
dirtyJOE
Java Overall Editor is a complex editor and viewer for compiled java binaries (.class files)
PJB
ELF
man elf
readelf
readelf -a -g -t --dyn-syms -W mybin
elfedit
objdump
objdump -C -g -F -x -T --special-syms mybin objdump -d -l -r -R -S mybin objdump -D -l -r -R -S mybin
nm
nm -a -C -S -s --special-syms mybin
ldd
Shared library dependencies:
ldd -v mybin
PE
Pefile
A Python module to read and work with PE (Portable Executable) files, see usage examples
#!/usr/bin/env python
import sys, pefile
pe = pefile.PE(sys.argv[1])
pe.dump_info()
open('out.txt', 'w').write(pe.dump_info())
Can run under Linux
PEiD
Can run with Wine
PETools
Can run with Wine
Resource Hacker
Can run with Wine
Dependency Walker
Can run with Wine
PEview
Can run with Wine
DLL Export Viewer
Can run with Wine
Under Wine, require absolute path to DLL so: click on gears, "load functions from the following DLL file", Browse
PEBrowse Pro
Can run with Wine
Explorer Suite
- CFF Explorer: Allows also to modify a PE
- Signature Explorer
- PE Detective
- Task Explorer (32 & 64)
PE Insider
Static protections
Packers
- http://www.openrce.org/reference_library/packer_database
- http://www.reverse-engineering.info/documents/33.html
- https://corkami.googlecode.com/files/packers.pdf
- UPX
upx -d myfile
- http://www.woodmann.com/crackz/Packers.htm
- Crinkler: some insane PE packing tool coming from the demoscene world.
Dynamic Analysis Tools
IDA Pro
IDA Pro has some debugging capabilities too.
Local debugging: win32, windbg
Remote debugging:
gdbserver --multi <client_ip>:<port> # default IDA port: 23946
Then on IDA: select Remote GDB debugger, paths should be paths on the gdbserver host.
Tuning:
- Debugger / options / Stop on process entry point
- Compatible with lib preloading, cf below
- from 6.4, can make use of Intel PIN tools for diff debugging, see tutorial (pdf)
Intel PIN tools
- Official page
- Windows, Linux, Mac OS X, Android
- x86-32, x86-64 (only Intel platforms obviously)
- binary instrumentation
The best way to think about Pin is as a "just in time" (JIT) compiler. The input to this compiler is not bytecode, however, but a regular executable. Pin intercepts the execution of the first instruction of the executable and generates ("compiles") new code for the straight line code sequence starting at this instruction. It then transfers control to the generated sequence. The generated code sequence is almost identical to the original one, but Pin ensures that it regains control when a branch exits the sequence. After regaining control, Pin generates more code for the branch target and continues execution. Pin makes this efficient by keeping all of the generated code in memory so it can be reused and directly branching from one sequence to another. In JIT mode, the only code ever executed is the generated code. The original code is only used for reference. When generating code, Pin gives the user an opportunity to inject their own code (instrumentation).
Vdb/Vtrace / Vivisect
- debugger, static analysis
- Windows, Linux, Android
- Intel, ARM
vtrace is a cross-platform process debugging API implemented in python, and vdb is a debugger which uses it
vivisect is a Python based static analysis and emulation framework
Android
ADBI: Binary Instrumentation Framework for Android
Slides here
Dynamic Dalvik Instrumentation Framework for Android
Slides here
DroidScope
DECAF(short for Dynamic Executable Code Analysis Framework) is a binary analysis platform based on QEMU.
This is also the home of the DroidScope dynamic Android malware analysis platform. DroidScope is now an extension to DECAF
Slides here and article here
DroidBox
Android Application Sandbox
TaintDroid
Realtime Privacy Monitoring on Smartphones
Soot
Java, Dalvik (see here and here)
GameCIH
GameGuardian
Drozer
Comprehensive security and attack framework for Android
Interacts with Dalvik VM and explore applications attack surface (activities, content providers, services, etc).
Can also be used remotely à la Metasploit with exploits & payloads
AndBug
A Scriptable Debugger for Android's Dalvik Virtual Machine
Hooker
Hooker is an opensource project for dynamic analysis of Android applications. This project provides various tools and applications that can be use to automaticaly intercept and modify any API calls made by a targeted application. It leverages Android Substrate framework to intercept these calls and aggregate all their contextual information (parameters, returned values, ...) in an elasticsearch database. A set of python scripts can be used to automatize the execution of an analysis in order to collect any API calls made by a set of applications.
Xposed
Changes app_process binary and hooks into all system or applications
Many modules
See also XDA forum
Cydia Substrate
Similar to Xposed but not via replacement of system components.
Hooks into Dalvik and native code
Misc
- How To Decode ProGuard’s Obfuscated Code From Stack Trace
- Live Memory Forensics on Android with Volatility (pdf)
- setpropex, as setprop but changes read-only properties by attaching to init via ptrace
- iSec Intent Sniffer and iSec Intent Fuzzer
More tools at http://wiki.secmobi.com/tools:android_dynamic_analysis
Java
Javasnoop
A tool that lets you intercept methods, alter data and otherwise hack Java applications running on your computer.
ELF
ltrace/strace
Tracing library calls and system calls.
Getting a summary:
ltrace -f -S mybin 2>&1|grep '(.*)'|sed 's/(.*//'|sort|uniq -c
Getting more:
ltrace -f -i -S -n 4 -s 1024 mybin
ftrace
Tracing inner execution flow as well
Lib preloading
#define _GNU_SOURCE
#include <dlfcn.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <time.h>
// Kill nanosleep()
int nanosleep(const struct timespec *req, struct timespec *rem){
printf("\n==== In our own nanosleep(), I dunnah want sleep\n");
return 0;
}
// Kill usleep()
int usleep(useconds_t usec){
printf("\n==== In our own usleep(), I dunnah want sleep\n");
return 0;
}
// Fix time()
time_t time(time_t *t){
printf("\n==== In our own time(), will return 1380120175\n");
return 1380120175;
}
// Fix srand()
void srand(unsigned int seed){
printf("\n==== In our own srand(), will do srand(0)\n");
void (*original_srand)(unsigned int seed);
original_srand = dlsym(RTLD_NEXT, "srand");
unsigned int myseed = 0;
return (*original_srand)(myseed);
}
#if 0
// Kill rand()
int rand(void){
printf("\n==== In our own rand(), will return 0\n");
return 0;
}
#else
// Intercept rand()
int rand(void){
int (*original_rand)(void);
original_rand = dlsym(RTLD_NEXT, "rand");
int r = (*original_rand)();
printf("\n==== In our own rand(), will return %04X\n", r);
return r;
}
#endif
gcc -fPIC -shared -Wl,-soname,patch -o patch.so patch.c -ldl export LD_PRELOAD=patch.so export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
ldpreloadhook
a quick open/close/ioctl/read/write/free symbol hooker
injectso
- x86-32, x86-64, ARM (since v0.52)
scanmem
scanmem is a simple interactive debugging utility for linux, used to locate the address of a variable in an executing process. This can be used for the analysis or modification of a hostile process on a compromised machine, reverse engineering, or as a "pokefinder" to cheat at video games.
- Linux/Android
- with a GUI since v0.13: GameConqueror
GDB
Enable binary writing, here changing a conditional jump to unconditional jump:
gdb -write -silent --args mycode 1 2 3 ... (gdb) set {unsigned char}0x400123 = 0xeb (gdb) disassemble 0x400123 0x400124 0x400123 jmp 0x...
or injecting NOPs:
(gdb) set {unsigned char}0x400123 = 0x90
Extensions
Stephen Bradshaw ha swritten some extensions to have more useful gdb info when debugging stripped binaries, closer to what you get with OllyDbg. See:
- http://www.thegreycorner.com/2013/10/my-python-gdb-extensions.html
- http://www.thegreycorner.com/2014/03/gdb-extensions-110.html
GUI
- Voltron is an unobtrusive debugger UI for hackers
- SchemDBG is a backend agnostic debugger frontend that focuses on debugging binaries without access to the source code
ERESI
The ERESI Reverse Engineering Software Interface is a multi-architecture binary analysis framework with a domain-specific language tailored to reverse engineering and program manipulation.
PE
Process Monitor
Process Explorer
RegShot
Computes diff between two registry snapshots
HeapMemView
OllyDbg
PE32-only dynamic disassembler and debugger: http://ollydbg.de/.
Version 1.1 is historically widespread, version 2.0 is re-written from scratch, still considered as beta by some.
Support software and hardware breakpoint, binary patching and repacking, symbol analysis, advanced instruction pattern search, trace with conditional breaking, etc.
ImmDbg
There is also a patched version of OllyDbg with advanced python scripting ability called Immunity Debugger: http://www.immunityinc.com/products-immdbg.shtml
Expect some OllyDbg plugins to not work properly with ImmDbg.
Plugins:
- Mona, a debugger plugin / Exploit Development Swiss Army Knife
WinAppDbg
The WinAppDbg python module allows developers to quickly code instrumentation scripts in Python under a Windows environment.
Tracer.py
Based on WinAppDbg, finds interesting bits in trace by dichotomy signal/noise
- run first time and try everything but not the interesting stuff -> use noise option
- then run again and try interesting stuff -> use signal option
WTFDLL.py
Find libraries loaded at runtime and the functions called
Cuckoo Sandboxing
Currently only supporting Windows binaries.
Cuckoo Sandbox is a malware analysis system. You can throw any suspicious file at it and in a matter of seconds Cuckoo will provide you back some detailed results outlining what such file did when executed inside an isolated environment.
Cuckoo generates a handful of different raw data which include:
- Native functions and Windows API calls traces
- Copies of files created and deleted from the filesystem
- Dump of the memory of the selected process
- Full memory dump of the analysis machine
- Screenshots of the desktop during the execution of the malware analysis
- Network dump generated by the machine used for the analysis
Dynamic protections
- http://www.openrce.org/reference_library/anti_reversing
- https://corkami.googlecode.com/files/cm.pdf
- ptrace e.g. on iOS
- sysctl, e.g. on iOS
Patching
Exploitation
Tools
- ROPgadget, supports ELF/PE/Mach-O format on x86, x64, ARM, PowerPC, SPARC and MIPS architectures
- ROPshell, online, supports ELF/PE/Mach-O format on x86, x64, ARM
- pwntools
- PEDA: Python Exploit Development Assistance for GDB (x86/x64)
- GEF: GDB enhanced features - multi-arch (x86/x64/mips/ppc/arm)
- Hexcellents notes
- ROP on ARM (pdf) by Xipiter / dontstuffbeansupyournose
- Framing Signals a return to portable shellcode: article, slides
Mitigation techniques
Some are taken from excellent Android Hacker's Handbook
Hardening the Heap
Hardened version of dlmalloc? Alternatives?
This can be done with LD_PRELOAD, e.g. with tcmalloc
LD_PRELOAD="/usr/lib/libtcmalloc.so"
Protecting against Integer Overflows
- Protected calloc?
- Hardened library for safe integer operations: safe_iop
Preventing Data Execution
Set stack (and heap) as non-executable.
Kernel marks stack as executable unless it finds a GNU_STACK program header without executable flag set.
To insert non-exec statement:
flag: -znoexecstack
To test:
/usr/sbin/execstack -q myprog
- "?": myprog has no GNU_STACK -> stack is executable
- "-": stack non-executable
- "X": stack executable
Same:
readelf -a myprog|grep -A1 GNU_STACK
- present? with RW or RWE?
Same:
cat /proc/123/maps|grep -E '(stack|heap)'
- rw or rwx?
To modify existing bin:
/usr/sbin/execstack -s myprog # set executable stack /usr/sbin/execstack -c myprog # clear
Max nr of process IDs
/sbin/sysctl kernel.pid_max
Traditionally 32768
/sbin/sysctl -w kernel.pid_max=4194303
ptrace
/sbin/sysctl kernel.yama.ptrace_scope
To allow ptrace:
/sbin/sysctl -w kernel.yama.ptrace_scope=0
Address Space Layout Randomization
Bin needs to be compiled position-independent:
CFLAGS: -fPIE LDFLAGS: -pie
To test:
readelf -h myprog | grep Type:
- DYN? position-independent
- EXEC? Not position-independent
or
readelf -d myprog | grep TEXTREL
Global settings
/sbin/sysctl kernel.randomize_va_space /sbin/sysctl -w kernel.randomize_va_space=2
- 0 – No randomization. Everything is static.
- 1 – Conservative randomization. Shared libraries, stack, mmap(), VDSO and heap are randomized.
- 2 – Full randomization. In addition to elements listed in the previous point, memory managed through brk() is also randomized.
To disable it locally (in a bash and its children)
setarch `uname -m` -R /bin/bash
On 32 bit systems “ulimit -s unlimited” disables the randomization of the mmap()-ing
Protecting the Stack
ProPolice stack protection is enabled by using
flags: -fstack-protector
Format String Protections
Enabled by using
flags: -Wformat-security -Werror=format-security
Beware compiler cannot detect all corner cases
See also _FORTIFY_SOURCE=2 for runtime protection against %n
Read-Only Relocations
Partial relro enabled by using
flags: -Wl,-z,relro
To test:
readelf -h myprog|grep RELRO
- GNU_RELRO? Partial relro protection present
Full relro enabled by using
flags: -Wl,-z,relro -Wl,-z,now
To test:
readelf -d myprog|grep NOW
- flags NOW? Full relro protection present
Fortifying Source Code
Enabled by using
flags: -D_FORTIFY_SOURCE=1
or
flags: -D_FORTIFY_SOURCE=2
depending on the compiler support
Access Control Mechanisms
SELinux