Difference between revisions of "Reverse-Engineering"
Line 43: | Line 43: | ||
''The best way to think about Pin is as a "just in time" (JIT) compiler. The input to this compiler is not bytecode, however, but a regular executable. Pin intercepts the execution of the first instruction of the executable and generates ("compiles") new code for the straight line code sequence starting at this instruction. It then transfers control to the generated sequence. The generated code sequence is almost identical to the original one, but Pin ensures that it regains control when a branch exits the sequence. After regaining control, Pin generates more code for the branch target and continues execution. Pin makes this efficient by keeping all of the generated code in memory so it can be reused and directly branching from one sequence to another.'' |
''The best way to think about Pin is as a "just in time" (JIT) compiler. The input to this compiler is not bytecode, however, but a regular executable. Pin intercepts the execution of the first instruction of the executable and generates ("compiles") new code for the straight line code sequence starting at this instruction. It then transfers control to the generated sequence. The generated code sequence is almost identical to the original one, but Pin ensures that it regains control when a branch exits the sequence. After regaining control, Pin generates more code for the branch target and continues execution. Pin makes this efficient by keeping all of the generated code in memory so it can be reused and directly branching from one sequence to another.'' |
||
''In JIT mode, the only code ever executed is the generated code. The original code is only used for reference. When generating code, Pin gives the user an opportunity to inject their own code (instrumentation).'' |
''In JIT mode, the only code ever executed is the generated code. The original code is only used for reference. When generating code, Pin gives the user an opportunity to inject their own code (instrumentation).'' |
||
+ | ===Lib preloading=== |
||
+ | <source lang=c> |
||
+ | #define _GNU_SOURCE |
||
+ | |||
+ | #include <dlfcn.h> |
||
+ | #include <sys/types.h> |
||
+ | #include <unistd.h> |
||
+ | #include <errno.h> |
||
+ | #include <stdio.h> |
||
+ | #include <time.h> |
||
+ | |||
+ | |||
+ | // Kill nanosleep() |
||
+ | int nanosleep(const struct timespec *req, struct timespec *rem){ |
||
+ | printf("\n==== In our own nanosleep(), I dunnah want sleep\n"); |
||
+ | return 0; |
||
+ | } |
||
+ | |||
+ | // Kill usleep() |
||
+ | int usleep(useconds_t usec){ |
||
+ | printf("\n==== In our own usleep(), I dunnah want sleep\n"); |
||
+ | return 0; |
||
+ | } |
||
+ | |||
+ | // Fix time() |
||
+ | time_t time(time_t *t){ |
||
+ | printf("\n==== In our own time(), will return 1380120175\n"); |
||
+ | return 1380120175; |
||
+ | } |
||
+ | |||
+ | // Fix srand() |
||
+ | void srand(unsigned int seed){ |
||
+ | printf("\n==== In our own srand(), will do srand(0)\n"); |
||
+ | void (*original_srand)(unsigned int seed); |
||
+ | original_srand = dlsym(RTLD_NEXT, "srand"); |
||
+ | unsigned int myseed = 0; |
||
+ | return (*original_srand)(myseed); |
||
+ | } |
||
+ | |||
+ | #if 0 |
||
+ | // Kill rand() |
||
+ | int rand(void){ |
||
+ | printf("\n==== In our own rand(), will return 0\n"); |
||
+ | return 0; |
||
+ | } |
||
+ | #else |
||
+ | // Intercept rand() |
||
+ | int rand(void){ |
||
+ | int (*original_rand)(void); |
||
+ | original_rand = dlsym(RTLD_NEXT, "rand"); |
||
+ | int r = (*original_rand)(); |
||
+ | printf("\n==== In our own rand(), will return %04X\n", r); |
||
+ | return r; |
||
+ | } |
||
+ | #endif |
||
+ | </source> |
||
+ | gcc -fPIC -shared -Wl,-soname,patch -o patch.so patch.c -ldl |
||
+ | export LD_PRELOAD=patch.so |
||
+ | export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH |
||
===Poor man's tools=== |
===Poor man's tools=== |
||
File, -z to uncompress, -s to inspect non-files, e.g. /dev/sda1 |
File, -z to uncompress, -s to inspect non-files, e.g. /dev/sda1 |
Revision as of 16:25, 4 October 2013
You'll find a lot of (moderate) reverse-engineering in this wiki but this page aims at providing a list of useful resources.
You won't' find much info about Windows platform because the topic is already quite well covered elsewhere.
Books
- The IDA Pro Book, 2nd Edition by Chris Eagle
- Reverse Engineering Code with IDA Pro by Dan Kaminsky et al
- Practical Malware Analysis by Michael Sikorski
Resources
Tools
IDA Pro
IDA Pro combines an interactive, programmable, multi-processor disassembler coupled to a local and remote debugger and augmented by a complete plugin programming environment.
- Official page
- Windows, Linux, Mac OS X
- x86-32, x86-64, ARM and many others
- ELF, Java bytecode, Dalvik, ARM,...
- disassembler, some debugger
- extensible through plugins & python (anti-debugger, findcrypt,...)
- IDA toolbag
- IDAscope
- patchdiff2
- Zynamics bindiff
- DarunGrim, another binary diff tool, opensource but discontinued?
- x86emu, x86 Emulator plugin. Windows, Linux, OS X
- Plugin contests 2012, 2011, 2010, 2009
Hex-Rays
The most expensivepowerful IDA Pro plugin is the Hex-Rays decompiler
- x86 and ARM
- decompiler
Limitations specific to ARM:
- floating point instructions are not supported
- VFP/SIMD/Neon/... instructions are not supported
- functions having an argument that is passed partially on registers and partially on the stack are not supported (e.g. int64 passed in R3 and on the stack)
Intel PIN tools
- Official page
- Windows, Linux, Mac OS X, Android
- x86-32, x86-64 (only Intel platforms obviously)
- binary instrumentation
The best way to think about Pin is as a "just in time" (JIT) compiler. The input to this compiler is not bytecode, however, but a regular executable. Pin intercepts the execution of the first instruction of the executable and generates ("compiles") new code for the straight line code sequence starting at this instruction. It then transfers control to the generated sequence. The generated code sequence is almost identical to the original one, but Pin ensures that it regains control when a branch exits the sequence. After regaining control, Pin generates more code for the branch target and continues execution. Pin makes this efficient by keeping all of the generated code in memory so it can be reused and directly branching from one sequence to another. In JIT mode, the only code ever executed is the generated code. The original code is only used for reference. When generating code, Pin gives the user an opportunity to inject their own code (instrumentation).
Lib preloading
#define _GNU_SOURCE
#include <dlfcn.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <time.h>
// Kill nanosleep()
int nanosleep(const struct timespec *req, struct timespec *rem){
printf("\n==== In our own nanosleep(), I dunnah want sleep\n");
return 0;
}
// Kill usleep()
int usleep(useconds_t usec){
printf("\n==== In our own usleep(), I dunnah want sleep\n");
return 0;
}
// Fix time()
time_t time(time_t *t){
printf("\n==== In our own time(), will return 1380120175\n");
return 1380120175;
}
// Fix srand()
void srand(unsigned int seed){
printf("\n==== In our own srand(), will do srand(0)\n");
void (*original_srand)(unsigned int seed);
original_srand = dlsym(RTLD_NEXT, "srand");
unsigned int myseed = 0;
return (*original_srand)(myseed);
}
#if 0
// Kill rand()
int rand(void){
printf("\n==== In our own rand(), will return 0\n");
return 0;
}
#else
// Intercept rand()
int rand(void){
int (*original_rand)(void);
original_rand = dlsym(RTLD_NEXT, "rand");
int r = (*original_rand)();
printf("\n==== In our own rand(), will return %04X\n", r);
return r;
}
#endif
gcc -fPIC -shared -Wl,-soname,patch -o patch.so patch.c -ldl export LD_PRELOAD=patch.so export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
Poor man's tools
File, -z to uncompress, -s to inspect non-files, e.g. /dev/sda1
file -k [-z] [-s] mybin
Strings
strings [-n min_length] -a -e [s|S|b|l|B|L] mybin
Shared library dependencies:
ldd -v mybin
Tracing library calls and system calls.
Getting a summary:
ltrace -f -S mybin 2>&1|grep '(.*)'|sed 's/(.*//'|sort|uniq -c
Getting more:
ltrace -f -i -S -n 4 -s 1024 mybin