# Debugging Tips
WIP, intended as a list of resources and strategies that are useful when developing non-Metal fixes. Basically the non-metal class you guys always bother me about. Aiming to make this beginner-level without being condescending, please provide feedback...
TODO: bug Slav, Flag, and everyone else who actually knows stuff!
# Assembly Resources
TODO: there are a lot more
# Tools
- Hopper (opens new window) (easiest, best for objc, 30min demo, no cracks for recent version)
- IDA (opens new window) (best pseudocode, limited free version, cracks available)
- Ghidra (opens new window) (scary)
objdump
(built-in, great for scripting (e.g. find all uses ofPE_parse_boot_argn
in the system), pass--x86-asm-syntax=intel
for familiar assembly syntax)otool
(similar toobjdump
, better at finding string refs and such, but only supports AT&T syntax)- class-dump (opens new window)
- TODO: look into Jonathan Levin's stuff
- Insert_dylib (opens new window)
TODO: more
# Hopper Tips
- press
X
on a function/string for references D
cycles through 1/2/4/8-byte chunks, various (see Modify) to set data typeC
sets as code,P
creates procedure (needed for pseudocode)- use the Edit menu to copy hex bytes (for some reason cmd+C doesn't work)
TODO: more tips? TODO: hopper scripts/extensions? TODO: IDA and stuff
# Calling Convention (just a summary, see cheatsheets)
Function calls use a standard register order, normally:
rdi
, rsi
, rdx
, rcx
, r8
, r9
Or the shorter versions:
edi
, esi
, edx
, ecx
, r8d
, r9d
Float/double values are passed in xmm<number>
registers. Note that this means it often doesn't actually matter the order of float values relative to int/pointer ones. f(int a,float b)
and f(float a,int b)
will both use edi
and xmm0
.
Structs (a common example being CGRect
) get put onto the stack, as do ints/pointers that don't fit into the registers. This can be pretty confusing.
Pointers and pointer types like id
or CF...Ref
are going to always be 64-bit (r...
) registers. Long integers too. Shorter integers (and booleans) will use 32-bit (e...
, ...d
). This can be helpful for guessing types. When writing a shim function, it's usually safe to use long
or void*
for a 64-bit register no matter the value.
Return values will be in rax
unless they're structs (stack), floats (xmm0
) or some weird thing that returns two values (TODO).
Keep in mind that C programming also uses a lot of "output parameters", i.e. pointers to a caller-created output variable, passed in as an argument, which the function will dereference and write to.
It's usually easier to figure out a function signature by finding somewhere it's called, rather than its actual implementation. Keep in mind that disassembler pseudocode will often tack on extra arguments (e.g. if something set rdx
a couple lines above, it might think it's the third argument to a 2-argument function), and it will often completely miss floats and stack arguments. If something messes with the stack or sets xmm0
just before a function call, be careful...
In C++ functions, this
is passed in rdi
so the arguments you see in the source code start at rsi
.
In objc functions, self
and the selector are passed in rdi
and rsi
. However objc signatures can usually be found with type encodings or class-dump anyways (TODO).
TODO: would it be helpful to have a sample walkthrough of finding the signature of some random SL function here?
# NOP Something
Find the offending assembly and check the hex view or addresses to figure out the file offset and how many bytes to kill (instructions are variable size, so a line of assembly will be multiple nop
s). Then use a hex editor or Binpatcher
or whatever to overwrite that range with 0x90
. Alternately, assemble a jump instruction (alt+A and jmp <offset>
) to skip a big chunk of code.
TODO: screenshots if this is confusing
# Patch Something
Similarly to nop
, but use alt+A to overwrite with lines of assembly and overwrite those corresponding hex bytes instead of 0x90
. Have to be clever about instruction sizes -- differing sizes will trash (multiple) subsequent lines, make sure to clean up with nop
s or use the same size. (Operating on smaller parts of the register (e.g. al
/eax
for rax
) is usually shorter.)
To force a function to return a certain value, use mov rax,<stuff>
followed by ret
. If not at the top of the function, you have to worry about stack stuff (TODO).
TODO: more screenshots
# DYLD Cache
- check the ramdisk for uncached binaries! (low-ish level, not AppKit or anything sadly)
- Apple's extractor (opens new window) (this will give very broken binaries, but it's great for dumping them all to a folder so you can
grep
for mystery error strings) - Hopper and IDA can both load semi-broken binaries
- Catalina can be helpful as a last resort
- TODO: correlating symbols from lldb/backtrace with sub_... crap in disassemblers
- no known (to me) way to get out working binaries, please prove me wrong...
TODO: research more tools, Flag sent some?
# OBJC iVars
Sometimes (in Hopper, IDA doesn't seem to handle this) you will see named ivar structures with a small numeric value being added to an id reference. This is the internal implementation of @property.
TODO: definitely needs some screenshots. should also just cover the "struct pointer + offset" concept in general.
# Misc. macOS Resources
- OSStatus (opens new window) (look up macOS error codes)
- fuckingblocksyntax.com (opens new window)
- Jonathan Levin's books (extremely detailed, PDFs on libgen)
- COMP2401 Memory Management PDF (opens new window)
- TODO: that kernel programming book Flag sent
TODO: there are so many more
# LLDB Tips
- run
attach -w -n <process name>
before launching the victim process and it will attach and pause when it barely started at all b <function/selector name/address>
to set a breakpointb -r '<regex>'
for setting multiple breakpoints at once, can be very powerful e.g.b -r '.*NSWindow.*' -C bt -G true
to log all NSWindow calls without stoppingd
to print assembly of current function,n
to step without going into calls,s
to step going into calls,c
to resumesettings set target.x86-disassembly-flavor intel
for familiar assembly syntax (can put in~/.lldbinit
)
TODO: example?
# ObjC Swizzling
- see
Utils/Swizzle.m
in the Monterey repo for how to swizzle using an IMP (C-style function) and here (opens new window) for an example implementation. - alternate (opens new window) (and worse, IMO) approach
# DYLD Interposing
- pretty straightforward but you have to paste in that weird macro, TODO
Note that swizzling/interposing is not only good for "fixing" stuff, but also for dynamic analysis -- logging and experimenting with changing values. I do this a lot.
TODO: DYLD_INSERT_LIBRARIES
(including launchctl and __XPC use, AMFI problems etc)
# Console/log
/dmesg
Not to be underestimated. Make sure to enable debug/info messages. log show -last boot -debug -info --predicate 'message contains "some shit"'
is very useful.
# sample
Either run on the command-line or through Activity Monitor, this lists a graph of functions being called in a process. Really great for finding places to start looking in a disassembler and/or seeing if a process is actually doing anything.
TODO: lsof/fs_usage
TODO: so much more to put here...
TODO: would a complete sample workflow of creating a shim function (something simple like the dock ones or accessibility zoom) be useful?
TODO: learn about mach_msg/MIG and document it here (currently all i really know is SLS... corresponding to _X... which is SL specific). similarly, IOConnect... calls. and other XPC stuff