Skip to main content

Mach-O Binary Format

A plain-English dictionary for the Mach-O binary format — what every header field, segment, section, and load command actually means, why the layout exists the way it does, and how to inspect it yourself with the tools Apple ships.

Every executable, framework, dylib, and object file on Apple platforms is a Mach-O file. The name stands for Mach Object — a leftover from the Mach microkernel that NeXT adopted in the late 1980s and Apple inherited with macOS. Understanding its layout turns linker errors, crash reports, and code-signing failures from opaque noise into something you can read and reason about.


Mach-O File

What it is: a binary container format used for executables (.app main binary), dynamic libraries (.dylib), static libraries (.a), object files (.o), and kernel extensions (.kext) on macOS, iOS, watchOS, tvOS, and visionOS.

Why it exists: every operating system needs a contract between the compiler, the linker, and the OS loader. Mach-O is Apple’s version of that contract. It tells dyld (the dynamic linker) exactly where to load the binary in memory, what libraries to pull in, where the symbols live, and how to verify the binary hasn’t been tampered with.

How to inspect it:

# Print all load commands for the main binary of an app
otool -l /Applications/Safari.app/Contents/MacOS/Safari

# Print the Mach-O header only
otool -h /usr/bin/swift

# Show all symbols (exported + local)
nm -a /usr/lib/libSystem.B.dylib

# Show segment sizes
size -m /usr/bin/swift

The overall structure of any Mach-O file, from byte 0 downward, is:

┌─────────────────────┐  ← byte 0
│   Mach-O Header     │  magic, CPU type, file type, load command count
├─────────────────────┤
│   Load Commands     │  LC_SEGMENT_64, LC_DYLIB, LC_CODE_SIGNATURE, …
├─────────────────────┤
│   Raw Data          │  actual code, data, symbol tables, strings
└─────────────────────┘

Mach-O Header (mach_header_64)

What it is: the first 32 bytes of every 64-bit Mach-O file. It identifies the file as a Mach-O binary and describes its basic properties.

Why it matters: dyld reads these bytes before doing anything else. The magic number tells it the byte order; the CPU type tells it whether this binary runs on this hardware; the file type tells it whether this is an executable, a library, or an object file.

The fields:

FieldTypeMeaning
magicuint32_t0xFEEDFACF (64-bit, little-endian) or 0xCEFAEDFE (byte-swapped)
cputypecpu_type_tCPU_TYPE_ARM64, CPU_TYPE_X86_64, etc.
cpusubtypecpu_subtype_tCPU variant — CPU_SUBTYPE_ARM64E for pointer authentication
filetypeuint32_tMH_EXECUTE, MH_DYLIB, MH_OBJECT, MH_BUNDLE, MH_CORE
ncmdsuint32_tNumber of load commands that follow
sizeofcmdsuint32_tTotal byte size of all load commands
flagsuint32_tFeature flags: MH_PIE, MH_TWOLEVEL, MH_NO_REEXPORTED_DYLIBS, …

How to read it:

otool -h /usr/bin/swift
# Mach header
#       magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
#  0xfeedfacf 16777228          0  0x00          2    42       5824 0x00200085

filetype 2 = MH_EXECUTE (runnable program). filetype 6 = MH_DYLIB. filetype 1 = MH_OBJECT (.o from the compiler).


Fat Binary / Universal Binary

What it is: a file that wraps multiple Mach-O binaries for different CPU architectures inside a single container. It begins with a fat_header followed by an array of fat_arch structs, each pointing to a slice at a different offset.

Why it exists: Apple silicon (arm64) and Intel (x86_64) Macs coexist. Rather than shipping separate downloads, a Universal Binary lets the OS pick the right slice at launch. During the Rosetta 2 transition and again with the arm64e / arm64 split for pointer authentication, fat binaries are how Apple ships one binary that runs on everything.

How to inspect and create them:

# Show what architectures are in a fat binary
lipo -info /usr/bin/python3
# Architectures in the fat file: /usr/bin/python3 are: x86_64 arm64

# Extract a single architecture slice
lipo /usr/bin/python3 -thin arm64 -output python3_arm64

# Create a fat binary from two slices
lipo -create python3_x86_64 python3_arm64 -output python3_universal

The fat_header magic is 0xCAFEBABE (big-endian, always — even on little-endian systems). dyld reads the fat header, finds the best matching fat_arch for the current CPU, seeks to that offset, and then reads the embedded Mach-O header as normal.


Load Commands

What they are: a variable-length list of instructions that immediately follows the Mach-O header. Each load command begins with an lc_type (what kind of command) and a cmdsize (how many bytes it occupies), followed by command-specific fields.

Why they exist: a static binary format can’t anticipate everything that might vary between binaries — some have code signatures, some have encryption info, some link against 3 libraries, some against 300. Load commands are an extensible list that encodes only what is needed. dyld iterates them in order and acts on each one it understands, ignoring types it doesn’t recognize.

The most important types:

CommandPurpose
LC_SEGMENT_64Maps a region of the file into the process’s virtual address space
LC_DYLD_INFO_ONLYTells dyld where to find binding, rebasing, and export info (modern)
LC_SYMTABPoints to the symbol table and string table
LC_DYSYMTABIndex of which symbols are local, external, or undefined
LC_LOAD_DYLIBA dynamic library this binary depends on
LC_RPATHA directory to search for @rpath-relative libraries
LC_CODE_SIGNATUREOffset and size of the code signing blob
LC_ENCRYPTION_INFO_64Marks an encrypted region (App Store FairPlay DRM)
LC_UUIDA unique 128-bit ID for this binary — matches dSYM for symbolication
LC_SOURCE_VERSIONThe source code version number embedded at link time
LC_BUILD_VERSIONMin OS version and SDK version
LC_MAINEntry point offset for MH_EXECUTE binaries (replaced LC_UNIXTHREAD)
# List all load commands with their details
otool -l MyApp.app/Contents/MacOS/MyApp | head -80

Segments

What they are: the coarse-grained memory regions of a Mach-O binary, defined by LC_SEGMENT_64 load commands. Each segment has a name, a virtual memory address, a size, and a set of permissions (read / write / execute).

Why the distinction exists: different regions of a binary need different memory protections. Executable code must be readable and executable but not writable (to prevent code injection). Mutable globals must be writable but should not be executable (W^X — Write XOR Execute). Segments encode these permissions so the kernel can enforce them when mapping the file.

The standard segments:

SegmentPermissionsContains
__TEXTr-x (read, execute)Machine code, string literals, Objective-C metadata, Swift metadata
__DATArw- (read, write)Mutable globals, __objc_data, __got (global offset table), __la_symbol_ptr
__DATA_CONSTr— (read only after dyld)Immutable pointers, __objc_classlist, __objc_protolist — made read-only after binding
__LINKEDITr— (read only)Symbol table, string table, dyld info, code signature — not mapped into the main address space
otool -l MyApp | grep -A 8 "^Load command" | grep -A 6 "segname __TEXT"
# segname __TEXT
#  vmaddr 0x0000000100000000
#  vmsize 0x0000000000004000
#  fileoff 0
#  filesize 16384
#  maxprot 0x00000005   (r-x)
#  initprot 0x00000005  (r-x)

Sections

What they are: subdivisions within a segment. A segment sets the memory permissions and overall address range; sections organize the content within it. Section names use the convention __segment,__section.

Why they exist: the linker, runtime, and debugger all need to find specific kinds of content quickly — Swift metadata, Objective-C selectors, unwind tables for stack unwinding. Sections give each kind of content a well-known address so nothing has to search.

Common sections in __TEXT:

SectionContains
__TEXT,__textCompiled machine code
__TEXT,__stubsShort trampolines that jump to lazily-bound external functions
__TEXT,__stub_helperCode that triggers lazy symbol binding on first call
__TEXT,__objc_methnamesObjective-C selector name strings
__TEXT,__cstringNull-terminated C string literals
__TEXT,__swift5_typesSwift type descriptors
__TEXT,__unwind_infoCompact unwind tables for exception handling

Common sections in __DATA:

SectionContains
__DATA,__gotNon-lazily bound external symbol pointers (filled at load time)
__DATA,__la_symbol_ptrLazily bound external symbol pointers (filled on first call)
__DATA,__objc_classlistPointers to all Objective-C class descriptors
__DATA,__objc_selrefsPointers to selector strings (deduplicated at link time)
__DATA,__bssZero-initialized global variables (not stored on disk — just a size)
__DATA,__commonUninitialized globals from C
# List all sections in a binary
otool -l MyApp | grep "sectname\|segname"

# Or use size for a summary
size -m MyApp.app/Contents/MacOS/MyApp

__LINKEDIT Segment

What it is: a special read-only segment that acts as the binary’s metadata store. It is never mapped into the normal executable address space — dyld reads it during loading and then the pages may be discarded.

Why it’s separate: keeping all linker metadata in one contiguous region means dyld only has to map one additional range of pages to do all its work (binding, symbol lookup, signature verification). It also means the kernel can discard __LINKEDIT pages under memory pressure once binding is done, because they’re file-backed and recreatable.

What lives in __LINKEDIT:

  • The symbol table (nlist_64 entries)
  • The string table (null-terminated symbol names)
  • The indirect symbol table (indexes into the above for stubs and GOT slots)
  • Export trie (a prefix-compressed trie of all exported symbol names and addresses)
  • Rebase info (how to adjust pointers for ASLR)
  • Binding info (external symbols to resolve at load time)
  • Lazy binding info (symbols resolved on first call)
  • Code signature blob

Symbol Table (LC_SYMTAB)

What it is: a table of nlist_64 structs, each representing one symbol — a function, a global variable, a debug entry, or an undefined external reference. Parallel to it is a string table: a blob of null-terminated names that entries index into.

Why it matters: the linker uses the symbol table to resolve references between .o files and libraries. The debugger uses it to map addresses back to function names. The dynamic linker uses the export trie (a modern, compressed version) to look up symbols across dylibs.

nlist_64 fields:

FieldMeaning
n_un.n_strxByte offset into the string table for this symbol’s name
n_typeBitmask encoding type, external flag, and debug flag
n_sectWhich section this symbol is defined in (0 = undefined)
n_descAdditional info: weak reference flag, library ordinal, etc.
n_valueAddress (for defined symbols) or 0 (for undefined)
# Show all exported symbols in a dylib
nm -gU /usr/lib/libSystem.B.dylib | head -20

# Show undefined symbols (external dependencies) in your binary
nm -u MyApp.app/Contents/MacOS/MyApp

# Demangle Swift symbol names
nm MyApp.app/Contents/MacOS/MyApp | xcrun swift-demangle | grep "MyClass"

LC_LOAD_DYLIB

What it is: a load command that records a dynamic library this binary depends on. It contains the library’s path (its “install name”), its compatibility version, and the current version the binary was linked against.

Why it matters: at launch, dyld reads every LC_LOAD_DYLIB, resolves each path to a file on disk, maps that file, and links the binary’s undefined symbols against the library’s exports. If any required library is missing, the process terminates immediately with dyld: Library not loaded.

How to inspect dependencies:

# Show all dylib dependencies of a binary
otool -L MyApp.app/Contents/MacOS/MyApp
# /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
# /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)
# /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
#   (compatibility version 300.0.0, current version 1952.0.0)

# Show the same for a specific framework
otool -L /System/Library/Frameworks/UIKit.framework/UIKit

The compatibility version is what the linker checks at build time. If the installed library’s compatibility version is lower than what was recorded, the load fails — this is how Apple enforces minimum OS requirements at the binary level.


Install Name

What it is: the canonical path embedded inside a dylib itself (in its LC_ID_DYLIB load command) that other binaries record when they link against it. When you run otool -L on a binary and see a path like /usr/lib/libSystem.B.dylib, that string came from libSystem.B.dylib’s own install name.

Why it matters: dyld resolves libraries by their recorded install name, not by where the file currently sits on disk. This is why moving a dylib to a different path breaks linking even if the file is still accessible — the path recorded in dependent binaries no longer matches the file’s location.

Path prefixes and their meanings:

PrefixMeaning
/usr/lib/ or /System/Absolute path — the OS guarantees the file is there
@executable_path/Relative to the directory containing the main executable
@loader_path/Relative to the directory containing the binary that has this load command
@rpathSearched through the LC_RPATH list at runtime
# See a dylib's own install name
otool -D MyFramework.framework/MyFramework
# MyFramework.framework/MyFramework: @rpath/MyFramework.framework/MyFramework

# Change it after the fact
install_name_tool -id "@rpath/MyFramework.framework/MyFramework" MyFramework

# Change a dependency path recorded in a binary
install_name_tool -change \
    "/old/path/libFoo.dylib" \
    "@rpath/libFoo.dylib" \
    MyApp

@rpath and LC_RPATH

What they are: @rpath is a placeholder prefix in a dylib’s install name that means “search the rpath list.” LC_RPATH load commands in the main executable (or any other binary in the chain) add directories to that list.

Why it exists: before @rpath, embedding a framework inside an app bundle required either a fixed absolute path (fragile) or @loader_path gymnastics. @rpath lets the dylib’s install name stay generic (@rpath/MyFramework.framework/MyFramework) while the executable declares where to look at runtime.

How it resolves: when dyld encounters a dependency whose path starts with @rpath, it substitutes each directory from the accumulated rpath list in order until it finds the file.

# See what rpath entries are embedded in a binary
otool -l MyApp | grep -A 2 LC_RPATH
# cmd LC_RPATH
# cmdsize 48
# path /usr/lib/swift (offset 12)

# Add an rpath entry
install_name_tool -add_rpath @executable_path/../Frameworks MyApp

# Delete one
install_name_tool -delete_rpath /usr/lib/swift MyApp

ASLR and Rebasing

What ASLR is: Address Space Layout Randomization. The kernel slides the entire binary to a random base address at each launch instead of loading it at its preferred virtual address. This makes exploits that rely on known absolute addresses much harder.

Why Mach-O needs to cooperate: if a binary contains absolute pointer values (e.g., a pointer to a global variable stored in __DATA), those pointers are wrong after the slide. The Mach-O format encodes a “rebase” list in __LINKEDIT that records every such pointer. dyld adds the slide to each one at load time.

The MH_PIE flag: a Mach-O header flag that opts the binary into ASLR. All modern executables set this flag; the linker adds it by default. Without it, the binary always loads at its preferred address (typically 0x100000000 on 64-bit macOS), making exploitation trivial.

# Check if a binary is PIE
otool -h MyApp | grep PIE
# MH_PIE flag will appear in the flags field if set

# The slide applied to a running process is visible in lldb
# (lldb) image list -o MyApp
# shows the load address offset from preferred

LC_CODE_SIGNATURE

What it is: a load command that records the file offset and size of the code signature blob appended at the end of the binary. The blob contains a hash of every page of the binary’s __TEXT segment plus the entitlements, signing identity, and an optional notarization ticket.

Why it matters: on iOS, watchOS, and tvOS, every binary must be signed or the kernel refuses to execute it. On macOS, Gatekeeper checks the signature on first launch, and hardened runtime processes require a valid signature to load third-party code. The code signature is also how the OS enforces entitlements — capabilities like push notifications, iCloud, or keychain access groups.

How to inspect and verify:

# Show the signature details
codesign -dvvv MyApp.app

# Verify the signature
codesign --verify --deep --strict MyApp.app

# Show entitlements embedded in the signature
codesign -d --entitlements :- MyApp.app

# Show the code signing blob itself (raw)
otool -l MyApp | grep -A 4 LC_CODE_SIGNATURE

LC_UUID

What it is: a 128-bit universally unique identifier embedded into every compiled binary by the linker. Each build of the same source produces a different UUID.

Why it matters: crash reports contain the UUID of the binary that crashed. The symbolication tool (atos, Instruments, Xcode Organizer) uses this UUID to find the matching .dSYM debug symbol bundle and map raw addresses back to file names and line numbers. If the UUID in the crash report doesn’t match the UUID of any .dSYM you have, you cannot symbolicate that crash.

# Show the UUID of a binary
dwarfdump --uuid MyApp.app/Contents/MacOS/MyApp

# Show the UUID stored inside a dSYM
dwarfdump --uuid MyApp.app.dSYM/Contents/Resources/DWARF/MyApp

# Both must match for symbolication to work

dSYM

What it is: a directory bundle (MyApp.app.dSYM) that contains the DWARF debug information stripped out of the release binary during the Xcode build. DWARF (Debugging With Attributed Record Formats) encodes line number tables, type information, and variable locations.

Why it’s separate: shipping DWARF inside the binary makes it much larger and exposes source file paths to anyone who opens the binary. Stripping debug info produces a smaller, faster-to-sign binary. The dSYM lives next to the archive and is uploaded to App Store Connect alongside the binary so Apple can symbolicate crash reports on your behalf.

How to symbolicate manually:

# Symbolicate a single address from a crash report
# Load address + slide = runtime address
atos -arch arm64 -o MyApp.app.dSYM/Contents/Resources/DWARF/MyApp \
     -l 0x100000000 \
     0x10000432c

# Symbolicate a whole crash report
symbolicatecrash MyCrash.crash MyApp.app.dSYM > symbolicated.crash

Two-Level Namespace

What it is: a Mach-O feature (enabled by the MH_TWOLEVEL header flag, on by default since macOS 10.1) where every undefined symbol reference in a binary records not just the symbol name but also which library it is expected to come from.

Why it matters: in a flat namespace, dyld finds the first library that exports a matching symbol name and binds to it — which means two libraries exporting the same name silently conflict. Two-level namespace eliminates that ambiguity: a reference to _NSLog in Foundation will only ever bind to Foundation’s _NSLog, even if another library exports the same name.

The practical implication: you can’t intentionally override a system symbol by exporting the same name from your own dylib — the binary’s symbol references already encode the expected library ordinal. This is why LD_PRELOAD-style interposition is much more limited on macOS than on Linux.

# Inspect library ordinals for undefined symbols
# (n_desc field encodes the library ordinal for two-level lookups)
nm -m MyApp | grep "from Foundation" | head -10

Tools Quick Reference

ToolWhat it does
otool -hPrint Mach-O header
otool -lPrint all load commands
otool -LPrint dylib dependencies (like ldd on Linux)
otool -t -vDisassemble __TEXT,__text
nmList symbols (add -g for globals, -u for undefined)
size -mPrint segment sizes
lipo -infoShow architectures in a fat binary
lipo -thinExtract one architecture slice
codesign -dvvvShow code signature details
codesign --verifyVerify signature validity
dwarfdump --uuidShow UUID of binary or dSYM
install_name_toolEdit install names and rpaths in place
xcrun swift-demangleDemangle Swift mangled symbol names
atosConvert addresses to symbols using a dSYM
symbolicatecrashFully symbolicate a .crash report
pagestuffShow what is on each page of a Mach-O file