pagestuff - Mach-O file page analysis tool

pagestuff file [-arch arch_flag] [-a] [-p] [pagenumber…]

Information about the specified logical pages of file in Mach-O executable format.
For each specified page of code, symbols (function and static data structure names) are displayed.
If no pages are specified, symbols for all pages in the __TEXT section are displayed.

In the following examples a small C program was compiled to a.out in only Mach-O 64-bit executable x86_64.
-arch arch_type the architecture to operate on when file is a universal file. (See arch(3) for the currently know arch_types.) When this option is used the page numbers are the logical page numbers starting at zero for the offset that architecture starts in a universal file.
-a all pages.
> pagestuff a.out -a
File Page 0 contains Mach-O headers
File Page 0 contains contents of section (__TEXT,__text)
File Page 0 contains contents of section (__TEXT,__stubs)
File Page 0 contains contents of section (__TEXT,__stub_helper)
File Page 0 contains contents of section (__TEXT,__cstring)
File Page 0 contains contents of section (__TEXT,__unwind_info)
Symbols on file page 0 virtual address 0x100000a10 to 0x100001000
  0x0000000100000a10 _main
File Page 1 contains contents of section (__TEXT,__unwind_info)
File Page 1 contains contents of section (__DATA,__got)
File Page 1 contains contents of section (__DATA,__nl_symbol_ptr)
File Page 1 contains contents of section (__DATA,__la_symbol_ptr)
Symbols on file page 1 virtual address 0x100001000 to 0x100001058
File Page 2 contains dyld info for sliding an image
File Page 2 contains dyld info for binding symbols
File Page 2 contains dyld info for lazy bound symbols
File Page 2 contains dyld info for symbols exported by a dylib
File Page 2 contains data of function starts
File Page 2 contains symbol table for defined global symbols
File Page 2 contains symbol table for undefined symbols
File Page 2 contains indirect symbols table
File Page 2 contains string table for external symbols
-p list sections of the specified Mach-O file, with offsets and lengths.
 > pagestuff a.out -p
FP_MACH_O
    offset = 0
    size = 8796
    MP_MACH_HEADERS
        offset = 0
        size = 1312                   following condensed by ed
    MP_EMPTY_SPACE offset = 1312 size = 1264
    MP_SECTION_64 (__TEXT,__text) offset = 2576 size = 1019
    MP_EMPTY_SPACE offset = 3595 size = 1
    MP_SECTION_64 (__TEXT,__stubs) offset = 3596 size = 48
    MP_SECTION_64 (__TEXT,__stub_helper) offset = 3644 size = 96
    MP_SECTION_64 (__TEXT,__cstring) offset = 3740 size = 281
    MP_EMPTY_SPACE offset = 4021 size = 3
    MP_SECTION_64 (__TEXT,__unwind_info) offset = 4024 size = 72
    MP_SECTION_64 (__DATA,__got) offset = 4096 size = 8
    MP_SECTION_64 (__DATA,__nl_symbol_ptr) offset = 4104 size = 16
    MP_SECTION_64 (__DATA,__la_symbol_ptr) offset = 4120 size = 64
    MP_EMPTY_SPACE offset = 4184 size = 4008
    MP_DYLD_INFO_REBASE offset = 8192 size = 8
    MP_DYLD_INFO_BIND offset = 8200 size = 40
    MP_DYLD_INFO_LAZY_BIND offset = 8240 size = 112
    MP_DYLD_INFO_EXPORT offset = 8352 size = 48
    MP_FUNCTION_STARTS offset = 8400 size = 8
    MP_EXTDEF_SYMBOLS offset = 8408 size = 32
    MP_UNDEF_SYMBOLS offset = 8440 size = 160
    MP_INDIRECT_SYMBOL_TABLE offset = 8600 size = 76
    MP_EXT_STRING_TABLE offset = 8676 size = 120
size -m -l -x is more concise
 > size -m -l -x
Segment __PAGEZERO: 0x100000000 (vmaddr 0x0 fileoff 0)
Segment __TEXT: 0x1000 (vmaddr 0x100000000 fileoff 0)
    Section __text: 0x3fb (addr 0x100000a10 offset 2576)
    Section __stubs: 0x30 (addr 0x100000e0c offset 3596)
    Section __stub_helper: 0x60 (addr 0x100000e3c offset 3644)
    Section __cstring: 0x119 (addr 0x100000e9c offset 3740)
    Section __unwind_info: 0x48 (addr 0x100000fb8 offset 4024)
    total 0x5ec
Segment __DATA: 0x1000 (vmaddr 0x100001000 fileoff 4096)
    Section __got: 0x8 (addr 0x100001000 offset 4096)
    Section __nl_symbol_ptr: 0x10 (addr 0x100001008 offset 4104)
    Section __la_symbol_ptr: 0x40 (addr 0x100001018 offset 4120)
    total 0x58
Segment __LINKEDIT: 0x1000 (vmaddr 0x100002000 fileoff 8192)
total 0x100003000

See

Mach-O(5), size(1)

vmmap
Display the virtual memory regions allocated in a process

vmmap [-w[ide]] [-v[erbose]] [-resident] [-dirty] [-swapped] [-purge] [-submap][-allSplitLibs] [-noCoalesce]
         [-interleaved] [-pages] [-summary] pid | partial-executable-name

Displays the virtual memory regions allocated in a specified process, helping a programmer understand how memory is being used, and what the purposes of memory at a given address may be.
The process can be specified by process ID or by full or partial executable name.
-w
-wide
 
-v
-verbose
Equivalent to -wide -resident -dirty -swapped -purge -submap -allSplitLibs -noCoalesce
-resident  
-dirty  
-swapped (paged out or compressed).
-purge  
-submap  
-allSplitLibs even those not loaded by this process.
-noCoalesce Do not coalesce adjacent identical regions.
-pages page counts rather than kilobytes.
-interleaved output all regions in ascending order of starting address,
rather than outputting all non-writable regions followed by all writable regions.
-summary  
example with -verbose

Explanation of output

Each region's description includes starting, ending address, size, permissions, sharing mode, and the purpose.

The first column names the purpose of the memory: malloc regions, stack, text, data segment, etc.

size

number of virtual memory pages reserved, not necessarily allocated.
For example, vm_allocate reserves pages, but they won't be allocated until touched.
A memory-mapped file may have a page reserved, but is not instantiated until a read or write occurs.
size may not describe the application's true memory usage.

protection mode

describes if the memory is readable, writable, or executable. Each region is displayed with current permission followed by the maximum permission . Pages of an executable always have the execute and read bits set ("r-x").
The current permissions usually do not permit writing to the region.
The maximum permissions allow writing so that the debugger can request write access to a page to insert breakpoints.
Permissions for executables appear as "r-x/rwx" .
The first page of an application (starting at address 0x00000000) permits neither reads, writes, or execution ("---"), ensuring that access to address 0, or dereferences of a NULL pointer cause a bus error.

share

Describes whether pages are shared between processes and what happens when pages are modified.
Private pages (PRV) are pages only visible to this process, are allocated as they are written to, and can be paged out to disk.
Copy-on-write (COW) pages are shared by multiple processes (or shared by a single process in multiple locations). When the page is modified, the writing process then receives its own private copy of the page.
Empty (NUL) sharing for page that do not exist in physical memory.
Aliased (ALI) and shared (SHM) memory is shared between processes.

The share mode typically describes the general mode controlling the region. For example, as copy-on-write pages are modified, they become private to the application. Even with the private pages, the region is still COW until all pages become private. Once all pages are private, then the share mode would change to private.

For regions loaded from binaries, the far right shows the library loaded into the memory.

submap

A shared set of virtual memory page descriptions that the OS can reuse amoung multiple processes. Submaps minimize the operating system's memory usage by representing the virtual memory regions only once. Submaps can either be shared by all processes (machine-wide) or local to the process (process-only). ( may be interesting for users working with the low level virtual memory system.)

For example, one submap contains the read-only portions of the most common dynamic libraries. These libraries are needed by most programs on the system, and because they are read-only, they will never be changed. As a result, the operating system shares these pages between all the processes, and only needs to create a single data structure to describe how this memory is laid out in every process.

That section of memory is referred to as the "split library region", and it is shared system-wide. So, technically, all of the dynamic libraries that have been loaded into that region are in the VM map of every process, even though some pro- cesses may not be using some of those libraries. By default, vmmap shows only those shared system split libraries that have been loaded into the specified target process. If the -allSplitLibs flags is given, information about all shared system split libraries will be printed, regardless of whether they've been loaded into the specified target process or not.

If the contents of a machine-wide submap are changed -- for example, the debugger makes a section of memory for a dylib writable so it can insert debugging traps -- then the submap becomes local, and the kernel will allocate memory to store the extra copy.

See

heap, leaks, and malloc_history commands display aspects of a process's memory usage.

> heap
Usage: heap [-guessNonObjects] [-sumObjectFields] [-showSizes] [-addresses all | classes-pattern] [-noContent] pid
    -guessNonObjects                   try to identify non-object malloc nodes which are referenced by fields of other objects
    -sumObjectFields                   accumulate those fields into the entry for that object
    -showSizes                         show distribution of each malloc size for each object, instead of summing and averaging
    -addresses=all|matching-pattern  outputs the addresses of matching objects found on the heap in ascending address order
    -noContent                         do not show object content in -addresses mode

> leaks
leaks: Search through a process for leaked memory.
Usage: leaks [-hq] [--nocontext] [--nostacks] pid/partial-process-name [--trace address>]
    -e/--exclude sym  exclude leaked blocks whose backtraces include the specified symbol
    -q/--quiet          suppress the process description header and footer
    --nocontext         do not output the binary contexts of discovered leaks
    --nostacks          do not output backtraces, even when available
    --trace=address   output chains of references from process 'roots' (e.g., global data) to the given block

> malloc_history 
malloc_history: Displays/aggregates allocation histories in a process
Usage: malloc_history  pid/partial-process-name [options] mode [address ...]
'mode' should be one of {-callTree, -allBySize, -allByCount, -allEvents, or one or more addresses}
    -allBySize                        [mode]
    -allByCount                       [mode]
    -allEvents                        [mode]
    -callTree                         [mode]
    -highWaterMark                    
    -showContent                      (-calltree only)
    -invert                           (-calltree only)
    -ignoreThreads                    (-calltree only)
    -collapseRecursion                (-calltree only)
    -chargeSystemLibraries            (-calltree only)
    -consolidateAllBySymbol           (-calltree only)
    -consolidateSystemFramesBySymbol  (-calltree only)

lsof can be used to get a list of open and mapped files in one or more processes, which can help determine why a volume can't be unmounted or ejected.

The Xcode developer tools include Instruments, a graphical application that gives similar information.


NXGetAllArchInfos, NXGetLocalArchInfo, NXGetArchInfoFromName, NXGetArchInfoFromCpuType, NXFindBestFatArch, NXCombineCpuSubtypes -- get architecture information
     #include 

     extern const NXArchInfo *
     NXGetAllArchInfos(void);

     extern const NXArchInfo *
     NXGetLocalArchInfo(void);

     extern const NXArchInfo *
     NXGetArchInfoFromName(const char *name);

     extern const NXArchInfo *
     NXGetArchInfoFromCpuType(cpu_type_t cputype, cpu_subtype_t cpusubtype);

     extern struct fat_arch *
     NXFindBestFatArch(cpu_type_t cputype, cpu_subtype_t cpusubtype, struct fat_arch *fat_archs, uint32_t nfat_archs);

     extern cpu_subtype_t
     NXCombineCpuSubtypes(cpu_type_t cputype, cpu_subtype_t cpusubtype1, cpu_subtype_t cpusubtype2);
Functions are for use in programs that have to deal with universal files or programs that can target multiple architectures.
Typically, a program will use a command-line argument that starts with -arch name, where this specifies an architecture. These functions and data structures provide some help for processing architecture flags and then processing the contents .

The structure NXArchInfo is defined in <mach-o/arch.h>:

               typedef struct {
                           const char *name;
                           cpu_type_t cputype;
                           cpu_subtype_t cpusubtype;
                           enum NXByteOrder byteorder;
                           const char *description;
                   } NXArchInfo;
It is used to hold the name of the architecture and the corresponding CPU type and CPU subtype, together with the architecture's byte order and a brief description string.

The currently known architectures are:

     Name          CPU Type            CPU Subtype                 Description
     x86_64        CPU_TYPE_X86_64     CPU_SUBTYPE_X86_64_ALL      Intel x86-64
     i386          CPU_TYPE_I386       CPU_SUBTYPE_I386_ALL        Intel 80x86
     arm           CPU_TYPE_ARM        CPU_SUBTYPE_ARM_ALL         ARM
     arm64         CPU_TYPE_ARM64      CPU_SUBTYPE_ARM64_ALL       ARM64
     ppc           CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_ALL     PowerPC
     ppc64         CPU_TYPE_POWERPC64  CPU_SUBTYPE_POWERPC64_ALL   PowerPC 64-bit
     m68k          CPU_TYPE_MC680x0    CPU_SUBTYPE_MC680x0_ALL     Motorola 68K
     hppa          CPU_TYPE_HPPA       CPU_SUBTYPE_HPPA_ALL        HP-PA
     i860          CPU_TYPE_I860       CPU_SUBTYPE_I860_ALL        Intel 860
     m88k          CPU_TYPE_MC88000    CPU_SUBTYPE_MC88000_ALL     Motorola 88K
     sparc         CPU_TYPE_SPARC      CPU_SUBTYPE_SPARC_ALL       SPARC
     i486          CPU_TYPE_I386       CPU_SUBTYPE_486             Intel 486
     i486SX        CPU_TYPE_I386       CPU_SUBTYPE_486SX           Intel 486SX
     pentium       CPU_TYPE_I386       CPU_SUBTYPE_PENT            Intel Pentium
     i586          CPU_TYPE_I386       CPU_SUBTYPE_586             Intel 586
     pentpro       CPU_TYPE_I386       CPU_SUBTYPE_PENTPRO         Intel Pentium Pro
     i686          CPU_TYPE_I386       CPU_SUBTYPE_PENTPRO         Intel Pentium Pro
     pentIIm3      CPU_TYPE_I386       CPU_SUBTYPE_PENTII_M3       Intel Pentium II Model 3
     pentIIm5      CPU_TYPE_I386       CPU_SUBTYPE_PENTII_M5       Intel Pentium II Model 5
     pentium4      CPU_TYPE_I386       CPU_SUBTYPE_PENTIUM_4       Intel Pentium 4
     armv4t        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V4T         arm v4t
     armv5         CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V5TEJ       arm v5
     xscale        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_XSCALE      arm xscale
     armv6         CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V6          arm v6
     armv6m        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V6M         arm v6m
     armv7         CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7          arm v7
     armv7f        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7F         arm v7f
     armv7s        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7S         arm v7s
     armv7k        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7K         arm v7k
     armv7m        CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7M         arm v7m
     armv7em       CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V7EM        arm v7em
     armv8         CPU_TYPE_ARM        CPU_SUBTYPE_ARM_V8          arm v8
     arm64         CPU_TYPE_ARM64      CPU_SUBTYPE_ARM64_V8        arm64 v8
     ppc601        CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_601     PowerPC 601
     ppc603        CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_603     PowerPC 603
     ppc604        CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_604     PowerPC 604
     ppc604e       CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_604e    PowerPC 604e
     ppc750        CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_750     PowerPC 750
     ppc7400       CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_7400    PowerPC 7400
     ppc7450       CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_7450    PowerPC 7450
     ppc970        CPU_TYPE_POWERPC    CPU_SUBTYPE_POWERPC_970     PowerPC 970
     m68030        CPU_TYPE_MC680x0    CPU_SUBTYPE_MC68030_ONLY    Motorola 68030
     m68040        CPU_TYPE_MC680x0    CPU_SUBTYPE_MC68040         Motorola 68040
     hppa7100LC    CPU_TYPE_HPPA       CPU_SUBTYPE_HPPA_7100LC     HP-PA 7100LC
The first set of entries are used for the architecture family. The second set of entries are used for a specific architecture, when more than one specific architecture is supported in a family of architectures.

NXGetAllArchInfos() returns a pointer to an array of all known NXArchInfo structures. The last NXArchInfo is marked by a NULL name.

NXGetLocalArchInfo() returns the NXArchInfo for the local host, or NULL if none is known.

NXGetArchInfoFromName() and NXGetArchInfoFromCpuType() return the NXArchInfo from the architecture's name or CPU type/CPU subtype combination. A CPU subtype of CPU_SUBTYPE_MULTIPLE can be used to request the most general NXArchInfo known for the given CPU type. NULL is returned if no matching NXArchInfo can be found.

NXFindBestFatArch() is passed a CPU type and CPU subtype and a set of fat_arch structs. It selects the best one that matches (if any), and returns a pointer to that fat_arch struct (or NULL). The fat_arch structs must be in the host byte order and correct such that fat_archs really points to enough memory for nfat_archs structs. It is possible that this routine could fail if new CPU types or CPU subtypes are added and an old version of this routine is used. But if there is an exact match between the CPU type and CPU subtype and one of the fat_arch structs, this routine will always succeed.

NXCombineCpuSubtypes() returns the resulting CPU subtype when combining two different CPU subtypes for the specified CPU type. If the two CPU subtypes can't be combined (the specific subtypes are mutually exclusive), -1 is returned, indicating it is an error to combine them. This can also fail and return -1 if new CPU types or CPU subtypes are added and an old version of this routine is used. But if the CPU subtypes are the same, they can always be combined and this routine will return the CPU subtype passed in.