Posted by Rico's Nerd Cluster on April 25, 2025

Motivating Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
void foo(){
    const size_t N = 100'000'000;
    std::vector<int> v(N);
    // fill v with 1,2,3,...
    std::iota(v.begin(), v.end(), 1);

    // sum in a tight loop
    long long sum = 0;
    for (size_t i = 0; i < N; ++i) {
        sum += v[i];
    }

    std::cout << "sum = " << sum << "\n";
}

int add(int a, int b) {
    return a + b;
}

int main() {
    foo();
    return 0;
}
  1. Compile to an object file
1
g++ -c example.cpp -o example.o
  1. Inspect raw (mangled) symbols nm example.o, you might see:
1
2
0000000000000000 T _Z3foov
000000000000015c T _Z3addii
  • _Z3foov → mangled for foo()
  • _Z3addii → mangled for add(int,int)

If we compile this file to an object file example.o:

1
g++ -C example.cpp
  1. We can see symbols with nm: nm example.o:
1
2
000000000000015c T _Z3barv
0000000000000000 T _Z3foov
  1. Demangle a symbol echo '_Z3addii' | c++filt, output: add(int, int)

  2. Demangle in-place with nm nm -C example.o. Example output:

1
2
3
4
5
6
7
8
0000000000000000 T foo()                 ← foo() at offset 0x0 in .text
000000000000015c T add(int, int)         ← add() at offset 0x15c in .text
                 U __stack_chk_fail      ← undefined symbol
0000000000000000 W __gnu_cxx::new_allocator<int>::allocate(...)
                                        ← weak symbol (address 0 until link)
0000000000000000 r __pstl::execution::v1::seq
                                        ← read-only data at offset 0 in .rodata

  • nm is a tool in GNU Binutils that lists symbol table of an .o, .so, and static library .a, or executable
  • Symbol types are:
    • T = global (external) function in .text
    • r = read-only data
    • W = weak symbol
  • Symbol Interpretation
    • foo is the first symbol in this section, so it lives at memory 0000000000000000
    • __gnu_cxx::new_allocator<int>... is a weak symbol, meaning it has a default value of 0 until the final link decides their address.
    • Once you’ve linked into an executable or .som these offsets become real address.