Computer Architecture and Operating Systems
Lecture 4: Linking and Loading
Andrei Tatarnikov
atatarnikov@hse.ru
@andrewt0301
Example C Program
int sum(int *a, int n); int sum(int *a, int n)
{
int array[2] = {1, 2}; int i, s = 0;
int main() for (i = 0; i < n; i++) {
{ s += a[i];
int val = sum(array, 2); }
return val; return s;
} }
main.c sum.c
2
Static Linking
Programs are translated and linked using a compiler driver:
linux> gcc -Og -o prog main.c sum.c
linux> ./prog
main.c sum.c
Source files
Translators Translators
(cpp, cc1, as) (cpp, cc1, as)
main.o sum.o Separately compiled
relocatable object files
Linker (ld)
Fully linked executable object file
prog (contains code and data for all functions
3
defined in main.c and sum.c)
Why Linkers?
Reason 1: Modularity
Program can be written as a collection of smaller source
files, rather than one monolithic mass.
Can build libraries of common functions (more on this
later)
e.g., Math library, standard C library
4
Why Linkers? (cont)
Reason 2: Efficiency
Time: Separate compilation
Change one source file, compile, and then relink.
No need to recompile other source files.
Space: Libraries
Common functions can be aggregated into a single
file...
Yet executable files and running memory images
contain only code for the functions they actually use.
5
What Do Linkers Do?
Step 1: Symbol resolution
Programs define and reference symbols (global variables
and functions):
void swap() {…} /* define symbol swap */
swap(); /* reference symbol swap */
int *xp = &x; /* define symbol xp, reference x */
Symbol definitions are stored in object file (by assembler)
in symbol table.
Symbol table is an array of structs
Each entry includes name, size, and location of symbol.
During symbol resolution step, the linker associates each
symbol reference with exactly one symbol definition. 6
What Do Linkers Do? (cont)
Step 2: Relocation
Merges separate code and data sections into single
sections
Relocates symbols from their relative locations in the .o
files to their final absolute memory locations in the
executable.
Updates all references to these symbols to reflect their
new positions.
Let’s look at these two steps in more detail….
7
Three Kinds of Object Files (Modules)
Relocatable object file (.o file)
Contains code and data in a form that can be combined
with other relocatable object files to form executable
object file.
Each .o file is produced from exactly one source (.c) file
Executable object file (a.out file)
Contains code and data in a form that can be copied
directly into memory and then executed.
Shared object file (.so file)
Special type of relocatable object file that can be loaded
into memory and linked dynamically, at either load time
or run-time.
Called Dynamic Link Libraries (DLLs) by Windows
8
Executable and Linkable Format (ELF)
Standard binary format for object files
One unified format for
Relocatable object files (.o),
Executable object files (a.out)
Shared object files (.so)
Generic name: ELF binaries
9
ELF Object File Format
Elf header 0
ELF header
Word size, byte ordering, file type (.o, exec, .so), machine
type, etc. Segment header table
(required for executables)
Segment header table
.text section
Page size, virtual addresses memory segments (sections),
segment sizes. .rodata section
.text section .data section
Code .bss section
.rodata section .symtab section
Read only data: jump tables, ... .rel.txt section
.data section .rel.data section
Initialized global variables
.debug section
.bss section
Uninitialized global variables Section header table
“Block Started by Symbol”
“Better Save Space”
Has section header but occupies no space 10
ELF Object File Format (cont.)
.symtab section 0
Symbol table ELF header
Procedure and static variable names Segment header table
Section names and locations (required for executables)
.rel.text section .text section
Relocation info for .text section .rodata section
Addresses of instructions that will need to be .data section
modified in the executable
.bss section
Instructions for modifying.
.symtab section
.rel.data section
Relocation info for .data section .rel.txt section
Addresses of pointer data that will need to be .rel.data section
modified in the merged executable .debug section
.debug section
Info for symbolic debugging (gcc -g) Section header table
Section header table
Offsets and sizes of each section 11
Linker Symbols
Global symbols
Symbols defined by module m that can be referenced by other
modules.
E.g.: non-static C functions and non-static global
variables.
External symbols
Global symbols that are referenced by module m but defined
by some other module.
Local symbols
Symbols that are defined and referenced exclusively by
module m.
E.g.: C functions and global variables defined with the
static attribute.
Local linker symbols are not local program variables
12
Step 1: Symbol Resolution
Referencing
a global…
…that’s defined here
int sum(int *a, int n); main.c int sum(int *a, int n) sum.c
{
int array[2] = {1, 2}; int i, s = 0;
int main() for (i = 0; i < n; i++) {
{ s += a[i];
int val = sum(array, 2); }
return val; return s;
} }
Defining
a global Referencing Linker knows
Linker knows a global… nothing of i or s
nothing of val …that’s defined here
13
Local Symbols
Local non-static C variables vs. local static C variables
local non-static C variables: stored on the stack
local static C variables: stored in either .bss, or .data
int f()
{
static int x = 0;
return x;
Compiler allocates space in .data for each
} definition of x
int g() Creates local symbols in the symbol table with
{ unique names, e.g., x.1 and x.2.
static int x = 1;
return x;
} 14
How Linker Resolves Duplicate Symbol Definitions
Program symbols are either strong or weak
Strong: procedures and initialized globals
Weak: uninitialized globals
p1.c p2.c
strong int foo=5; int foo; weak
strong p1() { p2() { strong
} }
15
Linker’s Symbol Rules
Rule 1: Multiple strong symbols are not allowed
Each item can be defined only once
Otherwise: Linker error
Rule 2: Given a strong symbol and multiple weak
symbols, choose the strong symbol
References to the weak symbol resolve to the strong
symbol
Rule 3: If there are multiple weak symbols, pick an
arbitrary one
Can override this with gcc –fno-common
16
Linker Puzzles
int x;
p1() {} p1() {} Link time error: two strong symbols (p1)
int x; int x; References to x will refer to the same
p1() {} p2() {}
uninitialized int. Is this what you really want?
int x; double x;
int y; p2() {}
Writes to x in p2 might overwrite y!
p1() {} Evil!
int x=7; double x; Writes to x in p2 will overwrite y!
int y=5; p2() {}
p1() {} Nasty!
int x=7; int x; References to x will refer to the same initialized
p1() {} p2() {}
variable.
Nightmare scenario: two identical weak structs, compiled by different
17
compilers with different alignment rules.
Global Variables
Avoid if you can
Otherwise
Use static if you can
Initialize if you define a global variable
Use extern if you reference an external global variable
18
Step 2: Relocation
Relocatable Object Files Executable Object File
System code .text 0
Headers
.data
System data System code
main()
.text
main.o
swap()
main() .text
int array[2]={1,2} .data More system code
System data
sum.o .data
int array[2]={1,2}
sum() .text
.symtab
.debug
19
Relocation Entries
int array[2] = {1, 2}; main.c
int main()
{
int val = sum(array, 2);
return val;
}
0000000000000000 <main>: main.o
0: 48 83 ec 08 sub $0x8,%rsp
4: be 02 00 00 00 mov $0x2,%esi
9: bf 00 00 00 00 mov $0x0,%edi # %edi = &array
a: R_X86_64_32 array # Relocation entry
e: e8 00 00 00 00 callq 13 <main+0x13> # sum()
f: R_X86_64_PC32 sum-0x4 # Relocation entry
13: 48 83 c4 08 add $0x8,%rsp
17: c3 retq
Source: objdump –r –d main.o
20
Relocated .text section
00000000004004d0 <main>:
4004d0: 48 83 ec 08 sub $0x8,%rsp
4004d4: be 02 00 00 00 mov $0x2,%esi
4004d9: bf 18 10 60 00 mov $0x601018,%edi # %edi = &array
4004de: e8 05 00 00 00 callq 4004e8 <sum> # sum()
4004e3: 48 83 c4 08 add $0x8,%rsp
4004e7: c3 retq
00000000004004e8 <sum>:
4004e8: b8 00 00 00 00 mov $0x0,%eax
4004ed: ba 00 00 00 00 mov $0x0,%edx
4004f2: eb 09 jmp 4004fd <sum+0x15>
4004f4: 48 63 ca movslq %edx,%rcx
4004f7: 03 04 8f add (%rdi,%rcx,4),%eax
4004fa: 83 c2 01 add $0x1,%edx
4004fd: 39 f2 cmp %esi,%edx
4004ff: 7c f3 jl 4004f4 <sum+0xc>
400501: f3 c3 repz retq
Using PC-relative addressing for sum(): 0x4004e8 = 0x4004e3 + 0x5
Source: objdump -dx prog 21
Loading Executable Object Files
Memory
Executable Object File Kernel virtual memory
invisible to user
0 code
ELF header
User stack
Program header table (created at runtime)
%rsp
(required for executables)
(stack
.init section pointer)
.text section Memory-mapped region for
shared libraries
.rodata section
.data section
.bss section brk
.symtab Run-time heap
(created by malloc)
.debug
Read/write data segment Loaded
.line (.data, .bss) from
the
.strtab Read-only code segment executable
Section header table (.init, .text, .rodata) file
0x400000
(required for relocatables) Unused 22
0
Packaging Commonly Used Functions
How to package functions commonly used by
programmers?
Math, I/O, memory management, string manipulation,
etc.
Awkward, given the linker framework so far:
Option 1: Put all functions into a single source file
Programmers link big object file into their programs
Space and time inefficient
Option 2: Put each function in a separate source file
Programmers explicitly link appropriate binaries into their
programs
More efficient, but burdensome on the programmer
23
Old-fashioned Solution: Static Libraries
Static libraries (.a archive files)
Concatenate related relocatable object files into a single
file with an index (called an archive).
Enhance linker so that it tries to resolve unresolved
external references by looking for the symbols in one or
more archives.
If an archive member file resolves reference, link it into
the executable.
24
Creating Static Libraries
atoi.c printf.c random.c
Translator Translator ... Translator
atoi.o printf.o random.o
unix> ar rs libc.a \
Archiver (ar) atoi.o printf.o … random.o
libc.a C standard library
Archiver allows incremental updates
Recompile function that changes and replace .o file in archive.
25
Commonly Used Libraries
libc.a (the C standard library)
4.6 MB archive of 1496 object files.
I/O, memory allocation, signal handling, string handling, data and time,
random numbers, integer math
libm.a (the C math library)
2 MB archive of 444 object files.
floating point math (sin, cos, tan, log, exp, sqrt, …)
% ar –t libc.a | sort % ar –t libm.a | sort
… …
fork.o e_acos.o
… e_acosf.o
fprintf.o e_acosh.o
fpu_control.o e_acoshf.o
fputc.o e_acoshl.o
freopen.o e_acosl.o
fscanf.o e_asin.o
fseek.o e_asinf.o
fstab.o e_asinl.o
… … 26
Linking with Static Libraries
libvector.a
#include <stdio.h> main2.c void addvec(int *x, int *y, addvec.c
#include "vector.h" int *z, int n) {
int i;
int x[2] = {1, 2};
int y[2] = {3, 4}; for (i = 0; i < n; i++)
int z[2]; z[i] = x[i] + y[i];
}
int main()
{
void multvec(int *x, int *y, multvec.c
int *z, int n)
addvec(x, y, z, 2); {
printf("z = [%d %d]\n”, int i;
z[0], z[1]);
return 0; for (i = 0; i < n; i++)
} z[i] = x[i] * y[i];
} 27
Linking with Static Libraries
addvec.o multvec.o
main2.c vector.h Archiver
(ar)
Translators
(cpp, cc1, as) libvector.a libc.a Static libraries
Relocatable main2.o addvec.o printf.o and any other
object files modules called by printf.o
Linker (ld)
prog2c Fully linked
executable object file
“c” for “compile-time” 28
Using Static Libraries
Linker’s algorithm for resolving external references:
Scan .o files and .a files in the command line order.
During the scan, keep a list of the current unresolved references.
As each new .o or .a file, obj, is encountered, try to resolve each
unresolved reference in the list against the symbols defined in obj.
If any entries in the unresolved list at end of scan, then error.
Problem:
Command line order matters!
Moral: put libraries at the end of the command line.
unix> gcc -L. libtest.o -lmine
unix> gcc -L. -lmine libtest.o
libtest.o: In function `main':
libtest.o(.text+0x4): undefined reference to `libfun'
29
Modern Solution: Shared Libraries
Static libraries have the following disadvantages:
Duplication in the stored executables (every function
needs libc)
Duplication in the running executables
Minor bug fixes of system libraries require each
application to explicitly relink
Modern solution: Shared Libraries
Object files that contain code and data that are loaded
and linked into an application dynamically, at either load-
time or run-time
Also called: dynamic link libraries, DLLs, .so files
30
Shared Libraries (cont.)
Dynamic linking can occur when executable is first loaded
and run (load-time linking)
Common case for Linux, handled automatically by the
dynamic linker (ld-linux.so)
Standard C library (libc.so) usually dynamically linked
Dynamic linking can also occur after program has begun
(run-time linking).
In Linux, this is done by calls to the dlopen() interface
Distributing software
High-performance web servers
Runtime library interpositioning
Shared library routines can be shared by multiple processes
More on this when we learn about virtual memory
31
Dynamic Linking at Load-time
main2.c vector.h unix> gcc -shared -o libvector.so \
addvec.c multvec.c
Translators
(cpp, cc1, as) libc.so
libvector.so
Relocatable main2.o Relocation and symbol table info
object file
Linker (ld)
Partially linked prog2l
executable object file
Loader (execve) libc.so
libvector.so
Code and data
Fully linked
executable Dynamic linker (ld-linux.so)
in memory 32
Dynamic Linking at Run-time
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
int x[2] = {1, 2};
int y[2] = {3, 4};
int z[2];
int main()
{
void *handle;
void (*addvec)(int *, int *, int *, int);
char *error;
/* Dynamically load the shared library that contains addvec() */
handle = dlopen("./libvector.so", RTLD_LAZY);
if (!handle) {
fprintf(stderr, "%s\n", dlerror());
dll.c exit(1);
} 33
Dynamic Linking at Run-time
...
/* Get a pointer to the addvec() function we just loaded */
addvec = dlsym(handle, "addvec");
if ((error = dlerror()) != NULL) {
fprintf(stderr, "%s\n", error);
exit(1);
}
/* Now we can call addvec() just like any other function */
addvec(x, y, z, 2);
printf("z = [%d %d]\n", z[0], z[1]);
/* Unload the shared library */
if (dlclose(handle) < 0) {
fprintf(stderr, "%s\n", dlerror());
exit(1);
}
dll.c return 0;
} 34
Linking Summary
Linking is a technique that allows programs to be
constructed from multiple object files.
Linking can happen at different times in a program’s
lifetime:
Compile time (when a program is compiled)
Load time (when a program is loaded into memory)
Run time (while a program is executing)
Understanding linking can help you avoid nasty errors
and make you a better programmer.
35
Case Study: Library Interpositioning
Library interpositioning : powerful linking technique
that allows programmers to intercept calls to arbitrary
functions
Interpositioning can occur at:
Compile time: When the source code is compiled
Link time: When the relocatable object files are statically
linked to form an executable object file
Load/run time: When an executable object file is loaded
into memory, dynamically linked, and then executed.
36
Some Interpositioning Applications
Security
Confinement (sandboxing)
Behind the scenes encryption
Debugging
In 2014, two Facebook engineers debugged a treacherous 1-
year old bug in their iPhone app using interpositioning
Code in the SPDY networking stack was writing to the wrong
location
Solved by intercepting calls to Posix write functions (write,
writev, pwrite)
Source: Facebook engineering blog post at
https://code.facebook.com/posts/313033472212144/debugging-file-corruption-on-ios/
37
Some Interpositioning Applications
Monitoring and Profiling
Count number of calls to functions
Characterize call sites and arguments to functions
Malloc tracing
Detecting memory leaks
Generating address traces
38
Example program
#include <stdio.h> Goal: trace the addresses and sizes
#include <malloc.h> of the allocated and freed blocks,
without breaking the program, and
int main() without modifying the source
{ code.
int *p = malloc(32);
free(p); Three solutions: interpose on the
return(0); lib malloc and free functions
} at compile time, link time, and
load/run time.
int.c
Compile-time Interpositioning
#ifdef COMPILETIME
#include <stdio.h>
mymalloc.c
#include <malloc.h>
/* malloc wrapper function */
void *mymalloc(size_t size)
{
void *ptr = malloc(size);
printf("malloc(%d)=%p\n",
(int)size, ptr);
return ptr;
}
/* free wrapper function */
void myfree(void *ptr)
{
free(ptr);
printf("free(%p)\n", ptr);
}
#endif 40
Compile-time Interpositioning
#define malloc(size) mymalloc(size) malloc.h
#define free(ptr) myfree(ptr)
void *mymalloc(size_t size);
void myfree(void *ptr);
linux> make intc
gcc -Wall -DCOMPILETIME -c mymalloc.c
gcc -Wall -I. -o intc int.c mymalloc.o
linux> make runc
./intc
malloc(32)=0x1edc010
free(0x1edc010)
linux>
41
Link-time Interpositioning
#ifdef LINKTIME mymalloc.c
#include <stdio.h>
void *__real_malloc(size_t size);
void __real_free(void *ptr);
/* malloc wrapper function */
void *__wrap_malloc(size_t size)
{
void *ptr = __real_malloc(size); /* Call libc malloc */
printf("malloc(%d) = %p\n", (int)size, ptr);
return ptr;
}
/* free wrapper function */
void __wrap_free(void *ptr)
{
__real_free(ptr); /* Call libc free */
printf("free(%p)\n", ptr);
}
#endif 42
Link-time Interpositioning
linux> make intl
gcc -Wall -DLINKTIME -c mymalloc.c
gcc -Wall -c int.c
gcc -Wall -Wl,--wrap,malloc -Wl,--wrap,free -o intl int.o mymalloc.o
linux> make runl
./intl
malloc(32) = 0x1aa0010
free(0x1aa0010)
linux>
The “-Wl” flag passes argument to linker, replacing each
comma with a space.
The “--wrap,malloc ” arg instructs linker to resolve
references in a special way:
Refs to malloc should be resolved as __wrap_malloc
Refs to __real_malloc should be resolved as malloc 43
Load/Run-time Interpositioning
#ifdef RUNTIME mymalloc.c
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
/* malloc wrapper function */
void *malloc(size_t size)
{
void *(*mallocp)(size_t size);
char *error;
mallocp = dlsym(RTLD_NEXT, "malloc"); /* Get addr of libc malloc */
if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(1);
}
char *ptr = mallocp(size); /* Call libc malloc */
printf("malloc(%d) = %p\n", (int)size, ptr);
return ptr;
}
44
Load/Run-time Interpositioning
/* free wrapper function */ mymalloc.c
void free(void *ptr)
{
void (*freep)(void *) = NULL;
char *error;
if (!ptr)
return;
freep = dlsym(RTLD_NEXT, "free"); /* Get address of libc free */
if ((error = dlerror()) != NULL) {
fputs(error, stderr);
exit(1);
}
freep(ptr); /* Call libc free */
printf("free(%p)\n", ptr);
}
#endif
45
Load/Run-time Interpositioning
linux> make intr
gcc -Wall -DRUNTIME -shared -fpic -o mymalloc.so mymalloc.c -ldl
gcc -Wall -o intr int.c
linux> make runr
(LD_PRELOAD="./mymalloc.so" ./intr)
malloc(32) = 0xe60010
free(0xe60010)
linux>
The LD_PRELOAD environment variable tells the
dynamic linker to resolve unresolved refs (e.g., to
malloc)by looking in mymalloc.so first.
46
Interpositioning Recap
Compile Time
Apparent calls to malloc/free get macro-expanded into
calls to mymalloc/myfree
Link Time
Use linker trick to have special name resolutions
malloc __wrap_malloc
__real_malloc malloc
Load/Run Time
Implement custom version of malloc/free that use
dynamic linking to load library malloc/free under different
names
47
Any Questions?
48