C as Implemented in
Assembly Language
1
Overview
We program in C for convenience
There are no MCUs which execute C, only machine code
So we compile the C to assembly code, a human-readable representation of machine
code
We need to know what the assembly code implementing the C looks like
To use the processor efficiently
To analyze the code with precision
To find performance and other problems
An overview of what C gets compiled to
C start-up module, subroutines calls, stacks, data classes and layout, pointers, control flow, etc.
2
Compiler Stages
Parser
reads in C code,
checks for syntax errors,
forms intermediate code (tree representation)
High-Level Optimizer
Modifies intermediate code (processor-independent)
Code Generator
Creates assembly code step-by-step from each node of the intermediate code
Allocates variable uses to registers
Low-Level Optimizer
Modifies assembly code (parts are processor-specific)
Assembler
Creates object code (machine code)
Linker/Loader
Creates executable image from object file
3
Examining Assembly Code before Debugger
Compiler can generate assembly code listing for reference
Select in project options
4
Examining Disassembled Program in Debugger
5
A Word on Code Optimizations
Compiler and rest of tool-chain try to optimize code:
Simplifying operations
Removing “dead” code
Using registers
These optimizations often get in way of understanding what the code does
Fundamental trade-off: Fast or comprehensible code?
Compiler optimization levels: Level 0 to Level 3
Code examples here may use “volatile” data type modifier to reduce compiler
optimizations and improve readability
6
USING REGISTERS
7
Register Use Conventions
Make it easier to create modular, isolated and integrated code
Scratch registers are not expected to be preserved upon returning from a called
subroutine
r0-r3
Preserved (“variable”) registers are expected to have their original values upon
returning from a called subroutine
r4-r8, r10-r11
8
Core Register Use
Register Synonym Special Role in the procedure call standard
r15 PC The Program Counter.
r14 LR The Link Register.
r13 SP The Stack Pointer.
r12 IP The Intra-Procedure-call scratch register.
Must be saved, restored by
r11 v8 Variable-register 8.
callee-procedure it may modify
r10 v7 Variable-register 7. them.
Platform register. The meaning of this register is defined
r9 v6,SB,TR
by the platform standard.
r8 v5 Variable-register 5. Must be saved, restored by
r7 v4 Variable register 4. callee-procedure it may modify
r6 v3 Variable register 3. them.
Calling subroutine expects
r5 v2 Variable register 2.
these to retain their value.
r4 v1 Variable register 1.
r3 a4 Argument / scratch register 4.
Don’t need to be saved. May be
r2 a3 Argument / scratch register 3. used for arguments, results, or
r1 a2 Argument / result / scratch register 2. temporary values.
r0 a1 Argument / result / scratch register 1.
9
MEMORY REQUIREMENTS
10
What Memory Does a Program Need?
int a, b;
Five possible types
const char c=123; Code
int d=31; Read-only static data
void main(void) {
Writable static data
int e;
◦ Initialized
char f[32];
e = d + 7; ◦ Zero-initialized
a = e + 29999; ◦ Uninitialized
strcpy(f,“Hello!”); Heap
} Stack
What goes where?
Code is obvious
And the others?
11
What Memory Does a Program Need?
int a, b;
Can the information change?
const char c=123; No? Put it in read-only, nonvolatile memory
int d=31; ◦ Instructions
void main(void) { ◦ Constant strings
int e; ◦ Constant operands
char f[32]; ◦ Initialization values
e = d + 7;
Yes? Put it in read/write memory
a = e + 29999;
strcpy(f,“Hello!”); ◦ Variables
} ◦ Intermediate computations
◦ Return address
◦ Other housekeeping data
12
What Memory Does a Program Need?
int a, b;
How long does the data need to exist? Reuse
const char c=123; memory if possible.
int d=31; Statically allocated
void main(void) { ◦ Exists from program start to end
int e;
◦ Each variable has its own fixed location
char f[32];
◦ Space is not reused
e = d + 7;
a = e + 29999; Automatically allocated
strcpy(f,“Hello!”); ◦ Exists from function start to end
} ◦ Space can be reused
Dynamically allocated
◦ Exists from explicit allocation to explicit deallocation
◦ Space can be reused
13
Program Memory Use
RAM Flash ROM
int a, b;
Zero-Initialized Data const char c=123; Constant Data
int d=31;
void main(void) {
int e;
Initialized Data InitializationData
char f[32];
e = d + 7;
a = e + 29999;
Stack strcpy(f,“Hello!”); Startup and Runtime
} Library Code
Heap Data Program .text
14
Activation Record
Activation records are located Lower
(Free stack space)
on the stack address
Local storage <- Stack ptr
Calling a function creates Activation record for
Return address
an activation record current function
Arguments
Returning from a function Local storage
Activation record for
deletes the activation record caller function
Return address
Arguments
Local storage
Activation record for
Automatic variables and caller’s caller function
Return address
housekeeping information are Arguments
Higher Activation record for Local storage
stored in a function’s activation address caller’s caller’s caller Return address
record function Arguments
Not all fields (LS, RA, Arg) may be present for each activation record
15
Type and Class Qualifiers
Const
Never written by program, can be put in ROM to save RAM
Volatile
Can be changed outside of normal program flow: ISR, hardware register
Compiler must be careful with optimizations
Static
Declared within function, retains value between function invocations
Scope is limited to function
16
Linker Map File
Contains extensive information on functions and variables
Value, type, size, object
Cross references between sections
Memory map of image
Sizes of image components
Summary of memory requirements
17
C Run-Time Start-Up Module
RAM Flash ROM
After reset, MCU must…
Zero-Initialized Data
Fill with Initialization Data
a, b zeros 31
Initialize hardware
Peripherals, etc. Initialized Data Constant Data
Set up stack pointer d c: 123
Hello!
Initialize C or C++ run-time Stack Startup and Runtime
environment e, f Library Code
Set up heap memory
Initialize variables Heap Data Code
18
ACCESSING DATA IN MEMORY
19
Accessing Data
int siA;
void static_auto_local() {
int aiB;
What does it take to get at a variable static int siC=3;
in memory? int * apD;
Depends on location, which depends on int aiE=4, aiF=5, aiG=6;
storage type (static, automatic, dynamic) siA = 2;
aiB = siC + siA;
apD = & aiB;
(*apD)++;
apD = &siC;
(*apD) += 9;
apD = &siA;
apD = &aiE;
apD = &aiF;
apD = &aiG;
(*apD)++;
aiE+=7;
*apD = aiE + aiF;
}
20
Static Variables
Static var can be located anywhere in 32-bit Load r0 with pointer to variable
Load r1 from [r0]
memory space, so need a 32-bit pointer Use value of variable
Can’t fit a 32-bit pointer into a 16-bit
Label:
instruction (or a 32-bit instruction), so save 32-bit pointer to Variable
the pointer separate from instruction, but
nearby so we can access it with a short PC-
relative offset
Load the pointer into a register (r0)
Can now load variable’s value into a register
(r1) from memory using that pointer in r0
Variable
Similarly can store a new value to the variable
in memory
21
Static Variables AREA ||.text||, CODE, READONLY, ALIGN=2
;;;20 siA = 2;
00000e 2102 MOVS r1,#2 ; r1 = 2
000010 4a37 LDR r2,|L1.240| ; r2 = &siA
Key 000012 6011 STR r1,[r2,#0] ; *r2 = r1
variable’s value
variable’s address ;;;21 aiB = siC + siA;
000014 4937 LDR r1,|L1.244| ; r1 = &siC
address of copy of variable’s
000016 6809 LDR r1,[r1,#0] ; r1 = *r1
address 000018 6812 LDR r2,[r2,#0] ; r2 = *r2
Addresses of siA and siC are 00001a 1889 ADDS r1,r1,r2 ; r1 = r1 + r2
...
stored as literals to be loaded
into pointers |L1.240|
Variables siC and siA are located DCD ||siA||
|L1.244|
in .data section with initial values DCD ||siC||
AREA ||.data||, DATA, ALIGN=2
||siC||
DCD 0x00000003
||siA||
DCD 0x00000000
22
Automatic Variables Stored on Stack
int main(void) {
Automatic variables are stored in a function’s auto vars;
a();
activation record (unless optimized and promoted }
to register)
void a(void) {
Activation records are located on the stack auto vars;
Calling a function creates an activation record, b();
allocating space on stack }
Returning from a function deletes void b(void) {
the activation record, freeing up space on stack auto vars;
c();
Variables in C are implicitly automatic, there is no }
need to specify the keyword
void c(void) {
auto vars;
…
}
23
Addressing Automatic Variables
Program must allocate space on stack for variables
Address Contents
Stack addressing uses an offset from the stack pointer: SP
[sp, #offset] SP+0x4
SP+0x8
SP+0xC
Items on the stack are word aligned SP+0x10
SP+0x14
In instructions, one byte used for offset, which is multiplied SP+0x18
by four SP+0x1C
Possible offsets: 0, 4, 8, …, 1020 SP+0x20
Maximum range addressable this way is 1024 bytes
24
Automatic Variables
Address Contents
SP aiG ;;;14 void static_auto_local( void ) {
SP+4 aiF 000000 b50f PUSH {r0-r3,lr}
SP+8 aiE ;;;15 int aiB;
SP+0xC aiB ;;;16 static int siC=3;
SP+0x10 r0 ;;;17 int * apD;
SP+0x14 r1 ;;;18 int aiE=4, aiF=5, aiG=6;
SP+0x18 r2 000002 2104 MOVS r1,#4
SP+0x1C r3 000004 9102 STR r1,[sp,#8]
SP+0x20 lr
000006 2105 MOVS r1,#5
000008 9101 STR r1,[sp,#4]
Initialize aiE 00000a 2106 MOVS r1,#6
Initialize aiF 00000c 9100 STR r1,[sp,#0]
…
Initialize aiG ;;;21 aiB = siC + siA;
…
00001c 9103 STR r1,[sp,#0xc]
Store value for aiB
25
USING POINTERS
26
Using Pointers to Automatics
C Pointer: a variable which holds the
;;;22 apD = & aiB;
data’s address 00001e a803 ADD r0,sp,#0xc
;;;23 (*apD)++;
000020 6801 LDR r1,[r0,#0]
aiB is on stack at SP+0xc
000022 1c49 ADDS r1,r1,#1
Compute r0 with variable’s address 000024 6001 STR r1,[r0,#0]
from stack pointer and offset (0xc)
Load r1 with variable’s value from
memory
Operate on r1, save back to variable’s
address
27
Using Pointers to Statics
;;;24 apD = &siC;
Load r0 with variable’s address from 000026 4833 LDR r0,|L1.244|
address of copy of variable’s address ;;;25 (*apD) += 9;
000028 6801 LDR r1,[r0,#0]
00002a 3109 ADDS r1,r1,#9
Load r1 with variable’s value from 00002c 6001 STR r1,[r0,#0]
memory |L1.244|
DCD ||siC||
AREA ||.data||, DATA, ALIGN=2
Operate on r1, save back to variable’s ||siC||
address DCD 0x00000003
28
ARRAY ACCESS
29
Array Access
uint8 buff2[3];
What does it take to get at an uint16 buff3[5][7];
array element in memory?
Depends on how many dimensions uint32 arrays(uint8 n, uint8 j) {
Depends on element size and row volatile uint32 i;
width i = buff2[0] + buff2[n];
i += buff3[n][j];
Depends on location, which
return i;
depends on storage type (static,
}
automatic, dynamic)
30
Accessing 1-D Array Elements
Address Contents
Need to calculate element address: sum of… buff2 buff2[0]
array start address
buff2 + 1 buff2[1]
offset: index * element size
buff2 + 2 buff2[2]
buff2 is array of unsigned characters
Move n (argument) from r0 into r2 00009e 4602 MOV r2,r0
;;;76 i = buff2[0] + buff2[n];
Load r3 with pointer to buff2
0000a0 4b1b LDR r3,|L1.272|
Load (byte) r3 with first element of buff2 0000a2 781b LDRB r3,[r3,#0] ; buff2
Load r4 with pointer to buff2 0000a4 4c1a LDR r4,|L1.272|
Load (byte) r4 with element at address buff2+r2 0000a6 5ca4 LDRB r4,[r4,r2]
r2 holds argument n 0000a8 1918 ADDS r0,r3,r4
Add r3 and r4 to form sum |L1.272|
DCD buff2
31
Accessing 2-D Array Elements
uint16 buff3[5][7]
Address Contents var[rows][columns]
buff3 buff3[0][0]
Sizes
buff3+1
Element: 2 bytes
buff3+2 buff3[0][1]
buff3+3 Row: 7*2 bytes = 14 bytes (0xe)
(etc.) Offset based on row index and column
buff3+10 buff3[0][5] index
buff3+11 column offset = column index *
buff3+12 buff3[0][6] element size
buff3+13
row offset = row index * row size
buff3+14 buff3[1][0]
buff3+15
buff3+16 buff3[1][1]
buff3+17
(etc.)
buff3+68 buff3[4][6]
buff3+69
32
Code to Access 2-D Array
Instruction r0 r1 r2 r3 r4 Description
;;; i += buff3[n][j]; i j n - -
MOVS r3,#0xe - - - 0xe - Load row size
MULS r3,r2,r3 - - n n*0xe - Multiply by row number
LDR r4,|L1.276| - - - - &buff3 Load address of buff3
ADDS r3,r3,r4 - - - &buff3+n*0xe - Add buff3 address to row
offset
LSLS r4,r1,#1 - j - - j<<1 Multiply column number by 2
(buff3 is uint16 array)
LDRH r3,[r3,r4] - - - *(uint16)(&buff3+n*0xe+j<<1) j<<1 Load halfword r3 with
= buff3[n][j] element at r3+r4 (buff3 +
row offset + col offset)
ADDS r0,r3,r0 i+buff3[n][j] - - buff3[n][j] Add r3 to r0 (i)
33
CALLING FUNCTIONS
34
AAPCS Core Register Use
Register Synonym Special Role in the procedure call standard
r15 PC The Program Counter.
r14 LR The Link Register.
r13 SP The Stack Pointer.
r12 IP The Intra-Procedure-call scratch register.
r11 v8 Variable-register 8.
r10 v7 Variable-register 7.
Platform register. The meaning of this register is defined
r9 v6,SB,TR
by the platform standard.
r8 v5 Variable-register 5.
r7 v4 Variable register 4.
r6 v3 Variable register 3.
r5 v2 Variable register 2.
r4 v1 Variable register 1.
r3 a4 Argument / scratch register 4.
r2 a3 Argument / scratch register 3.
r1 a2 Argument / result / scratch register 2.
r0 a1 Argument / result / scratch register 1.
35
Function Arguments and Return Values
First, pass the arguments
How to pass them?
◦ Much faster to use registers than stack
◦ But quantity of registers is limited
Basic rules
◦ Process arguments in order they appear in source code
◦ Round size up to be a multiple of 4 bytes
◦ Copy arguments into core registers (r0-r3), aligning doubles to even registers
◦ Copy remaining arguments onto stack, aligning doubles to even addresses
Second, call the function
Usually as subroutine with branch link (bl) or branch link and exchange instruction (blx)
36
Return Values
Callee passes Return Value in register(s)
or stack Return value size Registers used for passing
Fundamental Composite Data
Data Type Type
Registers 1-4 bytes r0 r0
8 bytes r0-r1 stack
16 bytes r0-r3 stack
Stack Indeterminate size n/a stack
Caller function allocates space for return
value, then passes pointer to space as an
argument to callee
Callee stores result at location indicated by
pointer
37
Call Example
int fun2(int arg2_1, int arg2_2) { fun2 PROC
int i; ;;;85 int fun2(int arg2_1, int
arg2_2 += fun3(arg2_1, 4, 5, 6); arg2_2) {
… ...
} 0000e0 2306 MOVS r3,#6
0000e2 2205 MOVS r2,#5
Argument 4 into r3 0000e4 2104 MOVS r1,#4
Argument 3 into r2 0000e6 4630 MOV r0,r6
Argument 2 into r1
Argument 0 into r0 0000e8 f7fffffe BL fun3
Call fun3 with BL instruction
0000ec 1904 ADDS r4,r0,r4
Result was returned in r0, so add to
r4 (arg2_2 += result)
38
Call and Return Example
fun3 PROC
int fun3(int arg3_1, int arg3_2, ;;;81 int fun3(int arg3_1, int arg3_2,
int arg3_3, int arg3_4) { int arg3_3, int arg3_4) {
return arg3_1*arg3_2*
arg3_3*arg3_4; 0000ba b510 PUSH {r4,lr}
}
Save r4 and Link Register on stack 0000c0 4348 MULS r0,r1,r0
0000c2 4350 MULS r0,r2,r0
r0 = arg3_1*arg3_2 0000c4 4358 MULS r0,r3,r0
r0 *= arg3_3
r0 *= arg3_4 0000c6 bd10 POP {r4,pc}
Restore r4 and return from
subroutine
Return value is in r0
39
CONTROL FLOW
40
Control Flow: Conditionals and Loops
How does the compiler implement for (i = 0; i < 10; i++){
conditionals and loops? x += i;
}
if (x){ switch (x) {
y++; case 1: while (x<10) {
} else { y += 3; x = x + 1;
y--; break; }
} case 31:
y -= 5; do {
break; x += 2;
default: } while (x < 20);
y--;
break;
}
41
Control Flow: If/Else
;;;39 if (x){
000056 2900 CMP r1,#0
000058 d001 BEQ |L1.94|
T F ;;;40 y++;
00005a 1c52 ADDS r2,r2,#1
Condition 00005c e000 B |L1.96|
|L1.94|
;;;41 } else {
action_if action_else ;;;42 y--;
00005e 1e52 SUBS r2,r2,#1
|L1.96|
;;;43 }
42
Control Flow: Switch 000066 d104 BNE |L1.114|
000068 e001 B |L1.110|
Evaluate |L1.106|
;;;46 case 1:
;;;47 y += 3;
T 00006a 1cd2 ADDS r2,r2,#3
= const1? action1 ;;;48 break;
00006c e003 B |L1.118|
F |L1.110|
T ;;;49 case 31:
= const2? action2 ;;;50 y -= 5;
00006e 1f52 SUBS r2,r2,#5
F ;;;51 break;
000070 e001 B |L1.118|
action3 |L1.114|
;;;52 default:
;;;53 y--;
000072 1e52 SUBS r2,r2,#1
;;;54 break;
;;;45 switch (x) { 000074 bf00 NOP
000060 2901 CMP r1,#1 |L1.118|
000062 d002 BEQ |L1.106| 000076 bf00 NOP
000064 291f CMP r1,#0x1f ;;;55 }
43
Iteration: While
;;;57 while (x<10) {
000078 e000 B |L1.124|
|L1.122|
loop_body ;;;58 x = x + 1;
00007a 1c49 ADDS r1,r1,#1
|L1.124|
00007c 290a CMP r1,#0xa
T ;57
Condition
00007e d3fc BCC |L1.122|
;;;59 }
F
44
Iteration: For
;;;61 for (i = 0; i < 10; i++){
init_expression 000080 2300 MOVS r3,#0
000082 e001 B |L1.136|
|L1.132|
loop_body ;;;62 x += i;
000084 18c9 ADDS r1,r1,r3
000086 1c5b ADDS r3,r3,#1
cond_expression ;61
|L1.136|
000088 2b0a CMP r3,#0xa
T ;61
Condition
00008a d3fb BCC |L1.132|
;;;63 }
F
45
Iteration: Do/While
;;;65 do {
00008c bf00 NOP
loop_body
|L1.142|
;;;66 x += 2;
T 00008e 1c89 ADDS r1,r1,#2
Condition ;;;67 } while (x < 20);
000090 2914 CMP r1,#0x14
000092 d3fc BCC |L1.142|
F
46