µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
Chapter-2
C Compilers and Optimization
Basic C Data Types
C Looping Structures
Register Allocation
Function Calls
Pointer Aliasing (Not covered in notes)
Q.No 1 Explain the Efficient Use of C data types
For local variables held in registers, don’t use a char or short type unless 8-bit or 16-bit
modular arithmetic is necessary. Use the signed or unsigned int types instead. Unsigned
types are faster when you use divisions
For array entries and global variables held in main memory, use the type with the
smallest size possible to hold the required data. This saves memory footprint. Avoid
using offsets from the base of the array with short type arrays, as LDRH does not
support this.
Use explicit casts when reading array entries or global variables into local variables, or
writing local variables out to array entries.
Avoid implicit or explicit narrowing casts in expressions because they usually cost extra
cycles
Avoid char and short types for function arguments or return values. Instead use the int
type even if the range of the parameter is smaller. This prevents the compiler
performing unnecessary casts.
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru
µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
Explain C Looping Structures
Loops with a Fixed Number of Iterations
Loops Using a Variable Number of Iterations
Loop Unrolling
1. Loops with a Fixed Number of Iterations
Example checksum routine
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru
µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
The following example shows improvement if we switch to a decrementing loop
rather than an incrementing loop
The SUBS and BNE instructions implement the loop. Our checksum example now has the
minimum number of four instructions per loop. This is much better than six for checksum_v1
and eight for checksum_v3.
2. Loops Using a Variable Number of Iterations
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru
µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
3. Loop Unrolling
Each loop iteration costs two instructions in addition to the body of the loop: a subtract to
decrement the loop count and a conditional branch.
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru
µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
Explain how Register Allocation can be done in ARM
The compiler attempts to allocate a processor register to each local variable you use in a
C function.
It will try to use the same register for different local variables if the use of the variables
do not overlap.
When there are more local variables than available registers, the compiler stores the
excess variables on the processor stack
These variables are called spilled or swapped out variables since they are written out to
memory
Spilled variables are slow to access compared to variables allocated to registers
To implement a function efficiently, you need to
Minimize the number of spilled variables
Ensure that the most important and frequently accessed variables are stored in registers
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru
µC&ES (21CS43)-Module 2-Chapter 2 4th Sem, AI&ML
Efficient Register Allocation
■ Try to limit the number of local variables in the internal loop of functions to 12. The compiler should
be able to allocate these to ARM registers.
■ You can guide the compiler as to which variables are important by ensuring these variables are used
within the innermost loop
Calling Functions Efficiently
■ Try to restrict functions to four arguments. This will make them more efficient to call. Use
structures to group related arguments and pass structure pointers instead of multiple arguments.
■ Define small functions in the same source file and before the functions that call them. The
compiler can then optimize the function call or inline the small function
■ Critical functions can be in lined using the __inline keyword.
Gouri S Katageri, Asst Professor, ECE Dept JSSATE Bengaluru