KEMBAR78
Chapter 8 - Code Generation Part 1 | PDF | Assembly Language | Computer Program
0% found this document useful (0 votes)
92 views5 pages

Chapter 8 - Code Generation Part 1

Uploaded by

Aschalew Ayele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views5 pages

Chapter 8 - Code Generation Part 1

Uploaded by

Aschalew Ayele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Compiler Design

Chapter Eight: Code Generation


Objective,
❖ Explain code generation phase and how it works.
❖ Describe the DAG representation of basic blocks.
❖ Describe several Issues in the design of a code generator: Memory management, Instruction
selection, Register allocation and target machine code.
❖ Describe several addressing modes together with their forms and associated costs.
❖ Explain the various strategies for efficient utilization of registers: Global Register Allocation,
Usage count, register assignment for outer loops and register allocation by graph coloring.

Code Generator
The final phase in compiler model is the code generator. It take’s intermediate representations of the
source program (three-address representations such as quadruples, triples, indirect triples or virtual
machine representations or linear representations such as postfix notations and graphical representations such
as syntax trees and DAG's) as an input and produces an equivalent target program as output.
The output of the code generator is the target program. This output may take on a verity of forms such as
Absolute machine code, Relocation machine code or Assembly language code.

The DAG Representation


• DAG (directed a cyclic graph) of a basic block is a representation for single expression in which
the nodes of the DAG represent the statements within the block and each child of a node
corresponds to the statement that is the last definition of an operand used in the statement.
Example: A DAG for the block
a=b+c
b=a-d
c=b+c
d=a-d

The design issues of a CODE GENERATOR are: -

➢ Input to the code generator


➢ Target programs
➢ Memory management
➢ Instruction selection
➢ Register allocation
➢ Approaches to code generation
Compiled by: Dawit K. 1
Compiler Design

1. Memory management:
The process of maintaining and organizing the entire memory is called the memory management
Mapping names in the source program to addresses of data objects in run-time memory is done cooperatively
by the front end and the code generator. A name in the three- address statement refers to a symbol-table entry
for name. then from the symbol-table entry, a relative address can be determined for the name.
2. Instruction selection

✓ Instruction speeds and machine idioms are other important factors.


✓ The nature of the instruction set of the target machine has a strong effect on the difficulty of
instruction selection.
✓ For example, the uniformity and completeness of the instruction set are important factors. If the target
machine does not support each data type in a uniform manner, then each exception to the general rule
requires special handling.
✓ On some machines, for example, floating-point operations are done using separate registers
✓ The factors to be considered during instruction selection are:
▪ The uniformity and completeness of the instruction set.
▪ Instruction speed and machine idioms.
▪ Size of the instruction set.

Eg., for the following address code is:


a := b + c
d := a + e inefficient assembly code is:
MOV b, R0 R0 ← b
ADD c, R0 R0 ← c + R0
MOV R0, a a ← R0
MOV a, R0 R0 ← a
ADD e, R0 R0 ← e + R0
MOV R0 , d d ← R0
Here the fourth statement is redundant, and so is the third statement if 'a' is not subsequently used.
The quality of the generated code is usually determined by its speed and size therefor we need to
know instruction costs in order to design good code sequences.
3. Register allocation
➢ A key problem in code generation is deciding what values to hold in what registers.
➢ Registers are the fastest computational unit on the target machine, but we usually do not have enough
of them to hold all values.
➢ Perform a number of operation but limited no of registers. This situation solved by register allocation
➢ The use of registers is often subdivided into two subproblems:
1. Register allocation, during which we select the set of variables that will reside in registers at each
point in the program.
Compiled by: Dawit K. 2
Compiler Design

2. Register assignment, during which we pick the specific register that a variable will reside in.

4. Target machine code

The target computer is a byte-addressable machine with four bytes to a word and n general-purpose
registers, R0, R1, …, Rn-1. It has two-address instructions of the form.
Syntax: op source, destination
Where op is an op-code, such as MOV, Add and source, destination are data fields. It has the following op-
codes: MOV (move source to destination), ADD (add source to destination) SUB (subtract source to
destination)
The source and destination fields are not long enough to hold memory addresses, so certain bit patterns in these
fields specify that words following an instruction certain operands and/or address. The source and destination
of an instruction are specified by combining registers and memory locations with address modes.
The address modes together with their assembly-language forms and associated costs are as follows:
MODE FORM ADDRESS ADDED COST
Absolute M M 1
Register R R 0
Indexed C(R) C+contents(R) 1
Indirect register *R contents(R) 0
Indirect indexed *C(R) contents(C+contents(R)) 1

1. Absolute: In Absolute mode, the data is available in RAM memory location M.


Eg: MOV M, R0 stores the contents of memory location into register R0.
2. Register: In Register mode, the instruction specifies the register which holds the data to be manipulated.
Eg: MOV R0, M stores the contents of register R0 into memory location M.
3. Indexed: In this mode, the instruction specifies the index where the data is available. C(R) represents
an address offset 'C' from the value in register R.
Eg: MOV 4(R0),M stores the value contents(4+contents(R0)) into memory location M.
4. Indirect Register & Indirect Indexed: In both modes, the address of the data is given directly. It is
indicated by prefix '*'.
Eg: MOV *4(R0),M stores the value contents(contents(4+contents(R0))) into memory location M.

Instruction costs:
The cost of a machine instruction is one more than the cost of address modes for source & destination. The
cost of an instruction gives the length of the instruction in words. Address modes involving registers have cost
zero and involving a memory location is one, because such operands have to be stored with the instruction.
The instruction length should be minimum if space is an important factor. By minimizing the instruction
length, we can reduce the time taken to execute the instruction.

Compiled by: Dawit K. 3


Compiler Design

Some examples follow:


1. The instruction MOV R0, R1 copies the constants of register R0 into register R1. This instruction has
cost one, since it occupies only one word of memory.
2. The instruction MOV R5, M copies the contents of register R5 into memory location M. This
instruction has cost two, since the address of memory location M is in the word following the instruction.
3. The instruction ADD #1, R3 Adds the constant 1 to the contents of register 3, and has cost two, since
the constant 1 must appear in the next word following the instruction.
4. The instruction SUB 4(R0), *12(R1) stores the value contents(contents(12+contents(R1)))-
contents(contents(4+(R0))) into the destination *12(R1). The cost of this instruction is three, since the
constants 4 & 12 are stored in the next two words following the instruction.
EX: - Consider the following code sequence
1. MOV b, R0
ADD c, R0
MOV R0, a
2. MOV b, a
ADD c, a
3. MOV *R1, *R0
ADD *R2, *R0
4. ADD R2, R1
MOV R1, a
Calculate the cost of the above instruction in terms of access time & memory usage.
SOLN: 6, 6, 2, 3

Register Allocation and Assignment


Efficient utilization of registers is vitally important in generating good code. This section presents
various strategies for deciding at each point in a program what values should reside in registers (register
allocation) and in which register each value should reside (register assignment).
The following are various strategies for making these decisions. They are:
➢ Global Register Allocation
➢ Usage count
➢ Register assignment for outer loops
➢ Register allocation by graph coloring
Global Register Allocation:

This approach is to keep frequently used variables of each inner loop in some fixed registers throughout
and make these registers accessible globally. Register that are not allocated can be used to hold local
values of a block. This approach is simple to implement with a drawback that fixed number of registers is
not always the right number to make available for global register allocation.

Compiled by: Dawit K. 4


Compiler Design

Usage Count:
By allocation a register to a variable ‘x’ for the duration of loop L. We sauce one unit of cost for
reference to ‘x’.we know that when the variable ‘x’ is assigned a value in the block and it is live, then it
remains in the registors.with this fact .we save two units of cost for each block of loop L in which ‘x’ is
assigned a value and line at the end of the block.
Thus the number of units of cost that can be saved by allocating a register to ‘x’ within loop L is given by

(Use(x,B)+z*live(x,B))
Where use(x,B) is the number of times ‘x’ is used within block B before it is reassigned a value. If ‘x’ is
assigned a value in B and it is live at the end of B then

Live(x, B) =1 otherwise live(x, B) =0


Register assignment for outer loops:
The same idea used to assign register for inner loops can be applied to assign resistor which registers for
outer loops. Suppose L1 is an outer loop containing an inner loop L2. The variables for which registers
are already allocated in L1 need not be allocated registers in L1-L2
However, if a register is allocated to a variable ‘x’ in L1 but not L2 then at the time of entering in to loop
L2.we must store X. At the time of leaving L2 and entering in to the block of L1-L2 we must load ‘x’.

Register allocation by graph colouring:

When there is a need for register but the register is not available then in order to free up a register, its
contents are spilled(stored) in to a memory location.

Graph colouring technique consists of two parts. The first part select target machine-instructions as if
there are infinite numbers of symbolic register.

The effort of this step is that the temporary names used in the intermediate representation become register
names and three-address statements because machine-language statements.

The second part constructs a registor-interframe graph for each procedure. The nodes of graph represent
symbolic registrar and there is an edge between two nodes if at a point one node is being defined while
the other is live.

After completion of graph, it is collared using k colors, where k is the number of assignable registers. The
graph is collared if each node is assigned a colour such that two adjacent nodes have a different
colour and the colouring ensure that two symbolic registers that can interact with each other are
assigned a different physical register.

Compiled by: Dawit K. 5

You might also like