Module 4
Intermediate Code Generation
Outline
Variants of Syntax Trees
Three-address code
Types and declarations
Translation of expressions
Type checking
Control flow
Backpatching
Introduction
Intermediate code is the interface between front end and
back end in a compiler
Ideally the details of source language are confined to the
front end and the details of target machines to the back
end (a m*n model)
In this chapter we study intermediate representations,
static type checking and intermediate code generation
Static Intermediate Code
Parser
Checker Code Generator Generator
Front end Back end
Variants of syntax trees
It is sometimes beneficial to crate a DAG instead of
tree for Expressions.
This way we can easily show the common sub-
expressions and then use that knowledge during code
generation
Example: a+a*(b-c)+(b-c)*d
+ *
*
d
a -
b c
SDD for creating DAG’s
Production Semantic Rules
1) E -> E1+T E.node= new Node(‘+’, E1.node,T.node)
2) E -> E1-T E.node= new Node(‘-’, E1.node,T.node)
3) E -> T E.node = T.node
4) T -> (E) T.node = E.node
5) T -> id T.node = new Leaf(id, id.entry)
6) T -> num T.node = new Leaf(num, num.val)
Example:
1)p1=Leaf(id, entry-a) 8) p8=Leaf(id,entry-b)=p3
2)P2=Leaf(id, entry-a)=p1 9) p9=Leaf(id,entry-c)=p4
3)p3=Leaf(id, entry-b) 10) p10=Node(‘-’,p3,p4)=p5
4)p4=Leaf(id, entry-c) 11) p11=Leaf(id,entry-d)
5)p5=Node(‘-’,p3,p4) 12) p12=Node(‘*’,p5,p11)
6)p6=Node(‘*’,p1,p5) 13) p13=Node(‘+’,p7,p12)
7)p7=Node(‘+’,p1,p6)
Value-number method for
constructing DAG’s
= id To entry for i
num 10
+ + 1 2
3 1 3
i 10
Algorithm
Search the array for a node M with label op, left child l
and right child r
If there is such a node, return the value number M
If not create in the array a new node N with label op, left
child l, and right child r and return its value
We may use a hash table
Three address code
In a three address code there is at most one operator
at the right side of an instruction
Example:
+
t1 = b – c
+ * t2 = a * t1
t3 = a + t2
* t4 = t1 * d
d
t5 = t3 + t4
a -
b c
Forms of three address
instructions
x = y op z
x = op y
x = y
goto L
if x goto L and ifFalse x goto L
if x relop y goto L
Procedure calls using:
param x
call p,n
y = call p,n
x = y[i] and x[i] = y
x = &y and x = *y and *x =y
Example
do i = i+1; while (a[i] < v);
L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100
Symbolic labels Position numbers
Data structures for three
address codes
Quadruples
Has four fields: op, arg1, arg2 and result
Triples
Temporaries are not used and instead references to
instructions are made
Indirect triples
In addition to triples we use a list of pointers to triples
Three address code
Example t1 = minus c
t2 = b * t1
b * minus c + b * minus c t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Quadruples Triples Indirect Triples
op arg1 arg2 result op arg1 arg2 op op arg1 arg2
minus c t1 0 minus c 35 (0) 0 minus c
* b t1 t2 1 * b (0) 36 (1) 1 * b (0)
minus c t3 2 minus c 37 (2) 2 minus c
* b t3 t4 3 * b (2) b (2)
38 (3) 3 *
+ t2 t4 t5 4 + (1) (3) 39 (4) 4 + (1) (3)
= t5 a 5 = a (4) 40 (5) 5 = a (4)
Type Expressions
Example: int[2][3]
array(2,array(3,integer))
A basic type is a type expression
A type name is a type expression
A type expression can be formed by applying the array type constructor
to a number and a type expression.
A record is a data structure with named field
A type expression can be formed by using the type constructor for
function types
If s and t are type expressions, then their Cartesian product s*t is a type
expression
Type expressions may contain variables whose values are type
expressions
Where We Are
Source Lexical Analysis
Code
Syntax Analysis
Semantic Analysis
IR Generation
IR Optimization
Code Generation
Optimization Machine
Code
Three-Address Code
● Or “TAC”
● High-level assembly where each operation has
at most three operands.
● Uses explicit runtime stack for function calls.
Uses vtables for dynamic dispatch.
Sample TAC Code
int x;
int y;
int x2 = x * x;
int y2 = y * y;
int r2 = x2 +
y2;
Sample TAC Code
int x; int y;
int x2 = x * x;
int y2 = y * y; x2 = x * x;
int r2 = x2 + y2; y2 = y * y;
r2 = x2 + y2;
Sample TAC Code
int a;
int b;
int c;
int d;
a = b + c + d
;
b = a * a + b * b;
Sample TAC Code
int a;
int b; _t0 = b +c;
int c; _t1 =_t0 +d;
int d; _t1=a * a;
_t2 = b * b;
a = b+ c + d; b = _t1 +_t2;
b = a* a + b* b;
Temporary Variables
● The “three” in “three-address code” refers to
the number of operands in any instruction.
Evaluating an expression with more than three
● subexpressions requires the introduction of
temporary variables.
Sample TAC Code
int a; int b;
a = 5 + 2 * b;
Sample TAC Code
int a;
int b; _t0 = 5;
_t1 =2* b;
a = 5+ 2 * b; a = _t0+ _t1;
Sample TAC Code
TAC allows for
instructions with two
operands.
int a; int b;
_t0 = 5;
a = 5 + 2 * b; _t1 = 2 * b;
a = _t0 + _t1;
Simple TAC Instructions
●
Variable assignment allows assignments of the form
● var = constant;
● var1 = var2;
var1
● = var2 op
● var3;
var = constant op
1
var2;
● var = var2
var1 = constant
● opconstant ;
op
1 2
constant;
●
Permitted operators are +, -, *, /, %. How
●
would you compile y = -x; ?
Simple TAC Instructions
●
Variable assignment allows assignments of the form
● var = constant;
● var1 = var2;
var1
● = var2 op
● var3;
var = constant op
1
var2;
● var = var2 op
● 1
var = constant1 op constant2; Permitted
constant;
●
operators are +, -, *, /, %. How would you compile y
●
= -x; ?
y = 0 – x; y = -1 * x;
One More with bools
int x; int y;
bool b1; bool
b2; bool b3;
b1 = x + x < y
b2 = x + x == y
b3 = x + x > y
One More with bools
_t0 = x + x;
int x; int y; _t1 = y;
bool b1; bool b1 = _t0 < _t1;
b2; bool b3;
_t2 = x + x;
b1 = x + x < y _t3 = y;
b2 = x + x == y b2 = _t2 == _t3;
b3 = x + x > y
_t4 = x + x;
_t5 = y;
b3 = _t5 < _t4;
TAC with bools
●
Boolean variables are represented as integers
that have zero or nonzero values.
In addition to the arithmetic operator, TAC
●
supports <, ==, ||, and &&.
How might you compile b = (x <= y) ?
TAC with bools
●
Boolean variables are represented as integers
that have zero or nonzero values.
In addition to the arithmetic operator, TAC
●
supports <, ==, ||, and &&.
How might you compile b = (x <= y) ?
●
_t0 = x < y;
_t1 = x == y; b = _t0 || _t1;
Control Flow Statements
int x; int
y; int z;
if (x < y)
z = x;
else
z = y;
z = z * z;
Control Flow Statements
int x;
int y; _t0 = x < y;
int z; IfZ _t0 Goto _L0;
z = x;
if (x < y) Goto _L1;
z = x; _L0:
else z = y;
z = y; _L1:
z = z * z;
z = z * z;
Control Flow Statements
int x;
int y; _t0 = x < y;
int z; IfZ _t0 Goto _L0;
z = x;
if (x < y) Goto _L1;
z = x; _L0:
else z = y;
z = y; _L1:
z = z * z;
z = z * z;
Control Flow Statements
int x;
int y; _t0 = x < y;
int z; IfZ _t0 Goto _L0;
z = x;
if (x < y) Goto _L1;
z = x; _L0:
else z = y;
z = y; _L1:
z = z * z;
z = z * z;
Labels
● TAC allows for named labels indicating
particular points in the code that can be
jumped to.
● There are two control flow instructions:
● Goto label;
● IfZ value Goto label;
● Note that IfZ is always paired with Goto.
Control Flow Statements
int x; int y;
while (x < y) {
x = x * 2;
}
y = x;
Control Flow Statements
int x; int y;
_L0:
while (x < y) { _t0 = x < y;
x = x * 2; IfZ _t0 Goto _L1;
}
x = x * 2;
Goto _L0;
y = x; _L1:
y = x;
A Complete Decaf
Program
void main() { int x, y;
int m2 = x * x + y * y;
while (m2 > 5) { m2 = m2 –
x;
}
}
A Complete Decaf
Program
main:
void main() { int x, y;
BeginFunc 24;
int m2 = x * x + y * y;
_t0 = x * x;
while (m2 > 5) { m2 = m2 – _t1 = y * y;
x; m2 = _t0 + _t1;
} _L0:
} _t2 = 5 < m2;
IfZ _t2 Goto _L1;
m2 = m2 – x;
Goto _L0;
_L1:
EndFunc;
A Complete Decaf
Program
main:
void main() { int x, y;
BeginFunc 24;
int m2 = x * x + y * y;
_t0 = x * x;
while (m2 > 5) { m2 = m2 – _t1 = y * y;
x; m2 = _t0 + _t1;
} _L0:
} _t2 = 5 < m2;
IfZ _t2 Goto _L1; m2
= m2 – x;
Goto _L0;
_L1:
EndFunc;
A Complete Decaf
Program
main:
void main() { int x, y;
BeginFunc 24;
int m2 = x * x + y * y;
_t0 = x * x;
while (m2 > 5) { m2 = m2 – _t1 = y * y;
x; m2 = _t0 + _t1;
} _L0:
} _t2 = 5 < m2;
IfZ _t2 Goto _L1; m2
= m2 – x;
Goto _L0;
_L1:
EndFunc;
Control Flow
boolean expressions are often used to:
Alter the flow of control.
Compute logical values.
Short-Circuit Code
Flow-of-Control Statements
Syntax-directed definition
Generating three-address code for booleans
translation of a simple if-statement
Backpatching
Previous codes for Boolean expressions insert symbolic labels for
jumps
It therefore needs a separate pass to set them to appropriate addresses
We can use a technique named backpatching to avoid this
We assume we save instructions into an array and labels will be
indices in the array
For nonterminal B we use two attributes B.truelist and B.falselist
together with following functions:
makelist(i): create a new list containing only I, an index into the array
of instructions
Merge(p1,p2): concatenates the lists pointed by p1 and p2 and returns a
pointer to the concatenated list
Backpatch(p,i): inserts i as the target label for each of the instruction
on the list pointed to by p
Backpatching for Boolean Expressions
Backpatching for Boolean Expressions
Annotated parse tree for x < 100 || x > 200 && x ! = y
Flow-of-Control Statements
Translation of a switch-statement