KEMBAR78
Code Generation Part-1 in Compiler Construction | PPTX
Prof Monika Shah (NU) 1
Code Generation
Course : 2CS701/IT794 – Compiler
Construction
Prof Monika Shah
Nirma University
Ref : Ch.9 Compilers Principles, Techniques, and Tools by Alfred Aho, Ravi Sethi, and Jeffrey Ullman
Prof Monika Shah (NU) 2
Glimpse
Part-1
• Introduction
• Code Generation Issues
Prof Monika Shah (Nirma University)
Prof Monika Shah (NU) 3
Machine independent asm to machine dependent
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code
Generation
Optimization
Code Generation
Source
Program
Assembly
Code
IR to low-level IR
Symbol
Table
Front
end
Back
end
Optimization
Token stream
AST or Parse tree
AST or Annotted Parse tree
Code Generation
• Final Phase of Compilation
• Produces a semantically equivalent target
program
• Main Functionalities :
• Instruction selection
• Register allocation and assignment
• Instruction ordering
• Functionalities for QoS
• Produce Correct code (semantic
preserving)
• Efficiency
• Effective use of resources
• User Heuristics to generate good, but
suboptimal code
Prof Monika Shah (NU) 4
Major functionalities of Code Generation
• Instruction selection
Choose semantically equivalent target machine instructions for the IR
statement
Choose cost effective instructions to implement the IR statements
• Register allocation and assignment
• Register allocation: selecting the set of variables that will reside in
registers at each point in the program
• Resister assignment: selecting specific register that a variable reside in
• Instruction ordering
Decide schedule of Instruction (Challenging for Parallel )
Prof Monika Shah (NU) 5
Issues in design of Code Generation phase
• Input to Code Generator
• Target Program
• Memory Management
• Instruction Selection
• Register Allocation
• Schedule order of Instructions
• Approaches
Prof Monika Shah (NU) 6
Issues in Code Generation design
Input to Code Generation
• Input : IR + Symbol
• IR format choices
• AST
• Postfix Notation
• Three Address Code
• Assumption :
• Front end phase generate low-level IR
• Syntactic and semantic errors detected by earlier phases
Prof Monika Shah (NU) 7
Issues in Code Generation design
Target Program (Output)
• The back-end code generator of a compiler may generate different forms of
code, depending on the requirements:
• Absolute machine code (executable code)
+ve : Can be placed in fixed location in memory.  immediate executed
• Relocatable machine code (object files for linker)
+ve : allows subprograms to be compiled separately
• Assembly language
+ve : facilitates debugging
+ve: Easy Code Generation using Assembler
• Byte code forms for interpreters (e.g. JVM)
+ve : Platform independence
Prof Monika Shah (NU) 8
The Target Machine
• Implementing code generation requires thorough understanding
of the target machine architecture and its instruction set
• Our (hypothetical) machine:
• Byte-addressable (word = 4 bytes)
• Has n general purpose registers R0, R1, …, Rn-1
• Two-address instructions of the form
op source, destination
Issues in Code Generation design
Target Program (Output)
Prof Monika Shah (NU) 9
The Target Machine: Op-codes and Address Modes
• Op-codes (op), for example
MOV (move content of source to destination)
ADD (add content of source to destination)
SUB (subtract content of source from dest.)
• Address modes
Mode Form Address Added Cost
Absolute M M 1
Register R R 0
Indexed c(R) c+contents(R) 1
Indirect register *R contents(R) 0
Indirect indexed *c(R) contents(c+contents(R)) 1
Literal #c N/A 1
Prof Monika Shah (NU) 10
Instruction Costs
• Machine is a simple, non-super-scalar processor with fixed
instruction costs
• Realistic machines have deep pipelines, I-cache, D-cache,
etc.
• Define the cost of instruction
= 1 + cost(source-mode) + cost(destination-mode)
Prof Monika Shah (NU) 11
Instruction Operation Cost
MOV R0,R1 Store content(R0) into register R1 1
MOV R0,M Store content(R0) into memory location M 2
MOV M,R0 Store content(M) into register R0 2
MOV 4(R0),M Store contents(4+contents(R0)) into M 3
MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3
MOV #1,R0 Store 1 into R0 2
ADD 4(R0),*12(R1) Add contents(4+contents(R0))
to value at location contents(12+contents(R1)) 3
Examples
Prof Monika Shah (NU)
12
Instruction Selection
• Instruction selection is important to obtain efficient code
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
ADD z,R0
MOV R0,x
a:=a+1 MOV a,R0
ADD #1,R0
MOV R0,a
ADD #1,a INC a
Cost = 6
Cost = 3 Cost = 2
Better Best
Issues in Code Generation design
Target Program (Output)
Prof Monika Shah (NU) 13
Instruction Selection: Utilizing Addressing Modes
• Suppose we translate a:=b+c into
MOV b,R0
ADD c,R0
MOV R0,a
• Assuming addresses of a, b, and c are stored in R0, R1, and R2
MOV *R1,*R0
ADD *R2,*R0
• Assuming R1 and R2 contain values of b and c
ADD R2,R1
MOV R1,a
Issues in Code Generation design
Prof Monika Shah (NU) 14
Need for Global Machine-Specific Code Optimizations
• Suppose we translate three-address code
x:=y+z
to: MOV y,R0
ADD z,R0
MOV R0,x
• Then, we translate
a:=b+c
d:=a+e
to: MOV a,R0
ADD b,R0
MOV R0,a
MOV a,R0
ADD e,R0
MOV R0,d
Redundant
Issues in Code Generation design
Prof Monika Shah (NU) 15
Register Allocation and Assignment
• Efficient utilization of the limited set of registers is important to
generate good code
• Registers are assigned by
• Register allocation to select the set of variables that will reside in registers at
a point in the code
• Register assignment to pick the specific register that a variable will reside in
• Finding an optimal register assignment in general is NP-complete
Issues in Code Generation design
Prof Monika Shah (NU)
16
Example
t:=a*b
t:=t+a
t:=t/d
MOV a,R1
MUL b,R1
ADD a,R1
DIV d,R1
MOV R1,t
t:=a*b
t:=t+a
t:=t/d
MOV a,R0
MOV R0,R1
MUL b,R1
ADD R0,R1
DIV d,R1
MOV R1,t
{ R1=t } { R0=a, R1=t }
Prof Monika Shah (NU)
17
Choice of Evaluation Order
• When instructions are independent, their evaluation order
can be changed
t1:=a+b
t2:=c+d
t3:=e*t2
t4:=t1-t3
a+b-(c+d)*e
MOV a,R0
ADD b,R0
MOV R0,t1
MOV c,R1
ADD d,R1
MOV e,R0
MUL R1,R0
MOV t1,R1
SUB R0,R1
MOV R1,t4
t2:=c+d
t3:=e*t2
t1:=a+b
t4:=t1-t3
MOV c,R0
ADD d,R0
MOV e,R1
MUL R0,R1
MOV a,R0
ADD b,R0
SUB R1,R0
MOV R0,t4
reorder
Issues in Code Generation design
Prof Monika Shah (NU) 18
• Mapping names in source program to memory address
is done by Code Generator phase
• Labels in three address code need to be converted to
addresses of Instructions
• Back patching technique
Issues in Code Generation design
Memory Mapping

Code Generation Part-1 in Compiler Construction

  • 1.
    Prof Monika Shah(NU) 1 Code Generation Course : 2CS701/IT794 – Compiler Construction Prof Monika Shah Nirma University Ref : Ch.9 Compilers Principles, Techniques, and Tools by Alfred Aho, Ravi Sethi, and Jeffrey Ullman
  • 2.
    Prof Monika Shah(NU) 2 Glimpse Part-1 • Introduction • Code Generation Issues Prof Monika Shah (Nirma University)
  • 3.
    Prof Monika Shah(NU) 3 Machine independent asm to machine dependent Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Generation Optimization Code Generation Source Program Assembly Code IR to low-level IR Symbol Table Front end Back end Optimization Token stream AST or Parse tree AST or Annotted Parse tree Code Generation • Final Phase of Compilation • Produces a semantically equivalent target program • Main Functionalities : • Instruction selection • Register allocation and assignment • Instruction ordering • Functionalities for QoS • Produce Correct code (semantic preserving) • Efficiency • Effective use of resources • User Heuristics to generate good, but suboptimal code
  • 4.
    Prof Monika Shah(NU) 4 Major functionalities of Code Generation • Instruction selection Choose semantically equivalent target machine instructions for the IR statement Choose cost effective instructions to implement the IR statements • Register allocation and assignment • Register allocation: selecting the set of variables that will reside in registers at each point in the program • Resister assignment: selecting specific register that a variable reside in • Instruction ordering Decide schedule of Instruction (Challenging for Parallel )
  • 5.
    Prof Monika Shah(NU) 5 Issues in design of Code Generation phase • Input to Code Generator • Target Program • Memory Management • Instruction Selection • Register Allocation • Schedule order of Instructions • Approaches
  • 6.
    Prof Monika Shah(NU) 6 Issues in Code Generation design Input to Code Generation • Input : IR + Symbol • IR format choices • AST • Postfix Notation • Three Address Code • Assumption : • Front end phase generate low-level IR • Syntactic and semantic errors detected by earlier phases
  • 7.
    Prof Monika Shah(NU) 7 Issues in Code Generation design Target Program (Output) • The back-end code generator of a compiler may generate different forms of code, depending on the requirements: • Absolute machine code (executable code) +ve : Can be placed in fixed location in memory.  immediate executed • Relocatable machine code (object files for linker) +ve : allows subprograms to be compiled separately • Assembly language +ve : facilitates debugging +ve: Easy Code Generation using Assembler • Byte code forms for interpreters (e.g. JVM) +ve : Platform independence
  • 8.
    Prof Monika Shah(NU) 8 The Target Machine • Implementing code generation requires thorough understanding of the target machine architecture and its instruction set • Our (hypothetical) machine: • Byte-addressable (word = 4 bytes) • Has n general purpose registers R0, R1, …, Rn-1 • Two-address instructions of the form op source, destination Issues in Code Generation design Target Program (Output)
  • 9.
    Prof Monika Shah(NU) 9 The Target Machine: Op-codes and Address Modes • Op-codes (op), for example MOV (move content of source to destination) ADD (add content of source to destination) SUB (subtract content of source from dest.) • Address modes Mode Form Address Added Cost Absolute M M 1 Register R R 0 Indexed c(R) c+contents(R) 1 Indirect register *R contents(R) 0 Indirect indexed *c(R) contents(c+contents(R)) 1 Literal #c N/A 1
  • 10.
    Prof Monika Shah(NU) 10 Instruction Costs • Machine is a simple, non-super-scalar processor with fixed instruction costs • Realistic machines have deep pipelines, I-cache, D-cache, etc. • Define the cost of instruction = 1 + cost(source-mode) + cost(destination-mode)
  • 11.
    Prof Monika Shah(NU) 11 Instruction Operation Cost MOV R0,R1 Store content(R0) into register R1 1 MOV R0,M Store content(R0) into memory location M 2 MOV M,R0 Store content(M) into register R0 2 MOV 4(R0),M Store contents(4+contents(R0)) into M 3 MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3 MOV #1,R0 Store 1 into R0 2 ADD 4(R0),*12(R1) Add contents(4+contents(R0)) to value at location contents(12+contents(R1)) 3 Examples
  • 12.
    Prof Monika Shah(NU) 12 Instruction Selection • Instruction selection is important to obtain efficient code • Suppose we translate three-address code x:=y+z to: MOV y,R0 ADD z,R0 MOV R0,x a:=a+1 MOV a,R0 ADD #1,R0 MOV R0,a ADD #1,a INC a Cost = 6 Cost = 3 Cost = 2 Better Best Issues in Code Generation design Target Program (Output)
  • 13.
    Prof Monika Shah(NU) 13 Instruction Selection: Utilizing Addressing Modes • Suppose we translate a:=b+c into MOV b,R0 ADD c,R0 MOV R0,a • Assuming addresses of a, b, and c are stored in R0, R1, and R2 MOV *R1,*R0 ADD *R2,*R0 • Assuming R1 and R2 contain values of b and c ADD R2,R1 MOV R1,a Issues in Code Generation design
  • 14.
    Prof Monika Shah(NU) 14 Need for Global Machine-Specific Code Optimizations • Suppose we translate three-address code x:=y+z to: MOV y,R0 ADD z,R0 MOV R0,x • Then, we translate a:=b+c d:=a+e to: MOV a,R0 ADD b,R0 MOV R0,a MOV a,R0 ADD e,R0 MOV R0,d Redundant Issues in Code Generation design
  • 15.
    Prof Monika Shah(NU) 15 Register Allocation and Assignment • Efficient utilization of the limited set of registers is important to generate good code • Registers are assigned by • Register allocation to select the set of variables that will reside in registers at a point in the code • Register assignment to pick the specific register that a variable will reside in • Finding an optimal register assignment in general is NP-complete Issues in Code Generation design
  • 16.
    Prof Monika Shah(NU) 16 Example t:=a*b t:=t+a t:=t/d MOV a,R1 MUL b,R1 ADD a,R1 DIV d,R1 MOV R1,t t:=a*b t:=t+a t:=t/d MOV a,R0 MOV R0,R1 MUL b,R1 ADD R0,R1 DIV d,R1 MOV R1,t { R1=t } { R0=a, R1=t }
  • 17.
    Prof Monika Shah(NU) 17 Choice of Evaluation Order • When instructions are independent, their evaluation order can be changed t1:=a+b t2:=c+d t3:=e*t2 t4:=t1-t3 a+b-(c+d)*e MOV a,R0 ADD b,R0 MOV R0,t1 MOV c,R1 ADD d,R1 MOV e,R0 MUL R1,R0 MOV t1,R1 SUB R0,R1 MOV R1,t4 t2:=c+d t3:=e*t2 t1:=a+b t4:=t1-t3 MOV c,R0 ADD d,R0 MOV e,R1 MUL R0,R1 MOV a,R0 ADD b,R0 SUB R1,R0 MOV R0,t4 reorder Issues in Code Generation design
  • 18.
    Prof Monika Shah(NU) 18 • Mapping names in source program to memory address is done by Code Generator phase • Labels in three address code need to be converted to addresses of Instructions • Back patching technique Issues in Code Generation design Memory Mapping