KEMBAR78
Ss ui lecture 2 | PPTX
System Software (5KS03)
Unit 1 : Introduction to Compiling
Lecture : 2 Phases of a compiler,
A S Kapse,
Assistant Professor,
Department Of Computer Sci. & Engineering
Anuradha Engineering College, Chikhli
Contents…
 Introduction to Compilers
 Phases of Compiler
Objectives…
 Upon completion of this lecture, you will be able
 To understand the basics of compiler
 To understand Application of compiler
 To understand phases of compiler
Review…./ Concepts
 What do you mean by compiler?
 What do you mean by Operating System?
 What do you mean by system?
The Many Phases of a Compiler
Source Program
Lexical Analyzer
1
Syntax Analyzer
2
Semantic Analyzer
3
Intermediate
Code Generator
4
Code Optimizer
5
Code Generator
6
Target Program
Symbol-table
Manager
Error Handler
1, 2, 3 : Analysis - Our Focus
4, 5, 6 : Synthesis
Language-Processing System
Source Program
Pre-Processor
1
Compiler
2
Assembler
3
Relocatable
Machine Code
4
Loader
Link/Editor
5
Executable
Library,
relocatable
object files
 Three Phases:
 Linear / Lexical Analysis:
 L-to-r Scan to Identify Tokens
token: sequence of chars having a collective meaning
 Hierarchical Analysis:
 Grouping of Tokens Into Meaningful Collection
 Semantic Analysis:
 Checking to ensure Correctness of Components
The Analysis Task For Compilation
Phase 1. Lexical Analysis
Easiest Analysis - Identify tokens which are
the basic building blocks
For
Example:
All are tokens
Blanks, Line breaks, etc. are scanned out
Position := initial + rate * 60 ;
_______ __ _____ _ ___ _ __ _
Phase 2. Hierarchical Analysis
aka Parsing or Syntax Analysis
For previous example,
we would have
Parse Tree:
identifier
identifier
expression
identifier
expression
number
expression
expression
expression
assignment
statement
position
:=
+
*
60
initial
rate
Nodes of tree are constructed using a grammar for the language
What is a Grammar?
 Grammar is a Set of Rules Which Govern the
Interdependencies & Structure Among the Tokens
statement is an assignment statement, or
while statement, or if
statement, or ...
assignment statement
expression is an
is an identifier := expression ;
(expression), or expression +
expression, or expression *
expression, or number, or
identifier, or ...
Why Have We Divided Analysis
in This Manner?
 Lexical Analysis - Scans Input, Its Linear Actions Are
Not Recursive
 Identify Only Individual “words” that are the the Tokens of
the Language
 Recursion Is Required to Identify Structure of an
Expression, As Indicated in Parse Tree
 Verify that the “words” are Correctly Assembled into
“sentences”
 What is Third Phase?
 Determine Whether the Sentences have One and Only
One Unambiguous Interpretation
 … and do something about it!
 e.g. “John Took Picture of Mary Out on the Patio”
Phase 3. Semantic Analysis
 Find More Complicated Semantic Errors and
Support Code Generation
 Parse Tree Is Augmented With Semantic Actions
position
initial
rate
:=
+
*
60
Compressed Tree
position
initial
rate
:=
+
*
inttoreal
60
Conversion Action
Phase 3. Semantic Analysis
 Most Important Activity in This Phase:
 Type Checking - Legality of Operands
 Many Different Situations:
Real := int + char ;
A[int] := A[real] + int ;
while char <> int do
…. Etc.
Supporting Phases/
Activities for Analysis
 Symbol Table Creation / Maintenance
 Contains Info (storage, type, scope, args) on Each
“Meaningful” Token, Typically Identifiers
 Data Structure Created / Initialized During Lexical
Analysis
 Utilized / Updated During Later Analysis & Synthesis
 Error Handling
 Detection of Different Errors Which Correspond to All
Phases
 What Kinds of Errors Are Found During the Analysis
Phase?
 What Happens When an Error Is Found?
The Synthesis Task For Compilation
 Intermediate Code Generation
 Abstract Machine Version of Code - Independent of
Architecture
 Easy to Produce and Do Final, Machine Dependent
Code Generation
 Code Optimization
 Find More Efficient Ways to Execute Code
 Replace Code With More Optimal Statements
 2-approaches: High-level Language & “Peephole”
Optimization
 Final Code Generation
 Generate Relocatable Machine Dependent Code
The Structure of a Compiler (2)
16
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all Phases of The Compiler)
(Character Stream)
Intermediate
Representation
Target machine code
The Structure of a Compiler (3)
17
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Scanner
 The scanner begins the analysis of the source program by
reading the input, character by character, and grouping
characters into individual words and symbols (tokens)
 RE ( Regular expression )
 NFA ( Non-deterministic Finite Automata )
 DFA ( Deterministic Finite Automata )
 LEX
(Character Stream)
Intermediate
Representation
Target machine code
The Structure of a Compiler (4)
18
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Parser
 Given a formal syntax specification (typically as a context-
free grammar [CFG] ), the parse reads tokens and groups
them into units as specified by the productions of the CFG
being used.
 As syntactic structure is recognized, the parser either calls
corresponding semantic routines directly or builds a syntax
tree.
 CFG ( Context-Free Grammar )
 BNF ( Backus-Naur Form )
 GAA ( Grammar Analysis Algorithms )
 LL, LR, SLR, LALR Parsers
 YACC
(Character Stream)
Intermediate
Representation
Target machine code
The Structure of a Compiler (5)
19
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program
(Character Stream)
Tokens Syntactic
Structure
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Semantic Routines
 Perform two functions
 Check the static semantics of each construct
 Do the actual translation
 The heart of a compiler
 Syntax Directed Translation
 Semantic Processing Techniques
 IR (Intermediate Representation)
Target machine code
The Structure of a Compiler (6)
20
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Optimizer
 The IR code generated by the semantic routines is
analyzed and transformed into functionally equivalent but
improved IR code
 This phase can be very complex and slow
 Peephole optimization
 loop optimization, register allocation, code scheduling
 Register and Temporary Management
 Peephole Optimization
(Character Stream)
Intermediate
Representation
Target machine code
The Structure of a Compiler (7)
21
Source
Program
(Character Stream)
Scanner
Tokens
Parser
Syntactic
Structure
Semantic
Routines
Intermediate
Representation
Optimizer
Code
Generator
Code Generator
 Interpretive Code Generation
 Generating Code from Tree/Dag
 Grammar-Based Code Generator
Target machine code
The Structure of a Compiler (8)
22
Scanner
[Lexical Analyzer]
Parser
[Syntax Analyzer]
Semantic Process
[Semantic analyzer]
Code Generator
[Intermediate Code Generator]
Code Optimizer
Tokens
Parse tree
Abstract Syntax Tree w/ Attributes
Non-optimized Intermediate Code
Optimized Intermediate Code
Code Optimizer
Target machine code
Video on Compilers
1. Introduction to Compiler
2. Application of Phases of Compiler
Questions..
1. Define Compiler?
2. List few applications of Compiler.
3. Explain the phases of compiler?
4. What is mean by token?
Homework..
1. What is parser?
2What is mean by analysis and synthesis.
3. Describe the following example.
area=pi * r * r + 45

Ss ui lecture 2

  • 1.
    System Software (5KS03) Unit1 : Introduction to Compiling Lecture : 2 Phases of a compiler, A S Kapse, Assistant Professor, Department Of Computer Sci. & Engineering Anuradha Engineering College, Chikhli
  • 2.
    Contents…  Introduction toCompilers  Phases of Compiler
  • 3.
    Objectives…  Upon completionof this lecture, you will be able  To understand the basics of compiler  To understand Application of compiler  To understand phases of compiler
  • 4.
    Review…./ Concepts  Whatdo you mean by compiler?  What do you mean by Operating System?  What do you mean by system?
  • 5.
    The Many Phasesof a Compiler Source Program Lexical Analyzer 1 Syntax Analyzer 2 Semantic Analyzer 3 Intermediate Code Generator 4 Code Optimizer 5 Code Generator 6 Target Program Symbol-table Manager Error Handler 1, 2, 3 : Analysis - Our Focus 4, 5, 6 : Synthesis
  • 6.
    Language-Processing System Source Program Pre-Processor 1 Compiler 2 Assembler 3 Relocatable MachineCode 4 Loader Link/Editor 5 Executable Library, relocatable object files
  • 7.
     Three Phases: Linear / Lexical Analysis:  L-to-r Scan to Identify Tokens token: sequence of chars having a collective meaning  Hierarchical Analysis:  Grouping of Tokens Into Meaningful Collection  Semantic Analysis:  Checking to ensure Correctness of Components The Analysis Task For Compilation
  • 8.
    Phase 1. LexicalAnalysis Easiest Analysis - Identify tokens which are the basic building blocks For Example: All are tokens Blanks, Line breaks, etc. are scanned out Position := initial + rate * 60 ; _______ __ _____ _ ___ _ __ _
  • 9.
    Phase 2. HierarchicalAnalysis aka Parsing or Syntax Analysis For previous example, we would have Parse Tree: identifier identifier expression identifier expression number expression expression expression assignment statement position := + * 60 initial rate Nodes of tree are constructed using a grammar for the language
  • 10.
    What is aGrammar?  Grammar is a Set of Rules Which Govern the Interdependencies & Structure Among the Tokens statement is an assignment statement, or while statement, or if statement, or ... assignment statement expression is an is an identifier := expression ; (expression), or expression + expression, or expression * expression, or number, or identifier, or ...
  • 11.
    Why Have WeDivided Analysis in This Manner?  Lexical Analysis - Scans Input, Its Linear Actions Are Not Recursive  Identify Only Individual “words” that are the the Tokens of the Language  Recursion Is Required to Identify Structure of an Expression, As Indicated in Parse Tree  Verify that the “words” are Correctly Assembled into “sentences”  What is Third Phase?  Determine Whether the Sentences have One and Only One Unambiguous Interpretation  … and do something about it!  e.g. “John Took Picture of Mary Out on the Patio”
  • 12.
    Phase 3. SemanticAnalysis  Find More Complicated Semantic Errors and Support Code Generation  Parse Tree Is Augmented With Semantic Actions position initial rate := + * 60 Compressed Tree position initial rate := + * inttoreal 60 Conversion Action
  • 13.
    Phase 3. SemanticAnalysis  Most Important Activity in This Phase:  Type Checking - Legality of Operands  Many Different Situations: Real := int + char ; A[int] := A[real] + int ; while char <> int do …. Etc.
  • 14.
    Supporting Phases/ Activities forAnalysis  Symbol Table Creation / Maintenance  Contains Info (storage, type, scope, args) on Each “Meaningful” Token, Typically Identifiers  Data Structure Created / Initialized During Lexical Analysis  Utilized / Updated During Later Analysis & Synthesis  Error Handling  Detection of Different Errors Which Correspond to All Phases  What Kinds of Errors Are Found During the Analysis Phase?  What Happens When an Error Is Found?
  • 15.
    The Synthesis TaskFor Compilation  Intermediate Code Generation  Abstract Machine Version of Code - Independent of Architecture  Easy to Produce and Do Final, Machine Dependent Code Generation  Code Optimization  Find More Efficient Ways to Execute Code  Replace Code With More Optimal Statements  2-approaches: High-level Language & “Peephole” Optimization  Final Code Generation  Generate Relocatable Machine Dependent Code
  • 16.
    The Structure ofa Compiler (2) 16 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) (Character Stream) Intermediate Representation Target machine code
  • 17.
    The Structure ofa Compiler (3) 17 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Scanner  The scanner begins the analysis of the source program by reading the input, character by character, and grouping characters into individual words and symbols (tokens)  RE ( Regular expression )  NFA ( Non-deterministic Finite Automata )  DFA ( Deterministic Finite Automata )  LEX (Character Stream) Intermediate Representation Target machine code
  • 18.
    The Structure ofa Compiler (4) 18 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Parser  Given a formal syntax specification (typically as a context- free grammar [CFG] ), the parse reads tokens and groups them into units as specified by the productions of the CFG being used.  As syntactic structure is recognized, the parser either calls corresponding semantic routines directly or builds a syntax tree.  CFG ( Context-Free Grammar )  BNF ( Backus-Naur Form )  GAA ( Grammar Analysis Algorithms )  LL, LR, SLR, LALR Parsers  YACC (Character Stream) Intermediate Representation Target machine code
  • 19.
    The Structure ofa Compiler (5) 19 Scanner Parser Semantic Routines Code Generator Optimizer Source Program (Character Stream) Tokens Syntactic Structure Intermediate Representation Symbol and Attribute Tables (Used by all Phases of The Compiler) Semantic Routines  Perform two functions  Check the static semantics of each construct  Do the actual translation  The heart of a compiler  Syntax Directed Translation  Semantic Processing Techniques  IR (Intermediate Representation) Target machine code
  • 20.
    The Structure ofa Compiler (6) 20 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Optimizer  The IR code generated by the semantic routines is analyzed and transformed into functionally equivalent but improved IR code  This phase can be very complex and slow  Peephole optimization  loop optimization, register allocation, code scheduling  Register and Temporary Management  Peephole Optimization (Character Stream) Intermediate Representation Target machine code
  • 21.
    The Structure ofa Compiler (7) 21 Source Program (Character Stream) Scanner Tokens Parser Syntactic Structure Semantic Routines Intermediate Representation Optimizer Code Generator Code Generator  Interpretive Code Generation  Generating Code from Tree/Dag  Grammar-Based Code Generator Target machine code
  • 22.
    The Structure ofa Compiler (8) 22 Scanner [Lexical Analyzer] Parser [Syntax Analyzer] Semantic Process [Semantic analyzer] Code Generator [Intermediate Code Generator] Code Optimizer Tokens Parse tree Abstract Syntax Tree w/ Attributes Non-optimized Intermediate Code Optimized Intermediate Code Code Optimizer Target machine code
  • 23.
    Video on Compilers 1.Introduction to Compiler 2. Application of Phases of Compiler
  • 24.
    Questions.. 1. Define Compiler? 2.List few applications of Compiler. 3. Explain the phases of compiler? 4. What is mean by token?
  • 25.
    Homework.. 1. What isparser? 2What is mean by analysis and synthesis. 3. Describe the following example. area=pi * r * r + 45