Compiler Construction
Lecture#1
Introduction
Course Description
The course is intended to teach the students the basic techniques that
underlie the practice of Compiler Construction. It will introduce the theory
and tools that can be standarly employed in order to perform syntax-directed
translation of a high-level programming language into an executable code by
understanding:
Lexical analysis, including regular languages and finite-state acceptors.
Syntactic analysis, including parsing techniques and grammars.
Code generation and optimization.
2
Textbook/References
Compilers: Principles, Techniques, and Tools
(2nd Edition)
3
Course Goals:
Understanding of the organization of a compiler
Understanding of the concepts of scanning, parsing, and translation
Understanding of Compiler writing tools.
4
Topics To be Covered (Not in order)
Introduction to compilers structure & goals
Simple compiler structure
Grammar, parse tree, and ambiguous grammar
Translation schemes
Context-free grammar & parsing
Introduction to left recursion and right recursion
Lexical analyzer (language, errors, pattern specifications)
Operations on languages and regular expressions
Finite automata
Parsers and errors and sentential error
Left recursion and left factoring
5
Grades! Grades! Grades!
Percentage of
Week
# Assessment task* Total Assessment
Due Score
1 Quiz 1 3 10
2 Quiz 2 8 10
3 Project 1 9 30
5 Mid-term Exam 6 20
6 Final Exam 11 30
6
Why we need to know compilers?
• Seeing the development of a compiler gives you a
feeling for how programs work.
• A great example of interplay between theory and
practice.
• Many algorithms and models you will use in compilers
are fundamental, and will be useful to you elsewhere:
• automata, regular expressions (lexing)
• context-free grammars, trees (parsing)
• hash tables (symbol table)
• dynamic programming, graph coloring (code gen.)
7
Computer Organization
Applications
Compiler
Operating System
Hardware
8
History, Programming Languages
Machine coding (binary programming – punch holes)
(first generation)
The computer’s ‘native language’, binary digits (0s, 1s)
0100 0001 0110 1110 0100 0001
0001 0010 1100 0100 0000 1101
Assembly Language (second generation)
One-to-one correspondence to machine language
MOV AX, 5h
MOV DX, 3h
ADD AX
Assembler: translates assembly language programs into machine language
9
History, Programming Languages (High-Level
Languages)
Procedural Languages (third generation)
* Instructions translate into machine language instructions
* Uses common words rather than abbreviated mnemonics
C, C++, Java, Fortran, QuickBasic
A= 3
B= A * 2 - 1
D= A / B + A^5
Compiler - translates the entire program at once
Interpreter - translates and executes one source program statement at a time
10
What Do Compilers Do
A compiler acts as a translator,
transforming human-oriented programming languages
into computer-oriented machine languages.
Ignore machine-dependent details for programmer
Programming Machine
Language Compiler Language
(Source) (Target)
11
What Do Compilers Do
Compilers may generate three types of code:
Pure Machine Code
Machine instruction set without assuming the existence of any
operating system or library.
Mostly being OS or embedded applications.
Augmented Machine Code
Code with OS routines and runtime support routines.
More often
Virtual Machine Code
Virtual instructions, can be run on any architecture with a virtual
machine interpreter or a just-in-time compiler
Ex. Java
12
What is a Compiler?
A compiler is program that reads a program written in one
language (source language) and translates it into an equivalent
program in another language (target language).
13
Why do we need Compilers?
Machines understand only 1’s and 0’s. High-level languages,
make it easier for the user to program in, but not for the
machine to understand.
Once the programmer has written and edited the
program (in an Editor), it needs to be translated into
machine language (1’s and 0’s) before it can be executed.
compilers are used to do this conversion
14
Where are compilers used for?
Implementation of programming languages
C, C++, Java, Lisp, Prolog, SML, Haskell, Ada, Fortran
Document processing
DVI → PostScript,
Word documents → PDF
Natural language processing
NL → database query language → database commands
Hardware design
silicon compilers, CAD data → machine operations, equipment lists
Report generation
CAD data → list of parts,
All kinds of input/output translations
various UNIX text filters, . . .
15
Compilers & Interpreters
* Interpreters are another class of translators
* Compiler: translates a program once and for all into target language.
C++
* Interpreter: effectively translates a source program every time it is run.
Basic
* Compilers and interpreters (highbred) are used together
Java
→ Java compiled into Java byte code,
→ byte code interpreted by a Java Virtual Machine (JVM).
16
Compiler / Translator and Interpreter
A translator is used to produce an “equivalent” program in
another language (e.g. from C to Pascal)
Compiler is a translator that generally takes in a higher level
language (e.g. C) and transforms it into a low level language
(usually object or machine code).
Compiler/Translator produce the entire output code before
executing
Interpreter compiles and executes a statement at a time before
moving on to the next statement
17
The Structure of a Compiler (1)
Any compiler must perform two major tasks
Compiler
Analysis Synthesis
Analysis of the source program
Synthesis of a machine-language program
18
The Structure of a Compiler (2)
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate
Representation
Symbol and Optimizer
Attribute
Tables
(Used by all Phases of The Compiler)
Code
Generator
19
Target machine code
Compiler writing tools
Compilergenerators or compiler-
compilers
E.g. scanner and parser generators
Examples : Yacc, Lex
20
Compiler Design Considerations
Debugging Compilers
Designed to aid in the development and debugging of
programs.
Optimizing Compilers
Designed to produce efficient target code
Retargetable Compilers
A compiler that has been designed to be relatively easy to
modify to generate code for different CPU instruction set
architectures.
21
Homework
Blackboard
Have a terrific day
22