Unit -1
Qno.1:-Explain the various phrases of complier?
The compilation process contains the sequence of various phases
Each phase takes source program in one representation and produces output in another representation
There are the various phases of compiler:
Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes source code as input. It reads the
source program one character at a time and converts it into meaningful lexemes. Lexical analyzer
represents these lexemes in the form of tokens.
Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as input and generates a
parse tree as output. In syntax analysis phase, the parser checks that the expression made by the tokens
is syntactically correct or not.
Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the parse tree follows the
rules of language. Semantic analyzer keeps track of identifiers, their types and expressions. The output
of semantic analysis phase is the annotated tree syntax.
Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the intermediate code.
Intermediate code is generated between the high-level language and the machine language. The
intermediate code should be generated in such a way that you can easily translate it into the target
machine code
Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate code so that the output of
the program could run faster and take less space. It removes the unnecessary lines of the code and
arranges the sequence of statements in order to speed up the program execution.
Qno. 3:-Describe the various data structure used in complier
A compiler is a program that converts HLL(High-Level Language) to LLL(Low-Level Language) like
machine-level language.
There are various data structures used in compilers such as:-
Tokens
Syntax Tree
Symbol Table
Literal Table
Parse Tree
1. Tokens
Typically when a scanner scans the input and gathers a stream of characters into tokens, it represents
the token symbolically it is represented as an enumerated data type representing the set of tokens in
the source language. It is important to keep the characters string and the information derived from it.
2. Syntax Tree
A syntax tree is a tree data structure in which a node represents an operand and each interior node
represents an operator. It is a dynamically allocated pointer-based tree data structure that is created as
parsing proceeds. If the syntax tree is generated by the parser, then it is in the tree form.
For ex- Syntax tree for a+b*c
3. Symbol Table
The symbol table is a data structure that is used to keep the information of identifiers, functions,
variables, constants, and data types. It is created and maintained by the compiler because it keeps the
information about the occurrence of entities
4. Literal Table
A literal table is a data structure that is used to keep track of literal variables in the program. It holds
constant and strings used in the program but it can appear only once in a literal table and its contents
apply to the whole program
5. Parse Tree
A parse tree is the hierarchical representation of symbols. The symbols include terminal or non-terminal.
In the parse tree the string is derived from the starting symbol and the starting symbol is mainly the root
of the parse tree. All the leaf nodes are symbols and the inner nodes are the operators or non-terminals.
To get the output we can use Inorder Traversal.
For example:- Parse tree for a+b*c
6. Intermediate Code
Once the intermediate code is generated, the intermediate code can be stored
as a linked list of structures, a text file, or an array of strings that only depends
on the type of intermediate code that is generated.
Qno. 4:- what are compiler construction tool
Compiler construction tools
The compiler writer can use some specialized tools that help in implementing various phases of a
compiler.
Some commonly used compiler construction tools include:Parser Generator – It produces syntax
analyzers (parsers) from the input that is based on a grammatical description of programming language
or on a context-free grammar. It is useful as the syntax analysis phase is highly complex and consumes
more manual and compilation time. Example: PIC, EQM
Scanner Generator – It generates lexical analyzers from the input that consists of regular expression
description based on tokens of a language. It generates a finite automaton to recognize the regular
expression. Example: Lex
Syntax directed translation engines – It generates intermediate code with three address format from the
input that consists of a parse tree. These engines have routines to traverse the parse tree and then
produces the intermediate code.
Automatic code generators – It generates the machine language for a target machine. Each operation of
the intermediate language is translated using a collection of rules and then is taken as an input by the
code generator.
Data-flow analysis engines – It is used in code optimization.Data flow analysis is a key part of the code
optimization that gathers the information, that is the values that flow from one part of a program to
another.
Compiler construction toolkits – It provides an integrated set of routines that aids in building compiler
components or in the construction of various phases of compiler.
Qno. 5:-Explain the input buffering
Input buffering is an most important technique in compiler design that helps to improve performance
and reduce expenses
Input buffering is a process used to optimize the reading of input from a source file, such as a program’s
source code,
The purpose of input buffering is to reduce the number of I/O operations performed by the compiler,
which can improve the performance and efficiency of the overall compilation process.
By buffering input data, the compiler can access and process larger chunks of input data at once
Input buffering in compiler design works by temporarily storing input data in memory before processing
it.
Types of Input Buffers Used in Compiler Design
Simple Input Buffer: This is the most basic type of input buffer, which reads input characters one by one
and returns a token when a delimiter is encountered
Line Input Buffer: This input buffer reads input characters one line at a time and returns a token when a
complete line has been read
Block Input Buffer: This input buffer reads input characters in blocks or chunks of fixed size, typically
specified by the programmer
Lookahead Input Buffer: This input buffer reads input characters in advance, typically one or more
characters ahead of the current position.
Qno. 6:-Explain Front end and back end of the compiler
UNIT-2ND