KEMBAR78
Compiler Design Lab Guide | PDF | Automata Theory | Computer Programming
0% found this document useful (0 votes)
32 views82 pages

Compiler Design Lab Guide

This document is a laboratory manual for the Compiler Design course (3170701) for B.E. Computer Engineering students at Government Engineering College, Modasa, for the academic year 2023-24. It outlines the objectives, outcomes, and practical exercises aimed at teaching students the theory and implementation of compilers, including tools like Lex and Yacc. The manual also includes guidelines for faculty and students, safety instructions, and a structured index for practical assessments.

Uploaded by

kushal senghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views82 pages

Compiler Design Lab Guide

This document is a laboratory manual for the Compiler Design course (3170701) for B.E. Computer Engineering students at Government Engineering College, Modasa, for the academic year 2023-24. It outlines the objectives, outcomes, and practical exercises aimed at teaching students the theory and implementation of compilers, including tools like Lex and Yacc. The manual also includes guidelines for faculty and students, safety instructions, and a structured index for practical assessments.

Uploaded by

kushal senghani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Degree Engineering

A Laboratory Manual for

Compiler Design
(3170701)

[ B.E. (Computer Engineering) : Semester - 7 ]

Enrolment No 200160107008
Name Limbani Nirav P.
Branch Computer Engineering
Academic Term 2023-24
Institute Name Government Engineering College, Modasa

Directorate of Technical Education, Gandhinagar,


Gujarat

1
Government Engineering College, Modasa
Computer Engineering

CERTIFICATE

This is to certify that Mr. Limbani Nirav P. Enrollment

No. 200160107008 of B.E. Semester 7th from Computer Engineering Department of

this Institute (GTU Code: 016) has satisfactorily completed the Practical / Tutorial work for the

subject Compiler Design (3170701) for the academic year 2023-24.

Place:

Date:

Signature of Course Faculty Head of the Department

Prof. A.K. Dodiya Prof. H.R. Patel

2
Compiler Design (3170701) 200160107008

Preface

Compiler Design is an essential subject for computer science and engineering students. It deals with
the theory and practice of developing a program that can translate source code written in one
programming language into another language. The main objective of this subject is to teach students
how to design and implement a compiler, which is a complex software system that converts high-
level language code into machine code that can be executed on a computer. The design of compilers
is an essential aspect of computer science, as it helps in bridging the gap between human-readable
code and machine-executable code.

This lab manual is designed to help students understand the concepts of compiler design and
develop hands-on skills in building a compiler. The manual provides step-by-step instructions for
implementing a simple compiler using C and other applicable programming language, covering all
the essential components such as lexical analyzer, parser, symbol table, intermediate code
generator, and code optimizer.

The manual is divided into several sections, each focusing on a specific aspect of compiler design.
The first section provides an introduction to finite automata, phases of compiler and covering the
basic concepts of lexical analysis. The subsequent sections cover parsing, code generation, and
study of Learning Basic Block Scheduling. Each section includes detailed instructions for
completing the lab exercises and programming assignments, along with examples and code
snippets.

The lab manual also includes a set of challenging programming assignments and quizzes that will
help students test their understanding of the subject matter. Additionally, the manual provides a list
of recommended books and online resources for further study.

This manual is intended for students studying Compiler Design and related courses. It is also useful
for software developers and engineers who want to gain a deeper understanding of compiler design
and implementation. We hope that this manual will be a valuable resource for students and
instructors alike and will contribute to the learning and understanding of compiler design.

3
Compiler Design (3170701) 200160107008

OBJECTIVE:
This laboratory course is intended to make the students experiment on the basic techniques of compiler
construction and tools that can used to perform syntax-directed translation of a high-level programming
language into an executable code. Students will design and implement language processors in C by using
tools to automate parts of the implementation process. This will provide deeper insights into the more
advanced semantics aspects of programming languages, code generation, machine independent
optimizations, dynamic memory allocation, and object orientation.

OUTCOMES:
Upon the completion of Compiler Design practical course, the student will be able to:
1. Understand the working of lex and yacc compiler for debugging of programs.
2. Understand and define the role of lexical analyzer, use of regular expression and transition diagrams.
3. Understand and use Context free grammar, and parse tree construction.
4. Learn & use the new tools and technologies used for designing a compiler.
5. Develop program for solving parser problems.
6. Learn how to write programs that execute faster.

4
Compiler Design (3170701) 200160107008

DTE’s Vision

▪ To provide globally competitive technical education


▪ Remove geographical imbalances and inconsistencies
▪ Develop student friendly resources with a special focus on girls’ education and support to
weaker sections
▪ Develop programs relevant to industry and create a vibrant pool of technical professionals

Institute’s Vision

▪ To create an ecosystem for proliferation of socially responsible and technically sound


engineers, innovators and entrepreneurs.

Institute’s Mission

▪ To develop state-of-the-art laboratories and well-equipped academic infrastructure.


▪ To motivate faculty and staff for qualification up-gradation, and enhancement of subject
knowledge.
▪ To promote research, innovation and real-life problem-solving skills.
▪ To strengthen linkages with industries, academic and research organizations.
▪ To reinforce concern for sustainability, natural resource conservation and social
responsibility.

Department’s Vision

▪ To create an environment for providing value-based education in Computer Engineering


through innovation, team work and ethical practices.

Department’s Mission

▪ To produce computer engineering graduates according to the needs of industry, government,


society and scientific community.
▪ To develop state of the art computing facilities and academic infrastructure.
▪ To develop partnership with industries, government agencies and R & D organizations for
knowledge sharing and overall development of faculties and students.
▪ To solve industrial, governance and societal issues by applying computing techniques.
▪ To create environment for research and entrepreneurship.

5
Compiler Design (3170701) 200160107008

Programme Outcomes (POs)

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex engineering
problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and
design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis of
the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to
the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and need
for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader
in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive
clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.

6
Compiler Design (3170701) 200160107008

Program Specific Outcomes (PSOs)

▪ Sound knowledge of fundamentals of computer science and engineering including software


and hardware.
▪ Develop the software using sound software engineering principles having web based/mobile
based interface.
▪ Use various tools and technology supporting modern software frameworks for solving
problems having large volume of data in the domain of data science and machine learning.

Program Educational Objectives (PEOs)

▪ Possess technical competence in solving real life problems related to Computing.


▪ Acquire good analysis, design, development, implementation and testing skills to formulate
simple computing solutions to the business and societal needs.
▪ Provide requisite skills to pursue entrepreneurship, higher studies, research, and
development and imbibe high degree of professionalism in the fields of computing.
▪ Embrace life-long learning and remain continuously employable.
▪ Work and excel in a highly competence supportive, multicultural and professional
environment which abiding to the legal and ethical responsibilities.

7
Compiler Design (3170701) 200160107008

Practical – Course Outcome matrix

Course Outcomes (COs)


Understand the basic concepts; ability to apply automata theory and knowledge
CO_3170701.1
on formal languages.
Ability to identify and select suitable parsing strategies for a compiler for
CO_3170701.2
various cases. Knowledge in alternative methods (top-down or bottom-up, etc).
Understand backend of compiler: intermediate code, Code optimization
CO_3170701.3
Techniques and Error Recovery mechanisms
Understand issues of run time environments and scheduling for instruction level
CO_3170701.4
parallelism.

Sr. CO CO CO CO
Title of experiment
No. 1 2 3 4
Implementation of Finite Automata and String Validation.
1. √

Introduction to Lex Tool. Implement following Programs


Using Lex:
2. a. Generate Histogram of words √
b. Caesar Cypher
c. Extract single and multiline comments from C
Program
Implement following Programs Using Lex:
a. Convert Roman to Decimal
3. b. Check weather given statement is compound or √
simple
c. Extract html tags from .html file

4. Introduction to YACC and generate Calculator Program.

Implement a program for constructing √


5. a. LL(1) Parser
b. Predictive Parser
Implement a program for constructing √
6. a. Recursive Decent Parser (RDP)
b. LALR Parser
Implement a program for constructing Operator Precedence √
7.
Parsing.
Generate 3-tuple intermediate code for given infix √
8.
expression.
Extract Predecessor and Successor from given Control Flow √
9.
Graph.
Study of Learning Basic Block Scheduling Heuristics from
10. √
Optimal Data.

8
Compiler Design (3170701) 200160107008

Industry Relevant Skills

Compiler Design is a vital subject in computer science and engineering that focuses on the
design and implementation of compilers. Here are some industry-relevant skills that students
can develop while studying Compiler Design:
• Proficiency in programming languages: A good understanding of programming languages
is essential for building compilers. Students should be proficient in programming
languages such as C/C++, Java, and Python.
• Knowledge of data structures and algorithms: Compiler Design involves the
implementation of various data structures and algorithms. Students should have a good
understanding of data structures such as stacks, queues, trees, and graphs, and algorithms
such as lexical analysis, parsing, and code generation.
• Familiarity with compiler tools: Students should be familiar with compiler tools such as
Lex and Yacc. These tools can help automate the process of creating a compiler, making
it more efficient and error-free.
• Debugging skills: Debugging is an essential skill for any programmer, and it is particularly
important in Compiler Design. Students should be able to use debugging tools to find and
fix errors in their code.
• Optimization techniques: Code optimization is a critical component of Compiler Design.
Students should be familiar with optimization techniques such as constant folding, dead
code elimination, and loop unrolling, which can significantly improve the performance of
the compiled code.
• Collaboration and communication skills: Compiler Design is a complex subject that
requires collaboration and communication between team members. Students should
develop good communication and collaboration skills to work effectively with their peers
and instructors.
By developing these industry-relevant skills, students can become proficient in Compiler
Design and be better equipped to meet the demands of the industry.

Guidelines for Faculty members

1. Teacher should provide the guideline with demonstration of practical to the students with
all features.
2. Teacher shall explain basic concepts/theory related to the experiment to the students before
starting of each practical
3. Involve all the students in performance of each experiment.
4. Teacher is expected to share the skills and competencies to be developed in the students
and ensure that the respective skills and competencies are developed in the students after
the completion of the experimentation.
5. Teachers should give opportunity to students for hands-on experience after the
demonstration.
6. Teacher may provide additional knowledge and skills to the students even though not
covered in the manual but are expected from the students by concerned industry.
7. Give practical assignment and assess the performance of students based on task assigned

9
Compiler Design (3170701) 200160107008

to check whether it is as per the instructions or not.


8. Teacher is expected to refer complete curriculum of the course and follow the guidelines
for implementation.

Instructions for Students

1. Students are expected to carefully listen to all the theory classes delivered by the faculty
members and understand the COs, content of the course, teaching and examination scheme, skill
set to be developed etc.
2. Students will have to perform experiments considering C or other applicable programming
language using Lex tool or Yacc.
3. Students are instructed to submit practical list as per given sample list shown on next page.
Students have to show output of each program in their practical file.
4. Student should develop a habit of submitting the experimentation work as per the schedule and
she/he should be well prepared for the same.

Common Safety Instructions

Students are expected to

1. Handle equipment with care: When working in the lab, students should handle equipment
and peripherals with care. This includes using the mouse and keyboard gently, avoiding
pulling or twisting network cables, and handling any hardware devices carefully.
2. Avoid water and liquids: Students should avoid using wet hands or having any liquids near
the computer equipment. This will help prevent damage to the devices and avoid any safety
hazards.
3. Shut down the PC properly: At the end of the lab session, students should shut down the
computer properly. This includes closing all programs and applications, saving any work,
and following the correct shutdown procedure for the operating system.
4. Obtain permission for laptops: If a student wishes to use their personal laptop in the lab, they
should first obtain permission from the Lab Faculty or Lab Assistant. They should follow
all lab rules and guidelines and ensure that their laptop is properly configured for the lab
environment.

10
Compiler Design (3170701) 200160107008

Index
(Progressive Assessment Sheet)

Sr. Title of experiment Page Date of Date of Assess Sign. of Remark


No. No. performance submission ment Teacher s
Marks with date
1 Implementation of Finite Automata and String
Validation.
2 Introduction to Lex Tool. Implement following
Programs Using Lex:
a. Generate Histogram of words
b. Caesar Cypher
c. Extract single and multiline comments from
C Program
3 Implement following Programs Using Lex:
a. Convert Roman to Decimal
b. Check weather given statement is compound
or simple
c. Extract html tags from .html file
4 Introduction to YACC and generate Calculator
Program.
5 Implement a program for constructing
a. LL(1) Parser
b. Predictive Parser
6 Implement a program for constructing
a. Recursive Decent Parser (RDP)
b. LALR Parser
7 Implement a program for constructing Operator
Precedence Parsing.
8 Generate 3-tuple intermediate code for given infix
expression.
9 Extract Predecessor and Successor from given Control
Flow Graph.
10 Study of Learning Basic Block Scheduling Heuristics
from Optimal Data.
Total

11
200160107008
Compiler Design (3170701)

Experiment No - 1
Aim: To study and implement Finite Automata and validate strings using it.

Date:

Competency and Practical Skills: Understanding and implementing Finite Automata,


string validation

Relevant CO: CO1

Objectives:
1. To understand the concept of Finite Automata.
2. To implement Finite Automata using programming language.
3. To validate strings using Finite Automata.

Software/Equipment: Computer system, Text editor, Programming language.

Theory:

Finite Automata is a mathematical model that consists of a finite set of states and a set of transitions
between these states. It is used to recognize patterns or validate strings. In a Finite Automata, there
are five components:
1. A set of states
2. An input alphabet
3. A transition function
4. A start state
5. A set of final (or accepting) states

In the implementation of Finite Automata and string validation, we need to create a Finite Automata
that recognizes a specific pattern or set of patterns. The Finite Automata consists of states,
transitions between the states, and a set of accepting states. The input string is then validated by
passing it through the Finite Automata, starting at the initial state, and following the transitions until
the string is either accepted or rejected.

String validation using Finite Automata is useful in a variety of applications, including pattern
matching, text processing, and lexical analysis in programming languages. It is an efficient method
for validating strings and can handle large inputs with minimal memory and time complexity.

Example: Suppose a finite automaton which accepts even number of a's where ∑ = {a, b, c}

12
200160107008
Compiler Design (3170701)

Solution:

q0 is the initial state.

Program:
Program-1: Create a program in Python that implements a Finite Automata to validate strings
that start with 'a' and end with 'b'.

Code:
def transition(state, input_symbol):
# initial state: A
# final state: C
transition_table = {
'A,a': 'B',
'A,b': 'N',
'B,a': 'B',
'B,b': 'C',
13
200160107008
Compiler Design (3170701)
'C,a': 'B',
'C,b': 'C',
'N,a': 'N',
'N,b': 'N'
}

return transition_table[state + ',' + input_symbol]

def validate(string):
state = 'A'

for input_symbol in string:


state = transition(state, input_symbol)

if state == 'C':
return True
return False

def display(inp):
if validate(inp):
print('{} is a valid string'.format(inp))
else:
print('{} is not a valid string'.format(inp))

input1 = 'abbabababa'
input2 = 'abbabababb'

display(input1)
display(input2)

Program-2: Implement a program to search a pattern in a string using finite automata

Code:
# Define the Finite Automata
def compute_transition_function(pattern, alphabet):
m = len(pattern)
transitions = {}
for q in range(m + 1):
for a in alphabet:
k = min(m + 1, q + 2)
while k > 0 and not (pattern[:q] + a).endswith(pattern[:k-1]):
k -= 1
transitions[(q, a)] = k
return transitions

# Search for pattern in text using Finite Automata


def search(pattern, text):
alphabet = set(text)
transitions = compute_transition_function(pattern, alphabet)
q=0
for i, char in enumerate(text):

14
200160107008
Compiler Design (3170701)
q = transitions[(q, char)]
if q == len(pattern):
return i - len(pattern) + 1
return -1

# Test the program with user input


pattern = input("Enter the pattern to search for: ")
text = input("Enter the text to search in: ")
index = search(pattern, text)
if index != -1:

15
200160107008
Compiler Design (3170701)
print("Pattern found at index", index)
else:
print("Pattern not found")

Observations and Conclusion:

Sample Program-1:

In this implementation of Finite Automata and string validation, I have learned how to create a
Finite Automata that recognizes a specific pattern or set of patterns, and how to validate strings
using the Finite Automata. I have also learned how to implement a Finite Automata using
programming language, and how to test it with different inputs. By using Finite Automata, I can
efficiently validate strings and recognize patterns, making it a powerful tool in computer science
and related fields.

Sample Program-2:

In this example, the program searches for the pattern “abab” in the text abbaaababaa. It computes
the Finite Automata using the compute_transition_function function, and then uses it to search for
the pattern in the text using the search function. It outputs that the pattern is found at index 3.

Quiz:
1. What is a Finite Automata?
A Finite Automaton (FA) is an abstract computational model with a finite set of states, an input
alphabet, transition rules, and optional accepting states. It comes in two main types: DFA
(deterministic) and NFA (nondeterministic). FAs are fundamental in theoretical computer science
and used in various applications like formal language theory, regular expressions, and compiler
design.
2. What are the five components of a Finite Automata?
A Finite Automaton (FA) has five components: States, Alphabet (input symbols), Transition
Function, Start State, and Accepting States (or Final States). States represent possible configurations,
alphabet is the set of input symbols, transition function defines state changes, start state is where it
begins, and accepting states signify valid input.
3. How a Finite Automata implemented using a programming language?
A Finite Automaton is implemented in a programming language by defining states, an alphabet,
transition rules, and processing logic for input. States are represented using variables or data
structures, each with a unique identifier. The alphabet consists of input symbols. Transition rules are
implemented using conditional statements or data structures. The automaton starts in an initial state,
and accepting states denote successful input acceptance. The chosen programming language, like
Python or Java, is used to write code for processing input strings and simulating state transitions.
4. What is the process of building a finite automaton to search a pattern in a text string?
To build a Finite Automaton for pattern searching, define states, alphabet, create transition rules,
designate accepting states, initialize and iterate through the text, handle mismatches or completion,
check for successful matches, optionally use a failure function, implement in a programming
language, and thoroughly test the automaton with different inputs.
5. What are the advantages of using a finite automaton to search for patterns in text strings?
Using a finite automaton for pattern searching in text strings offers several advantages. It allows for
16
200160107008
Compiler Design (3170701)
efficient pattern matching with a time complexity proportional to the length of the text. Additionally,
once constructed, the automaton can be used to search for multiple patterns without needing to
rebuild it. This makes it suitable for scenarios where pattern matching is a recurring task. Moreover,
finite automata can be implemented in a way that consumes less memory compared to other pattern
matching algorithms, making them resource-efficient.

Suggested Reference:

1. Introduction to Automata Theory, Languages and Computation by John E. Hopcroft, Rajeev


Motwani, and Jeffrey D. Ullman.
2. https://www.youtube.com/watch?v=58N2N7zJGrQ&list=PLBlnK6fEyqRgp46KUv4ZY69yX
mpwKOIev
3. GeeksforGeeks: Finite Automata Introduction - https://www.geeksforgeeks.org/introduction-of-
finite-automata/

17
200160107008
Compiler Design (3170701)

References used by the students:

Rubric wise marks obtained:

Problem
Knowledge Implementat Testing & Creativity in
Recognition
Rubrics (2) ion (2) Debugging (2) logic/code (2) Total
(2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

18
200160107008
Compiler Design (3170701)
Experiment No - 2
Aim: Introduction to Lex Tool. Implement following Programs Using Lex
a. Generate Histogram of words
b. Caesar Cypher
c. Extract single and multiline comments from C Program

Date

Competency and Practical Skills: Understanding of Lex tool and its usage in compiler
design, understanding of regular expressions and data structures, improving programming
skill to develop programs using lex tool

Relevant CO: CO1

Objectives:
1. To introduce students to Lex tool and its usage in compiler design
2. To provide practical knowledge of regular expressions and their use in pattern matching
3. To enhance students' understanding of data structures such as arrays, lists, and trees
4. To develop students' problem-solving skills in developing and implementing programs using
Lex tool
5. To develop students' debugging skills to identify and resolve program errors and issues

Software/Equipment: Computer system, Text editor, Lex tool, C compiler, Terminal or Command
prompt.

Theory:

❖ COMPILER:
• A compiler is a translator that converts the high-level language into the machine language.
• High-level language is written by a developer and machine language can be understood by the
processor. Compiler is used to show errors to the programmer.
• The main purpose of a compiler is to change the code written in one language without changing
the meaning of the program.
• When you execute a program which is written in HLL programming language then it executes
into two parts.
• In the first part, the source program compiled and translated into the object program (low level
language).
• In the second part, the object program translated into the target program through the assembler.

19
200160107008
Compiler Design (3170701)

❖ LEX:
• Lex is a program that generates lexical analyzers. It is used with a YACC parser generator.
• The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
• It reads the input stream and produces the source code as output through implementing the
lexical analyzer in the C program.
• During the first phase the compiler reads the input and converts strings in the source to tokens.
• With regular expressions we can specify patterns to lex so it can generate code that will allow
it to scan and match strings in the input. Each pattern specified in the input to lex has an
associated action.
• Typically an action returns a token that represents the matched string for subsequent use by the
parser. Initially we will simply print the matched string rather than return a token value.

➢ Function of LEX:
• Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs
the lex.1 program and produces a C program lex.yy.c.
• Finally C compiler runs the lex.yy.c program and produces an object program a.out.
• a.out is a lexical analyzer that transforms an input stream into a sequence of tokens.

➢ LEX File Format:


• A Lex program is separated into three sections by %% delimiters. The formal of Lex source is
as follows:
%{ definitions %}
%%
{ rules }
%%
{ user subroutines }
• Definitions include declarations of constant, variable and regular definitions.
• Rules define the statement of form p1 {action1} p2 {action2}. pn {action}.
• Where pi describes the regular expression and action1 describes the actions the lexical analyzer
should take when pattern pi matches a lexeme.
• User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded
with the lexical analyzer and compiled separately.

20
200160107008
Compiler Design (3170701)

❖ FLEX: Fast Lexical Analyzer Generator


• FLEX is a tool/computer program for generating lexical analyzers (scanners or lexers) written
by Vern Paxson in C around 1987. It is used together with Berkeley Yacc parser generator or
GNU Bison parser generator. Flex and Bison both are more flexible than Lex and Yacc and
produce faster code.
• Bison produces parsers from the input file provided by the user. The function yylex() is
automatically generated by the flex when it is provided with a .l file and this yylex() function is
expected by parser to call to retrieve tokens from current/this token stream.

STEPS:
• Step 1 : An input file describes the lexical analyzer to be generated named lex.l is written in lex
language. The lex compiler transforms lex.l to C program, in a file that is always named lex.yy.c.
• Step 2 : The C compiler compile lex.yy.c file into an executable file called a.out.
• Step 3 : The output file a.out take a stream of input characters and produce a stream of tokens.
• Program Structure:

In the input file, there are 3 sections:


Definition Section: The definition section contains the declaration of variables, regular definitions,
and manifest constants. In the definition section, text is enclosed in “%{ %}” brackets. Anything
written in this brackets is copied directly to the file lex.yy.c
Syntax:
%{
// Definitions
%}

Rules Section: The rules section contains a series of rules in the form: pattern action and pattern
must be unintended and action begin on the same line in {} brackets. The rule section is enclosed
in “%% %%”.
Syntax:
%%
pattern action
%%

User Code Section: This section contains C statements and additional functions. We can also
compile these functions separately and load with the lexical analyzer.

How to run the program:


To run the program, it should be first saved with the extension .l or .lex. Run the below commands
on terminal in order to run the program file.
• Step 1: lex filename.l or lex filename.lex depending on the extension file is saved with name.l
extension.
• Step 2: gcc lex.yy.c
• Step 3: ./a.out
• Step 4: Provide the input to program in case it is required

Program:

Program-1: Generate Histogram of words

Code:
%{
#include<stdio.h>
#include<string.h>
#define MAX 1000
21
200160107008
Compiler Design (3170701)
%}

/* Declarations */
int count = 0;
char words[MAX][MAX];

/* Rule Section */
%%

[a-zA-Z]+ {
int i, flag = 0;
for(i=0; i<count; i++) {
if(strcmp(words[i], yytext) == 0) {
flag = 1;
break;
}
}
if(flag == 0) {
strcpy(words[count++], yytext);
}
}

.;

%%

/* Code Section */
int main(int argc, char **argv) {
if(argc != 2) {
printf("Usage: ./a.out <filename>\n");
return 1;
}
FILE *fp = fopen(argv[1], "r");
if(fp == NULL) {
printf("Cannot open file!\n");
return 1;
}
yyin = fp;
yylex();

int i, j;
printf("\nWord\t\tFrequency\n");
for(i=0; i<count; i++) {
int freq = 0;
rewind(fp);
while(fscanf(fp, "%s", words[MAX-1]) == 1) {
if(strcmp(words[MAX-1], words[i]) == 0) {
freq++;
}
}
printf("%-15s %d\n", words[i], freq);
}

22
200160107008
Compiler Design (3170701)
fclose(fp);
return 0;
}

Program-2: Caesar Cypher

Code:
/* simple lex program that takes in a string of letters */
/* and outputs the Caesar Cipher version of it - note */
/* that initially the programs shifts letters by 3, but */
/* it can easily be modified to implement different shifts */

/* rule section */
%%
[a-z] { char ch = yytext[0];
ch += 3;
if (ch > 'z') ch -= ('z'+1-'a');
printf ("%c", ch);
}

[A-Z] { char ch = yytext[0];


ch += 3;
if (ch > 'Z') ch -= ('Z'+1-'A');
printf ("%c", ch);
}
%%

int main (void) {


return yylex ();
}

int yywrap(void) {
return 1;
}

Program-3: Extract single and multiline comments from C Program

Code:
%{
#include <stdio.h>
%}

%option noyywrap

%%
"//"(.|\n)* { printf("Single Line Comment: %s\n", yytext); }
"/*"([^*]|\*+[^*/])*\*+"/" { printf("Multi-line Comment: %s\n", yytext); }
.;

%%

int main() {
char input[1000];

23
200160107008
Compiler Design (3170701)
printf("Enter C code (type 'exit' on a new line to quit):\n");

while(1) {
fgets(input, sizeof(input), stdin);

if (strcmp(input, "exit\n") == 0) {
break;
}

yy_scan_string(input);
yylex();
}

return 0;
}

Observations and Conclusion:

Program-1:

$ lex histogram.l
$ cc lex.yy.c -o histogram -ll
$ ./histogram sample.txt

In this program, I have implemented a histogram of words using lex tool. The program counts the
frequency of each word in a given input file. It uses an array words to store all the distinct words
and counts the frequency of each word by iterating through the words array and comparing it with
the input file. The program also checks for errors such as invalid input file. This program can be
used to analyze the most frequent words in a text file or a document. This program can be extended
to handle large files by implementing a dynamic array to store the distinct words instead of a fixed
size array.

24
200160107008
Compiler Design (3170701)
Program-2:

25
200160107008
Compiler Design (3170701)

Program-3:

Quiz:
1. What is Lex tool used for?
Lex is a tool used for generating lexical analyzers or tokenizers, typically used in compiler
construction to break down source code into tokens.
2. What is the purpose of the "Generate Histogram of words" program in Lex?
A "Generate Histogram of Words" program in Lex counts the occurrences of unique words in a text
to analyze word frequency distribution.
3. Which program in Lex is used for encrypting text?
Lex is not used for text encryption; encryption is typically handled by cryptographic libraries in
programming languages.
4. What is the purpose of the "Extract single and multiline comments from C Program"
programin Lex?
It extracts comments (both single-line and multiline) from C code for documentation,
analysis, or other purposes.
5. How does the Caesar Cypher program work?
The Caesar cipher program shifts letters in the alphabet by a fixed amount to encrypt and decrypt
messages.

Suggested Reference:

1. "Flex & Bison: Text Processing Tools" by John Levine


2. "Lex and Yacc" by Tom Niemann
3. "Introduction to Compiler Construction with Unix" by Axel T. Schreiner
4. https://www.youtube.com/watch?v=ArtBEUvS3PI
5. Lex - A Lexical Analyzer Generator. Retrieved from
https://www.gnu.org/software/flex/manual/
6. Lexical Analysis with Flex. Retrieved from https://www.geeksforgeeks.org/flex-fast-lexical-
analyzer-generator/
7. The Flex Manual. Retrieved from https://westes.github.io/flex/manual/

References used by the students:

Rubric wise marks obtained:

26
200160107008
Compiler Design (3170701)
Understandi Problem Completeness
Logic
ng of Lex Recognition and accuracy Ethics (2)
Rubrics Building (2) Total
tool (2) (2) (2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

27
200160107008
Compiler Design (3170701)
Experiment No - 3
Aim: Implement following Programs Using Lex
a. Convert Roman to Decimal
b. Check weather given statement is compound or simple
c. Extract html tags from .html file

Date:

Competency and Practical Skills: Understanding of Lex tool and its usage in compiler
design, understanding of regular expressions and data structures, improving programming
skill to develop programs using lex tool

Relevant CO: CO1

Objectives:
1. To introduce students to Lex tool and its usage in compiler design
2. To provide practical knowledge of regular expressions and their use in pattern matching
3. To enhance students' understanding of data structures such as arrays, lists, and trees
4. To develop students' problem-solving skills in developing and implementing programs using
Lex tool
5. To develop students' debugging skills to identify and resolve program errors and issues

Software/Equipment: Computer system, Text editor, Lex tool, C compiler, Terminal or Command
prompt.

Theory:

❖ LEX:
• Lex is a program that generates lexical analyzers. It is used with a YACC parser generator.
• The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
• It reads the input stream and produces the source code as output through implementing the
lexical analyzer in the C program.
• During the first phase the compiler reads the input and converts strings in the source to tokens.
• With regular expressions we can specify patterns to lex so it can generate code that will allow
it to scan and match strings in the input. Each pattern specified in the input to lex has an
associated action.
• Typically an action returns a token that represents the matched string for subsequent use by the
parser. Initially we will simply print the matched string rather than return a token value.

➢ Function of LEX:
• Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs
the lex.1 program and produces a C program lex.yy.c.
• Finally C compiler runs the lex.yy.c program and produces an object program a.out.
• a.out is a lexical analyzer that transforms an input stream into a sequence of tokens.

➢ LEX File Format:


• A Lex program is separated into three sections by %% delimiters. The formal of Lex source is
as follows:
%{ definitions %}
%%
{ rules }
%%
{ user subroutines }
• Definitions include declarations of constant, variable and regular definitions.

28
200160107008
Compiler Design (3170701)
• Rules define the statement of form p1 {action1} p2 {action2}. pn {action}.
• Where pi describes the regular expression and action1 describes the actions the lexical analyzer
should take when pattern pi matches a lexeme.
• User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded
with the lexical analyzer and compiled separately.

❖ FLEX: Fast Lexical Analyzer Generator


• FLEX is a tool/computer program for generating lexical analyzers (scanners or lexers) written
by Vern Paxson in C around 1987. It is used together with Berkeley Yacc parser generator or
GNU Bison parser generator. Flex and Bison both are more flexible than Lex and Yacc and
produce faster code.
• Bison produces parsers from the input file provided by the user. The function yylex() is
automatically generated by the flex when it is provided with a .l file and this yylex() function is
expected by parser to call to retrieve tokens from current/this token stream.

STEPS:
• Step 1 : An input file describes the lexical analyzer to be generated named lex.l is written in lex
language. The lex compiler transforms lex.l to C program, in a file that is always named lex.yy.c.
• Step 2 : The C compiler compile lex.yy.c file into an executable file called a.out.
• Step 3 : The output file a.out take a stream of input characters and produce a stream of tokens.
• Program Structure:

In the input file, there are 3 sections:


Definition Section: The definition section contains the declaration of variables, regular definitions,
and manifest constants. In the definition section, text is enclosed in “%{ %}” brackets. Anything
written in this brackets is copied directly to the file lex.yy.c
Syntax:
%{
// Definitions
%}

Rules Section: The rules section contains a series of rules in the form: pattern action and pattern
must be unintended and action begin on the same line in {} brackets. The rule section is enclosed
in “%% %%”.
Syntax:
%%
pattern action
%%

User Code Section: This section contains C statements and additional functions. We can also
compile these functions separately and load with the lexical analyzer.

How to run the program:


To run the program, it should be first saved with the extension .l or .lex. Run the below commands
on terminal in order to run the program file.
• Step 1: lex filename.l or lex filename.lex depending on the extension file is saved with name.l
extension.
• Step 2: gcc lex.yy.c
• Step 3: ./a.out
• Step 4: Provide the input to program in case it is required

29
200160107008
Compiler Design (3170701)
Program:

Program-1: Convert Roman to Decimal

Code:
%{
#include <stdio.h>

int decimal = 0; // to store the decimal value


int prev_value = 0; // to store the value of the previous Roman numeral
%}

/* Rules section */
%%
I { prev_value = 1; } // If the symbol is 'I', set prev_value to 1
IV { decimal += 3; } // If the symbol is 'IV', add 3 to the decimal value
V { decimal += 5; } // If the symbol is 'V', add 5 to the decimal value
IX { decimal += 8; } // If the symbol is 'IX', add 8 to the decimal value
X { decimal += 10; } // If the symbol is 'X', add 10 to the decimal value
XL { decimal += 30; } // If the symbol is 'XL', add 30 to the decimal value
L { decimal += 50; } // If the symbol is 'L', add 50 to the decimal value
XC { decimal += 80; } // If the symbol is 'XC', add 80 to the decimal value
C { decimal += 100; } // If the symbol is 'C', add 100 to the decimal value
CD { decimal += 300; } // If the symbol is 'CD', add 300 to the decimal value
D { decimal += 500; } // If the symbol is 'D', add 500 to the decimal value
CM { decimal += 800; } // If the symbol is 'CM', add 800 to the decimal value
M { decimal += 1000; } // If the symbol is 'M', add 1000 to the decimal value
. { printf("Invalid Roman numeral\n"); exit(1); } // If any other symbol is encountered, exit
with an error message
%%

/* Code section */
int main()
{
yylex();
printf("Decimal value: %d\n", decimal);
return 0;
}

Program-2: Check weather given statement is compound or simple

Code:
/*Program to recognize whether a given sentence is simple or compound.*/
%{
#include<stdio.h>
int flag=0;
%}

%%
and |
or |
but |
because |

30
200160107008
Compiler Design (3170701)
if |
then |
nevertheless { flag=1; }
.;
\n { return 0; }
%%

int main()
{
printf("Enter the sentence:\n");
yylex();
if(flag==0)
printf("Simple sentence\n");
else
printf("compound sentence\n");
}

int yywrap( )
{
return 1;
}

Program-3: Extract html tags from .html file

Code:
%{
#include<stdio.h>
%}

%%
\<[^>]*\> fprintf(yyout,"%s\n",yytext);
.|\n;
%%

int yywrap()
{
return 1;
}

int main()
{
yyin=fopen("input7.html","r");
yyout=fopen("output7.txt","w");
yylex();
return 0;
}

31
200160107008
Compiler Design (3170701)

Observations and Conclusion:

Program-1:

The Lex program scans the input Roman numeral and converts it into decimal by matching each
symbol with the corresponding decimal value. If an invalid symbol is encountered, the program
exits with an error message. Lex provides an efficient and easy way to define the rules for the
conversion.

Program-2:

Program-3:

Quiz:
1. What is Lex tool?
Lex is a tool for tokenizing input text in compiler construction and text processing.
2. What is the purpose of Lex tool?
Lex is used to create tokenizers for breaking down text into meaningful units known as tokens,
which is vital in tasks like compiler construction and text processing.
3. What is the aim of the Roman to Decimal conversion program?
The aim of a Roman to Decimal conversion program is to convert Roman numerals into their
decimal equivalents, allowing for modern numeric operations and comparisons.
4. How does the program check whether a given statement is compound or simple?
The program distinguishes between compound and simple statements by examining the syntax,
particularly the presence of block delimiters or specific language constructs.
5. What is the purpose of the program to extract HTML tags from an HTML file?
The program extracts HTML tags from an HTML file, enabling tasks like web scraping, content
analysis, and data extraction.
6. Explain the Rule Section of a Lex program.
The Rule Section in a Lex program defines patterns and actions for recognizing and processing
specific text patterns in the input.
7. What is the purpose of the Definition Section in a Lex program?
The Definition Section in a Lex program defines named symbols and regular expressions for
improved code readability, reusability, and maintenance.
8. How can you declare and initialize a variable in a Lex program?
In a Lex program, declare and initialize a variable using C code within the action part of a rule by
following the standard C syntax.

32
200160107008
Compiler Design (3170701)
9. How can you compile and execute a Lex program?
To compile and execute a Lex program
1. Write the Lex program (usually with a `.l` extension).
2. Generate C code with the `lex` command.
3. Compile the C code using a C compiler (e.g., `gcc`).
4. Execute the resulting program, providing input as required.

Suggested Reference:

1. Aho, A.V., Sethi, R., & Ullman, J.D. (1986). Compilers: Principles, Techniques, and Tools.
Addison-Wesley.
2. Levine, J.R., Mason, T., & Brown, D. (2009). lex & yacc. O'Reilly Media, Inc.
3. Lex - A Lexical Analyzer Generator. Retrieved from
https://www.gnu.org/software/flex/manual/
4. Lexical Analysis with Flex. Retrieved from https://www.geeksforgeeks.org/flex-fast-lexical-
analyzer-generator/
5. The Flex Manual. Retrieved from https://westes.github.io/flex/manual/

References used by the students:

Rubric wise marks obtained:

33
200160107008
Compiler Design (3170701)
Understandi Problem Completeness
Logic
ng of Lex Recognition and accuracy Ethics (2)
Rubrics Building (2) Total
tool (2) (2) (2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

34
200160107008
Compiler Design (3170701)
Experiment No - 4
Aim: Introduction to YACC and generate Calculator Program

Date:

Competency and Practical Skills:


• Understanding of YACC and its role in compiler construction
• Ability to write grammar rules and YAAC programs for a given language
• Ability to develop program using YACC

Relevant CO: CO2

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the concept of YACC and its significance in compiler construction
➢ Write grammar rules for a given language
➢ Implement a calculator program using YACC

Software/Equipment: Windows/Linux Operating System, YACC Compiler, Text editor,


Command prompt or terminal

Theory:
YACC (Yet Another Compiler Compiler) is a tool that is used for generating parsers. It is used in
combination with Lex to generate compilers and interpreters. YACC takes a set of rules and
generates a parser that can recognize and process the input according to those rules.

The grammar rules that are defined using YACC are written in BNF (Backus-Naur Form) notation.
These rules describe the syntax of a programming language.

INPUT FILE:
→ The YACC input file is divided into three parts.
/* definitions */
....
%%
/* rules */
....
%%
/* auxiliary routines */
....

Definition Part:
→ The definition part includes information about the tokens used in the syntax definition.

Rule Part:
→ The rules part contains grammar definition in a modified BNF form. Actions is C code in { }
and can be embedded inside (Translation schemes).

Auxiliary Routines Part:


→ The auxiliary routines part is only C code.
→ It includes function definitions for every function needed in the rules part.
→ It can also contain the main() function definition if the parser is going to be run as a program.
→ The main() function must call the function yyparse().

35
200160107008
Compiler Design (3170701)

For Compiling YACC Program:


1. Write lex program in a file file.l and yacc in a file file.y
2. Open Terminal and Navigate to the Directory where you have saved the files.
3. type lex file.l
4. type yacc file.y
5. type cc lex.yy.c y.tab.h -ll
6. type ./a.out

The program for generating a calculator using YACC involves the following steps:
➢ Defining the grammar rules for the calculator program
➢ Writing the Lex code for tokenizing the input
➢ Writing the YACC code for parsing the input and generating the output

Program:

1. Create a file named "calculator.l":

%{
#include<stdio.h>
#include<ctype.h>
int result;
%}

%%

[0-9]+ { yylval = atoi(yytext); return INTEGER; }


[ \t] ; /* skip whitespace */
\n { return EOL; }
. { return yytext[0]; }

%%

int yywrap(void) {
return 1;
}

2. Create a file named "calculator.y":

%{
#include<stdio.h>
%}

36
200160107008
Compiler Design (3170701)

%token INTEGER EOL

%%

line: /* empty */
| line exp EOL { printf("= %d\n", $2); }
;

exp: INTEGER { $$ = $1; }


| exp '+' exp { $$ = $1 + $3; }
| exp '-' exp { $$ = $1 - $3; }
| exp '*' exp { $$ = $1 * $3; }
| exp '/' exp { $$ = $1 / $3; }
| '(' exp ')' { $$ = $2; }
;

%%

int main(void) {
yyparse();
return 0;
}

void yyerror(char* s) {
fprintf(stderr, "error: %s\n", s);
}

Observations and Conclusion:

After executing the program, we observed that the calculator program was successfully generated
using YACC. It was able to perform simple arithmetic operations such as addition, subtraction,
multiplication, and division. The program was also able to handle negative numbers and brackets.

Quiz:
1. What is YACC?
YACC, or "Yet Another Compiler Compiler," generates parsers for processing programming
language syntax based on context-free grammars.
2. What is the purpose of YACC?
37
200160107008
Compiler Design (3170701)
YACC generates parsers for processing programming language syntax, aiding in compiler
construction, syntax analysis, and abstract syntax tree generation.

3. What is the output of YACC?


The primary output of YACC is a parser implemented in C, which parses input according to defined
grammar rules.
4. What is a syntax analyzer?
A syntax analyzer (parser) checks code structure against language grammar, creating a structured
representation for further processing and reporting syntax errors.

38
200160107008
Compiler Design (3170701)

Suggested Reference:
1. "Lex & Yacc" by John R. Levine, Tony Mason, and Doug Brown
2. "The Unix Programming Environment" by Brian W. Kernighan and Rob Pike

References used by the students:

Rubric wise marks obtained:

Understandi Grammar
Implementat Testing &
ng of YAAC Generation Ethics (2)
Rubrics ion (2) Debugging (2) Total
(2) (2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

39
200160107008
Compiler Design (3170701)
Experiment No - 5
Aim: Implement a program for constructing
a. LL(1) Parser
b. Predictive Parser

Date:

Competency and Practical Skills:


• Understanding Parsers and its role in compiler construction
• Ability to write first and follow for given grammar
• Ability to develop LL(1) and predictive parser using top down parsing approach

Relevant CO: CO2

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the concept parsers and its significance in compiler construction
➢ Write first and follow set for given grammar
➢ Implement a LL(1) and predictive grammar using top down parser

Software/Equipment: C compiler

Theory:

❖ LL(1) Parsing: Here the 1st L represents that the scanning of the Input will be done from
the Left to Right manner and the second L shows that in this parsing technique, we are
going to use the Left most Derivation Tree. And finally, the 1 represents the number of
look-ahead, which means how many symbols are you going to see when you want to make
a decision.

Essential conditions to check first are as follows:


1. The grammar is free from left recursion.
2. The grammar should not be ambiguous.
3. The grammar has to be left factored in so that the grammar is deterministic grammar.
These conditions are necessary but not sufficient for proving a LL(1) parser.

Algorithm to construct LL(1) Parsing Table:


Step 1: First check all the essential conditions mentioned above and go to step 2.
Step 2: Calculate First() and Follow() for all non-terminals.
1. First(): If there is a variable, and from that variable, if we try to drive all the strings then
the beginning Terminal Symbol is called the First.
2. Follow(): What is the Terminal Symbol which follows a variable in the process of
derivation.
Step 3: For each production A –> α. (A tends to alpha)
1. Find First(α) and for each terminal in First(α), make entry A –> α in the table.
2. If First(α) contains ε (epsilon) as terminal, then find the Follow(A) and for each terminal in
Follow(A), make entry A –> ε in the table.
3. If the First(α) contains ε and Follow(A) contains $ as terminal, then make entry A –> ε in
the table for the $.
To construct the parsing table, we have two functions:
In the table, rows will contain the Non-Terminals and the column will contain the Terminal
Symbols. All the Null Productions of the Grammars will go under the Follow elements and the

40
200160107008
Compiler Design (3170701)
remaining productions will lie under the elements of the First set.

41
200160107008
Compiler Design (3170701)

❖ Predictive Parser
Predictive parser is a recursive descent parser, which has the capability to predict which production
is to be used to replace the input string. The predictive parser does not suffer from backtracking.
To accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the next
input symbols. To make the parser back-tracking free, the predictive parser puts some constraints
on the grammar and accepts only a class of grammar known as LL(k) grammar.

Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree. Both
the stack and the input contains an end symbol $ to denote that the stack is empty and the input is
consumed. The parser refers to the parsing table to take any decision on the input and stack element
combination.

In recursive descent parsing, the parser may have more than one production to choose from for a
single instance of input, whereas in predictive parser, each step has at most one production to
choose. There might be instances where there is no production matching the input string, making
the parsing procedure to fail.

Program-1:
#include<stdio.h>
#include<string.h>
#define TSIZE 128
// table[i][j] stores
// the index of production that must be applied on
// ith varible if the input is
// jth nonterminal
int table[100][TSIZE];
// stores all list of terminals
42
200160107008
Compiler Design (3170701)
// the ASCII value if use to index terminals
// terminal[i] = 1 means the character with
// ASCII value is a terminal
char terminal[TSIZE];
// stores all list of terminals
// only Upper case letters from 'A' to 'Z'
// can be nonterminals
// nonterminal[i] means ith alphabet is present as
// nonterminal is the grammar
char nonterminal[26];
// structure to hold each production
// str[] stores the production
// len is the length of production
struct product {
char str[100];
int len;
}pro[20];
// no of productions in form A->ß
int no_pro;
char first[26][TSIZE];
char follow[26][TSIZE];
// stores first of each production in form A->ß
char first_rhs[100][TSIZE];
// check if the symbol is nonterminal
int isNT(char c) {
return c >= 'A' && c <= 'Z';
}
// reading data from the file
void readFromFile() {
FILE* fptr;
fptr = fopen("text.txt", "r");
char buffer[255];
int i;
int j;
while (fgets(buffer, sizeof(buffer), fptr)) {
printf("%s", buffer);
j = 0;
nonterminal[buffer[0] - 'A'] = 1;
for (i = 0; i < strlen(buffer) - 1; ++i) {
if (buffer[i] == '|') {
++no_pro;
pro[no_pro - 1].str[j] = '\0';
pro[no_pro - 1].len = j;
pro[no_pro].str[0] = pro[no_pro - 1].str[0];
pro[no_pro].str[1] = pro[no_pro - 1].str[1];
pro[no_pro].str[2] = pro[no_pro - 1].str[2];
j = 3;
}
else {
pro[no_pro].str[j] = buffer[i];
++j;
if (!isNT(buffer[i]) && buffer[i] != '-' && buffer[i] != '>') {
terminal[buffer[i]] = 1;

43
200160107008
Compiler Design (3170701)
}
}
}
pro[no_pro].len = j;
++no_pro;
}
}
void add_FIRST_A_to_FOLLOW_B(char A, char B)
{
int i;
for (i = 0; i < TSIZE; ++i)
{
if (i != '^'){
follow[B - 'A'][i] = follow[B - 'A'][i] || first[A - 'A'][i];
}
}
void add_FOLLOW_A_to_FOLLOW_B(char A, char B)
{
int i;
for (i = 0; i < TSIZE; ++i)
{
if (i != '^')
follow[B - 'A'][i] = follow[B - 'A'][i] || follow[A - 'A'][i];
}
}
void FOLLOW()
{
int t = 0;
int i, j, k, x;
while (t++ < no_pro)
{
for (k = 0; k < 26; ++k) {
if (!nonterminal[k]) continue;
char nt = k + 'A';
for (i = 0; i < no_pro; ++i) {
for (j = 3; j < pro[i].len; ++j) {
if (nt == pro[i].str[j]) {
for (x = j + 1; x < pro[i].len; ++x) {
char sc = pro[i].str[x];
if (isNT(sc)) {
add_FIRST_A_to_FOLLOW_B(sc, nt);
if (first[sc - 'A']['^'])
continue;
}
else {
follow[nt - 'A'][sc] = 1;
}
break;
}
if (x == pro[i].len)
add_FOLLOW_A_to_FOLLOW_B(pro[i].str[0], nt);
}
}

44
200160107008
Compiler Design (3170701)
}
}
}
}
void add_FIRST_A_to_FIRST_B(char A, char B) {
int i;
for (i = 0; i < TSIZE; ++i) {
if (i != '^') {
first[B - 'A'][i] = first[A - 'A'][i] || first[B - 'A'][i];
}
}
}
void FIRST() {
int i, j;
int t = 0;
while (t < no_pro) {
for (i = 0; i < no_pro; ++i) {
for (j = 3; j < pro[i].len; ++j) {
char sc = pro[i].str[j];
if (isNT(sc)) {
add_FIRST_A_to_FIRST_B(sc, pro[i].str[0]);
if (first[sc - 'A']['^'])
continue;
}
else {
first[pro[i].str[0] - 'A'][sc] = 1;
}
break;
}
if (j == pro[i].len)
first[pro[i].str[0] - 'A']['^'] = 1;
}
++t;
}
}
void add_FIRST_A_to_FIRST_RHS B(char A, int B) {
int i;
for (i = 0; i < TSIZE; ++i) {
if (i != '^')
first_rhs[B][i] = first[A - 'A'][i] || first_rhs[B][i];
}
}
// Calculates FIRST(ß) for each A->ß
void FIRST_RHS() {
int i, j;
int t = 0;
while (t < no_pro) {
for (i = 0; i < no_pro; ++i) {
for (j = 3; j < pro[i].len; ++j) {
char sc = pro[i].str[j];
if (isNT(sc)) {
add_FIRST_A_to_FIRST_RHS B(sc, i);
if (first[sc - 'A']['^'])

45
200160107008
Compiler Design (3170701)
continue;
}
else {
first_rhs[i][sc] = 1;
}
break;
}
if (j == pro[i].len)
first_rhs[i]['^'] = 1;
}
++t;
}
}
int main() {
readFromFile();
follow[pro[0].str[0] - 'A']['$'] = 1;
FIRST();
FOLLOW();
FIRST_RHS();
int i, j, k;
// display first of each variable
printf("\n");
for (i = 0; i < no_pro; ++i) {
if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
char c = pro[i].str[0];
printf("FIRST OF %c: ", c);
for (j = 0; j < TSIZE; ++j) {
if (first[c - 'A'][j]) {
printf("%c ", j);
}
}
printf("\n");
}
}
// display follow of each variable
printf("\n");
for (i = 0; i < no_pro; ++i) {
if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
char c = pro[i].str[0];
printf("FOLLOW OF %c: ", c);
for (j = 0; j < TSIZE; ++j) {
if (follow[c - 'A'][j]) {
printf("%c ", j);
}
}
printf("\n");
}
}
// display first of each variable ß
// in form A->ß
printf("\n");
for (i = 0; i < no_pro; ++i) {
printf("FIRST OF %s: ", pro[i].str);

46
200160107008
Compiler Design (3170701)
for (j = 0; j < TSIZE; ++j) {
if (first_rhs[i][j]) {
printf("%c ", j);
}
}
printf("\n");
}
// the parse table contains '$'
// set terminal['$'] = 1
// to include '$' in the parse table
terminal['$'] = 1;
// the parse table do not read '^'
// as input
// so we set terminal['^'] = 0
// to remove '^' from terminals
terminal['^'] = 0;
// printing parse table
printf("\n");
printf("\n\t**************** LL(1) PARSING TABLE *******************\n");
printf("\t \n");
printf("%-10s", "");
for (i = 0; i < TSIZE; ++i) {
if (terminal[i]) printf("%-10c", i);
}
printf("\n");
int p = 0;
for (i = 0; i < no_pro; ++i) {
if (i != 0 && (pro[i].str[0] != pro[i - 1].str[0]))
p = p + 1;
for (j = 0; j < TSIZE; ++j) {
if (first_rhs[i][j] && j != '^') {
table[p][j] = i + 1;
}
else if (first_rhs[i]['^']) {
for (k = 0; k < TSIZE; ++k) {
if (follow[pro[i].str[0] - 'A'][k]) {
table[p][k] = i + 1;
}
}
}
}
}
k = 0;
for (i = 0; i < no_pro; ++i) {
if (i == 0 || (pro[i - 1].str[0] != pro[i].str[0])) {
printf("%-10c", pro[i].str[0]);
for (j = 0; j < TSIZE; ++j) {
if (table[k][j]) {
printf("%-10s", pro[table[k][j] - 1].str);
}
else if (terminal[j]) {
printf("%-10s", "");
}

47
200160107008
Compiler Design (3170701)
}
++k;
printf("\n");
}
}
}

Observations and Conclusion:


Program -1:

In the above example, the grammar is given as input and first set and follow set of nonterminals
are identified.Further the LL1 parsing table is constructed .

Program-2:

48
200160107008
Compiler Design (3170701)

#include <stdio.h>
#include <string.h>

char prol[7][10] = { "S", "A", "A", "B", "B", "C", "C" };


char pror[7][10] = { "A", "Bb", "Cd", "aB", "@", "Cc", "@" };
char prod[7][10] = { "S->A", "A->Bb", "A->Cd", "B->aB", "B->@", "C->Cc", "C->@" };
char first[7][10] = { "abcd", "ab", "cd", "a@", "@", "c@", "@" };
char follow[7][10] = { "$", "$", "$", "a$", "b$", "c$", "d$" };
char table[5][6][10];

int numr(char c)
{
switch (c)
{
case 'S':
return 0;

case 'A':
return 1;

case 'B':
return 2;

case 'C':
return 3;

case 'a':
return 0;

case 'b':
return 1;

case 'c':
return 2;

case 'd':
return 3;

case '$':
return 4;
}

return (2);
}

int main()
{
int i, j, k;

for (i = 0; i < 5; i++)


for (j = 0; j < 6; j++)
strcpy(table[i][j], " ");

49
200160107008
Compiler Design (3170701)

printf("The following grammar is used for Parsing Table:\n");

for (i = 0; i < 7; i++)


printf("%s\n", prod[i]);

printf("\nPredictive parsing table:\n");

fflush(stdin);

for (i = 0; i < 7; i++)


{
k = strlen(first[i]);
for (j = 0; j < 10; j++)
if (first[i][j] != '@')
strcpy(table[numr(prol[i][0]) + 1][numr(first[i][j]) + 1], prod[i]);
}

for (i = 0; i < 7; i++)


{
if (strlen(pror[i]) == 1)
{
if (pror[i][0] == '@')
{
k = strlen(follow[i]);
for (j = 0; j < k; j++)
strcpy(table[numr(prol[i][0]) + 1][numr(follow[i][j]) + 1], prod[i]);
}
}
}

strcpy(table[0][0], " ");

strcpy(table[0][1], "a");

strcpy(table[0][2], "b");

strcpy(table[0][3], "c");

strcpy(table[0][4], "d");

strcpy(table[0][5], "$");

strcpy(table[1][0], "S");

strcpy(table[2][0], "A");

strcpy(table[3][0], "B");

strcpy(table[4][0], "C");

printf("\n \n");

50
200160107008
Compiler Design (3170701)
for (i = 0; i < 5; i++)
for (j = 0; j < 6; j++)
{
printf("%-10s", table[i][j]);
if (j == 5)
printf("\n \n");
}
}

51
200160107008
Compiler Design (3170701)

Quiz:
1. What is a parser and state the Role of it?
A parser analyzes the syntax of programming language source code, checking for correctness, reporting
errors, and creating structured representations (e.g., parse trees or ASTs). It also handles operator precedence
and may perform semantic actions. Its role is crucial in the compilation process, serving as a bridge between
source code and further compilation phases.
2. Types of parsers? Examples to each.
Here are common types of parsers:

1. Recursive Descent Parser (RDP): Hand-written parsers for LL(k) grammars.


2. LL Parser: ANTLR (e.g., ANTLR4).
3. LR Parser: Bison, Yacc.
4. LALR Parser: LALR parser generator in GNU Bison.
5. GLR Parser: Elkhound.
6. Earley Parser: Marpa.
7. Chart Parser: CYK, Earley.
8. Top-Down vs. Bottom-Up Parsers: Recursive descent parsers (top-down) and LR parsers (bottom-up).
3. What are the Tools available for implementation?
Tools for parser and compiler implementation include Yacc/Bison, ANTLR, Lex/Flex, JavaCC, PLY, Jison,
PEG.js, Coco/R, Elkhound, Marpa, Ragel, Parsec, and ANTLR4:JS. These tools cover a range of
programming languages and parsing needs.
4. How do you calculate FIRST(),FOLLOW() sets used in Parsing Table construction?
Calculate FIRST() sets by considering terminal symbols, production rules, and ε (empty string), and update
them iteratively. Calculate FOLLOW() sets by considering production rules, non-terminal positions, and ε,
updating iteratively while considering the start symbol's FOLLOW() set. These sets aid in parsing table
construction.
5. Name the most powerful parser.
The most powerful parser is the Generalized LR (GLR) parser, known for its ability to handle a wide range of
grammars, including ambiguous ones. It can produce multiple parse trees when necessary.

Suggested Reference:
1. Introduction to Automata Theory, Languages and Computation by John E. Hopcroft, Rajeev
Motwani, and Jeffrey D. Ullman.
2. Geeks for geeks: https://www.geeksforgeeks.org/construction-of-ll1-parsing-table/
52
200160107008
Compiler Design (3170701)
3. http://www.cs.ecu.edu/karl/5220/spr16/Notes/Top-down/LL1.html

References used by the students:


Rubric wise marks obtained:

Knowledge
Problem Completeness
of Parsing Code Presentation
implementati and accuracy
Rubrics techniques Quality (2) (2) Total
on (2) (2)
(2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

53
200160107008
Compiler Design (3170701)

Experiment No - 06

Aim: Implement a program for constructing


a. Recursive Decent Parser (RDP)
b. LALR Parser

Date:

Competency and Practical Skills:


• Understanding of RDP and bottom up parsers and its role in compiler construction
• Ability to write acceptance of string through RDP and parsing of string using LALR
parsers for a given grammar
• Ability to develop RDP and LALR parser using bottom up approach

Relevant CO: CO2

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the RDP ,broad classification of bottom up parsers and its significance in
compiler construction
➢ Verifying whether the string is accepted for RDP, a given grammar is parsed using LR
parsers.
➢ Implement a RDP and LALR parser

Software/Equipment: C compiler

Theory:

❖ Recursive Descent Parser:

Recursive Descent Parser uses the technique of Top-Down Parsing without backtracking. It can be
defined as a Parser that uses the various recursive procedure to process the input string with no
backtracking. It can be simply performed using a Recursive language. The first symbol of the string
of R.H.S of production will uniquely determine the correct alternative to choose.
The major approach of recursive-descent parsing is to relate each non-terminal with a procedure.
The objective of each procedure is to read a sequence of input characters that can be produced by
the corresponding non-terminal, and return a pointer to the root of the parse tree for the non-
terminal. The structure of the procedure is prescribed by the productions for the equivalent non-
terminal.
The recursive procedures can be simply to write and adequately effective if written in a language
that executes the procedure call effectively. There is a procedure for each non-terminal in the
grammar. It can consider a global variable lookahead, holding the current input token and a
procedure match (Expected Token) is the action of recognizing the next token in the parsing process
and advancing the input stream pointer, such that lookahead points to the next token to be parsed.
Match () is effectively a call to the lexical analyzer to get the next token.
For example, input stream is a + b$.
lookahead == a

54
200160107008
Compiler Design (3170701)
match()
lookahead == +
match ()
lookahead == b
……………………….
……………………….
In this manner, parsing can be done.

❖ LALR (1) Parsing:

LALR refers to the lookahead LR. To construct the LALR (1) parsing table, we use the
canonical collection of LR (1) items.
In the LALR (1) parsing, the LR (1) items which have same productions but different look
ahead are combined to form a single set of items
LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.
Example
S → AA
A → aA
A→b
Add Augment Production, insert '•' symbol at the first position for every production in G
and also add the look ahead.
S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I0 State:
Add Augment production to the I0 State and Compute the ClosureL
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "•" is followed by the non-
terminal. So, the I0 State becomes
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the
non-terminal. So, the I0 State becomes.
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $
I2= Go to (I0, A) = closure ( S → A•A, $ )
Add all productions starting with A in I2 State because "•" is followed by the non-
terminal. So, the I2 State becomes
I2= S → A•A, $
A → •aA, $
A → •b, $
I3= Go to (I0, a) = Closure ( A → a•A, a/b )
Add all productions starting with A in I3 State because "•" is followed by the non-
terminal. So, the I3 State becomes

55
200160107008
Compiler Design (3170701)
I3= A → a•A, a/b
A → •aA, a/b
A → •b, a/b
Go to (I3, a) = Closure (A → a•A, a/b) = (same as I3)
Go to (I3, b) = Closure (A → b•, a/b) = (same as I4)
I4= Go to (I0, b) = closure ( A → b•, a/b) = A → b•, a/b
I5= Go to (I2, A) = Closure (S → AA•, $) =S → AA•, $
I6= Go to (I2, a) = Closure (A → a•A, $)
Add all productions starting with A in I6 State because "•" is followed by the non-
terminal. So, the I6 State becomes
I6 = A → a•A, $
A → •aA, $
A → •b, $
Go to (I6, a) = Closure (A → a•A, $) = (same as I6)
Go to (I6, b) = Closure (A → b•, $) = (same as I7)
I7= Go to (I2, b) = Closure (A → b•, $) = A → b•, $
I8= Go to (I3, A) = Closure (A → aA•, a/b) = A → aA•, a/b
I9= Go to (I6, A) = Closure (A → aA•, $) A → aA•, $
If we analyze then LR (0) items of I3 and I6 are same but they differ only in their
lookahead.
I3 = { A → a•A, a/b
A → •aA, a/b
A → •b, a/b
}
I6= { A → a•A, $
A → •aA, $
A → •b, $
}
Clearly I3 and I6 are same in their LR (0) items but differ in their lookahead, so we can
combine them and called as I36.
I36 = { A → a•A, a/b/$
A → •aA, a/b/$
A → •b, a/b/$
}
The I4 and I7 are same but they differ only in their look ahead, so we can combine them
and called as I47.
I47 = {A → b•, a/b/$}
The I8 and I9 are same but they differ only in their look ahead, so we can combine them
and called as I89.
I89 = {A → aA•, a/b/$}
Drawing DFA:

56
200160107008
Compiler Design (3170701)
LALR (1) Parsing table:

Program-1:

#include<stdio.h>
#include<string.h>
#include<ctype.h>

char input[10];
int i,error;
void E();
void T();
void Eprime();
void Tprime();
void F();
main()
{
i=0;
error=0;
printf("Enter an arithmetic expression : "); // Eg: a+a*a
gets(input);
E();
if(strlen(input)==i&&error==0)
printf("\nAccepted..!!!\n");
else printf("\nRejected..!!!\n");
}

void E()
{
T();
Eprime();
}
void Eprime()
{
if(input[i]=='+')
{
i++;
T();
Eprime();
}
}
void T()
{
57
200160107008
Compiler Design (3170701)
F();
Tprime();
}
void Tprime()
{
if(input[i]=='*')
{
i++;
F();
Tprime();
}
}
void F()
{
if(isalnum(input[i]))i++;
else if(input[i]=='(')
{
i++;
E();
if(input[i]==')')
i++;

else error=1;
}

else error=1;
}

Program-2:

#include <stdio.h>
#include <string.h>

// Define the grammar rules and terminals/non-terminals


enum { TERMINAL_A, TERMINAL_B, NON_TERMINAL_S, NUM_SYMBOLS };

// Define the grammar production rules


int productions[][3] = {
{NON_TERMINAL_S, TERMINAL_A, NON_TERMINAL_S},
{NON_TERMINAL_S, TERMINAL_B}
};

// Define the LR(0) items


int items[][4] = {
{0, 0, 0, -1},
{0, 1, 0, -1},
{1, 0, 0, -1}
};

// Define the action and goto tables


int action_table[][NUM_SYMBOLS] = {
{1, 2, -1}, // State 0
{-1, -1, -1}, // State 1

58
200160107008
Compiler Design (3170701)
{-1, -1, -1} // State 2
};

int goto_table[][2] = {
{1, 2}, // State 0
{-1, -1}, // State 1
{-1, -1} // State 2
};

// Stack to hold state and symbols


int stack[100];
int stack_top = -1;

// Input tokens
int input[] = {TERMINAL_A, TERMINAL_A, TERMINAL_B, TERMINAL_B, TERMINAL_B};

// Function to perform shift operation


void shift(int state, int symbol) {
stack[++stack_top] = symbol;
stack[++stack_top] = state;
}

// Function to perform reduce operation


void reduce(int production) {
int non_terminal = productions[production][0];
int rhs_length = 0;
while (productions[production][rhs_length + 1] != -1) {
rhs_length++;
}

for (int i = 0; i < 2 * rhs_length; i++) {


stack_top--;
}

int current_state = stack[stack_top];


stack[++stack_top] = non_terminal;

int next_state = goto_table[current_state][non_terminal];


stack[++stack_top] = next_state;
}

int main() {
int current_state = 0;
int i = 0;

while (i < sizeof(input)/sizeof(input[0])) {


int symbol = input[i];
int action = action_table[current_state][symbol];

if (action > 0) {
shift(action, symbol);
i++;
} else if (action < 0) {

59
200160107008
Compiler Design (3170701)
int production = -action - 1;
reduce(production);
} else {
printf("Error: Invalid action\n");
return 1;
}

current_state = stack[stack_top];
}

printf("Parsing successful\n");

return 0;
}

Observations and Conclusion:

Program -1:

a+(a*a), a+a*a, (a), a , a+a+a*a+a.... etc are accepted


++a, a***a, +a, a*, ((a . . . etc are rejected.

In the above output, as pe the grammar provided and as per calling procedure , the tree is parsed
and thereby the inputted strings are mapped w.r.t calling procedure ; and the string/s which are
successfully parsed are accepted and others rejected.
Program-2:

Quiz:
1. What do you mean by shift reduce parsing?
Shift-reduce parsing is a bottom-up parsing method used in language processing. It involves shifting input
symbols onto a stack and then reducing them based on grammar rules, gradually building the parse tree or
abstract syntax tree for the input.
2. Provide broad classification of LR parsers.
LR parsers are broadly classified into two categories: SLR (Simple LR) and LALR (Look-Ahead LR) parsers.
SLR parsers use minimal lookahead, while LALR parsers use more extensive lookahead for parsing
decisions. SLR parsers are simpler but have limitations, while LALR parsers are more powerful and can
handle a wider range of grammars.
3. Differentiate RDP and LALR parser.
RDP (Recursive Descent Parser) is a top-down parsing method, often hand-written and suitable for LL(k)
grammars. LALR (Look-Ahead LR) parser is a bottom-up parsing approach generated by tools like Yacc or
Bison, suitable for LR(1) grammars. LALR parsers have more extensive lookahead, making them more
powerful but complex to generate compared to RDP parsers.
4. How to do merging of itemsets?
To merge item sets when constructing parsing tables:

1. Identify sets with similar core items (production rules with position markers).
2. Combine item sets with identical core items.
3. Compute new lookahead symbols by uniting the lookahead sets.
4. Update transitions between sets as needed.
5. Continue merging until no further merges are possible.
6. Generate the parsing tables for the parser from the merged item sets. This simplifies the parser while
maintaining parsing power

60
200160107008
Compiler Design (3170701)

61
200160107008
Compiler Design (3170701)
Suggested Reference:

1. Introduction to Automata Theory, Languages and Computation by John E. Hopcroft, Rajeev


Motwani, and Jeffrey D. Ullman.
2. Geeks for geeks: https://www.geeksforgeeks.org/recursive-descent-parser/
3. https://www.youtube.com/watch?v=odoHgcoombw
4. https://www.geeksforgeeks.org/lalr-parser-with-examples/

References used by the students:

Rubric wise marks obtained:


Knowledge
Problem Completeness
of Parsing Code Presentation
implementati and accuracy
Rubrics techniques Quality (2) (2) Total
on (2) (2)
(2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

62
200160107008
Compiler Design (3170701)

Experiment No - 07

Aim: Implement a program for constructing Operator Precedence Parsing.


Date:

Competency and Practical Skills:


• Understanding of OPG and its role in compiler construction
• Ability to write precedence table for a given language
• Ability to develop OPG program using C compiler.

Relevant CO: CO2

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the concept of OPG its significance in compiler construction
➢ Write precedence relations for grammar
➢ Implement a OPG using C compiler

Software/Equipment: C compiler

Theory:

Operator Precedence Parsing is also a type of Bottom-Up Parsing that can be used to a class of
Grammars known as Operator Grammar.

A Grammar G is Operator Grammar if it has the following properties −


• Production should not contain ϵ on its right side.
• There should not be two adjacent non-terminals at the right side of production.
Example 1 − Verify whether the following Grammar is operator Grammar or not.
E → E A E |(E)|id
A → +| − | ∗
Solution
No, it is not an operator Grammar as it does not satisfy property 2 of operator Grammar.
As it contains two adjacent Non-terminals on R.H.S of production E → E A E.
We can convert it into the operator Grammar by substituting the value of A in E → E A E.
E → E + E |E − E |E * E |(E) | id.
Operator Precedence Relations
Three precedence relations exist between the pair of terminals.

Relation Meaning

p <. q p has less precedence than q.

p >. q p has more precedence than q.

p =. q p has equal precedence than q.


Depending upon these precedence Relations, we can decide which operations will be executed or
parsed first.
Association and Precedence Rules
• If operators have different precedence
Since * has higher precedence than +

63
200160107008
Compiler Design (3170701)
Example−
In a statement a + b * c
∴ + <. *
In statement a * b + c
∴∗.>+
• If operators have Equal precedence, then use Association rules.
(a) Example minus; In statement a + b + c here + operators are having equal precedence.
As '+' is left Associative in a + b + c
∴ (a + b) will be computed first, and then it will be added to c.
i.e., (a + b) + c
+ .> +
Similarly, '*' is left Associative in a * b * c
(b) Example − In a statement a ↑ b ↑ c here, ↑ is the Right Associative operator
∴ It will become a ↑ (b ↑ c)
∴ (b ↑ c) will be computed first.
∴ ↑<. ↑
• Identifier has more precedence then all operators and symbols.
∴ θ <. id $ <. id
id . > θ id . > $
id . >)
(<. id.
• $ has less precedence than all other operators and symbols.
$ <. ( id . > $
$ <. + ). > $
$ <.*

Example 2 – Construct the Precedence Relation table for the Grammar.


E → E + E | E ∗ E/id
Solution
Operator-Precedence Relations
Id + * $

Id .> .> .>

+ <. .> <. .>

* <. .> .> .>

$ <. <. <.


Advantages of Operator Precedence Parsing
• It is accessible to execute.
Disadvantages of Operator Precedence Parsing
• Operator Like minus can be unary or binary. So, this operator can have different
precedence’s in different statements.
• Operator Precedence Parsing applies to only a small class of Grammars.

64
200160107008
Compiler Design (3170701)
Program:

#include<stdlib.h>
#include<stdio.h>
#include<string.h>

// function f to exit from the loop


// if given condition is not true
void f()
{
printf("Not operator grammar");
exit(0);
}

void main()
{
char grm[20][20], c;

// Here using flag variable,


// considering grammar is not operator grammar
int i, n, j = 2, flag = 0;

// taking number of productions from user


scanf("%d", &n);
for (i = 0; i < n; i++)
scanf("%s", grm[i]);

for (i = 0; i < n; i++) {


c = grm[i][2];

while (c != '&#092;&#048;') {

if (grm[i][3] == '+' || grm[i][3] == '-'


|| grm[i][3] == '*' || grm[i][3] == '/')

flag = 1;

else {

flag = 0;
f();
}

if (c == '$') {
flag = 0;
f();
}

c = grm[i][++j];
}
}

if (flag == 1)
printf("Operator grammar");
}

65
Compiler Design (3170701) 200160107008
Observations and Conclusion:
Input :3
A=A*A
B=AA
A=$

Output : Not operator grammar

In the above example ,the grammar is analysed as per operator grammar rules and the output is
against the rules of OPG so, it is not an operator grammar.
Input :2
A=A/A
B=A+A

Output : Operator grammar

In the above example ,the grammar is analysed as per operator grammar rules and the output
favors the rules of OPG(operator present between two non terminals) so, it is not an operator
grammar.

Quiz:
1. Define operator grammar.
An operator grammar defines how operators and operands combine in expressions, specifying rules for
parsing and evaluating expressions in programming languages. It focuses on the structure and interactions of
operators within expressions.
2. Define operator precedence grammar.
An operator precedence grammar defines how operators are grouped and ordered in expressions to resolve
syntax ambiguities, considering the precedence and associativity of operators in a programming language.
3. What are the different precedence relations in operator precedence parser?
In an operator precedence parser, the precedence relations include "Lower Than" (LT), "Higher Than" (GT),
and "Equal To" (EQ). These relations determine the order of operator evaluation in expressions.
4. What are the different methods are available to determine relations?
Different methods to determine precedence relations in operator precedence parsing include using a
precedence table, considering operator associativity, examining parsing rules, and allowing for user-defined
precedence. These methods establish the order of operators in expressions.
5. What do you mean by precedence function
A precedence function, in operator precedence parsing, assigns and compares precedence levels for operators
to determine their evaluation order in expressions. It helps resolve operator precedence and associativity
during parsing.

Suggested Reference:
1. https://www.gatevidyalay.com/operator-precedence-parsing/
2. https://www.geeksforgeeks.org/role-of-operator-precedence-parser/

Rubric wise marks obtained:

Knowledge Knowledge
Completeness
of parsing of Implementat Presentation
and accuracy
Rubrics techniques precedence ion (2) (2) Total
(2)
(2) table (2)
Good Avg. Good Avg. Good66 Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks
Compiler Design (3170701) 200160107008

Experiment No - 08

Aim: Generate 3-tuple intermediate code for given infix expression.

Date:

Competency and Practical Skills:


• Understanding of intermediate code generation and its role in compiler construction
• Ability to write intermediate code for given infix expression.

Relevant CO: CO2

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the different intermediate code representations and its significance in compiler
construction
➢ Write intermediate code for given infix expression

Software/Equipment: C compiler

Theory:

Three address code is a type of intermediate code which is easy to generate and can be easily
converted to machine code. It makes use of at most three addresses and one operator to represent
an expression and the value computed at each instruction is stored in temporary variable generated
by compiler. The compiler decides the order of operation given by three address code.

Three address code is used in compiler applications:

Optimization: Three address code is often used as an intermediate representation of code during
optimization phases of the compilation process. The three address code allows the compiler to
analyze the code and perform optimizations that can improve the performance of the generated
code.
Code generation: Three address code can also be used as an intermediate representation of code
during the code generation phase of the compilation process. The three address code allows the
compiler to generate code that is specific to the target platform, while also ensuring that the
generated code is correct and efficient.
Debugging: Three address code can be helpful in debugging the code generated by the compiler.
Since three address code is a low-level language, it is often easier to read and understand than the
final generated code. Developers can use the three address code to trace the execution of the
program and identify errors or issues that may be present.
Language translation: Three address code can also be used to translate code from one
programming language to another. By translating code to a common intermediate representation,
it becomes easier to translate the code to multiple target languages.
General representation –
a = b op c
Where a, b or c represents operands like names, constants or compiler generated temporaries and
op represents the operator

67
Compiler Design (3170701) 200160107008
Example-1: Convert the expression a * – (b + c) into three address code.

Example-2: Write three address code for following code


for(i = 1; i<=10; i++)
{
a[i] = x * 5;
}

Implementation of Three Address Code –


There are 3 representations of three address code namely
1. Quadruple
2. Triples
3. Indirect Triples
1. Quadruple – It is a structure which consists of 4 fields namely op, arg1, arg2 and result. op
denotes the operator and arg1 and arg2 denotes the two operands and result is used to store the
result of the expression.
Advantage –
• Easy to rearrange code for global optimization.
• One can quickly access value of temporary variables using symbol table.
Disadvantage –
• Contain lot of temporaries.
• Temporary variable creation increases time and space complexity.
Example – Consider expression a = b * – c + b * – c. The three address code is:
t1 = uminus c
t2 = b * t1
t3 = uminus c

68
Compiler Design (3170701) 200160107008
t4 = b * t3
t5 = t2 + t4
a = t5

2. Triples – This representation doesn’t make use of extra temporary variable to represent a
single operation instead when a reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and arg2.
Disadvantage –
• Temporaries are implicit and difficult to rearrange code.
• It is difficult to optimize because optimization involves moving intermediate code.
When a triple is moved, any other triple referring to it must be updated also. With
help of pointer one can directly access symbol table entry.
Example – Consider expression a = b * – c + b * – c

3. Indirect Triples – This representation makes use of pointer to the listing of all references to
computations which is made separately and stored. Its similar in utility as compared to
quadruple representation but requires less space than it. Temporaries are implicit and easier to
rearrange code.
Example – Consider expression a = b * – c + b * – c

Question – Write quadruple, triples and indirect triples for following expression : (x + y) * (y +
z) + (x + y + z)
Explanation – The three address code is:
69
Compiler Design (3170701) 200160107008
t1 = x + y
t2 = y + z
t3 = t1 * t2
t4 = t1 + z
t5 = t3 + t4

Program:

void main()
{
printf("Enter the expression:");
scanf("%s",j);
printf("\tThe Intermediate code is:\n");
small();
}
if(j[i]=='*')
printf("\tt%d=%s*%s\n",c,a,b);
if(j[i]=='/')
printf("\tt%d=%s/%s\n",c,a,b);
if(j[i]=='+')
printf("\tt%d=%s+%s\n",c,a,b);if(j[i]=='-')
printf("\tt%d=%s-%s\n",c,a,b);
if(j[i]=='=')
printf("\t%c=t%d",j[i-1],--c);
sprintf(ch,"%d",c);
j[i]=ch[0];
c++;
small();
}
void small()
{

70
Compiler Design (3170701) 200160107008
pi=0;l=0;
for(i=0;i<strlen(j);i++)
{
for(m=0;m<5;m++)
if(j[i]==sw[m])
if(pi<=p[m])
{
pi=p[m];
l=1;
k=i;
}
}
if(l==1)
dove(k);
else
exit(0);
}
Observations and Conclusion:

In the above example the user is asked to write an infix expression and the output is generated
intermediate code(3- address code).
Quiz:
1. What are the different implementation methods for three-address code?
Different implementation methods for three-address code (TAC) include quadruples, triples, indirect triples,
and direct execution. These methods vary in how they represent and manipulate intermediate code in
compilers.
2. What do you mean by quadruple?
A quadruple is a data structure in compilers that represents intermediate code using four fields for the
operator, two operands, and a result. It's commonly used for three-address code (TAC) to represent high-level
program constructs.
3. Define triple and indirect triple.
A "triple" is a three-field representation in three-address code (TAC) with an operator and two operands. An
"indirect triple" is also TAC but explicitly represents the result as a temporary variable, making result
handling more structured.
4. What is abstract syntax tree?
An Abstract Syntax Tree (AST) is a tree-like data structure in compilers that represents the syntactic structure
of a program, making it easier to analyze and manipulate program code.
Suggested Reference:
1. Introduction to Automata Theory, Languages and Computation by John E. Hopcroft, Rajeev
Motwani, and Jeffrey D. Ullman.
2. https://www.geeksforgeeks.org/introduction-to-intermediate-representationir/
3. https://cs.lmu.edu/~ray/notes/ir/
Compiler Design (3170701) 200160107008

Rubric wise marks obtained:

Problem Completeness
Knowledge Logic
Recognition and accuracy Ethics (2)
Rubrics (2) Building (2) Total
(2) (2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) 71 (1) (2) (1) (2) (1)

Marks

72
Compiler Design (3170701) 200160107008

Experiment No - 09

Aim: Extract Predecessor and Successor from given Control Flow Graph.

Date:

Competency and Practical Skills:


• Understanding of Control flow structure in compilers
• Ability to write predecessor and successor for given graph

Relevant CO: CO3

Objectives:
By the end of this experiment, the students should be able to:
➢ Understand the concept control structure (in blocks) in compiler
➢ Write the predecessor and successor for given graph.

Software/Equipment: C /C++ compiler

Theory:

A basic block is a simple combination of statements. Except for entry and exit, the basic blocks
do not have any branches like in and out. It means that the flow of control enters at the beginning
and it always leaves at the end without any halt. The execution of a set of instructions of a basic
block always takes place in the form of a sequence.
The first step is to divide a group of three-address codes into the basic block. The new basic
block always begins with the first instruction and continues to add instructions until it reaches a
jump or a label. If no jumps or labels are identified, the control will flow from one instruction to
the next in sequential order.
The algorithm for the construction of the basic block is described below step by step:
Algorithm: The algorithm used here is partitioning the three-address code into basic blocks.
Input: A sequence of three-address codes will be the input for the basic blocks.
Output: A list of basic blocks with each three address statements, in exactly one block, is
considered as the output.
Method: We’ll start by identifying the intermediate code’s leaders. The following are some
guidelines for identifying leaders:
1. The first instruction in the intermediate code is generally considered as a leader.
2. The instructions that target a conditional or unconditional jump statement can be
considered as a leader.
3. Any instructions that are just after a conditional or unconditional jump statement can
be considered as a leader.
Each leader’s basic block will contain all of the instructions from the leader until the instruction
right before the following leader’s start.
Example of basic block:
Three Address Code for the expression a = b + c – d is:
T1 = b + c
T2 = T1 - d
a = T2
This represents a basic block in which all the statements execute in a sequence one after the other.
Basic Block Construction:
Let us understand the construction of basic blocks with an example:
Example:
73
Compiler Design (3170701) 200160107008
1. PROD = 0
2. I = 1
3. T2 = addr(A) – 4
4. T4 = addr(B) – 4
5. T1 = 4 x I
6. T3 = T2[T1]
7. T5 = T4[T1]
8. T6 = T3 x T5
9. PROD = PROD + T6
10. I = I + 1
11. IF I <=20 GOTO (5)
Using the algorithm given above, we can identify the number of basic blocks in the above three-
address code easily-
There are two Basic Blocks in the above three-address code:
• B1 – Statement 1 to 4
• B2 – Statement 5 to 11
Transformations on Basic blocks:
Transformations on basic blocks can be applied to a basic block. While transformation, we don’t
need to change the set of expressions computed by the block.
There are two types of basic block transformations. These are as follows:
1. Structure-Preserving Transformations
Structure preserving transformations can be achieved by the following methods:
1. Common sub-expression elimination
2. Dead code elimination
3. Renaming of temporary variables
4. Interchange of two independent adjacent statements
2. Algebraic Transformations
In the case of algebraic transformation, we basically change the set of expressions into an
algebraically equivalent set.
For example, and expression
x:= x + 0
or x:= x *1
This can be eliminated from a basic block without changing the set of expressions.
Flow Graph:
A flow graph is simply a directed graph. For the set of basic blocks, a flow graph shows the flow
of control information. A control flow graph is used to depict how the program control is being
parsed among the blocks. A flow graph is used to illustrate the flow of control between basic
blocks once an intermediate code has been partitioned into basic blocks. When the beginning
instruction of the Y block follows the last instruction of the X block, an edge might flow from
one block X to another block Y.
Let’s make the flow graph of the example that we used for basic block formation:

74
Compiler Design (3170701) 200160107008

Flow Graph for above Example

Firstly, we compute the basic blocks (which is already done above). Secondly, we assign the flow
control information.

Program:
// C++ program to find predecessor and successor in a BST
#include <iostream>
using namespace std;

// BST Node struct


Node
{
int key;
struct Node *left, *right;
};

// This function finds predecessor and successor of key in BST. //It


sets pre and suc as predecessor and successor respectively void
findPreSuc(Node* root, Node*& pre, Node*& suc, int key) {
// Base case
if (root == NULL) return ;

// If key is present at rootif


(root->key == key)
{
// the maximum value in left subtree is predecessorif
(root->left != NULL)
{
Node* tmp = root->left;
while (tmp->right)
tmp = tmp->right;pre
75
Compiler Design (3170701) 200160107008
= tmp ;
}

// the minimum value in right subtree is successorif


(root->right != NULL)
{

Node* tmp = root->right ;


while (tmp->left)
tmp = tmp->left ;suc
= tmp ;

}
return ;
}

// If key is smaller than root's key, go to left subtreeif (root->key > key)
{
suc = root ;
findPreSuc(root->left, pre, suc, key) ;
}
else // go to right subtree
{
pre = root ;
findPreSuc(root->right, pre, suc, key) ;
}
}

// A utility function to create a new BST node


Node *newNode(int item)
{
Node *temp = new Node;
temp->key = item;
temp->left = temp->right = NULL;
return temp;
}

/* A utility function to insert a new node with given key in BST */


Node* insert(Node* node, int key)
{
if (node == NULL)
return newNode(key);if
(key < node->key)
node->left = insert(node->left, key);
else
node->right = insert(node->right, key);

76
Compiler Design (3170701) 200160107008
return node;
}

// Driver program to test above function int


main()
{
int key = 65; //Key to be searched in BST

/* Let us create following BST


50
/ \
30 70
/ \ / \
20 40 60 80 */

Node *root = NULL;


root = insert(root, 50);
insert(root, 30);
insert(root, 20);
insert(root, 40);
insert(root, 75);
insert(root, 60);
insert(root, 80);

Node* pre = NULL, *suc = NULL;

findPreSuc(root, pre, suc, key);


if (pre != NULL)
cout << "Predecessor is " << pre->key << endl;
else
cout << "No Predecessor";

if (suc != NULL)
cout << "Successor is " << suc->key;
else
cout << "No Successor";
return 0;
}

Observations and Conclusions:

77
Compiler Design (3170701) 200160107008
In the above example user gets predecessor and successor of a given specific node .

Quiz:
1. What is flowgraph?
A flowgraph is a graphical representation of a program's control flow or execution flow. It typically
depicts the sequence of statements, functions, or program components, along with the decision
points and control transfers (e.g., loops, branches, and function calls) between them. Flowgraphs are
used for analyzing, visualizing, and understanding the structure and behavior of software programs,
making them a valuable tool in software engineering and program analysis.
2. Define DAG.
A Directed Acyclic Graph (DAG) is a finite directed graph that contains no directed cycles. In a
DAG, edges have a direction, meaning they go from one vertex to another, but you cannot follow a
sequence of edges and return to the same vertex. DAGs are often used in various computer science
and mathematical applications, including scheduling algorithms, dependency analysis, and
topological sorting.
3. Define Backpatching .
Backpatching is a compiler optimization technique used in code generation. It involves inserting
code or updating addresses in a generated intermediate code or assembly code to resolve and connect
control flow statements, such as conditional branches or jumps, to their corresponding target
locations. Backpatching is particularly useful for handling jumps to labels or addresses that are not
known at the time the code is initially generated. It ensures that control flow is correctly directed to
the intended destinations during program execution.
4. What is concept o local transformation on a block of code?
Local transformations on a code block involve making optimizations or changes within a limited
scope, typically a basic block, to improve code efficiency or correctness without affecting the
broader program structure.

Suggested Reference:
1. Introduction to Automata Theory, Languages and Computation by John E. Hopcroft, Rajeev
Motwani, and Jeffrey D. Ullman
2. https://www.geeksforgeeks.org/data-flow-analysis-compiler/

References used by the students:


Rubric wise marks obtained:

Documentatio
Problem
Knowledge implementati Correctness n and
Recognition
Rubrics (2) on (2) (2) Presentation Total
(2)
(2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

78
Compiler Design (3170701) 200160107008

Experiment No - 10
Aim: Study of Learning Basic Block Scheduling Heuristics from Optimal Data.

Date:

Competency and Practical Skills:


• Knowledge of basic computer architecture concepts
• Understanding of compiler design and optimization techniques
• Ability to analyze and interpret performance data
• Proficiency in programming languages such as C/C++ and assembly language

Relevant CO: CO4

Objectives:
By the end of this experiment, the students should be able to:
➢ Understanding the concept of basic block scheduling and its importance in compiler
optimization.
➢ Understanding the various heuristics used for basic block scheduling.
➢ Analyzing optimal data to learn the basic block scheduling heuristics.
➢ Comparing the performance of the implemented basic block scheduler with other commonly
used basic block schedulers.

Software/Equipment: Computer system

Theory:
Instruction scheduling is an important step for improving the performance of object code produced
by a compiler. Basic block scheduling is important in its own right and also as a building block for
scheduling larger groups of instructions such as superblocks. The basic block instruction scheduling
problem is to find a minimum length schedule for a basic block a straight-line sequence of code
with a single-entry point and a single exit point subject to precedence, latency, and resource
constraints. Solving the problem exactly is known to be difficult, and most compilers use a greedy
list scheduling algorithm coupled with a heuristic. The heuristic is usually hand-crafted, a
potentially time-consuming process. Modern architectures are pipelined and can issue multiple
instructions per time cycle. On such processors, the order that the instructions are scheduled can
significantly impact performance.

The basic block instruction scheduling problem is to find a minimum length schedule for a basic
block a straight-line sequence of code with a single entry point and a single exit point subject to
precedence, latency, and resource constraints. Instruction scheduling for basic blocks is known to
be NP-complete for realistic architectures. The most popular method for scheduling basic blocks
continues to be list scheduling.

For e.g.: We consider multiple-issue pipelined processors. On such processors, there are multiple
functional units and multiple instructions can be issued (begin execution) each clock cycle.
Associated with each instruction is a delay or latency between when the instruction is issued and
when the result is available for other instructions which use the result. In this paper, we assume that
all functional units are fully pipelined and that instructions are typed. Examples of types of
instructions are load/store, integer, floating point, and branch instructions. We use the standard
labelled directed acyclic graph (DAG) representation of a basic block (see Figure 1(a)). Each node
corresponds to an instruction and there is an edge from i to j labelled with a positive integer l (i, j)
if j must not be issued until i has executed for l (i, j) cycles. Given a labelled dependency DAG for

79
Compiler Design (3170701) 200160107008

a basic block, a schedule for a multiple-issue processor specifies an issue or start time for each
instruction or node such that the latency constraints are satisfied and the resource constraints are
satisfied. The latter are satisfied if, at every time cycle, the number of instructions of each type
issued at that cycle does not exceed the number of functional units that can execute instructions of
that type. The length of a schedule is the number of cycles needed for the schedule to complete; i.e.,
each instruction has been issued at its start time and, for each instruction with no successors, enough
cycles have elapsed that the result for the instruction is available. The basic block instruction
scheduling problem is to construct a schedule with minimum length.

Instruction scheduling for basic blocks is known to be NP-complete for realistic architectures. The
most popular method for scheduling basic blocks continues to be list scheduling. A list scheduler
takes a set of instructions as represented by a dependency DAG and builds a schedule using a best-
first greedy heuristic. A list scheduler generates the schedule by determining all instructions that
can be scheduled at that time step, called the ready list, and uses the heuristic to determine the best
instruction on the list. The selected instruction is then added to the partial schedule and the scheduler
determines if any new instructions can be added to the ready list.

The heuristic in a list scheduler generally consists of a set of features and an order for testing the
features. Some standard features are as follows. The path length from a node i to a node j in a DAG
is the maximum number of edges along any path from i to j. The critical-path distance from a node
i to a node j in a DAG is the maximum sum of the latencies along any path from i to j. Note that
both the path length and the critical-path distance from a node i to itself is zero. A node j is a
descendant of a node i if there is a directed path from i to j; if the path consists of a single edge, j is
also called an immediate successor of i. The earliest start time of a node i is a lower bound on the
earliest cycle in which the instruction i can be scheduled.

In supervised learning of a classifier from examples, one is given a training set of instances, where
each instance is a vector of feature values and the correct classification for that instance, and is to
induce a classifier from the instances. The classifier is then used to predict the class of instances
that it has not seen before. Many algorithms have been proposed for supervised learning. One of
the most widely used is decision tree learning. In a decision tree the internal nodes of the tree are
labelled with features, the edges to the children of a node are labelled with the possible values of
the feature, and the leaves of the tree are labelled with a classification. To classify a new example,
one starts at the root and repeatedly tests the feature at a node and follows the appropriate branch
until a leaf is reached. The label of the leaf is the predicted classification of the new example.

80
Compiler Design (3170701) 200160107008

Algorithm:

This document is on automatically learning a good heuristic for basic block scheduling using
supervised machine learning techniques. The novelty of our approach is in the quality of the training
data we obtained training instances from very large basic blocks and we performed an extensive
and systematic analysis to identify the best features and to synthesize new features— and in our
emphasis on learning a simple yet accurate heuristic.

Observations and Conclusion:

• Instruction scheduling is an important step for improving the performance of object code
produced by a compiler.
• Basic block scheduling is important as a building block for scheduling larger groups of
instructions such as superblocks.
• The basic block instruction scheduling problem is to find a minimum length schedule for a basic
block a straight-line sequence of code with a single-entry point and a single exit point subject
to precedence, latency, and resource constraints.
• Solving the problem exactly is known to be difficult, and most compilers use a greedy list
scheduling algorithm coupled with a heuristic.
• Modern architectures are pipelined and can issue multiple instructions per time cycle. The order
that the instructions are scheduled can significantly impact performance.
• Instruction scheduling for basic blocks is known to be NP-complete for realistic architectures.
• The most popular method for scheduling basic blocks continues to be list scheduling, which
81
Compiler Design (3170701) 200160107008
takes a set of instructions as represented by a dependency DAG and builds a schedule using a
best-first greedy heuristic.
• The heuristic in a list scheduler generally consists of a set of features and an order for testing
the features.
• Decision tree learning is one of the most widely used algorithms for supervised learning.

Quiz:
1. What is the basic block instruction scheduling problem?
The basic block instruction scheduling problem involves reordering and optimizing instructions
within a basic block of code to maximize hardware resource usage, reduce execution latency,
optimize control flow, manage registers efficiently, and handle instruction dependencies. It's a crucial
compiler optimization technique for improving program performance on modern processors.
2. Why is instruction scheduling important for improving the performance of object code
produced by a compiler?
Instruction scheduling is important for better object code performance because it optimizes
hardware resource usage, reduces execution latency, minimizes control flow overhead,
manages registers efficiently, handles dependencies, and enhances pipeline efficiency,
collectively leading to overall performance improvement.
3. What are the constraints that need to be considered in solving the basic block
instruction scheduling problem?
Constraints in basic block instruction scheduling include data dependencies, resource
limitations, control flow, instruction timing, register pressure, code size, readability, and target
architecture compatibility.
4. What is the most popular method for scheduling basic blocks?
The most popular method for scheduling basic blocks is List Scheduling. In List Scheduling,
instructions are placed in a ready list and scheduled based on resource availability, dependencies, and
priority. It is widely used due to its simplicity, effectiveness, and suitability for various architectures.
List Scheduling optimizes resource utilization and helps improve overall code performance. Other
methods, like ILP-based approaches and heuristics, are also used depending on specific requirements.
5. What is the heuristic used in a list scheduler?
In list scheduling, a common heuristic used to determine the order of instruction scheduling is the
"Critical Path Length." This heuristic selects instructions based on their impact on the critical path of
execution, aiming to minimize the schedule's overall execution time. It prioritizes instructions that
have the most influence on the program's performance.

Suggested Reference:

1. https://dl.acm.org/doi/10.5555/1105634.1105652
2. https://www.worldcat.org/title/1032888564

References used by the students:

Rubric wise marks obtained:

Problem
Knowledge Documentati Presentation
Recognition Ethics (2)
Rubrics (2) on (2) (2) Total
(2)
Good Avg. Good Avg. Good Avg. Good Avg. Good Avg.
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

82

You might also like