Compiler Design Lab Manual
Compiler Design Lab Manual
1. Software Requirement:
http://www.megaleecher.net/Download_Turbo_For_Windows
LEX tool--flex-2.5.4a-1.exe
YACC tool--bison-2.4.1-setup.exe
Token
A token is a structure representing a lexeme that explicitly indicates its categorization for the
Purpose of parsing. A category of token is what in linguistics might be called a part-of- speech.
Examples of token categories may include “identifier” and “integer literal”, although the set of
Token differ in different programming languages. The process of forming tokens from an input
stream of characters is called tokenization. Consider this expression in the C programming
language: Sum=3 + 2;
Tokenized and represented by the following table:
A program that reads a source code in C/C++ from an unformatted file and extract various types of
tokens from it (e.g. keywords/variable names, operators, constant values).
Algorithm:-
#include<stdio.h>
#include<conio.h>
#include<ctype.h>
#include<string.h>
void main()
{
FILE *fi,*fo,*fop,*fk;
int flag=0,i=1;
char c,t,a[15],ch[15],file[20];
clrscr();
printf("\n Enter the File Name:");
scanf("%s",&file);
fi=fopen(file,"r");
fo=fopen("inter.c","w");
fop=fopen("oper.c","r");
fk=fopen("key.c","r");
c=getc(fi); while(!
feof(fi))
{
if(isalpha(c)||isdigit(c)||(c=='['||c==']'||c=='.'==1))
fputc(c,fo);
Else
{
if(c=='\n')
fprintf(fo,"\t$\t");
else fprintf(fo,"\t%c\t",c);
}
c=getc(fi);
}
fclose(fi);
fclose(fo);
fi=fopen("inter.c","r");
printf("\n Lexical Analysis");
fscanf(fi,"%s",a);
printf("\n Line: %d\n",i++);
while(!feof(fi))
{
if(strcmp(a,"$")==0)
Input Files:
Oper.c
Input.c
#include "stdio.h"
#include "conio.h"
void main()
{
int a=10,b,c;
a=b*c;
getch();
}
Description:
Lex:-
The lex is used in the manner depicted. A specification of the lexical analyzer is preferred by
creating a program lex.1 in the lex language.
Then lex.1 is run through the lex compiler to produce a ‘c’ program lex.yy.c.
The program lex.yy.c consists of a tabular representation of a transition diagram constructed
from the regular expression of lex.1 together with a standard routine that uses table of
recognize leximes.
Lex.yy.c is run through the ‘C’ compiler to produce as object program a.out, which is the lexical
analyzer that transform as input stream into sequence of tokens.
Algorithm:
1. First, a specification of a lexical analyzer is prepared by creating a program lexp.l in
the LEX language.
2. The Lexp.l program is run through the LEX compiler to produce an equivalent code
in C language named Lex.yy.c
3. The program lex.yy.c consists of a table constructed from the Regular Expressions of
Lexp.l, together with standard routines that uses the table to recognize lexemes.
4. Finally, lex.yy.c program is run through the C Compiler to produce an object
program a.out, which is the lexical analyzer that transforms an input stream into a
sequence of tokens.
10
$ lex lexp.l
$ cc lex.yy.c
$ ./a.out test.c
Description:
Lex is a computer program that generates lexical analyzers ("scanners" or "lexers").Lex is commonly
used with the yacc parser generator.
Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the
lexer in the C programming language
1. A lexer or scanner is used to perform lexical analysis, or the breaking up of an input stream
into meaningful units, or tokens.
2. For example, consider breaking a text file up into individual words.
3. Lex: a tool for automatically generating a lexer or scanner given a lex specification (.l file).
The structure of a Lex file is intentionally similar to that of a yacc file; files are divided up into three
sections, separated by lines that contain only two percent signs, as follows:
Definition section:
%%
Rules section:
%%
C code section:
<statements>
The definition section is the place to define macros and to import header files written
in C. It is also possible to write any C code here, which will be copied verbatim into the
generated source file.
The rules section is the most important section; it associates patterns with C
statements. Patterns are simply regular expressions. When the lexer sees some text in
the input matching a given pattern, it executes the associated C code. This is the basis
of how Lex operates.
13
Description:-
The lex command reads File or standard input, generates a C language program, and writes it to a file
named lex.yy.c. This file, lex.yy.c, is a compilable C language program. A C++ compiler also can compile
the output of the lex command. The -C flag renames the output file to lex.yy.C for the C++ compiler. The
C++ program generated by the lex command can use either STDIO or
IOSTREAMS. If the cpp define _CPP_IOSTREAMS is true during a C++ compilation, the program uses
IOSTREAMS for all I/O. Otherwise, STDIO is used.
The lex command uses rules and actions contained in File to generate a program, lex.yy.c,which can be
compiled with the cc command. The compiled lex.yy.c can then receive input, break the input into the
logical pieces defined by the rules in File, and run program fragments contained in the actions in File.
The generated program is a C language function called yylex. The lex command stores the yylex function
in a file named lex.yy.c. You can use the yylex function alone to recognize simple one-word input, or you
can use it with other C language programs to perform more difficult input analysis functions. For
example, you can use the lex command to generate a program that simplifies an input stream before
sending it to a parser program generated by the yacc command.
The yylex function analyzes the input stream using a program structure called a finite state machine.
This structure allows the program to exist in only one state (or condition) at a time. There is a finite
number of states allowed. The rules in File determine how the program moves from one state to
another. If you do not specify a File, the lex command reads standard input. It treats multiple files as a
single file.
Note: Since the lex command uses fixed names for intermediate and output files, you can have only one
program generated by lex in a given directory.
• yytext
– where text matched most recently is stored
• yyleng
– number of characters in text most recently matched
• yylval
– associated value of current token
• yymore()
– append next string matched to current contents of yytext
• yyless(n)
– remove from yytext all but the first n characters
• unput(c)
– return character c to input stream
• yywrap()
– may be replaced by user
– The yywrap method is called by the lexical analyser whenever it inputs an EOF as the first character
when trying to match a regular expression
Files
y.output--Contains a readable description of the parsing tables and a report on conflicts generated by
grammar ambiguities.
y.tab.c----Contains an output file.
y.tab.h-----Contains definitions for token names.
yacc.tmp-----Temporary file.
yacc.debug----Temporary file.
yacc.acts-----Temporary file.
/usr/ccs/lib/yaccpar---Contains parser prototype for C programs.
/usr/ccs/lib/liby.a----Contains a run-time library.
Basic specification
Names refer to either tokens or non-terminal symbols. Yacc requires tokens names to be declared as
such. In addition, for reasons discussed in section 3, it is often desirable to include the lexical analyzer as
part of the specification file, I may be useful to include other programs as well. Thus, the sections are
separated by double percent “%%” marks. (the percent‟%‟ is generally used in yacc specifications as an
escape character). In other words, a full specification file looks like.
In other words a full specification file looks like
Declarations
%%
15
The Lex program recognizes only extended regular expressions and formats them into character
packages called tokens, as specified by the input file. When using the Lex program to make a lexical
analyzer for a parser, the lexical analyzer (created from the Lex command) partitions the input stream.
The parser(from the yacc command) assigns structure to the resulting pieces. You can also use other
programs along with programs generated by Lex or yacc commands.
A token is the smallest independent unit of meaning as defined by either the parser or the lexical
analyzer. A token can contain data, a language keyword, an identifier or the parts of language syntax.
The yacc program looks for a lexical analyzer subroutine named yylex, which is generated by the lex
command. Normally the default main program in the Lex library calls the yylex subroutines. However if
the yacc command is loaded and its main program is used , yacc calls the yylex subroutines. In this case
each Lex rule should end with:
return (token);
The yacc command assigns an integer value to each token defined in the yacc grammar file through a #
define preprocessor statement.
The lexical analyzer must have access to these macros to return the tokens to the parser. Use the yacc –
d option to create a y.tab.h file and include the y.tab.h file in the Lex specification file by adding the
following lines to the definition section of the Lex specification file:
%{
#include “y.tab.h”
%}
Alternatively you can include the lex.yy.c file the yacc output file by adding the following lines after the
second %% (percent sign, percent sign) delimiter in the yacc grammar file:
#include”lex.yy.c”
The yacc library should be loaded before the Lex library to get a main program that invokes the yacc
parser. You can generate Lex and yacc programs in either order.
Aim:
Study the LEX and YACC tool and Evaluate an arithmetic expression with parentheses, unary and
Algorithm:
1) Get the input from the user and Parse it token by token.
2) First identify the valid inputs that can be given for a program.
3) The Inputs include numbers, functions like LOG, COS, SIN, TAN, etc. and operators.
4) Define the precedence and the associativity of various operators like +,-,/,* etc.
5) Write codes for saving the answer into memory and displaying the result on the screen.
7) Display the possible Error message that can be associated with this calculation.
8) Display the output on the screen else display the error message on the screen.
Program
Downloaded by Sangeetha Raja
: CALC.L
%{
#include<stdio.h>
#include<stdlib.h>
void yyerror(char
*);
#include
"y.tab.h" int
yylval;
%}
%%
[a-z] {yylval=*yytext='&'; return
VARIABLE;} [0-9]+ {yylval=atoi(yytext);
return INTEGER;} CALC.Y
[\t] ;
%%
int yywrap(void)
{
return 1;
}
%token INTEGER VARIABLE
%left '+' '-'
%left '*' '/'
%{
int yylex(void);
void yyerror(char
*); int sym[26];
%}
%%
PROG
:
PROG STMT '\n'
;
Aim: Using JFLAP, create a DFA from a given regular expression. All types of error must be checked
during the conversion.
What is JFLAP: -
JFLAP program makes it possible to create and simulate automata. Learning about automata with pen
and paper can be difficult, time consuming and error-prone. With JFLAP we can create automata of
different types and it is easy to change them as we want. JFLAP supports creation of DFA and NFA,
Regular Expressions, PDA, Turing Machines, Grammars and more.
Setup: -
JFLAP is available from the homepage: (www.JFLAP.org). From there press “Get FLAP” and follow the
instructions. You will notice that JFLAP have a .JAR extension. This means that you need Java to run
JFLAP. With Java correctly installed you can simply select the program to run it. You can also use a
command console run it from the files current directory with, Java –jar JFLAP.jar.
Using JFLAP: -
When you first start JFLAP you will see a small menu with a selection of eleven different automata and
rule sets. Choosing one of them will open the editor where you create chosen type of automata. Usually
you can create automata containing states and transitions but there is also creation of Grammar and
Regular Expression which is made with a text editor.
20
The toolbar contains six tools, which are used to edit automata.
Attribute Editor Tool, changes properties and position of existing states and transitions.
State Creator Tool, creates new states. Transition
Creator Tool, creates transitions. Deletion Tool,
deletes states and transitions.
Undo/Redo, changes the selected object prior to their history.
21
Choose Regular Expression in the main menu, then just type the expression in the textbox. Definitions
for Regular Expressions in JFLAP:
Kleene Star
+ Union
! Empty String
Correctly written expressions can then be converted to an NFA. To convert your expression select
Convert → Convert to NFA. The conversion will begin with two states and a transition with your Regular
Expression. With the (D)e-expressionify Transition tool you can break down the Regular Expression into
smaller parts. Each transition will contain a sub expression. The next step is to link every rule with
lambda transitions. Add new transition between states that should be connected with the Transition
Tool. If you are unsure what to do you can select Do Step to automatically make the next step. If you
want the NFA immediately Do All creates the whole NFA for you.
22
Downloaded by Sangeetha Raja
You can notice how the conversion differs depending on how the Regular Expression looks. For
example the expression a+b results in a fork, were either ‘a’ or ‘b’ can be chosen.
23
Create LL(1) parse table for a given CFG and hence Simulate LL(1) Parsing
Aim: Using JFLAP create LL(1) parse table for a given CFG and hence Simulate LL(1) parsing
Implementation: -
Step 1. Choose the grammar from JFLAP and insert grammar you want to create LL(1) parsing table.
24
Step 3: - Now select parse to use that table to create parse tree from it.
Result:- We create LL(1) parse table for a given CFG and hence Simulate LL(1) parsing
Aim: Using JFLAP create SLR(1) parse table for a given grammar. Simulate parsing and output the parse
tree proper format.
Implementation: -
Step 1: - Choose the grammar from JFLAP and insert grammar you want to create SLR(1) parsing table.
Step 3: - Now select parse to use that table to create parse tree from it.
Result :- We created SLR(1) parse table for a given grammar and Simulated parsing and output the
parse tree proper format.
Algorithm:
First ()-
Follow ()-
Program:
#include<stdio.h>
#include<math.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int n,m=0,p,i=0,j=0;
char a[10][10],f[10];
void follow(char c);
void first(char c);
int main(){
int i,z;
char c,ch;
clrscr();
printf("Enter the no of prooductions:\n");
30
AIM :
To write a C program to generate a three address code for a given expression.
ALGORITHM:
struct three
{
char data[10],temp[7];
}s[30];
void main()
{
char d1[7],d2[7]="t";
int i=0,j=1,len=0;
FILE *f1,*f2;
clrscr();
f1=fopen("sum.txt","r");
f2=fopen("out.txt","w");
while(fscanf(f1,"%s",s[len].data)!=EOF)
len++;
itoa(j,d1,7);
strcat(d2,d1);
strcpy(s[j].temp,d2);
strcpy(d1,"");
Input: sum.txt
t1=in1+in2
t2=t1+in3
t3=t2-in4
out=t3
RESULT:
Thus a C program to generate a three address code for a given expression is written, executed and the
output is verified.
AIM:
ALGORITHM:
Step3: Checking the validating of the given expression according to the rule using yacc.
Step4: Using expression rule print the result of the given values
Step5: Stop the program.
LEX PART:
%{
#include "y.tab.h"
%}
%%
[a-zA-Z_][a-zA-Z_0-9]* return id;
[0-9]+(\.[0-9]*)? return num;
[+/*] return op;
. return yytext[0];
\n return 0;
%%
int yywrap()
{
return 1;
}
YACC PART:
%{
#include<stdio.h>
int valid=1;
%}
%token num id op
%%
start : id '=' s ';'
s : id x
| num x
| '-' num x
| '(' s ')' x
;
x : op s
| '-' s
|
;
%%
int yyerror()
{
valid=0;
printf("\nInvalid expression!\n");
return 0;
}
AIM:
To write a program for implementing Symbol Table using C.
ALGORITHM:
Step1: Start the program for performing insert, display, delete, search and modify option in symbol table
Step2: Define the structure of the Symbol Table
Step3: Enter the choice for performing the operations in the symbol Table
Step4: If the entered choice is 1, search the symbol table for the symbol to be inserted. If the symbol is
already present, it displays “Duplicate Symbol”. Else, insert the symbol and the corresponding address in
the symbol table.
Step5: If the entered choice is 2, the symbols present in the symbol table are displayed.
Step6: If the entered choice is 3, the symbol to be deleted is searched in the symbol table.
Step7: If it is not found in the symbol table it displays “Label Not found”. Else, the symbol is deleted.
Step8: If the entered choice is 5, the symbol to be modified is searched in the symbol table.
c=b[j];
if(isalpha(toascii(c)))
{
p=malloc(c);
add[x]=p;
d[x]=c;
RESULT:
Thus the program for symbol table has been executed successfully.
AIM:
ALGORITHM:
Step1: Generate the program for factorial program using for and do-while loop to specify optimization
technique.
Step2: In for loop variable initialization is activated first and the condition is checked next. If the condition is
true the corresponding statements are executed and specified increment / decrement operation is
performed.
Step3: The for loop operation is activated till the condition failure.
Step4: In do-while loop the variable is initialized and the statements are executed then the condition
checking and increment / decrement operation is performed.
Step5: When comparing both for and do-while loop for optimization do while is best because first the
statement execution is done then only the condition is checked. So, during the statement execution itself we
can find the inconvenience of the result and no need to wait for the specified condition result.
Step6: Finally when considering Code Optimization in loop do-while is best with respect to performance.
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;