System Software Study Guide
System Software Study Guide
Sc CT Semester V Part A Questions and Answers System Software (BCS354) UNIT I Define language processor. A language processor is software which bridges a specification gap between the specification or execution gap. 2. What is an interpreter? An interpreter is a language processor which bridges an execution gap without generating a machine language program. 3. What the activities in language processing? Program generation activities. Program execution activities. 4. What is program translation? The program translation model bridges the execution gap by translating a program written in a PL, called the source program into an equivalent program in the machine or assembly language of the computer system, called the target program. 5. What are the characteristic of program translation? The program must be translated before it can be executed. The translated pgm may be saved in agile The pigmy must be retranslated 6. Define Language Processing? Language Processing =Analysis of SP +Synthesis of TP 7. What are the components of Language specification? Lexical Rules Syntax Rules Semantic Rules 8. Define Forward References? Forward Reference of a Program Entity is a e reference to the Entity which proceeds its definition in the program 1. 9. Define Language Processor Pass? Language processor pass is the processing of every statement in source program or its equivalent representation, to perform a language processing function 10. What is meant by Semantic Action? All actions performed by the front end, except lexical &syntax Analysis are called Semantic actions 11. What are the actions included in semantic actions? Checking semantic validity Determining the meaning of SP Constructing an IR 12. What is syntactic analysis? Syntax analysis processes the string of tokens built by lexical analysis to determine the statement class. 13. What is lexical analysis? Lexical analysis identifies the lexical units in a source statement. It then classifies the units into different lexical classes. 14. What are the important issues in code generation? To determine the places where the intermediate results should be kept Determine which instructions should be used Determine which addressing mode should be used
15. What is formal language grammar? Formal language grammar is a set of rules which precisely specify the sentence of L. 16. What is the formal language? A language L can be considered as a connection of valid sentences. Each sentence can be looked upon as a sequence of words. An each word as a sequence of letters or graphics symbols. A language specified in this manner is known as formal language. 17. Define non terminal symbols. A non terminal symbol is the name of the syntax category of a language. It can be written in between<.>. 18. What is a grammar? A grammar G of language Lg is a quraple(,SNT,S,P) where is the alphabet SNT is the set of NTs S is the distinguished symbol P is the set of productions 19. What are the classifications of grammars? Type 0 grammar Type 1 grammar Type 2 grammar Type 3 grammar Operator grammar 20. Define operator grammar. An operator grammar is a grammar non of its productions conatin two or more consecutive NTs in RHS alternative. 21. Define binding. Binding is the association of an attribute of a program entity with a value. 22. Define Static & Dynamic binding. Static binding is a binding performed before the execution of the program begins. Dynamic binding is a binding performed after the execution of the program begins. 23. What are the inputs for language processor developments tools? Specification of a grammar of language L Specification of semantic actions to be performed 24. What is LEX? LEX accepts an input specification, which consists of two components. Specification of string Specification of semantic aimed at building an IR 25. Write about YACC? Each string specification in the input to YACC resembles a grammar production. The parser generated by YACC performs reductions according to this grammar.
Unit II 1. Define: search data structure. The search data structure is a set of entries, each entry accommodating the information concerning one entity. Each entry is assumed to contain key fields, which form the basis for a search. 2. What are the different formats in language processing? * Entry format. * Fixed and variable length entries. *Hybrid entry format. 3. Define: hybrid entry format. It is used between the fixed and variable entry formats to combine the access efficiency of the fixed entry format or the memory efficiency of the variable entry format. 4. What are the operations on search data structure? 1. Add: Add the entry of the symbol. 2. Search: Search and locate the entry of the symbol. 3. Delete: Deleting the entry of the symbol. 5. Write about the table organization. A table is a linear data structure .Two points can be made concerning tables as search structures. 6. What are the important points to be noted in table organization? 1. Given the location of the entry of the table. It is meaningful to talk of the next entry of the table for the previous entry of the table. 2. The address of entering a table can be determined by its entry number. 7. What is sequential search organization? * Make a prediction concerning the entry of the search data structure, with symbols S may be occupying .Let it be e. * Let Se be the symbol occupying e th entry. Compare S with Se.Exit with success if two match. * Repeat the above steps till it can be concluded. 8. Define: Hash table organization. * E: =h(x), where h=hash function and x=data. * Exit with success if S=Se & with failure if entry e is unoccupied. * Repeat steps 1 & 2 with different functions h,h,. . 9. Define: Hashing function. While hashing the representation S , the symbol to be searched is treated as a binary number. The hashing function performs a numerical transformation on this number to obtain e. 10. What are the steps involved in hashing function? * The hashing function should not be sensitive to the symbols in Sp.It should perform equally well for different source program. * The hashing function h should execute reasonably fast. 11. What are the two different classes of the hashing function? 1. Multiplication Function. 2. Division Function.
12. Define: Multiplications functions. These functions are analogous to function used in random number generation. 13. Define: Division functions. If n is a prime number then it is called as Prime division hashing. 14. What are the types of collision handling methods? 1. Rehashing. 2. Overflow chaining. 15. Define: Rehashing. This uses a sequence of hashing functions h1, h2, h3, to resolve collisions. A popular technique called sequential rehashing used in securing relation. 16. Define: Overflow chaining. It accommodates the colliding entries in a separating table called the overflow table. To facilitate this, a pointer field is added to each entry. 17. Define : Linked list and tree structure organization. Linked list and tree structure organization are non linear in structure. That is the elements of the searched data structure are not located in adjoining memory areas. 18. Define: Nested search structures. These nested structures are also known as Multi list structure. Each symbol table entry contains two additional fields. One is field list and another is next field list. 19. Define: Heaps. A heap is a non linear data structure which permits allocation and deal location of entity in a random order. 20. Define: Scanning. Scanning is the process of recognizing the lexical components in a source string. 21. Define: Entry format. Each entry in a search structure is a set of fields. The structure consists of 2 parts, a fixed part and a variant part. 22. Define: Binary tree. Each node in the tree is a symbol entry with two pointer fields.[left and right pointer].The left pointer points to a search tree containing all symbols <S. While, a right pointers points to a search tree containing all symbols >S. 23. Define: Extended stack. Two new pointers are there: Record Based Pointer: * Pointing to the first word of the last record in the stack. * The first word of each record is a reserved pointer. 24. What are the concepts used in Memory Management? 1.Reference count. 2.Garbage collection. 25. Define: Reference count. In these techniques the system associates a reference count to each memory area to indicate the number of its active users.
UNIT III 1. What is assembly language? An assembly language is a machine independent, low level programming language, which is specific to a certain computer system. 2. What are the kinds of statements in assembly program? 1. Imperative statements 2. Declarative statements 3. Assembler Directives. 3. What is a literal? A literal is an operand with the syntax =< value>. It differs from the constants because its location cannot be specified in the assembly program. 4. What is assembler directive? Assembler directives instruct the assembler to perform certain actions during the assembly of the program. It is described as START <constant> 5. What are the steps to develop a design specification? 1. Identify the information 2. Design a suitable data structures. 3. Determine the processing necessary to obtain and maintain the information 4. Determine the processing necessary to perform the task. 6. What are the data structures used in Synthesis phase? 1. Symbol table 2. Mnemonics table 7. What are the mnemonic codes used in assembler? 1. ADD 2. SUB 3. MULT 4. MOV C 5. MOV M 6. CMP
8. Write about Imperative statement. Imperative statements indicate an action to perform during the execution the assembler program. Each imperative statement typically translates the language into machine instructions. 9. Write the format of declarative statements. [Label] DS <constant value> [Label] DC<value> 10. What are assembly directives? The assembly directives instruct the assembler to perform certain action during the assembly of a program. 11. What about the Synthesize phase? 1. Obtain the machine opcode corresponding mnemonic form 2. Obtain the address of numeric. 3. Synthesis the machine instructions.
12. Write about the tasks in analysis phase? 1. Isolate the label, mnemonic code and the operand fields 2. If a label is present enter the pair. 3. Check the validity of the mnemonic opcode 4. Perform LC parsing 13. What is a macro? Macro is a unit of specification for program generation through expansion. It consists of macro name, set of formal parameters in the body of code. 14. What are the strings in lexical substitution? 1. An ordinary string 2. The name of the formal parameter which is proceeded by the character& 3. The name of the preprocessor variable which is proceeded by the character & 15. Define macro definition. A macro definition is enclosed between a macro header statements and a macro end statement. Macro definitions are typically located at the start of the program. 16. What are the statements in macro definition? 1. Macro prototype statements 2. One or more model statements 3. Macro preprocessor statements 17. Define macro call. A macro is called by writing the macro name in the mnemonic field of assembly statements. Syntax <macroname>[<actualparameter specification>] 18. Define macro expansion. A macro call leads to macro expansion. During a macro expansion a macro call statement is replaced by a sequence of assembly statements. 19. What are the key notions in macro expansion? 1. Expansion time control flow 2. Lexical substitution 20. Define nested macro calls. A model statement in a macro may constitute a call on another macro, such calls are known as nested macro call. 21. What are the advanced macro facilities? a. Facilities for the alteration of flow of control b. Expansion time variable c. Attribute parameters 22.What are the features provided to facilitate alteration of flow of control? 1. Expansion time Sequencing symbols 2. Expansion time statements 23. What is expansion time variable? Expansion time variables are variables, which can only be used during the expansion of macro calls. What is the syntax for defining attribute? <attribute name><formal parameter specification> What is a macro preprocessor? The macro preprocessor accepts an assembly program containing definitions and calls and translates it into an assembly program which doesnt contain any macro definition or calls
24. 25.
Unit IV 1.What is a compiler? A compiler bridges a semantic gap between a programming language and an execution domain. 2.What are two aspects of compilation? Generate code to implement the meaning of a source program in the execution domain. Provide diagnostics for violations of programming language semantics in a source program.
3.What are the features of Programming Language? Data Types Data Structures Scope Rules Control Structures
4.What is a Data Type? Data Type is the Specification of one thing ie the Legal value for a variable of specified type and also the legal operation on legal values. 5.What are the Data Structure used in a programming language? A programming language permits data structures like Arrays, Stacks, Records, and Lists etc 6.What do you meant by Scope Rules, Control Structures, and Code Optimization? The scope of a program entity is the path of the program where the entity is accessible. Collection of language features for altering the flow of control during the execution of the program. This includes conditional transfer of controls, conditional execution, Iteration control, and Procedure calls. Code Optimization aims at improving the execution efficiency of a program, which is achieved in two ways. Redundancy in the senior program are eliminated Computation in program are rearranged 7.What are the 2 points regarding Optimization? It seeks to improve a program rather than an algorithm is used Efficient code generation for a specific target machine and belongs to the backend of a program 8.Write down some of the Transformations? Compile Time Evaluation Elimination of Sub expression Dead code Elimination Frequency Reduction Strength Reduction Local & global Optimization 9.What is dead code Elimination? Code, which can be waited from a program altering its result, is called dead code. It is detected by checking whether the value is used anywhere in the program.
10.What is Frequency Reduction? Moving code from a block of a program is executed very frequently to another part of the program, which is executed. 11.What is Strength Reduction? Replace the occurrence of the time consuming operation by an occurrence of a faster operation. 12.What is Local Global Optimization? Optimization transformations are applied over a small segment of a program in case of Local Optimization. The scope of LP is s basic block, which is an essentially sequential segment in the source program. Transformations are applied over the program unit in case of Global Optimization. It requires more analyses effort to establish the feasibility of an Optimization 13.What is Control flow analysis? 1.Control Flow Analysis contains two concepts. They are Predecessors Successors 2.Paths Path is a sequence of edges 3.Ancestors & Descendants 4.Dominators &Post Dominators. 14. What are Live Variables? A variable var is said to be Live at a program pointer pi, in the basic block bi if the value contained in it at pi is likely to be used during subsequent execution of the program. 15.What are Interpreters? Interpreter usage is motivated by two reasons. Efficiency in certain environments Simplicity 16.What are the 3 main components of Interpreters? Symbolic Table Data Store Data Manipulation Routines 17.What are Pure and Impure Interpreters? In Pure interpreters the source program is retained in the source for all through the Interpretation An Impure interpreter performs some preliminary processing of the source program to reduce analysis overheads during interpretations. 18.What is Program flow Graph? A PFG for a program (p) is a directed Graph GP={N1,E,N0} where N is a set of basic blocks &E is a set of edges &N0 is the Starting node. 19.What is Data Flow Analysis? Dataflow analysis Technique analysis the use of data in a program to collect information for the purpose of Optimization .the informations called dataflow information which is computed at the entry and exit of each basic block in GP. 20.What is a Basic Block? Sequence of Program Statements (s1,s2,sn)such that only sn that mean a transfer of control statement and only s1 can be a destination of a transfer of control statement.
UNIT V 1. What are the steps to be followed in the execution of the program? 1. Translation of the program. 2. Linking of the program. 3. Relocation of the program. 4. Loading of the program. 2. Define program relocation? Program relocation is the process of modifying the address used in the address sensitive instruction of a program such that the program can execute correctly from the designated area of the memory. 3. What is linking? Linking is the process of binding an external reference to the correct link time address. 4. Define Binary Programs? Binary program is the machine language program comprising a set of program unit SP such that for all Pi belong to SP. 5. What is the difference between relocatable and non-relocatable program? The difference between the relocatable and non-relocatable program is the availability of information concerning the address sensitive instructions in it. 6. What is the self-relocating program? A self-relocating program is the program, which can perform the relocation of its own sensitive instructions. 7. Define Overlay? An overlay is the part of the program, which has the same load origin as some other parts of the program. 8. Define Software tool? A software tool is the software program which 1. Interfaces a program with the entity generating its input data 2. Interfaces the result of the program with the entity consuming them. 9. What are the steps in program development? 1. Program design, coding, documentation. 2. Preparation of programs. 3. Program translation. 4. Program testing, Debugging. 5. Performance tuning. 6. Reformatting the data. 10. What is the profile monitor? A profile monitor is the software tool that collects information regarding the execution behavior of the program (eg) the amount of execution time consumed by its module 11. What is program preprocessing technique? Program preprocessing technique are used to support static analysis of programs, test data generated and documentation use this technique. 12. What is program instrumentation? Program instrumentation implies insertion of statements in a program. The instrumented program is translated using a standard translator.
13. What are the forms of text editor? 1) Line editor 2) Steam editor 3) Screen editor 4) Word processors 5) Structure Editors.
14. What is programming environment? The programming environment is a software system that provides integrated facilities for program creation, editing, execution ,testing and debugging 15. What is a template? Template is a unit of information concerning PL/CS. It consists of some fixed syntax information and few place holders. 16. What is a user interface? User interface is a interaction of a user with an application. UI functionalities are issuing of commands & exchanging of data 17. What are the ways to implement command dialogs? 1) Command language 2) Command menus 3) Direct Manipulation 18. What is user interface management system? The user interface management system accepts specification of the presentation dialog semantics to produce the presentation and dialog managers of the UI respectively. 19. What are the two UIMS? 1) Menu lay 2) Hyper card 20. What are the facilities provided by debug monitors? 1) Setting brake points 2) Display values of variables 3) Assigning new values to variables 4) Testing user defined assertions. 21. What are the components in the object module? Header Program Relocation table. Linking table. 22. What is the classification of the programs? Non relocatable programs. Relocatable programs. Self-relacatable programs. 23. What are the categories of information? 1. Binary image. 2. External References. 3. Public Definitions. 4. Debugging Information. 5. Miscellaneous Information. 24. What is the format of LINKER invocation? LINKER <object module names>,<executable file>,<load origin>,<list of library files> 25. What are the components of overlay structured programming? A permanently resident portion, called the root. A set of overlays. END
&
10
Language processors A language processor is a software which bridges a specification or execution gap. Specification gap is the semantic gap between two specifications of the same task. Execution gap is the gap between the semantics of programs written in the different programming languages. The program form of input to a language processor as the source program and to its output as the target program. The languages in which these programs are written are called source language and target language. A language processors is defined to meet practical requirements 1. A language translator bridges an execution gap to the machine language of a computer system. An assembler is a language translator whose source language is assembly language. A compiler is another language translator. 2. A detranslator bridges the same execution gap as the language translator, but in the reverse direction. 3. A preprocessor is a language processor which bridges an execution gap. 4. A language migrator bridges the specification gap between two programming languages. Figure 1.3 Language processor Interpreters An interpreter is a language processor which bridges an execution gap without generating a machine language program. We say that an interpreter executes a program written in a programming language. Figure 1.4 Interpreter Page 4 Page 3
Problem oriented and procedure oriented languages A classical solution is to develop a programming language such that the programming language domain is very close or identical to the application domain. Such PL can only be used for specific application; hence they are called problem oriented languages. They have large execution gap and is bridged by the translator or interpreter and does not concern the software designer. A procedure oriented language provides general purpose facilities required in most application domains. 2. Language processing activities It is divided into two activities Program generation activities Program execution activities
1. 2. Program generation
Figure 1.6
Program Generation
Page 5
11
The program generator is a software system which accepts the specification of a program to be generated, and generates a program in the target PL. We call this the program generator domain. The specification gap is now the gap between the application domain and the program generator domain. Example A screen handling program. Program execution Two popular models for program execution are translation and interpretation. 1. Program translation
This model bridges the execution gap by translating a program written in a PL, called the source program, into an equivalent program in the machine or assembly language of the computer system, called the target program. Characteristics of the program translation model are o o o 2. A program must be translated before it can be executed. The translated program may be saved in a file. A program must be retranslated following modifications.
Program interpretation
The interpreter reads the source program and stores it in its memory. During interpretation it takes a source program statement, determines its meaning and performs actions which implement it. In interpretation the CPU uses a program counter to note the address of the next instruction to be executed. This instruction is subject to the instruction execution. 1. 2. 3. Fetch the instruction. Decode the instruction. Execute the instruction.
The characteristic of interpretation are o o 3. The source program is retained in the source form itself. A statement is analyzed during its interpretation.
Fundamentals of language processing Language processing = Analysis of source program + Synthesis of target program
The collection of language processor components engaged in analyzing a source program as the analysis phase. Components engaged in synthesizing a target program constitute the synthesis phase.
A specification of the source language forms the basis of source program analysis. The specification consist of 1. 2. 3. Lexical rules formation of valid lexical units. Syntax rules formation of valid statements Semantic rules associate meaning with valid statements.
The synthesis phase consist of two main activities o Creation of data structures in the target program
12
Phases and passes of a language processor Forward reference A forward reference of a program entity is a reference to the entity which precedes its definition in the program. Language processor pass A language processor pass is the processing of every statement in a source program, or its equivalent representation, to perform a language processing function. Intermediate representation of programs An intermediate representation is a representation of a source program which reflects the effect of some, but not all, analysis and synthesis tasks performed during language processing. In a two pass language processor the first pass concerned exclusively with source language issues. Hence it is called the front end of the language processor. The second pass is concerned with program synthesis for a specific target language. Hence it is called the back end of the language processor. Properties of intermediate representations are o o o Ease of use Processing efficiency Memory efficiency
Semantic actions Semantic actions includes o o o A toy compiler 1. Lexical analysis (Scanning) Lexical analysis identifies the lexical units in a source statement. It then classifies the units into different lexical classes. Ex ids, constants, reserved ids etc. 2. Syntax analysis (Parsing) Syntax analysis processes the string of tokens built by lexical analysis to determine the statement classes. Ex. assignment statement, if statement etc. 3. Semantic analysis Semantic analysis identifies the sequence of actions necessary to implement the meaning of a source statement. 4. Memory allocation Checking semantic validity of constructs in source program Determining the meaning of semantic program Constructing an intermediate representation
13
The memory requirement of an identifier is computed from its type, length and dimensionality, and memory is allocated to it. The address of memory area is entered in the symbol table. 5. Code generation It is to select the appropriate instructions. The important issues are 1. 2. 3. 4. Determine the places where the intermediate results should be kept. Determine which instruction should be used for type conversion operation. Determine which addressing modes should be used for accessing variables. Fundamentals of language specification
1. Programming language grammars A language L can be considered to be a collection of valid sentences. Each sentence can be looked upon as a sequence of words, and each word as a sequence of letters or graphic symbols acceptable in L. a language specified in this manner is known as formal language. Terminal symbols, alphabet and strings The alphabet of L, denoted by the Greek symbol , is the collection of symbols in its character set. A symbol in the alphabet is known as a terminal symbol (T) of L. the alphabet can be represented as = {a,b,,z,0,1,,9} Here the symbols {,,and} are part of the notation. We call them metasymbols to differentiate them from terminate symbols. A string is a finite sequence of symbols. We will represent string by Greek symbols , , etc. Thus = axy is a string over . The length of the string is the number of symbols in it. Nonterminal symbols A nonterminal symbol is the name of a syntax category of a language. An nonterminal written as a single capital letter or as a name enclosed between <.>.
Productions A production also called a rewriting rule is a rune of the grammar, a production is of the form A nonterminal symbol ::= String of Terminals and Nonterminals The symbol | (standing for or) is used to separate the strings on the RHS <Article> Grammar A grammar G of a language LG is a quadruple (, SNT, S, P) where ::= a | an | the
14
SNT S P Derivation
is the alphabet of LG, that is set of Ts is the set of NTs is the distinguished symbol, and is the set of productions.
Let production P1 of grammar G be of the form P1 : A ::= And let be a string such that = A , then replacement of A by in string constitutes a derivation according to production P1. Reduction Let production P1 of grammar G be of the form P1 : A ::= And let be a string such that = , then replacement of by A in string constitutes a reduction according to production P1. Parse trees A sequence of derivation or reductions reveals the syntactic structure of a string with respect to G. We depict the syntactic structure in the form of a parse tree.
Classification of grammars 1. Type-0 grammars These grammars, known as phrase structure grammars, contain productions of the form ::= where both and can be strings Ts and NTs. 2. Type-1 grammars These grammars are known as context sensitive grammars because their productions specify that derivation or reductions of strings can take place only in specific contexts. A Type-1 production has the form A ::= Thus, a string in a sentential form can be replaced by A. 3. Type-2 grammars
15
These grammars impose no context requirements on derivations or reductions. A typical Type-2 production is of the form A :: = Which can be applied independent of its context. These grammars are therefore known as context free grammars (CFG). CFGs are ideally suited for programming language specifications. 4. Type-3 grammars Type-3 grammars are characterized by productions of the form A :: = t B | t A :: = Bt | t or
5. Operator grammars (OG) An operator grammar is a grammar none whose productions contain two or more consecutive NTs in any RHS alternative. Ambiguity in grammatic specification Ambiguity implies the possibility of different interpretations of a source string. The string can have more than one meaning associated with it. Eliminating ambiguity The normal methods of achieving this is to use a hierarchy of NTs in the grammar, and to associate the reduction or derivation of an operator with an appropriate NT. Binding and binding times 1. Binding A binding is the association of an attribute of a program entity with a value. Binding time is the time at which a binding is performed. The various binding times are 1. 2. 3. 4. 5. Language definition time of L. Language implementation time of L. Compilation time of P. Execution init time of proc. Execution time of proc.
2. Static and dynamic bindings Static binding A static binding is a binding performed before the execution of a program begins. Dynamic binding
16
A dynamic binging is a binging performed after the execution of a program has begun.
5.
Page 31
Two LPDTs widely used in practice. These are, the lexical analyzer gernerator LEX, and the parser generator YACC.
LEX LEX accepts an input specification which consists of two components. The first component is a specification of strings representing the lexical units in L. this is in the form of regular expressions. The second component is a specification of semantic actions. YACC Each string specification in the input to the YACC resembles a grammar production. The parser generated by YACC performs reductions according to this grammar
UNIT II Elements of assembly language programming An assembly language is a machine dependent, low level programming language, which is specific to a certain computer system. The basic features of programming are 1. Mnemonic operation codes. 2. Symbolic operands. 3. Data declarations. Assembly language statements An assembly program contains three kinds of statements 1. Imperative statements. 2. Declarative statements. 3. Assembler directives. A simple assembly scheme Design specification of an assembler Identify the information necessary to perform a task. Design a simple data structures to record the information. Determine the processing necessary to obtain and maintain the information. Determine the processing necessary to perform the task.
1. 2. 3. 4.
3. Pass structure of assemblers Two pass translation This can handle forward references easily. LC processing is performed in the first pass and symbols defined in the program are entered into the symbol table. The second pass synthesizes the target form using the address information found in the symbol table.
17
The first pass performs analysis of the source program. Second pass performs synthesis of the target program. The first pass constructs an intermediate representation of the source program. It consists of data structures, symbol table and intermediate code. Fig : 4.7 Overview of two pass assembly - Page 95
Single pass translation The problem of forward references is tackled using a process called backpatching. The operand field of an instruction containing a forward reference is left blank initially. The address of the forward reference symbol is put into this field when its definition is encountered. 4. Design of a two pass assembler Tasks performed by the passes of a two pass assembler are Pass I 1. Separate the symbol, mnemonic opcode and operand fields. 2. Build the symbol table. 3. Perform LC processing. 4. Construct intermediate representation. Pass II Synthesize the target program. Pass I performs analysis of the source program and synthesis of the intermediate representation while Pass II processes the intermediate representation to synthesis the target program. Advanced assembler directives ORGIN The syntax of the directive is ORGIN <address spec > Where <address spec>is an <operand spec> or < constant>. The ORGIN statement is useful when the target program does not consist of consecutive memory words. EQU The EQU statement has the syntax <symbol> EQU <address spec> where <address spec> is an <operand spec> or <constant>. LTORG At every LTORG statement, the assembler allocates memory to the literals of a literal pool. The pool contains all literals used in the program since the start of the program or since the last LTORG statement. Pass I of the assembler. Pass I uses the following data structures: OPTAB A table of mnemonic opcodes and related information. SYMTAB Symbol table LITTAB A table of literals used in the program. OPTAB contains the fields mnemonic opcode, class and mnemonic information. The class field indicates whether the opcode corresponds to an imperative statement (IS), a declarative statement (DL) or an assembler directive (AD). A SYMTAB entry contains the fields address and length. A LITTAB entry contains the fields literal and address. Processing of an assembly statement begins with the processing of its label field. If it contains a symbol, the symbol and the value in LC is copied into a new entry of SYMTAB. For a declaration or assembler directive statement, the routine mentioned in the mnemonic info field is called to perform appropriate processing of the statement.
18
Algorithm 4.1 Assembler first pass Intermediate code form The intermediate code unit consisting of three fields 1. Address 2. Representation of the mnemonic opcode 3. representation of operands
Page 100
Address Opcode operands Mnemonic fields The mnemonic field contains a pair of the form (statement class, code) where statement class can be one of IS, DL and AD standing for imperative statement, declarative statement and assembler directive. For an imperative statement, code is the instruction opcode. For declerations and assembler directives , code is an ordinal number within the class, DC 01 DS 02 For assembler directives the codes are START 01 END 02 ORIGIN 03 EQU 04 LTORG 05 Pass II of the assembler Algorithm 4.2 Assembler Second Pass - Page 107
UNIT III Macros and Macro Processors A macro is a unit of specification for program generation through expansion. A macro consists of a name, a set of formal parameters and a body of code. Two kinds of macro expansion can be readily identified. 1. 2. Lexical expansion : It implies replacement of a character string by another character string during program generation. Semantic expansion : It implies generation of instructions to the requirements of a specific usage.
1. Macro definition and call Macro definition A macro definition is enclosed between a macro header statement and macro end statement. Macro definitions are typically located at the start of a program. A macro definition consists of 1. 2. 3. A macro prototype statement. One or more model statements. Macro preprocessor statements.
The macro prototype statement declares the name of a macro and the names and kinds of its parameters. The macro prototype statement has the following syntax: <macro name> [<formal parameter spec> [,]]
19
where <macro name> appears in the mnemonic field of an assembly statement dn <formal parameter spec> is of the form & <parameter name> [<parameter kind>] Macro call A macro is called by writing the macro name in the mnemonic field of an assembly statement. The macro call has the syntax <macro name> [<actual parameter spec> [,..]]
where actual parameter typically resembles an operand specification in an assembly language statement. Example MACRO INCR MOVER ADD MOVEM MEND 2. Macro expansion A macro call leads to macro expansion. During macro expansion, the macro call statement is replaced by a sequence of assembly statements. Two key notions concerning expansion are: 1. 2. Expansion time control flow: This determines the order in which model statements are visited during macro expansion. Lexical substitution: It is used to generate an assembly statement from a model statement.
Flow of control during expansion The default flow of control during expansion is sequential. A preprocessor statement can alter the flow of control during expansion. Some statements are either never visited during expansion called conditional expansion. Certain statements are repeatedly visited during expansion called expansion time loop. The flow of control is implemented using a macro expansion counter (MEC). Algorithm Outline of macro expansion - Page 134 Lexical substitution A model statement consist of 3 types of strings 1. 2. 3. An ordinary string The name of a formal parameter which proceeded by the character &. The name of a preprocessor variable, which is also preceded by the character &.
During lexical expansion, string of type 1 retained without substitution. The other two types are replaced by the values of the formal parameters. The rules for determining the value of a formal parameter depend on the kind of parameters.
20
a. Positional parameters A positional parameter is written as &<parameter name>. The <actual parameter spec> in a call on a macro using positional parameters is an <ordinary string>. The value of positional formal parameter XYZ is determined by the rule of positional association as follows: 1. 2. Find the ordinal position XYZ in the list of format parameters in the macro prototype statement. Find the actual parameter specification occupying the same ordinal position in the list of actual parameters in the macro call statement.
b. Keyword parameters For keyword parameters <parameter name> is an ordinary string and <parameter kind> is the string =. The <actual parameter> is written as <formal parameter name> = <ordinary string>. The value of a formal parameter XYZ is determined by the rule of keyword association as follows: 1. 2. Find the actual parmeter specifificaiotn which has the forme XYZ = <ordinary string>. Let <ordinary string> in the specification be the string ABC. Then the value of formal parameter XYZ is ABC.
Ex: MACRO INCR_M MOVER ADD MOVEM MEND The macro calls INCR_M .. INCR_M are now equivalent. MEM_VAL=A, INCR_VAL-B, REG=AREG INCR_VAL=B, REG=AREG, MEM_VAL = A &MEM_VAL=, &INCR_VAL=, ®= ®,&MEM_VAL ®, &INCR_VAL ®, &MEM_VAL
c. Default specification of parameters Default specification of parameters is useful in situations where a parameter has the same value in most calls. Default specification has the syntax &<parameter name> [<parameter kind> [ <default value>]] Ex:MACRO INCR_D MOVER ADD MOVEM MEND
&MEM_VAL=, &INCR_VAL=, ® = AREG ®, &MEM-VAL ®, &INCR_VAL ®, &MEM_VAL
Consider the following call INCR_D INCR_D INCR_D MEM_VAL=A, INCR_VAL=B INCR_VAL=B, MEM_VAL=A INCR_VAL=B, MEM_VAL=A, REG = BREG
21
The first two calls use the default values. The third call overrides the default value for the REG with the value BREG. d. Macros with mixed parameter lists A macro may be defined to use both positional and keyword parameters. In this case, all positional parameters must precede all keyword parameters. For example in the macro call SUMUP A,B,G=20,H=X
A,B are positional parameters while G,H are keyword parameters. e. Other uses of parameters Formal parameters can also appear in the label and opcode fields of model statements. Ex: MACRO CALC &X, &Y, &OP= MULT, &LAB = &LAB MOVER AREG, &X &OP AREG, &Y MOVEM AREG, &X MEND Expansion of the call CALC A,B,LAB=LOOP leads the code LOOP MOVER MULT MOVEM AREG, A AREG, B AREG, A
3. Design of a macro preprocessor The macro preprocessor accepts an assembly program containing definitions and calls and translates it into an assembly program which does not contain any macro definitions or calls. Fig 5.6 A schematic of a macro preprocessor Page 145 Design overview Tasks involved in macro expansion 1. 2. 3. 4. 5. 6. Identify macro call in the program. Determine the values of formal parameters. Maintain the values of expansion time variables declared in a macro. Organize expansion time control flow. Determine the values of sequencing symbols. Perform expansion of model statement. For each task a four step procedure is followed. 1. 2. 3. 4. Identify the information necessary to perform a task. Design a suitable data structure to record the information. Determine the processing necessary to obtain the information. Determine the processing necessary to perform the task.
22
A table called the macro name table (MNT) is designed to hold the names of all macros defined in a program. A macro name is entered in this table. While processing a match indicates that the current statement is a macro call. Determine values of formal parameters. A table called actual parameter table (APT) is designed to hold the values of formal parameters during the expansion of a macro call. Two items of information are needed to construct this table, names of formal parameters and default values of keyword parameters. For this purpose a table called parameter default table (PDT) is used for each macro. If a macro call does not specify value for some parameter, its default value would be copied from PDT to APT.
Maintain expansion time variables. An expansion time variables table (EVT) is maintained for this purpose. The table contain pair of the form ( <EV name>, <value>) Organize expansion time control flow. The body of a macro, i.e the set of preprocessor statements and model statements in it, is stored in a table called the macro definition table (MDT) for use during macro expansion. The flow of control during macro expansion determines when a model statement is to be visited for expansion. Determine the values of sequencing symbols. The sequencing symbols table (SST) is maintained to hold this information. The table contains pairs of the form (<sequencing symbol name>, <MDT entry #>) where <MDT> entry # is the number of the MDT entry which contains the model statement defining the sequencing symbol. Perform expansion of model statement. This is a trivial task given the following 1. 2. 3. MEC (Macro expansion counter) points to the MDT entry containing the model statement. Values of formal parameters and EVs are available in APT and EVT, respectively. The model statement defining a sequencing symbol can be identified from SST.
Data structures The tables APT (Actual parameter table), PDT (Parameter default table) and EVT (Expansion time variables) contain pairs which are searched. In order for simplification many data structures are used. They are
23
1.
2. 3. 4. 5. 6. 7. 8. 9.
MNT Macro name table. It contains macro name, number of positional parameters, number of keyword parameters, number of expansion time variables, MDT pointer (MDTP), KPDTAP pointer (KPDTP), and SSTAB pointer (SSTP). PNTAB Parameter name table. It contains parameter name. EVNTAB EV name table. It contains EV name. SSNTAB SS name table. It contains SS name. KPDTAB Keyword parameter default table. It contains parameter name, default value. MDT Macro definition table. It contains label, opcode, operands. APTAB Actual parameter table. It contains Value. EVTAB EV table. It contains value. SSTAB SS table. It contains MDT entry number.
PNTAB and KPDTAB are constructed by processing the prototype statement. Entries are added to EVNTAB and SSNTAB as EV declarations and SS definitions/references are encountered. MDT entries are constructed while processing the model statements and preprocessor statement in the macro body. An entry is added to SSTAB when the definition of a sequencing symbol is encountered. APTAB is constructed while processing a macro call. EVTAB is constructed at the start of expansion of a macro. Processing of macro definitions Algorithm 5.2 Processing of macro definitions. Page 150.
The algorithm is invoked for every macro definition in the program. Macro expansion The following data structures are used to perform macro expansion. APTAB Actual parameter table. EVTAB EV table. MEC Macro expansions counter. APTAB_ptr APTAB pointer. EVTAB_ptr EVTAB pointer. Algorithm 5.3. Macro expansion Nested macro calls Two provisions are required to implement the expansion of nested macro calls: 1. 2. Each macro under expansion must have its own set of data structures. An expansion nesting counter (Nest_cntr) is maintained to count the number of nested macro calls. Nest_cntr is incremented when a macro call is recognized and decremented when a MEND statement is encountered. Page 153.
The macro calls are expanded in LIFO manner, a practical solutions is to use a stack to accommodate the expansion time data structures. The stack consist of expansion records, each expansion record accommodating one set of expansion time data structures. When a nested macro call is recognized, a new expansion record is pushed on the stack. At MEND the record is popped off the stack. Design of a macro assembler One organization is two pass macro expansions. The first pass collects information about the symbols defined in a program and the second pass performs macro expansion. Pass structure of a macro assembler
24
Pass I 1. 2. Pass II 1. 2. 3. 4. Pass III 1. Target code generation. Integrating Pass I of the assembler with the preprocessor would give two pass structures. Pass I 1. 2. 3. 4. 5. Pass II 1. Target code generation. Macro definition processing. Macro expansion. Memory allocation, LC processing and SYMTAB construction. Processing of literals. Intermediate code generation. Macro expansion. Memory allocation and LC processing. Processing of literals. Intermediate code generation. Macro definition processing SYMTAB construction
UNIT V 7. Text Editors It is now recognized that interactive text editor should be considered as the primary interface to the computer for all types of knowledge workers as they compose, organize, study and manipulated computer-based information. Overview of the editing process An interactive editor is a computer program that allows a user to create and revise a target document. The document-editing process is an interactive user-computer dialogue designed to accomplish four tasks: 1. Select the part of the target document to be viewed and manipulated. 2. Determine how to format this view on-line and how to display it. 3. Specify and execute operations that modify the target document. 4. Update the view appropriately. Selection involves first traveling through the document to locate the area of interest. This is accomplished by next screenful, bottom, and find pattern. Filtering extracts the point of interest such as the next screenful of text or next statement. Formatting then determines how the result of the filtering will be seen as visible representation on a display screen or other device.
25
In editing phase the target document is altered with operations such as insert, delete, replace and copy. User interface The user of an interactive editor is presented with a conceptual model of the editing system. Some of the early line editors simulated the world of the keypunch. These editors allowed operations on numbered sequences of 80-character card-image lines, either with a single line or on an integral number of lines. The user interface is concerned with the input device, the output device and the interaction language of the system. Input devices are used to enter elements of the text being edited, to enter commands, and to designate editable elements. These devices are categorized as 1. Text devices. 2. Button devices. 3. Locator devices. Text or string devices are typically typewriter-like keyboards on which a user presses and releases keys, sending a unique code for each key. Button or choice devices generate an interrupt or set a system flag, usually causing invocation of an associated application-program action. They are set of special function keys on an alphanumeric keyboard. Buttons can be simulated in software by displaying text strings or symbols on the screen. Locator devices are two-dimensional analog-to-digital converters that position cursor symbol on the screen by observing the users movement of the device. The most common such devices are mouse and the data tablet. The data tablet is a flat, rectangular, electromagnetically sensitive panel. Either a ballpointpen-like stylus is moved over the surface. The tablet returns to a system program the coordinates of the position of the data tablet at which the stylus is located. Voice-input devices, which translate spoken words to their textual equivalents, may prove to be the text input of the future. The more modern professional workstation support multiple proportionally spaced character fonts to produce realistic facsimiles of hard-copy documents. The typed specification often requires the user to remember the exact form of all commands, or at least their abbreviations. The function-key interface addresses these deficiencies. Function-key command specification is typically coupled with cursor-key movement for specifying operands, which eliminates much typing. As an alternative to shifting function keys, the standard alphanumeric keyboard is often overloaded to simulate function keys. The user may press a control key simultaneously with a normal alphanumeric key to generate a new character that is interpreted like a function key. The menu oriented user interface is an attempt to address these problems. A menu is a multiple choice set of text strings or icons, which are graphic symbols that represent objects or operations. One problem with a menu-oriented system can arise when there are many possible actions and several choices are required to complete an action. Editor structure Figure 7.8 Typical editor structure - Page 436 Command language processor: - It accepts input from the users input devices and analyzes the tokens and syntactic structure of the commands. So it functions much like a lexical and syntactic phase of a compiler. The command language processor may produce an intermediate representation of the desired editing operations. This intermediate representation is then decoded by an interpreter that invokes the appropriate semantic routines. The semantic routine involves traveling, editing, viewing and display functions. Editing operations are always specified explicitly by the user. Display operations are specified implicitly by the other three categories of operations. The traveling and viewing operations may be invoked either explicitly by the user or implicitly by the editing operations. In editing a document, the start of the area to be edited is determined by the current editing pointer maintained by the editing component which is the collection of modules dealing with editing tasks.
26
3. Loaders and Linkers Loading Which brings the object program into memory for execution. Relocation Which modifies the object program so that it can be loaded at an address different from the location originally specified. Linking Which combines two or more separate object programs and supplies the information needed to allow references between them. A loader is a system program that performs the loading function. Many loaders also support relocation and linking. Some systems have a linker to perform the linking operation and a separate loader to handle relocation and loading. 3.1 Basic loader functions Basic loader function is to bringing an object program into memory and starting its execution. Design of an absolute loader All functions of absolute loader are performed in a single pass. The header record is checked to verify that the correct program has been presented for loading. As each Text record is read, the object code it contains is moved to the indicated address in memory. When the End record is encountered, the loader jumps to the specified address to begin execution of the loader program. Algorithm for an absolute loader Page 126 Each pair of bytes from the object program record must be packed together into one byte during loading. For example, the machine operation code for an STL instruction would be represented by the pair of characters 1 and 4. When the loader reads these, they will occupy two bytes of memory. However this operation code must be stored as one byte during loading. Therefore most machines store object programs in a binary form. This is not well to printing or reading by human beings. Simple Bootstrap Loader When a computer is first turned on or restated, a special type of absolute loader, called a bootstrap loader, is executed. This loads the first program to be run by the computer- usually an operating system. Source code for bootstrap loader for SIC/XE. Page 128. The bootstrap itself begins at address 0 in the memory of the machine. It loads the operating system at address 80. Each byte of object code to be loaded is represented on device F1. However there is no Header record, End record, or control information. This code is loaded into memory starting at address 80. After all the object code from device F1 has been loaded, the bootstrap jumps to address 80, which begins the execution of the program. The subroutine GETC reads one character form device F1 and converts it from the ASCII character code. The subroutine GETC jumps to address when an end-of-file is read from device F1. The main loop of the bootstrap keeps the address of the next memory location to be loaded in register X. the TIXR instruction is then used to add 1 to the value in register X. 3.2 Machine-dependent Loader Features
27
One disadvantage of absolute loader is the need for the programmer to specify the actual address at which it will be loaded into memory. Very small computer with a small memory it is not difficult. On large and more advanced machines, the situation is not easy. Efficient sharing of the machine requires that we write relocatable program instead of absolute ones. The need for program relocation is an indirect consequence of change to large and more powerful computers. It is dependent on machine characteristics. The process of linking usually involves relocation of some of the routines being linked together. Relocation Loaders that allow for program relocation are called relocating loader or relative loaders. There are two methods for relocating. Method 1 A modification record is used to describe each part of the object code that must be changed when the program is relocated. Each modification specifies the starting address and length of the field whose value is to be altered. It then describes the modification (add / subtract address) to be performed. The modification record scheme is a convenient means for specifying program relocation; however it is not well suited for use with all machine architectures.
Method 2 A machine that uses direct addressing and has a fixed instruction format, it is more efficient to specify relocation. Along with the Text record a relocation bit is associated with each word of the object code. If the relocating bit corresponding to a word of object code is set to 1, the programs starting address is to be added to this word when the program is relocated. A bit value of 0 indicates no relocation is needed. If a Text record contains fewer than 12 words of object code, the bits corresponding to unused words are set to 0. Some computers provide a hardware relocation capability that eliminates some of the need for the loader to perform program relocation. The conversion of this relative address to actual addresses is performed as the program is executed. Program Linking Program linking involves linking external references between programs. Programs are made up of control sections. The control sections are assembled together or they could be assembled independently of one another. The loader has no way of knowing which control sections were assembled at the same time. Each program contains some list of items. The labels in the beginning and ends of the lists are external symbols. They are available for use in linking. In some programs, the expression must be assembled as an external reference even though the final result will be an absolute value independent of the location at which the programs are loaded. The Modification record instructs the loader to add / subtract the address of the program.
28
For the references that are instruction operands, the calculated values after loading do not always appear to be equal. This is because there is an additional address calculation step involved for programcounter relative instructions. Algorithm and Data structures for a Linking Loader The modification records for relocation are used so that the linking and relocation function are performed using the same mechanism. The input for the loader consists of a set of object programs that are to be linked together. The linking loader usually makes two passes over its input, just as an assembler does. Pass1 assigns addresses to all external symbols, and pass 2 performs the actual loading, relocation and linking. The main data structure is ESTAB. It is used to store the name and address of each external symbols in the set of control sections being loaded. Two other important variables are PROVADDR Beginning address in memory where the linked programs is to be loaded. Its value is supplied to the loader by the operating system. CSADDR It contains the starting address assigned to the control section currently being scanned by the loader. This value is added to all relative addresses within the control section to convert them to actual addresses. Pass 1 of loader Algorithm for pass 1 of a linking loader Page 143. During the first pass, the loader is concerned only with Header and Define record types in the control sections. The control section name from the Header record is entered into ESTAB, with value given by CSADDR. All external symbols appearing in the Define record for the control section are also entered into the ESTAB. When the End record is read, the control section length CSLTH is added to CSADDR. At the end of pass 1 , ESTAB contains all external symbols defined in the set of control sections together with the address assigned to each. Pass 2 of loader Algorithm for pass 2 of a linking loader Page 144 Pass 2 performs the actual loading, relocation and linking of the program. CSADDR contains the actual starting address of the control section currently being loaded. As each Text record is read, the object code is moved to the specified address. When Modification record is encountered, the symbol whose value is to be used to modification is looked up in ESTAB. This value is added or subtracted from the indicated location in memory. The last step is to transfer the control the loaded program to begin execution. The End record for each control section may contain the address of the first instruction in that control section. If no control section contains a transfer record, the loader used the beginning of the linked program. The pass 2 algorithm is modified by assigning a reference number to each external symbol referred to in a control section. This reference number is used in the Modification records. This mechanism avoids multiple searches of ESTAB for the same symbol during loading of a control section.
29
3.3 Machine-independent loader features Some loader features are not directly related to machine architecture and design. Automatic library search Many linking loaders automatically incorporate the routines from a subprogram library into the program being loaded. This feature allows the programmer to use subroutines from one or more libraries as if they were part or the programming language. On some systems, this feature is referred to as automatic library call or library search. Linking loaders that support automatic library search must keep track of external symbols. One easy way is to use Refer record. The loader searches the library or libraries specified for routines that contain the definitions of these symbols. Subroutines fetched from library may themselves contain external references. It is therefore necessary to repeat the library search process until all references are solved. Unresolved one is treated as errors. The libraries to be searched by the loader ordinarily contain assembled compiled versions of the subroutines. It is possible to scan Define records. In most cases a special file structure is used. This structure contains a directory that gives the name of each routine and a pointer to its address within the file. Loader options Many loaders have a special command language that is used to specify options. On some systems options are specified as part of the job control language that is processed by the operating system. Option 1 One typical loader option allows the selection of alternate sources of input. Ex:INCLUDE program-name(library-name) Might direct the loader to read the designated object program from a library and treat it as if it were part of the primary loader input Other commands allow the user to delete external symbols or entire control sections. Ex DELETE Csect-name Might instruct the loader to delete the named control section from the set of programs being loaded. The command CHANGE name1,name2 might cause the external symbol name1 to be changed to name 2. Option 2 Another common loader option involves the automatic inclusion of library routines to satisfy external references. Most loaders allow the users to specify alternative libraries to be searched using a statement Library mylib
30
Option 3 It involves output from the loader. It may also include external symbol addresses. 3.4 Loader design options Linkage editors
Processing of an object program using linking loader and linkage editor Page 153
The source program is first assembled or compiled, producing an object program. A linking loader performs all linking and relocation operations, including automatic library search, and loads the linked program directly into the memory for execution. A linking loader searches libraries and resolves external references every tine the program is executed. A linkage editor produces a linked version of the program, which is written to a file or library for later execution. The linkage editor performs relocation of all control sections relative to the start of the linked program. This means that the loading can be accomplished in one pass with no external symbol table required. The linked program produced by a linkage editor is generally in a form that is suitable for processing by a relocating loader. If the actual address at which the program will be loaded is known in advance, the linkage editor can perform all the needed relocation. The content and processing of such an image are the same as for an absolute object program. Linkage editors can be used to build packages or subroutines or other control sections that are generally used together. If a program using formatted I/O were linked in the usual way, all of the crossreferences between these libraries subroutines would have to be processed individually. The linkage editor could be used combine the appropriate subroutines into a package. Linkage editors often allow the user to specify that external references are not to be resolved by automatic library search. Dynamic linking This scheme postpones the linking function until execution time: a subroutine is loaded and linked to the rest of the program when it is first called. This type of function is usually called dynamic linking, dynamic loading or load on call.
Dynamic linking is often used to allow several execution programs to share one copy of a subroutine or library.
31