Introduction to Software      59
c
                                                                                                                   h
                                                                                                                   a
                                                                                                                   p
                                                                                                                   t
  Introduction to Software                                                                                         e
                                                                                                                   r         4
                                                                                                 LEARNING
                                                                                                    OBJECTIVES
         After studying this chapter, the readers will be able to
         ∑	 identify system programs and application programs               ∑	 briefly discuss compiler, interpreter, linker, and loader
         ∑	 discuss basic concepts of high- and low-level languages            functions
                                                                            ∑	 explain the software development steps
  4.1 INTRODUCTION                                                    are program files to be used in conjunction with the main
                                                                      executable file and additional data files and configuration
The basic concepts of software have already been introduced
                                                                      files. Consequently, the installation of software is not just
in Chapter 1. As discussed earlier, there are different catego-
                                                                      copying the files in the hard disk. It is typically dependent on
ries of software. Among them, system software controls the
                                                                      the operating system on which it would execute and whether
activities of computer resources (such as input/output devic-
                                                                      the software is a local, Web, or portable application. For local
es, memory, processor), schedule the execution of multiple
                                                                      application software, its files are placed in the appropriate
tasks, whereas application software is designed and devel-
                                                                      locations on the computer’s hard disk and may require
oped for a specific or generic type of use which is sold in the
                                                                      additional configurations with the underlying operating
market or for a user or an organization. The term ‘application’
                                                                      system so that it can be run as and when required. Portable
refers to the specific usage such as creating documents, draw-
                                                                      software is basically designed to run from removable storage,
ing images, playing video games, etc., which is accomplished
                                                                      such as a CD or USB flash drive without installing its program
by a computer system.
                                                                      files or configuration data on the hard disk. Most interesting
	 Nowadays, software is typically composed of several files
                                                                      fact is that no trace is found when the removable storage
among which at least one must be an executable file intended
                                                                      media containing the portable software is removed from
to be executed by users or automatically launched by the
                                                                      the computer. On the other hand, web application software
operating system. Apart from this main executable file, there
                                                                      is accessed through a Web browser and most of its program
  60     Computer Fundamentals and Programming in C
code runs on a remote computer connected to the Internet or         	 Machine language is considered to be the first generation
other computer network.                                             language (1GL). As it is the native language of the computer,
                                                                    CPU can directly start executing machine language
  4.2  PROGRAMMING LANGUAGES                                        instructions. But the limitations of using machine language
                                                                    in writing programs include the following.
A programming language can be defined formally as an
artificial formalism in which algorithms can be expressed. It is    Difficult to use and error prone  It is difficult to understand
composed of a set of instructions in a language understandable      and develop a program using machine language. Because it is
to the programmer and recognizable by a computer. Computer          hard to understand and remember the various combinations of
languages have been continuing to grow and evolve since the         1’s and 0’s representing data and instructions. The programmer
1940’s. Assembly language was the normal choice for writing         has to remember machine characteristics while preparing a
system software like operating systems, etc. But, C has             program. Checking machine instructions to locate errors are
been used to develop system software since its emergence.           about as tedious as writing the instructions. For anybody
The UNIX operating system and its descendants are mostly            checking such a program, it would be difficult to forecast the
written in C. Application programs are designed for specific        output when it is executed. Nevertheless, computer hardware
computer applications. Most programming languages are               recognizes only this type of instruction code. Further,
designed to be good for one category of applications but not        modifying such a program is highly problematic.
necessarily for the other. For an instance, COBOL is more           Machine independent  As the internal design of the
suitable for business applications whereas FORTRAN is               computer is different across types, which in turn is determined
more suitable for scientific applications.                          by the actual design or construction of the ALU, CU, and
	 The development of programming languages has been                 size of the word of the memory unit, the machine language
governed by a number of factors such as type and performance        also varies from one type of computer to another. Hence,
of available hardware, applications of computers in different       it is important to note that after becoming proficient in the
fields, the development of new programming methodologies            machine code of a particular computer, the programmer may
and its implementation etc.                                         be required to learn a new machine code and would have to
                                                                    write all the existing programs again in case the computer
4.2.1	 Generation of Programming Languages
                                                                    system is changed.
Just as hardware is classified into generations based on tech-
nology, computer languages also have a generation classifi-         Second generation language (2GL)—assembly
cation based on the level of interaction with the machine.          language
                                                                    Assembly language is considered to be a second generation
First generation language (1GL)—machine
                                                                    language (2GL). In this language, an instruction is expressed
language
                                                                    using mnemonic codes instead of binary codes. Normally an
The instructions in machine language are written in the             assembly language statement consists of a label, an operation
form of binary codes that can immediately be executed by            code, and one or more operands. Labels are used to identify
the processor. A machine language instruction generally has         and reference instructions in the program. The operation code
three parts as shown in Fig. 4.1. The first part is the operation   is a symbolic notation that specifies the particular operation to
code that conveys to the computer what function has to be           be performed, such as MOV, ADD, SUB, or CMP etc. The operand
performed by the instruction. All computers have operation          represents the register or the location in main memory where
codes for functions such as adding, subtracting and moving.         the data to be processed is located. For example, a typical
The second part “Mode” specifies the type of addressing             statement in assembler to command the processor to move
used by the instruction to obtain the operand referred by the       the hexadecimal number 0x80 into processor register R2
instruction. The third part of the instruction either specifies     might be:
that the operand contains data on which the operation has to be
                                                                    		       MOV R2, 080H
performed or it specifies that the operand contains a location,
the contents of which have to be subjected to the operation.        	 The following is an example of an assembly language
                                                                    program for adding two numbers A and B and storing the
                              n-bits
                                                                    result in some memory location.
              p-bits         q-bits          r-bits
                                                                    	              Load register A with content of
                                                                             LDA, 2000h; 	
         Operation code      Mode          Operand
                                                                                   memory address 2000h
                                                                    	MOV B, 10h; 	 Load register B with 10th.
                          n=p+q+r                                   	ADD A, B 		   Add contents of A with contents of
   Fig. 4.1 General format of machine language instruction                         B and store result in register A
                                                                                                   Introduction to Software    61
	        MOV (100), A	 Save the result in the main memory           executed. This can be performed either by a compiler, or
                       location 100 from register A.                by interpreter. One statement in a high-level programming
	        HALT 	       	Halt process                                 language will be translated into several machine language
                                                                    instructions.
	An assembly language program cannot be executed by a               	 Advantages of high-level programming languages are
machine directly as it is not in a binary machine language          many fold which are as follows.
form. An assembler is a translator that produces machine            Readability Programs written in these languages are
language code from an assembly language code. It produces           more readable than those written in assembly and machine
a single machine language instruction from a single assembly        languages.
language statement. Therefore, the coding to solve a problem
in assembly language has to be exercised at individual              Portability High-level programming languages can be run
instruction level. That’s why, along with machine language,         on different machines with little or no change. It is, therefore,
assembly language is also referred to as a low level language.      possible to exchange software, leading to creation of program
	 Writing a program in assembly language is more convenient         libraries.
than writing in machine language. Instead of binary sequence,       Easy debugging Errors can be easily detected and removed.
as in machine language, a program in assembly language is
                                                                    Ease in the development of software  Since the instructions
written in the form of symbolic instructions. This gives the
                                                                    or statements of these programming languages are closer to
assembly language program improved readability. It also
                                                                    the English language, software can be developed with ease.
offers several disadvantages.
                                                                    The time and cost of creating machine and assembly language
•	 The most eminent disadvantage of assembly language               programs were quite high. This motivated the development
   is that it is machine dependent. Assembly language is            of high-level languages.
   specific to the internal architecture of a particular model of
                                                                    Fourth generation languages (4GL)
   a processor and the programmer should know all about the
   internal architecture of the processor. A program written        The Fourth Generation Language (4GL) is a non-procedural
   in assembly language for one processor will not work on a        language that allows the user to simply specify what is
   different processor if it is architecturally different.          wanted without describing the steps that the computer has to
•	 Though mnemonic codes are easier to be remembered than           follow to produce the result. This class of languages requires
   binary codes, programming with assembly language is still        significantly fewer instructions to accomplish a particular task
   difficult and time-consuming.                                    than does a third generation language. Thus, a programmer
                                                                    should be able to write a program faster in 4GL than in a third
Third generation language (3GL)—high-level                          generation language.
language                                                            	 The main areas and purviews of 4GLs are: database
High-level languages are called third generation languages          queries, report generators, data manipulation, analysis and
(3GLs). High-level programming languages were developed             reporting, screen painters, etc. An example of a 4GL is the
to make programming easier and less error-prone. Languages          query language that allows a user to request information from
like C, C++, COBOL, FORTRAN, BASIC, PASCAL etc.,                    a database with precisely worded English-like sentences.
have instructions that are similar to English language that         A query language is used as a database user interface and
makes it easy for a programmer to write programs and                hides the specific details of the database from the user.
identify and correct errors in them. The program shown              The following example shows a query in a common query
below is written in BASIC to obtain the sum of two numbers.         language, SQL.
	      10	 LET X = 7                                                 SELECT address FROM EMP WHERE empname = ‘PRADIP DEY’
	      20	 LET Y = 10                                               	 With a report generator, the programmer specifies the
	      30	 SUM = X + Y                                              headings, detailed data, and other details to produce the
	      40	 PRINT SUM                                                required report using data from a file. 4GLs offer several
	      50	END                                                       advantages which include the following.
	 Most third generation languages are procedural in nature.         •	 Like third generation languages, fourth generation lan-
That is, the programmer must specify the sequential logically          guages are mostly machine independent. They are primar-
related steps to be followed by the computer in a program.             ily used mainly for developing business applications.
As computer only understands machine language, a program            •	 Most of the fourth generation languages can be easily
written in a high level language must be translated into               learnt and employed by end-users.
the basic machine language instruction set before it can be
  62     Computer Fundamentals and Programming in C
•	 All 4GLs are designed to reduce programming effort, the              guages. Both assembly language and machine language are
   time it takes to develop software, and the cost of software          considered as low-level languages. Low-level languages are
   development. Programming productivity is increased when              closer to the native language of the computer as program
   4GL is used in coding.                                               written in machine language does not require translation for
                                                                        a processor to execute them. Assembly language is also con-
Fifth generation language (5GL)                                         sidered as a low-level language since each assembly language
Natural languages represent the next step in the development            instruction accomplishes only a single operation and the cod-
of programming languages belonging to Fifth Generation                  ing for a problem is at the individual instruction level. On
Language (5GL). Natural language is similar to query                    the other hand, high-level programming languages provide a
language, with one difference: it eliminates the need for the           high level of abstraction from the actual machine hardware.
user or programmer to learn a specific vocabulary, grammar,             	High-level languages can further be characterized is by
or syntax.                                                              programming paradigm (Fig. 4.2). A programming paradigm
	 Actually, 5GL is a programming language based around                  refers to the way of problem solving that includes a set of
solving problems using constraints given to the program,                methodologies, theories, practices and standards. The high-
                                                                        level programming languages may also be categorized into
rather than using an algorithm written by a programmer. Fifth
                                                                        three groups—procedural, non-procedural, and problem
generation languages are used mainly in artificial intelligence
                                                                        oriented.
research. OPSS and Mercury are examples of fifth generation
languages.                                                              Procedural programming languages
                                                                        In procedural programming, a program is conceived as
  note                                                                  a set of logically related instructions to be executed in
	 ∑	A low-level computer programming language is one                    order. In procedural programming, each program can be
    that is closer to the native language of the computer.              divided into small self-contained program segment, each
    Machine and assembly languages are referred to as low-              of which performs a particular task and be re-used in the
    level languages since the coding for a problem is at the            program as and when required without repeated explicit
    individual instruction level.                                       coding corresponding to the segment. These sections of
	 ∑	 Program written in languages other than machine lan-               code are known as procedures or subroutines or functions.
     guage is required to be translated into machine code.
                                                                        It also makes it easier for programmers to understand and
                                                                        maintain program structure. There are mainly three classes of
4.2.2	 Classification of Programming Languages                          procedural programming languages.
Programming languages can be classified in various ways.                Algorithmic  Using this type of programming languages,
According to the extent of translation that is required to gen-         the programmer must specify the steps the computer has
erate the machine instructions from a program, programming              to follow while executing a program. In these languages,
languages can be classified into low-level or high-level lan-           a complex problem is solved using top-down approach
                                                              Programming language
                                        High-level language                               Low-level language
                           Procedural         Non-procedural       Problem-            Machine       Assembly
                                                                   oriented            language      language
                              Algorithmic          Functional          Numerical
                              (COBOL,              (LISP, ML)          (MATLAB)
                              FORTRAN, C)
                              Object               Logic based         Symbolic
                              oriented             (PROLOG)            (MATHEMATICA)
                              (C++, JAVA,
                              SMALLTALK)
                              Scripting                                Publishing
                              (VB, PERL)                               (LATEX)
                                            Fig. 4.2  Programming language classification
                                                                                                 Introduction to Software   63
of problem solving in which the problem is divided into            The behaviour depends upon the types of data used in the
a collection of small problems and each small problem              operation. Polymorphism is a term that describes a situation
is realized in terms of subprogram. Each subprogram is             where one name may refer to different methods. This means
implemented using procedure or function. Languages like C,         that a general kind of operations may be accessed in the same
COBOL, PASCAL and FORTRAN fall into this category.                 manner even though specific actions associated with each op-
Object-oriented language  The basic philosophy of ob-              eration may differ.
ject-oriented programming is to deal with objects rather than      Reusable code  Object oriented programming languages
functions or subroutines as in strictly algorithmic languages.     enable programmer to make parts of program reusable and
Instead of procedures, object-oriented programming relies on       extensible by breaking down a program into reusable objects.
software objects as the units of modularity. Data and associ-      These objects can then be grouped together in different ways
ated operations are unified grouping objects with common           to form new programs. By reusing code it is much easier to
properties, operations and semantics. The use of an object         write new programs by assembling existing pieces.
oriented programming language, advocates the reuse of not          	 Using the above features, object-oriented programming
only code but also of entire design leading to creation of ap-     languages facilitate to produce reliable and reusable software
plication framework. A program thus becomes a collection           in reduced cost and time. C++, JAVA, SMALLTALK, etc. are
of cooperating objects, rather than a list of instructions. Ob-    examples of object-oriented languages.
jects are self-contained modules that contain data as well as
                                                                   Scripting languages  Few years back, the scripting languag-
the functions needed to manipulate the data within the same
                                                                   es were not considered as the languages, but rather thought of
module. The most important object-oriented programming
                                                                   as auxiliary tool. A scripting language may be thought of as a
features are
                                                                   glue language, which sticks a variety of components written
Abstraction  Abstraction is a technique of focussing on the        in other languages together. These languages are usually in-
essential and relevant details from a complex problem which        terpreted. One of the earliest scripting languages is the UNIX
are of interest to the application. It helps to simplify the un-   shell. Now there are several scripting languages such as VB-
derstanding and using of any system. With data abstraction,        script, Python, Tcl and Perl etc. Javascript language also be-
data structures can be used without having to be concerned         longs to this category and defacto standard for the implemen-
about the exact details of implementation. Object-oriented         tation of client-side Web application.
programming languages use classes and objects for repre-
                                                                   Non-procedural languages
senting abstractions. A class defines the specific structure of
a given abstraction. It has a unique name that conveys the         These functional languages solve a problem by applying a
meaning of the abstraction. Class definition provides a soft-      set of functions to the initial variables in specific ways to
ware design which describes the general properties of some-        get the result. A program written in a functional language
thing that the software is modeling. Object is an instance of      consists of a series of built-in function evaluation together
class. An object’s properties are exactly those described by       with arguments to those functions. LISP, ML, Scheme, etc.
its class.                                                         are examples of functional languages.
                                                                   	 Another non-procedural class of languages is called rule
Encapsulation and data hiding  The process, or mecha-              based languages or logic programming languages. A logic
nism, by which the data and functions or methods for ma-           program is expressed as a set of atomic sentences, known
nipulating data into a single unit, is commonly referred to as     as facts, and horn clauses, such as if-then rules. A query is
encapsulation.                                                     then posed. Then the execution of the program begins and
Inheritance  Inheritance allows the extension and reuse of         the system tries to find out if the answer to the query is true
existing code, without having to repeat or rewrite the code        or false for the given facts and rules. Such languages include
from scratch. Inheritance involves the creation of new class-      PROLOG.
es, also called derived classes, from existing classes (base       Problem-oriented languages
classes). Object oriented languages are usually accompanied
                                                                   These languages provide readymade procedures or functions
by a large and comprehensive library of classes. Members of
                                                                   which are pre-programmed. The user has to write the
these classes can either be used directly or reused by employ-     statements in terms of those pre-written functions. MATLAB
ing inheritance in designing new classes.                          is a very popular language among scientists and engineers to
Polymorphism  The purpose of polymorphism is to let one            solve a wide class of problems in digital signal processing,
name be used to specify a general class of action. An op-          control systems, modelling of systems described by
eration may exhibit different behaviors in different instances.    differential equations, matrix computations, etc.
  64     Computer Fundamentals and Programming in C
	 Another class of problem oriented languages is for              This process is known as parsing. Syntax is similar to the
symbolic language manipulation. For example, simplifying          grammar of a language. Syntax rules specify the way in which
a complex algebraic expression or getting the indefinite          valid syntactic elements are combined to form the statements
integral of a complex expression. MATHEMATICA is a                of the language. Syntax rules are often described using a
popular language of this type.                                    notation known as BNF (Backus Naur Form) grammar.
	 In the Internet era, a new category of languages has
emerged, the markup languages. Mark-up languages are
                                                                   Source           Lexical         Syntactic           Semantic
not programming languages. For instance, HTML, the                program           analysis        analysis            analysis
most widely used mark-up language, is used to specify the
layout of information in Web documents. However, some
programming capability has crept into some extensions to                       Library code
                                                                                                                       Intermediate
HTML and XML. Among these are the Java Server Pages,                           object code
                                                                                from other                                 code
Standard Tag Library (JSTL), and eXtensible Stylesheet                         compilations                             generation
Language Transformations (XSLT).
  4.3	 COMPILING, LINKING, AND LOADING                            Executable
                                                                   program           Linker
                                                                                                    Object
                                                                                                   program
                                                                                                                         Code
                                                                                                                       generation
       A PROGRAM
A program, written in source language, is translated by the                     Fig. 4.3  The process of compilation
compiler to produce a program in a target language. The
source language is usually a high-level language. The target      	 As a result of parsing, a data structure, known as parse
language may or not necessarily be machine language. In most      tree, is produced.
cases, the target language is assembly language, and in which     Semantic analysis The semantics of a statement in a
case, the target program must be translated by an assembler       programming language define what will happen when that
into an object program. Then the object program is linked with    statement is executed. Semantic rules assign meanings to
other object programs to build an executable program, which       valid statements of the language. In the semantic analysis
is normally saved in a specified location of the secondary        phase, the parsed statements are analysed further to make
memory. When it is needed to be executed, the executable          sure that the operators and operands do not violate source
file is loaded into main memory before its execution. The         language specification.
whole process is managed, coordinated and controlled by the
                                                                  Intermediate code generation and optimization To
underlying operating system. Sometimes the target language
                                                                  make the target program a bit smaller or faster or both,
may be a language other than machine or assembly language,
                                                                  many compilers produce an intermediate form of code
in which case a translator for that language must be used to
                                                                  for optimization. In most cases, the intermediate code is
obtain an executable object program.
                                                                  generated in assembly language or in a different language at
	 Conceptually, the compilation process can be divided into
                                                                  a level between assembly language and machine language.
a number of phases, each of which is handled by different
modules of a compiler, as shown in Fig. 4.3.                      Code generation  This is the final phase of a standard
                                                                  compilation which converts every statement of the optimized
Lexical analysis  In this phase, the source program is
                                                                  intermediate code into target code using predefined target
scanned for lexical units (known as tokens) namely, identifier,
                                                                  language template. The target language template depends on
operator delimiter, etc. and classify them according to their
                                                                  the machine instructions of the processor, addressing modes
types. A table, called symbol table, is constructed to record
                                                                  and number of registers, etc.
the type and attributes information of each user-defined name
used in the program. This table is accessed in the other phases   	 If a system library containing pre-written subroutines
of compilation.                                                   or functions and/or separately compiled user-defined
                                                                  subroutines are used in a program a final linking and loading
Syntax analysis  In this phase, tokens are conflated into
                                                                  step is needed to produce the complete machine language
syntactic units such as expressions, statements, etc. that must
                                                                  program in an executable form.
conform to the syntax rules of the programming language.
                                                                                                    Introduction to Software    65
  note                                                              4.4.2  Linker
Conceptually, the compilation process can be divided into a         Most of the high-level languages provide libraries of
 number of phases                                                   subroutines or functions so that certain common operations
	 ∑	 In the first phase of compilation, termed as lexical           may be reused by system-supplied routines without explicit
     analysis, each statement of a program is analyzed and          coding. Hence, the machine language program produced by
     broken into individual lexical units termed tokens and         the translator must normally be combined with other machine
     constructs a symbol table for each identifier.                 language programs residing within the library to form a
	 ∑	 The second stage of translation is called syntax analysis;     useful execution unit. This process of program combination
     tokens are combined into syntactic units according to the
     syntax or grammar of the source language.
                                                                    is called linking and the software that performs this operation
	 ∑	 In the third stage of compilation, the parsed statements
                                                                    is variously known as a linker. The features of a programming
     are analysed further to make sure that the operators and       language influence the linking requirements of a program. In
     operands do not violate source language specifications.        languages like FORTRAN, COBOL, C, all program units
	 ∑	 Next, an intermediate representation of the final machine      are translated separately. Hence, all subprogram calls and
     language code is produced. Optionally, the intermediate        common variable references require linking. Linking makes
     code is optimized to produce an optimized code.                the addresses of programs known to each other so that transfer
	 ∑	 The last phase of translation is code generation whereby       of control from one subprogram to another or a main program
     the optimized intermediate code is converted into target
                                                                    takes place during execution.
     code.
                                                                    4.4.3  Loader
                                                                    Loading is the process of bringing a program from secondary
                                                                    memory into main memory so it can run. The system soft-
  4.4  TRANSLATOR, LOADER, AND LINKER
                                                                    ware responsible for it is known as loader. The simplest type
       REVISITED                                                    of loader is absolute loader which places the program into
4.4.1  Translators                                                  memory at the location prescribed by the assembler. Boot-
                                                                    strap loader is an absolute loader which is executed when
There are three types of translators, namely Assembler,
                                                                    computer is switched on or restarted to load the operating
Compiler and Interpreter. Assembler converts one assembly
                                                                    system.
language statement into a single machine language instruction.
                                                                    	 In most of the cases, when a compiler translates a source
. Depending on its implementation, a high-level language
                                                                    code program into object code, it has no idea where the code
employs a compiler or an interpreter or both for translation.
                                                                    will be placed in main memory at the time of its execution.
One statement in a high-level programming language will
                                                                    In fact, each time it is executed, it would likely be assigned
be translated into several machine language instructions.
                                                                    a different area of main memory depending on the avail-
Both compiler and interpreter translate a program written in
                                                                    ability of primary storage area at the time of loading. That
high-level language into machine language but in different
                                                                    is why, compilers create a special type of object code which
fashion. Compiler translates the entire source program into
                                                                    can be loaded into any location of the main memory. When
object program at once and then the object files are linked
                                                                    the program is loaded into memory to run, all the addresses
to produce a single executable file. Unlike compiler, an
                                                                    and references are adjusted to reflect the actual location of
interpreter translates one line of source code at a time—then
                                                                    the program in memory. This address adjustment is known
executes it—before translating the next one and it does this
                                                                    as relocation. Relocation is performed before or during the
every time the program executes. BASIC is a language that is
                                                                    loading of the program into main memory.
usually implemented with an interpreter. Translation using an
                                                                    	 In modern languages, a prewritten subroutine is not loaded
interpreter is slower than that using a compiler. The interpreter
                                                                    until it is called. All subroutines are kept on disk in a relocat-
translates each line of source code to machine code each
                                                                    able load format. The main program is loaded into memory
time the program is executed. With respect to debugging, an
                                                                    and is executed. When a routine needs to call another routine,
interpreted language is better than the compiled language. In
                                                                    the calling routine first checks whether the other routine has
an interpreter, syntax error is brought to the attention of the
                                                                    been loaded. If not, the linking loader is called to load the
programmer immediately so that the programmer can make
                                                                    desired routine into memory and to update the program’s ad-
necessary corrections during program development. The Java
                                                                    dress tables to reflect this change. Then, control is passed to
language uses both a compiler and an interpreter.
                                                                    the newly loaded routine.
  66      Computer Fundamentals and Programming in C
  note                                                                   Documentation is developed throughout the program
	 ∑	 A high-level source program must be translated first into
                                                                         development process. Documentation is extremely
     a form the machine can execute. This is done by the                 important, yet it is the area in program development that
     system software called the translator.                              is most often overlooked or downplayed.
	 ∑	The machine language program produced by the transla-          	 6.	 The last step in developing a program is implementation.
     tor must normally be combined with other machine lan-               Once the program is complete, it needs to be installed on
     guage programs residing within the library to form a useful         a computer and made to work properly. If the program
     execution unit. Linking resolves the symbolic references            is developed for a specific company, the programming
     between object programs. It makes object programs
                                                                         team may be involved in implementation. If the program
     known to each other. The system software responsible
     for this function is known as linker.                               is designed to be sold commercially, the documentation
	 ∑	 Relocation is the process of assigning addresses to the
                                                                         will have to include directions for the user to install the
     various parts of the program, adjusting the code and data           program and begin working with it.
     in the program to reflect the assigned addresses.             	 7.	 Even after completion, a program requires attention.
	 ∑	A loader is a system software that places executable pro-            It needs to be maintained and evaluated for possible
     gram’s instructions and data from secondary memory into             changes.
     primary memory and prepares them for execution and ini-
     tiates the execution
                                                                     4.6  SOFTWARE DEVELOPMENT
                                                                   Programming is an individual’s effort and requires no for-
  4.5  DEVELOPING A PROGRAM                                        mal systematic approach. Software development is more than
                                                                   programming. A large number of people are involved in soft-
We first discuss the step-by-step listing of the procedure
                                                                   ware development and it emphasizes on planned aspect of
involved in creating a computer program. Here we explain the
                                                                   development process. Programming is one of the activities
seven important steps towards creating effective programs:
                                                                   in software development. Other activities include require-
definition, design, coding, testing, documentation, imple-
                                                                   ment analysis, design, testing, deployment, maintenance etc.
mentation, and maintenance.
                                                                   A software is built according to client’s requirements. It is
	 1.	 The first step in developing a program is to define          driven by cost, schedule and quality. That is, software should
      the problem. This definition must include the needed         be developed at reasonable cost, handed over in reasonable
      output, the available input, and a brief definition of how   time. Below the most basic steps in software development
      one can transform the available input into the needed        are explored.
      output.
	 2.	 The second step is to design the problem solution.           4.6.1 Steps in Software development:
      This detailed definition is an algorithm, a step-by-step     The entire process of software development and
      procedure for solving a problem.                             implementation involves a series of steps. Each successive
	 3.	The third step in developing a program is to code             step is dependent on the outcome of the previous step.
      the program; that is, state the program’s steps in the       Thus, team of software designers, developers and users are
      language being used. The instructions must follow the        required to interact with each other at each stage of software
      language’s syntax, or rules, just as good English must       development so as to ensure that the end product is as per the
      follow the rules of grammar in English.                      client’s requirements.
                                                                   	 Software development steps are described below.
	 4.	 The fourth step is to test the program to make sure
      that it will run correctly, no matter what happens. If the   Feasibility study
      algorithm is wrong or the program does not match the         The feasibility of developing the software in terms of
      algorithm, the errors are considered logic errors. Errors    resources and cost is ascertained. In order to determine the
      in a program are called bugs; the process of finding         feasibility of software developments, the existing system of
      the bugs and correcting them is called debugging the         the user is analysed properly. The analysis done in this step is
      program. To test or debug a program, one must create a       documented in a standard document called feasibility report,
      sample-input data that represents every possible way to      which contains the observations and recommendations related
      enter input.                                                 to the task of software development. Activities involved in
	 5.	 The fifth step in developing a program is to complete the    this step include the following.
      documentation of the program. Documentation should           Determining development alternatives  This activity in-
      include: user instructions, an explanation of the logic of   volves searching for the different alternatives that are avail-
      the program, and information about the input and output.     able for the development of software.
                                                                                                  Introduction to Software   67
Analysing economic feasibility  This activity involves de-          Design
termining whether the development of new software will be           After the feasibility analysis stage, the next step is creating
financially beneficial or not. This type of feasibility analysis    the architecture and design of the new software. It involves
is performed to determine the overall profit that can be earned     developing a logical model or basic structure of the new
from the development and implementation of the software.
                                                                    software. Design of the software is divided into two stages –
This feasibility analysis activity involves evaluating all the
                                                                    system design and detailed software design.
alternatives available for development and selecting the one
                                                                    	 System design partitions the requirements to hardware or
which is most economical.
                                                                    software systems. It establishes overall system architecture.
Accessing technical feasibility  It involves analysing vari-        The architecture of a software system refers to an abstract
ous factors such as the performance of the technologies,            representation of that system. Architecture is concerned with
ease of installation, ease of expansion or reduction in size,       making sure the software system meets the requirements of
interoperability with other technologies, etc. The technical        the product, as well as ensuring that future requirements can
feasibility involves the study of the nature of technology as       be addressed. The architecture step also addresses interfaces
to how easily it can be learnt and the level of training required   between the software system and other software products, as
to understand the technology. This type of feasibility assess-
                                                                    well as the underlying hardware or the host operating system.
ment greatly helps in selecting the appropriate technologies
                                                                    Detailed design represents the software system functions in
to be used for developing the software. The selection should
                                                                    a form that can be transformed into one or more executable
be made after evaluating the requirement specification of the
                                                                    programs. Specification is the task of precisely describing the
software.
                                                                    software to be written, possibly in a rigorous way.
Analysing operational feasibility  It involves studying the
                                                                    Implementation
software on operational and maintenance fronts. The opera-
tional feasibility of any software is done on the basis of sev-     In this step, the code for the different modules of the new
eral factors such as the following.                                 software is developed. The code for the different modules
                                                                    is developed according to the design specifications of each
	(a)	 Type of tools needed for operating the software
                                                                    module. The programmers in the software development team
	(b)	 Skill set required for operating the software                 use development tools for this purpose. An important, and
	(c)	 Documentation and other support required for operating        often overlooked, task is documenting the internal design
      the software                                                  of software for the purpose of future maintenance and
Requirement analysis                                                enhancement.
In this step, the requirements related to the software, which is    Testing
to be developed, are understood. Analysing the requirements         It is basically performed to detect the prevalence of any
analysis is an important step in the process of developing          errors in the new software and rectify those errors. One of
software. If the requirements of the user are not properly          the reasons for the occurrence of errors or defects in the
understood, then the software is bound to fall short of end         new software is that the requirements of the client were not
user’s expectations. Thus, requirements analysis is always          properly understood. Another reason for the occurrence of
the first step towards development of software.                     errors is the common mistakes committed by a programmer
	 The users may not be able to provide the complete set             while developing the code. The two important activities that
of requirements pertaining to the desired software during           are performed during testing are verification and validation.
the requirement analysis stage. There should be continuous          Verification is the process of checking the software based on
interaction between the software development team and the           some predefined specifications, while validation involves
end users. The software development team also needs to take         testing the product to ascertain whether it meets the user
into account the fact that the requirement of the users may         requirements. During validation, the tester inputs different
keep changing during the development process. Thus proper           values to ascertain whether the software is generating the
analysis of user requirements is quite essential for developing     right output as per the original requirements.
the software within a given time frame.
                                                                    Deployment
	 The customer requirements identified during the
requirements gathering and analysis activity are organized          The newly developed and fully tested software is installed
into a System Requirements Specification Document. The              in its target environment. Software documentation is handed
important components of this document are functional                over to the users and some initial data are entered in the
requirements, the nonfunctional requirements, and the goals         software to make it operational. The users are also given
of implementation.                                                  training on the software interface and its other functions.
  68         Computer Fundamentals and Programming in C
Maintenance                                                                     to it for ensuring its full time availability. The software may
In this phase, developed software is made operational. Users                    be required to be modified if the environment undergoes a
will have lots of questions and software problems which lead to                 change. Maintaining and enhancing software to cope with
the next phase of software development. Once the software has                   newly discovered problems or new requirements can take far
been deployed successfully, a continuous support is provided                    more time than the initial development of the software.
                                                                          Summary
A programming language is an artificial formalism for expressing the            stages. In lexical analysis phase, lexical units or tokens are produced
instructions to be executed in a specified sequence. Programming                from the statements. Also symbol table is constructed to record the type
languages can be classified into low-level and high level languages. Low-       and attributes information of each user-defined name in the program.
level programming languages include machine language and assembly               Next, syntax analysis takes place. In this phase, tokens are grouped into
language. In fact, assembly languages were so revolutionary that they           syntactic units such as expressions, and statements. that must conform to
became known as second-generation languages, the first generation being         the grammatical rules of the source language to form a data structure called
the machine languages themselves. Assembly languages are symbolic               parse tree. In semantic analysis, the parse trees are analysed further to
programming languages that use symbolic notation to represent machine-          make sure that the operators and operands do not violate source language
language instructions.                                                          type specification. Then, to produce a more efficient target program, the
Most third generation languages are procedural languages. Compilers             intermediate code is generated which is then optimized. In the last phase,
convert the program instructions from human understandable form to the          object code in target language is produced. Linking resolves symbolic
machine understandable form. Interpreters also convert the source program       references between object programs. A loader is a system program that
to machine language instruction but execute each line as it is entered. The     accepts object programs and prepares them for execution and initiates the
translation of the source program takes place for every run and is slower       execution. Programming is an individual’s effort and requires no formal
than the compiled code. The system software controls the activities of a        systematic approach. Software development is more than programming. It
computer, application programs, flow of data in and out of memory and disk      involves a series of steps–feasibility study, requirement analysis, design,
storage. Compilation of a source code into target code follows successive       coding, testing, deployment and maintenance.
                                                                          Key Terms
Loader  It is a system program that accepts object programs and                 Syntax  It refers to the rules governing the computer operating system,
prepares these programs for execution by the computer and initialize the        the language, and the application.
execution.                                                                      Assembler  It is a program that translates an assembly language
Linker  It takes one or more object files or libraries as input and combines    program into machine code.
them to produce a single (usually executable) file.                             Bug  It is a programming error.
Compiler  It is a system software that translates the entire source             Debugging  It is the process of eliminating errors from a program.
program into machine language.
                                                                                Semantic  It is the meaning of those expressions, statements, and
Interpreter  An interpreter is a system software that translates the source     program units.
program into machine language line by line.
                                                      Frequently asked questions
1.  Distinguish between 3GL and 4GL.                                            2.  What are the functions of a loader?
                    3GL                                   4GL                   The functions of a loader are as follows:
 Meant for use by professional           May be used by non-professional        ∑	 Assignment of load-time storage area to the program
 programmers.                            programmers as well as by
                                         professional programmers.              ∑	 Loading of program into assigned area
 Requires specifications of how to       Requires specifications of what task   ∑	 Relocation of program to execute properly from its load time storage
 perform a task.                         to perform. System determines how to      area
                                         perform the task.                      ∑	 Linking of programs with one another
 Requires large number of procedural     Requires fewer instructions.
 instructions.                                                                  3.  What is a debugger?
 Code may be difficult to read,          Code is easy to understand and         The debugger is a program that lets the programmer to trace the flow of
 understand, and maintain by the user.   maintain.                              execution or examine the value of variables at various execution points in
 Typically, file oriented.               Typically, database oriented.          the program. For example, GDB, the GNU debugger, is used with GNU
                                                                                                                          Introduction to Software                69
C Compiler. Debugger is always integrated in most of the Integrated             6.  Distinguish between a compiler and an interpreter.
Development Environment.
                                                                                                Compiler                                     Interpreter
4. What does syntax and semantics of a programming language
mean?                                                                            Scans the entire program before              Translates and executes the program
                                                                                 translating it into machine code.	           line by line.
The syntax of a programming language is the form of its expressions,
                                                                                 Converts the entire program to               Each time the program is executed,
statements, and program units. Its semantics is the meaning of those
                                                                                 machine code and only when all               every line is checked for syntax error
expressions, statements, and program units.                                      the syntax errors are removed does           and then converted to the equivalent
5.  What is a symbol table? What is its function?                                execution take place.                        machine code.
The symbol table serves as a database for the compilation process. It            Not much helpful in debugging.               Very helpful in debugging.
records the type and attributes information of each user-defined name in         Compilation process is faster.               Interpretation process is slower.
the program. This table is used in sytax analysis, semantic analysis as well
                                                                                 Gives a list of all errors in the program.   Stops at the first error.
as in code generation phases of compilation.
                                                                      exercises
	1.	 What do you mean by a program?                                             	 9.	 Briefly explain linker and loader. Is there any difference between
	 2.	 Distinguish between system software and application software.                   them?
	 3.	 State the advantages and disadvantages of machine language and            	10.	 Explain linking loader and linkage editor.
      assembly language.                                                        	11.	 Classify the programming languages.
	 4.	 Compare and contrast assembly language and high-level language.           	12.	 What is a functional language?
	 5.	 Differentiate between 3GL and 4GL.                                        	13.	 What is object-oriented language? Name five object-oriented
	 6.	 What is a translator?                                                           programming languages.
	 7.	 What are the differences between a compiler and an interpreter?           	14.	 What is the difference between linking loader and linkage editor?
	 8.	 Briefly explain the compilation and execution of a program written in a   	15.	 What is relocation?
      high-level language.