KEMBAR78
C and Data Structure - 101 13 PDF | PDF | C (Programming Language) | Data Type
0% found this document useful (0 votes)
470 views314 pages

C and Data Structure - 101 13 PDF

Uploaded by

Malathi Sankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
470 views314 pages

C and Data Structure - 101 13 PDF

Uploaded by

Malathi Sankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 314

.emaN e1.

sruIncrease
oC eht fothe
ezifont
s tnosize
f ehof
t esthe
aerCourse
cnI .1 Name.


.egaP revoC e2.ht nuse
i rethe
daefollowing
h a sa gniwasolaloheader
f eht esin
u the
.2 Cover Page.

YTISREVINUALAGAPPA
APPAGALAUNIVERSITY
B.C.A.


elcyC drihT eht ni )46.3:APGC( CA[Accredited
AN yb edarGwith
’+A’’A+’
htiwGrade
detidby
ercNAAC
cA[ (CGPA:3.64) in the Third Cycle
]CGU-DRHM yb ytisrevinU I–yrogeand
taC Graded
sa dedarasG Category–I
dna University by MHRD-UGC]
300 036 – IDUKIARA
KARAIKUDI
K – 630 003
101 13 NOITACUDE ECNATSIDDIRECTORATE
FO ETAROTCEOF
RIDDISTANCE EDUCATION

C AND DATA STRUCTURE


I - Semester

C AND DATA STRUCTURE


B.C.A.
101 13

itnem sYou
a egaare
p reinstructed
voc eht etatodpupdate
u ot dethe
tcurcover
tsni erpage
a uoYas mentioned below:
.emaN e1.sruIncrease
oC eht fothe
ezifont
s tnosize
f ehof
t esthe
aerCourse
cnI .1 Name.
aP revoC e2.ht nuse
i rethe
daefollowing
h a sa gniwasolaloheader
f eht esin
u the
.2 Cover Page.

ISREVINUALAGAPPA
APPAGALAUNIVERSITY
rihT eht ni )46.3:APGC( CA[Accredited
AN yb edarGwith
’+A’’A+’
htiwGrade
detidby
ercNAAC
cA[ (CGPA:3.64) in the Third Cycle
]CGU-DRHM yb ytisrevinU I–yrogeand
taC Graded
sa dedarasG Category–I
dna University by MHRD-UGC]
300 036 – IDUKIARA
KARAIKUDI
TACUDE ECNATSIDDIRECTORATE
K
FO ETAROTCEOF
– 630 003
RIDDISTANCE EDUCATION
C AND DATA STRUCTURE
B.C.A.

I - Semester
ALAGAPPA UNIVERSITY
[Accredited with ‘A+’ Grade by NAAC (CGPA:3.64) in the Third Cycle
and Graded as Category–I University by MHRD-UGC]
(A State University Established by the Government of Tamil Nadu)
KARAIKUDI – 630 003

Directorate of Distance Education

B.C.A.
I - Semester
101 13

C AND DATA STRUCTURE


Authors
Dr Subburaj Ramasamy, Former Senior Director, Department of Information Technology, Government of India
Units (1-8, 10)
Rohit Khurana, CEO, ITL Education Solutions Ltd.
Units (9, 11-14)

"The copyright shall be vested with Alagappa University"

All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Alagappa
University, Karaikudi, Tamil Nadu.

Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Alagappa University, Publisher and its Authors shall in no event be liable for
any errors, omissions or damages arising out of use of this information and specifically disclaim any
implied warranties or merchantability or fitness for any particular use.

Vikas® is the registered trademark of Vikas® Publishing House Pvt. Ltd.


VIKAS® PUBLISHING HOUSE PVT. LTD.
E-28, Sector-8, Noida - 201301 (UP)
Phone: 0120-4078900  Fax: 0120-4078999
Regd. Office: 7361, Ravindra Mansion, Ram Nagar, New Delhi 110 055
 Website: www.vikaspublishing.com  Email: helpline@vikaspublishing.com

Work Order No. AU/DDE/DE1-238/Preparation and Printing of Course Materials/2018 Dated 30.08.2018 Copies - 500
SYLLABI-BOOK MAPPING TABLE
C and Data Structure

BLOCK 1: INTRODUCTION TO C
UNIT - 1: Program Development Styles and Basics of C. Introduction to C Unit 1: Program Development
- Character Set - Identifiers and Keywords - Data Types - Constants - Variables Styles and Basics of C
- Declarations - Declaring Variables, - Rules for Defining Variables. Initializing (Pages 1-33);
Variables - Type Conversion. Operator and Expressions Unit 2: Data Input, Output and
UNIT - 2: Data Input, Output and Preliminaries - Single Character Input Preliminaries
and Output - Entering Input Data - Writing Output Data - Gets and Puts (Pages 34-41);
Functions Unit 3: Control Statements
UNIT - 3: Control Statements: Branching and Looping - Nested Control (Pages 42-71)
Structures - Switch - Break - Continue and Goto.

BLOCK 2: FUNCTIONS, ARRAYS AND POINTERS


UNIT - 4: Function: Defining a Function - Accessing a Function - Passing Unit 4: Function
Arguments to a Function - Recursion - Library Function - Macros - C (Pages 72-106);
Preprocessor - Program Structure: Storage Classes - Automatic Variables - Unit 5: Arrays
Global Variables - Static Variables- Multiple Programming - Bitwise Operation. (Pages 107-120);
UNIT - 5: Arrays - Array Initialization, Definition of Array, Characteristic of Unit 6: Pointers
Array, One Dimensional Array, Two Dimensional Array, Multidimensional (Pages 121-142)
Arrays, Character Array and Strings - String Handling Functions.
UNIT - 6: Pointers - Features of Pointers, Pointer Declaration, Arithmetic
Operation with Pointers, Pointers and Arrays, Pointers and Two Dimensional
Arrays, Array of Pointers, Pointers to Pointers, Pointers and strings.

BLOCK 3: STRUCTURE UNION AND FILES


UNIT - 7: Structures and Unions: Defining a Structure - Processing a Unit 7: Structures and Union
Structure - Structures and Pointers - Passing Structures to Functions - Self (Pages 143-166);
Referential Structures - Bit Fields - Unions - Enumerations. Unit 8: Data File
UNIT - 8: Data File: Opening and Closing a Data File - Creating a Data File (Pages 167-187)
- Processing a Data File - Unformatted Data File - Command Line Parameters.

BLOCK 4: LINEAR DATA STRUCTURE


UNIT - 9: Introduction to Data Structure, Stack, Stack Related Terms, Unit 9: Introduction to Data
Operation on a Stack, Representation of Stack, Implementation of a Stack - Structure
Polish Notation. (Pages 188-209);
UNIT - 10: Queues, Various Positions of Queue, Circular Queues. Operations Unit 10: Queues
on Queue, Representation of Queues. Applications of Queue. (Pages 210-225);
UNIT - 11: List, Merging Lists, Linked List, Single Linked List, Double Unit 11: List
Linked List, Header Linked List, Insertion and Deletion of Linked List, (Pages 226-262)
Traversing a Linked List.Representation of Linked List

BLOCK 5: NON-LINEAR DATA STRUCTURE


UNIT - 12: Introduction - Trees, Binary Trees, Types of Binary Trees, Unit 12: Introduction to Trees
UNIT - 13: Binary Tree Representation, Traversing Binary Trees, (Pages 263-267);
UNIT - 14: Binary Search Tree, Insertion and Deletion Operations, Trees Unit 13: Binary Tree
and their Applications Hashing Techniques. Representation
(Pages 268-275);
Unit 14: Binary Search Tree
(Pages 276-304)
CONTENTS
INTRODUCTION

BLOCK I: INTRODUCTION TO C
UNIT 1 PROGRAM DEVELOPMENT STYLES AND BASICS OF C 1-33
1.0 Introduction
1.1 Objectives
1.2 Introduction to Programming
1.2.1 Tokens, Identifiers and Keywords; 1.2.2 The C Character Set
1.2.3 Elementary Data Types; 1.2.4 Constants; 1.2.5 Variable
1.3 Operators and Expressions
1.4 Answers to Check Your Progress Questions
1.5 Summary
1.6 Key Words
1.7 Self Assessment Questions and Exercises
1.8 Further Readings

UNIT 2 DATA INPUT, OUTPUT AND PRELIMINARIES 34-41


2.0 Introduction
2.1 Objectives
2.2 I/O Functions: Entering Input Data and Writing Output Data
2.2.1 Single Character Input/Output; 2.2.2 Strings—gets() and puts()
2.3 Answers to Check Your Progress Questions
2.4 Summary
2.5 Key Words
2.6 Self Assessment Questions and Exercises
2.7 Further Readings

UNIT 3 CONTROL STATEMENTS 42-71


3.0 Introduction
3.1 Objectives
3.2 Decision and Control Structures
3.2.1 Logical Operators and Branching; 3.2.2 Loops and Control Constructs
3.2.3 switch Statement; 3.2.4 break, continue, return and goto
3.3 Answers to Check Your Progress Questions
3.4 Summary
3.5 Key Words
3.6 Self Assessment Questions and Exercises
3.7 Further Readings

BLOCK II: FUNCTIONS, ARRAYS AND POINTERS


UNIT 4 FUNCTION 72-106
4.0 Introduction
4.1 Objectives
4.2 Function
4.2.1 Defining and Accessing a Function; 4.2.2 Function Arguments
4.2.3 Arrays and Functions; 4.2.4 Recursive Function
4.3 Storage Classes
4.3.1 Automatic Variables; 4.3.2 Register Variables
4.3.3 External (Global) Variables; 4.3.4 Static Variables
4.3.5 External (Global) Static Variable; 4.3.6 Multi-file Program
4.4 Macros
4.5 Preprocessor Directives
4.6 Answers to Check Your Progress Questions
4.7 Summary
4.8 Key Words
4.9 Self Assessment Questions and Exercises
4.10 Further Readings

UNIT 5 ARRAYS 107-120


5.0 Introduction
5.1 Objectives
5.2 One Dimensional Array
5.2.1 Two-Dimensional Arrays
5.3 Strings and Characters Array
5.3.1 String Manipulation Using Library Functions
5.4 Answers to Check Your Progress Questions
5.5 Summary
5.6 Key Words
5.7 Self Assessment Questions and Exercises
5.8 Further Readings

UNIT 6 POINTERS 121-142


6.0 Introduction
6.1 Objectives
6.2 Concept of Pointers
6.2.1 Pointer Arithmetic; 6.2.2 Passing Pointer to a Function; 6.2.3 Pointers and Strings
6.3 Answers to Check Your Progress Questions
6.4 Summary
6.5 Key Words
6.6 Self Assessment Questions and Exercises
6.7 Further Readings

BLOCK III: STRUCTURE UNION AND FILES


UNIT 7 STRUCTURES AND UNION 143-166
7.0 Introduction
7.1 Objectives
7.2 Structures
7.2.1 Processing a Structure; 7.2.2 Array of Structures
7.2.3 Structure Elements Passing to Functions; 7.2.4 Structure passing to Functions
7.2.5 Structure Within Structure; 7.2.6 Structure Containing Arrays; 7.2.7 Pointers to Structures
7.3 Union Definition
7.4 Enumerated Data Types
7.5 Answers to Check Your Progress Questions
7.6 Summary
7.7 Key Words
7.8 Self Assessment Questions and Exercises
7.9 Further Readings

UNIT 8 DATA FILE 167-187


8.0 Introduction
8.1 Objectives
8.2 Data File
8.2.1 Opening and Closing a Data File; 8.2.2 Concept of Binary Files
8.2.3 Formatted I/O Operations with Files; 8.2.4 Writing and Reading a Data File
8.2.5 Unformatted Data Files; 8.2.6 Processing a Data File; 8.2.7 Use of the Command Line Argument
8.3 Answers to Check Your Progress Questions
8.4 Summary
8.5 Key Words
8.6 Self Assessment Questions and Exercises
8.7 Further Readings

BLOCK IV: LINEAR DATA STRUCTURE


UNIT 9 INTRODUCTION TO DATA STRUCTURE 188-209
9.0 Introduction
9.1 Objectives
9.2 Stack Related Terms and Operations on Stack
9.3 Application and Implementation of Stack
9.3.1 Converting Infix Notation to Postfix and Prefix or Polish Notations
9.4 Answers to Check Your Progress Questions
9.5 Summary
9.6 Key Words
9.7 Self Assessment Questions and Exercises
9.8 Further Readings

UNIT 10 QUEUES 210-225


10.0 Introduction
10.1 Objectives
10.2 Queues
10.3 Representation of Queues
10.4 Circular Queue
10.5 Applications of Queues
10.6 Answers to Check Your Progress Questions
10.7 Summary
10.8 Key Words
10.9 Self Assessment Questions and Exercises
10.10 Further Readings

UNIT 11 LIST 226-262


11.0 Introduction
11.1 Objectives
11.2 Merging List and Linked List
11.3 Singly-Linked Lists
11.3.1 Traversing; 11.3.2 Insertion; 11.3.3 Deletion
11.4 Doubly-Linked Lists
11.4.1 Insertion; 11.4.2 Deletion
11.5 Header List
11.6 Representation of Linked List
11.7 Answers to Check Your Progress Questions
11.8 Summary
11.9 Key Words
11.10 Self Assessment Questions and Exercises
11.11 Further Readings

BLOCK V: NON-LINEAR DATA STRUCTURE


UNIT 12 INTRODUCTION TO TREES 263-267
12.0 Introduction
12.1 Objectives
12.2 Trees
12.2.1 Forms of Binary Tree
12.3 Answers to Check Your Progress Questions
12.4 Summary
12.5 Key Words
12.6 Self Assessment Questions and Exercises
12.7 Further Readings

UNIT 13 BINARY TREE REPRESENTATION 268-275


13.0 Introduction
13.1 Objectives
13.2 Binary Tree
13.2.1 Array Representation; 13.2.2 Linked Representation
13.3 Binary Tree Traversals
13.4 Answers to Check Your Progress Questions
13.5 Summary
13.6 Key Words
13.7 Self Assessment Questions and Exercises
13.8 Further Readings

UNIT 14 BINARY SEARCH TREE 276-304


14.0 Introduction
14.1 Objectives
14.2 Binary Search Tree
14.2.1 Inserting a Node; 14.2.2 Deleting a Node
14.3 Applications of Trees
14.4 Hashing Techniques
14.5 Answers to Check Your Progress Questions
14.6 Summary
14.7 Key Words
14.8 Self Assessment Questions and Exercises
14.9 Further Readings
Introduction
INTRODUCTION

C is a programming language and is substantially different from C++ and C#.


NOTES Many operating systems are written using C, UNIX being the first. Later, Microsoft
Windows, Mac OS X and GNU/Linux were written in C. Not only is C the
language of operating systems, it is the precursor and inspiration for almost all the
popular high-level languages available today. Perl, PHP, Python and Ruby are
also written in C. In fact, one of the strengths of C is its universality and portability
across various computer architectures. Therefore, C can be used for the
development of different types of applications that include real-time systems and
expert systems. C also provides flexibility to users for introducing new types of
features in their programs, depending upon the requirement and definition of user-
defined functions. The various features of C—algorithms, flow charts, decision-
making statements, functions, arrays, linked lists, stacks and trees, structures and
pointers—are useful for program developers. The goal of this book is to introduce
you to C—a programming language that is ideally suited to modern computers
and modern programming.
The book, C and Data Structure, follows the self-instructional mode wherein
each unit begins with an Introduction to the topic followed by an outline of the
Objectives. The detailed content is then presented in a simple and structured manner
interspersed with Check Your Progress questions. A list of Key Words, a Summary
and a set of Self Assessment Questions and Exercises is also provided at the end
of each unit for effective recapitulation.

Self-Instructional
8 Material
Program Development
BLOCK - I Styles and Basics of C
INTRODUCTION TO C

NOTES
UNIT 1 PROGRAM DEVELOPMENT
STYLES AND BASICS OF C
Structure
1.0 Introduction
1.1 Objectives
1.2 Introduction to Programming
1.2.1 Tokens, Identifiers and Keywords
1.2.2 The C Character Set
1.2.3 Elementary Data Types
1.2.4 Constants
1.2.5 Variable
1.3 Operators and Expressions
1.4 Answers to Check Your Progress Questions
1.5 Summary
1.6 Key Words
1.7 Self Assessment Questions and Exercises
1.8 Further Readings

1.0 INTRODUCTION

In this unit, you will learn about the program development life cycle, building blocks
of program, operators and expressions. There are various stages through which a
program passes in order to provide the output. Compiler is a program that
processes statements written in a particular programming language and turns them
into machine language or “code” that a computer’s processor uses. Tokens are
atomic elements or building blocks of a program. There are six classes of tokens
which are explained in this unit. Further you will learn about the evaluation of
expression and operators.

1.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss the concept of program development
 Explain the building blocks of program
 Explain the various data types
 Define and initialize the variables
 Explain the various types of operators Self-Instructional
Material 1
Program Development
Styles and Basics of C 1.2 INTRODUCTION TO PROGRAMMING

A computer program or software is a sequence of instructions coded in a


NOTES programming language, such as FORTRAN, Pascal, BASIC, C, C++, Java, C #
(pronounced C sharp), etc. Software development is considered to be an art.
Now, it is an engineering discipline having a set of well-developed methodologies
for software development and maintenance.
Software Development Life Cycle (SDLC)
When programming languages became available in the 1960s, software
development emerged as an independent profession. Till then, only computer
specialists who were knowledgeable about computer hardware could make the
computer perform intended operations and use them through machine languages.
The emergence of the various high-level languages and progressive reduction of
cost of computer systems gave rise to a number of software developers or
programmers. The software development projects undertaken in the early days
were essentially developing the program statements or the codes for the applications
straight away. This resulted in a software crisis due to which projects where always
getting delayed and exceeding the original cost by two to three times or even
more. The software did not perform the intended functions on delivery and contained
many defects. Therefore, a software engineering conference was held in the early
970s to overcome the crisis. Software engineering is essentially the application of
engineering principles to the development of software. One of the recommendations
was to divide the software development activity into a number of phases. This led
to the formulation of a software development life cycle model. The stages of the
SDLC model are shown in Figure 1.1.
Planning Stage
(Resource allocation and team assignments)

Requirement Stage
(Simple description about given problem, use-case diagram)

Development Stage (Test scripts are developed and integration)

Testing Stage (Data Validation and integration is tested)

Release Stage (checklist and users-guide implementation)

Fig. 1.1 Stages of SDLC

C is a procedural systems implementation language. Despite its low-level capabilities,


the language was designed to encourage cross-platform programming. C has abilities
for structured programming and supports lexical variable scope and recursion. In a

Self-Instructional
2 Material
C program, all executable codes are contained within functions and the function Program Development
Styles and Basics of C
parameters are always passed by value. Pass-by-reference is simulated in C by
explicitly passing pointer values. C program uses the semicolon as a statement
terminator. Comments are written between the delimiters /* and */. C source files
contain declarations and definitions of function. Function definitions contain NOTES
declarations and statements. Declarations either define new types using keywords,
such as struct, union and enum or assign types to new variables by writing the type
followed by the variable name. Keywords, such as char and int specify built-in
types. The programming codes are enclosed in curly braces ({ }) for limiting the
scope of declarations and to act as a single statement for control structures.
C statements denote specific actions. The most common statement is an expression
statement which contains an expression to be evaluated followed by a semicolon
(;). To modify the normal sequential execution of statements, C supports various
control flow statements using the C reserved keywords. If and if…else
are used for conditional execution whereas do…while, while and
for are used for iterative execution or looping. break and continue keywords
are used to leave the innermost enclosing loop statement. The goto statement
directly goes to the specified label within the function. switch selects a case
to be executed which is based on the value of an integer expression. Expressions
can use a variety of built-in operators and may contain function calls.
A program in C is run by selecting Run option from the main menu to get the
result. The following steps are required to run an executable file:
 Execute or run the .exe file. The result is obtained if no error is present.
 The program is debugged if any error occurs.
 If an error occurs, the compilation step is repeated for the execution step.
After writing the program it must be saved as .c extension name. If the .c file is
compiled successfully, it produces the desired result. The following sequence is
required to run a C program:
Creating a C program  Compiling the program  Linking the program
from C library  Executing the C program  Get the output
Figure 1.2 shows the flow chart of the development stage of a simple C program
in which you find that Calculate.c file contains the predefined header file;
for example <stdio.h>, <stdlib.h>, which is then sent to the compiler.
After successful compilation, Calculate.obj file is prepared. Linker is used
to link the program. Linking is the process of combining various pieces of code
and data together to form a single executable program that can be loaded in memory.
Linking can be done at compile time, at load time by loaders and also at run time
by application programs. After linking the library files, such as math.lib,
Calculate.exe file is created.

Self-Instructional
Material 3
Program Development Other user
Predefined Calculate.c
Styles and Basics of C header defined
file, e.g. header
studio.h files

Compiler
NOTES

Other user Calculate.obj Libraries


generated e.g.
object file math.lib

Linker

Calculate.exe

Fig. 1.2 Development Stage of a C Program Using Turbo C


In fact, Calculate.exe is an executable file that is used to display the result
on the output screen. Figure 1.3 is a flow chart that shows the pseudo code that is
referred to as a set of standard rules to create and run a C program.

Fig. 1.3 Flow Chart Containing Pseudo Code of Running a C Program

The following screenshot shows the Turbo C (TC) editor in which users or
programmers write the program.

Self-Instructional
4 Material
In this screen, the Run menu has a drop down list that is used for editing commands Program Development
Styles and Basics of C
as shown in Table 1.1.
Table 1.1 Keystrokes for Operations Performed with a C Programming File
NOTES
Operation Key Combination
Run CTRL+F9
Program reset CTRL+F2
Go to cursor F4
Trace into F7
Step over F8
User Screen ALT+F5

Write the following C program in the Turbo C screen:


/*Example 1.1
#include <stdio.h>
#include <conio.h>
void main()
{
clrscr();
printf(“\Welcome to the world of C programming.”);
getch();
}
After saving the file, it is sent for compiling successfully. The C program is executed
if the output is displayed to the user screen. The key combination [ALT+F9] is
used to compile the program. You can get user screen by selecting [ALT+F5] key
combination that displays the following screen.

Any key is pressed to return to editor screen where the C program is written. You
can exit from it by clicking [ALT+X].
Self-Instructional
Material 5
Program Development 1.2.1 Tokens, Identifiers and Keywords
Styles and Basics of C
These are six classes of tokens in C programming language:
(i) Keyword
NOTES (ii) Identifier
(iii) Constant
(iv) String literal
(v) Operator
(vi) Punctuators
Tokens are similar to atomic elements or building blocks of a program. A C program
is constructed using tokens. There are certain other building blocks of a program
that do not form a part of any of the above. They are as follows:
 Blanks
 Horizontal tabs
 Vertical tabs
 New line characters
 Form feed
 Comments
1.2.2 The C Character Set
The C language supports and implements the American Standard Code for
Information Interchange (ASCII) for representing characters. The ASCII uses 7
bits for representing each character or digit. The characters are coded from
0000000 (decimal 0) to 1111111 (decimal 127). Therefore, the ASCII consists of
codes for 128 characters in all. The ASCII values (decimal equivalent of the 7
bits) of some alphabets and digits are given in Table 1.2.
Table 1.2 ASCII Values of Selected Alphabets

ASCII Value Character or Digit


48 0
49 1
57 9
65 A
66 B
67 C
89 Y
90 Z
97 a
98 b
121 y
122 z

Self-Instructional
6 Material
The digits and alphabets are organized sequentially and hence, it is easy to get the Program Development
Styles and Basics of C
ASCII value; for instance, the ASCII value of D is 68, E is 69, 8 is 56, x is 120,
and so on.
Identifiers NOTES
Any name is an identifier. Just as the name of a person, street or city helps in the
identification of a person or a street or a city, the identifier in the C language
assigns names to files, functions, constants, variables, etc. An identifier in the C
language is defined as a sequence of alphanumeric characters, i.e., alphabets or
digits. The first character of an identifier has to be an alphabet. In the C language,
lowercase alphabets and uppercase alphabets are considered to be different. For
instance, VAR and var represent different names in the C language.
VALID IDENTIFIERS
C1
PROC1
P34
VAR_1
EX1
a
bc
Ual1
Aa
INVALID IDENTIFIERS
1PROGA
4.3
A-B

Any function name is also an identifier. For instance, ‘printf’ is the name of function
available with the C language system. The function helps in printing. Therefore,
identifiers can be constructed with alphabets (A...Z), (a...z), (0...9). In addition,
underscore can also be used in identifiers. Unless otherwise specified, small letters
are usually used for identifiers.
Keywords
Keywords are also known as reserved words in C. ‘int’ is a reserved word or
keyword. Keywords have a specific meaning to the compiler. They should be
used for giving specific instructions to the computer. These words cannot be used
for any other purpose such as naming a variable. C is a very concise language
containing only thirty-two reserved words and this is one of its strengths. Common
statements, such as print, input, etc., are implemented through library functions
in C, giving relief to programmers and reducing the size of code as compared to
other programming languages. This makes the task of programming simple.
Self-Instructional
Material 7
Program Development Now take a look at the keywords given in Table 1.3. You will come across
Styles and Basics of C
most of them in the book. Their meaning will become clear as you read the book.
Table 1.3 C Keywords
NOTES auto break case char
const continue default do
double else enum extern
float for goto if
int long register return
short signed sizeof static
struct switch typedef union
unsigned void volatile while

1.2.3 Elementary Data Types


Data is used in a program to get information. In a program used to find out the
greater of two numbers, the numbers are data, and the output which says which
number is greater, is information. C is a versatile language and handles many different
types of data in an elegant manner.
Bit stands for binary digit, i.e., 0 or 1. Each byte is a collection of 8 bits, i.e.,
8 consecutive bits of ‘0’ or ‘1’. Data is handled in a computer generally in terms of
bytes and therefore, will be in the form of multiples of 8 bits. Each ASCII character
is represented by one byte.
Fundamental Data Types
An item that holds data is also called an object. An object has a name or an
identifier associated with it. Each object can hold a specific type of data. There
are five basic data types in C, as follows:
(i) Character
(ii) Integer
(iii) Real numbers
(iv) Void
(v) Enum
You have to first understand how a computer works. Assume that two
numbers a and b are to be multiplied. First, the two numbers have to be stored in
the memory. Then the required calculation has to be performed. The result has
also to be stored in memory. Each number is of a specific data type; for instance,
all three of them can be declared to be integers. Each data type occupies a specific
size in the memory. What does one mean by size? It is the amount of storage
space required; each bit needs one storage space. One byte needs eight storage
spaces. If a number is of type integer declared as int, it is stored in 2 bytes. The
number depending on its type gets stored in different forms. If a number is of float

Self-Instructional
8 Material
type, it takes 4 bytes to store it. All characters can be represented according to Program Development
Styles and Basics of C
the ASCII table and hence, 1 byte, i.e., 8 bits are good enough to store a character,
which is represented as char.
These sizes may vary from computer to computer. The header files
<limits.h> and <float.h> contain information about the sizes of the data NOTES
types.
Real numbers can be expressed with single precision or double precision.
Double precision means that real numbers can be expressed more precisely.
Double precision also means more digits in mantissa. The type ‘float’ means
single precision and ‘double’ means a double precision real number. Table 1.4
indicates the size of various data types.
Table 1.4 Size of Data Types
Data Type Size
char 1 byte
int 2 bytes
float 4 bytes
double 8 bytes

1.2.4 Constants
The following are the types of constants:
 Integer constant
 Character constant
 Float constant
 Enumeration constant
 String constant
 Symbolic constant
All these types are explained below.
Integer Constants
The following are the types of integers:
int
unsigned int
long
unsigned long
Variations in integer types
We can use the sign bit also for holding the value. In such cases, the variable will
be called unsigned int. Themaximum value of anunsigned int will be
equal to 65535 because we are using the Most Significant Bit (MSB) also for
storing the value. The minimum value will obviously be 0.

Self-Instructional
Material 9
Program Development A long integer is represented as long int or simply long. The maximum
Styles and Basics of C
and minimum values of long are given below:
LONG MAX + 2147483647

NOTES LONG MIN – 2147483647


Unless otherwise specified, integers or long integers will be signed, i.e., the
first bit will be reserved for the sign. The long int obviously uses 4 bytes or 32
bits.
The magnitudes of long can also be doubled by using an unsigned long
integer denoted as unsigned long.
However, integers are not suitable for very low values and very large values.
This can be overcome by floating point or real numbers.
An integer constant may be suffixed by the letter u or U to specify that it is
an unsigned integer. Similarly, if the integer is suffixed with l or L, it
signifies a long integer. If we specify unsigned long integer we
suffix the constant with ul or UL.
The following are the examples of valid and invalid integers:
Valid integers
+345 /* integer */
345 /* integer */
–345 /* integer */
729u /* unsigned integer */
729U /* unsigned integer */
–112345L /* Long integer */
112345UL /* Unsigned Long integer */
+112345l /* Long integer */
112345l /* Long integer - if no sign precedes, it is a
positive number */
Invalid integers
345.0 /* decimal point not allowed */
112, 345L /* no comma allowed */
112 345UL /* = blank not allowed */
112890345L /* exceeds the maximum */
+112 345UL /* unsigned cannot have + */
(345l /* ( not allowed */
–345s /* illegal characters */
We have so far considered only decimal numbers. The C language, however,
entertains other type of numbers as well. The octal numbers will be preceded by 0
(zero).

Self-Instructional
10 Material
The following are examples of valid and invalid octal numbers: Program Development
Styles and Basics of C
Valid octal number
0346
0547 NOTES
0120
Invalid octal number
0394 /* 8 or 9 are not allowed in an octal number */
0 x 345 /* prefix has to be zero only */
The C language also supports hexadecimal numbers. Here, since the base
is 16, we use alphabets also in the numbers as given in Table 1.5.
Table 1.5
a or A for 10
b or B for 11
c or C for 12
d or D for 13
e or E for 14
f or F for 15

Additionally, hexadecimal numbers will be preceded by 0X or ox, i.e., zero


followed by x.
The following are examples of valid and invalid hexadecimal numbers:
Valid hexadecimal numbers
0x345
0xA132
0x100
0x20B
Invalid hexadecimal numbers

0x, 123 /* no comma */


0x /* cannot be empty */
0A00 /* x is missing */
Character Constants
A character constant is a single character enclosed in single quotes as in ‘x’.
Characters can be alphabets, digits or special symbols.
The following are examples of valid and invalid character constants:
Valid character constants
‘A’
‘Z’
‘C’
‘c’ Self-Instructional
Material 11
Program Development Invalid character constants
Styles and Basics of C
‘\n’
‘\t’
‘\u’
NOTES
‘\b’
AA
‘AA’
“AA”
‘1a’
A character constant represents its integer value as defined in the character set of
the machine. Therefore, you can add 2 characters. For example, the ASCII values
of digit 1 = 49 and C = 67. When we add these values we get code 116 whose
equivalent character is t.
Let us verify this with the following example:
/* Example 1.2
demonstr ate s th at c har s can be trea ted like
integers*/
#include <stdio.h>
int main()
{
const char ALPHA1=‘1';
char alpha2=’C’;
char alpha3;
alpha3=ALPHA1+alpha2;
putchar(alpha3);
}
Result of the program
t
Therefore, characters can be treated like integers as well, although they are
declared as character variables and constants. Since characters are of type int,
we could add them. Characters can also be defined as integers as given in Example
1.3.
/* Example 1.3
Demonstrates that a char can also be declared as
int*/
#include <stdio.h>
int main()
{
int x;
Self-Instructional
12 Material
x=’1'+’C’; Program Development
Styles and Basics of C
printf(“x as integer=%d\n”, x);/*x printed as integer*/
printf(“x as character=%c\n”, x);/*x printed as character*/
}
NOTES
Result of the program
x as integer=116
x as character=t
Floating Point or Real Numbers
Let us enumerate the difference between floating point and integer numbers.
 Integers are whole numbers without decimal points but a float has always a
decimal point. Even if the number is a whole number, it is written with a
decimal point. For instance, 42 is an integer, while 42.0 is floating-point
number.
 Floating-point numbers occupy more space for storage as we have already
seen.
A real number in the simple form consists of a whole number followed by
the decimal point and also one or more decimal numbers following the decimal
point, which makes the fractional part. This form of representation is known as
fractional form. It must have a decimal point. It could be either positive or negative.
As usual the default sign is positive. No commas or blanks or special characters
are allowed in between.
The following are the examples of valid and invalid float types:
Valid floats
144.00
226.012
Invalid floats
+144 /* no decimal point */
1,44.0 /* comma not allowed */
Scientific notation: Floating-point numbers can also be expressed in scientific
notation. For example, 3.0 E2 is a floating-point number. The value of the number
will be equal to 3.0 × 102 = 300.0.
Instead of the upper case E, the lower case e can be used as in
0.453 e + 05, which will be equal to 0.453 × 105 = 45300

There are two parts in the scientific notation of a real number, which are as follows:
(i) Mantissa (before E)
(ii) Exponent (after E)
In the scientific form the following rules are to be observed:
 The mantissa part can be positive or negative.
Self-Instructional
Material 13
Program Development  The exponent must have at least one digit, which can be a positive or negative
Styles and Basics of C
integer. Remember the exponent cannot be a floating-point number.
type float is a single precision number occupying a storage space of 4
bytes.
NOTES
type double represents floating-point numbers of double precision and
hence occupies 8 bytes.
If you look at the file <float.h> you will find the maximum and minimum
floating-point numbers as given below.
FLT – MAX 1E + 37 maximum floating point number
FLT – MIN 1E – 37 minimum floating point number
Floating-point constants The constants are suffixed as given below:
F or f – float
no suffix – double
L or l – long double
If an integer is suffixed with L or l, then it is a long integer.
If a float is suffixed with L or l, then it is a long double floating-point
number.
Examples
Valid floating-point constants
1.0 e 5
123.0 f /* float */
11123.05 /* double */
23467.34 e 5 l /* long double */
Invalid real constants
245.0 /* invalid float, but valid double */
456 /* It is an integer */
1.0 e 5.0 /* exponent cannot be a real number */
When they are declared as variables, they can be declared as follows:
float a = 3.12;
float a, b, c;
float val1;
float val2;
long double val3;
The values of constants cannot be altered in programs. They can be defined as
follows:
const int PRINC = 1000;
const float INT_RATE = 0.12 f;

Self-Instructional
14 Material
The values of PRINC and INT_RATE cannot be changed in the program even by Program Development
Styles and Basics of C
introducing another assignment statement when they are declared as constants
using const. Example 1.4 verifies this statement:
/*Example 1.4
NOTES
Demonstrates that constants cannot be changed
even with assignment statements. To verify, include
statements 7, 8 & 9 in the program by removing
the comment specifiers at the beginning of the program
statement 7 and the end of statement 9*/
#include<stdio.h>
main()
{
const int PRINC =1000;
const float INTST=0.12f;
printf(“PRINCIPAL=%d INTEREST=%f\n”, PRINC, INTST);
/*PRINC =2000;
INTST=0.24f;
printf(“PRINCIPAL=%d INTEREST=%f\n”, PRINC, INTST);*/
}
Key in the example and execute the program. After successful compilation, you
will get the result as follows:
Result of the program
PRINCIPAL = 1000; INTEREST = 0.1200
Now include the second part of the program by removing /* and */ at
statements 7 and 9, respectively. Earlier this was treated as a comment. Now this
part will get included in the program. Now compile it. You will get a compilation
error. This is due to your attempt to redefine the constants PRINC and INTST,
which is not allowed. Incidentally, the technique of including or excluding a program
segment at will using /* and */ is a convenient method for program development.
String Constants
A character constant is a single character enclosed within single quotes. A string
constant is a number of characters, arranged consecutively and enclosed within
double quotes.
Examples of valid strings:
“God”
“is within me”
“ ”
You may be surprised about the third string constant, which has no character.
This is called a NULL or empty string and is allowed in C.
Self-Instructional
Material 15
Program Development The string constant can contain blanks, special characters, quotation marks,
Styles and Basics of C
etc. within the string. In order to distinguish the end of a string constant, the compiler
places a null character \0 (back slash followed by zero) at the end of each string
before the quotation mark. The null character is not visible when the string appears
NOTES on the screen. The programmer does not include the null character either. It is the
compiler which automatically inserts the null character at the end of every string.
Invalid string:
‘Yoga’ /* should be enclosed in double quotes */
Symbolic Constants
The format for symbolic constant is as follows:
# define name constant
For example, we can define:
# define INITIAL 1
Which defines INITIAL as 1.
The INITIAL type of definition is called symbolic constants. They are not
variables and hence, they are not defined as part of the declarations of variables.
They are specified on top of the program before the main function. The symbolic
constants are to be written in capital or upper case letters. Wherever the symbolic
constant names appear in the program, the compiler will replace them with the
corresponding replacement constants defined in the # define statement. In this
case, 1 will be substituted wherever INITIAL appears in the program. Note that
there is no semicolon at the end of the # define statement.
1.2.5 Variable
The names of variables and constants are identifiers. The names are made up of
alphabets, digits and underscore, but the first character of an identifier must be an
alphabet. C allows up to 31 characters for the identifier (names) and therefore,
the naming of the variables should be carried out in an easily understandable manner.
For example, in the program for the calculation of,
Simple interest I = pnr/100,
you can declare them with actual names,
p = principal, r = rate_of_interest, n = number_of_
years
Naturally, programmers may not like typing long names for fear of making mistakes
elsewhere in the program apart from being reluctant to increase their typing
workload. Meaningful names can, however, be assigned to data types even with
few letters. For example,
p = princ; r = intrate; n = years
Some compilers may allow names with up to 31 (thirty-one) characters, but may
consider the first eight characters for all purposes. Hence, a programmer could
Self-Instructional
16 Material
coin shorter yet meaningful names, instead of using single alphabets for variable Program Development
Styles and Basics of C
names. One should, however, be careful not to use the reserved words, such as
the 32 keywords, for variable names as they have a specific meaning to the compiler.
If they are used as variable names, then the compiler will get confused. Be careful
not to use the reserved words as identifiers. NOTES
A program to find out the square of integer 5 is given as follows:
/*Example 1.5
/*program to find square of 5*/
#include <stdio.h>
int main()
{
printf(“square of %d= %d”, 5, 5*5);
}

Result of the program


square of 5= 25

Variables: An Overview
You have now achieved the objective of finding the square of 5. Later on, you may
want to find out the square of another number, say 7, for example. We would have
to write the same program again replacing 5 by 7 and then compile and run it. This
would waste a lot of time. To save time, we can, therefore, write a general-purpose
program as shown in Example 1.6.
/*Example 1.6
/*program to find square of any given number*/
#include <stdio.h>
int main()
{
int num;
printf(“Enter the integer whose square is to be found\n”);
scanf(“%d”, &num);
printf(“square of %d= %d”, num, num*num);
}

Here, we define num as an integer variable. When ‘&’ precedes num, it indicates
the memory address of num.
At the first printf, the message appears as it is and the cursor goes to
the next line because of the new line character \n at the end of the message, but
before the closing quotation mark. The next statement is new to you. It is used to
receive an integer typed on the console. You can type in any integer number, and
the number typed will be stored in the memory at the memory location named Self-Instructional
Material 17
Program Development ‘num’. The purpose of the statement is, therefore, to get the integer (because of
Styles and Basics of C
the appearance of %d within quotes) and it is stored at the memory address ‘num’.
The next statement prints the number typed and its square.
NOTES When you execute the program, the following message appears on the
monitor:
Enter the integer whose square is to be found.
Since we want to find out the square of 25 type:
25

Promptly, the reply will be as shown as follows:


square of 25 = 625

The next time you may want to find out the square of another number, say 121.
Then simply run the program and when prompted, type 121 to get the answer.
Here the number whose square has to be found out has been declared as a
variable. The variable has a name and is stored at the location pointed to by the
variable name. Therefore, the execution of the program for finding out the square
of any number is easy with the above modification.
Variables and constants are fundamental data types. A variable can be
assigned only one value at a time, but can change value during program execution.
A constant, as the name indicates, cannot be assigned a different value during
program execution. For example, PI, if declared as a constant, cannot have its
value varied in a given program. If PI has been declared as a constant = 3.14, it
cannot be reassigned any other value in the program. Programs may declare a
number of constants. Variables are similarly useful for any programming language.
If PI has been declared as a variable, then it can be changed in the program to
any value. This is one difference between a variable and a constant. Whether an
identifier is constant or variable depends on how it is declared. Both variables and
constants belong to one of the data types like int, float, etc. The convention
in ‘C’ is to indicate the names of constants by the upper case letters.
PI
SIGMA

Variable names are, on the other hand, indicated by the lower case letters.
int a
float xy

Size of Variables
The C programmer should understand how much memory storage each variable
type occupies in the IDE used by him. The following example will help us to find
the size of each variable type. The result will be in terms of bytes occupied by the
variable type. A new keyword sizeof is used to find out the size. The syntax for
using the keyword is as follows:
Self-Instructional sizeof (<data type>)
18 Material
or Program Development
Styles and Basics of C
sizeof (<expression>)

Consider the following example:


/*Example 1.7
NOTES
/*program to find out the sizes of various types of
integers*/
#include<stdio.h>
int main()
{
printf(“size of char =%d\n”, sizeof(char));
printf(“size of short=%d\n”, sizeof(short));
printf(“size of int =%d\n”, sizeof(int));
printf(“size of unsigned int=%d\n”, sizeof(unsigned));
printf(“size of long int=%d\n”, sizeof(long));
printf(“size of unsigned long int=%d\n”, sizeof(unsigned
long));
printf(“size of float =%d\n”, sizeof(float));
printf(“size of double=%d\n”, sizeof(double));
printf(“size of long double%d\n”, sizeof(long double));
}
Result of the program
size of char = 1
size of short = 2
size of int = 2
size of unsigned int = 2
size of long int = 4
size of unsigned long int = 4
size of float = 4
size of double = 8
size of long double = 10

Therefore, it is obvious that a long double occupies 10 bytes and stores long
floating-point numbers with double precision.
Note that the size of short int will be either equal to or less than the size of an
integer variable.
Variables, which require more than 1 byte for storage, will be stored
consecutively in the memory.

Check Your Progress


1. Write the sequence required to run a C program.
2. Define keywords.
3. What are character constants?
Self-Instructional
Material 19
Program Development
Styles and Basics of C 1.3 OPERATORS AND EXPRESSIONS

An expression is a combination of variables, constants and operators written


NOTES according to the syntax of C language. In C, every expression evaluates to a
value, i.e., every expression results in some value of a certain type that can be
assigned to a variable. Some examples of C expressions are shown in Table 1.6.
Table 1.6 Representation of Arithmetic Expressions in C

Algebraic Expression C Expression

ab–c a * b – c

(m + n) (x + y) (m + n) * (x + y)

(ab / c) a * b / c

3x2 +2x + 1 3*x*x+2*x+1

(x / y) + c x / y + c

Evaluation of Expressions
Expressions in C are evaluated using an assignment statement of the following
form: variable = expression;
variable is any valid C variable name. When the statement is encountered
then the expression is evaluated to replace the previous value of the variable
on the left hand side. All variables used in the expression must be assigned values
so that no error occurs at the time of evaluation. The following are some examples
of evaluation statements:
x = a * b – c;
y = b / c * a;
z = a – b / c + d;
The following program illustrates the effect of presence of parenthesis in expressions.
/*Example 1.8
void main ()
{
float a, b, c, x, y, z;
a = 9;
b = 12;
c = 3;
x = a – b / 3 + c * 2 – 1;
y = a – b / (3 + c) * (2 – 1);
Self-Instructional
20 Material
z = a – ( b / (3 + c) * 2) – 1; Program Development
Styles and Basics of C
printf (“x = %fn”,x);
printf (“y = %fn”,y);
printf (“z = %fn”,z);
NOTES
}
Result of the program
x = 10.00
y = 7.00
z = 4.00
An arithmetic expression without parenthesis will be evaluated from left to right
using the rules of precedence of operators. The two distinct priority levels of
arithmetic operators in C are as follows:
High priority * / %
Low priority + -
Rules for Evaluation of Expression
The following are the rules for evaluation of expression in C language:
 First parenthesized sub expressions are evaluated from left to right.
 If parenthesis is nested, the evaluation begins with the innermost sub
expression.
 The precedence rule is applied in determining the order of application of
operators in evaluating sub expressions.
 The associability rule is applied when two or more operators of the same
precedence level appear in the sub expression.
 Arithmetic expressions are evaluated from left to right using the rules of
precedence.
 When parenthesis is used, the expressions within the parenthesis assume
the highest priority.
Types of Expressions
The three types of expressions in C language are as follows:
1. Arithmetic: It evaluates a number, for example a=12;.
2. String: It evaluates character or text string, for example ‘text’ or
‘12345’.
3. Logical: It retains ‘true’ or ‘false’ value.
Conditional Expression in C
The conditional expression holds two values based on the generated condition.
The following syntax is used to write a conditional expression:
(condition) ? val1 : val2;
Self-Instructional
Material 21
Program Development In C, the standard expression can be declared as follows:
Styles and Basics of C
Status_of_person = (age >= 18) ? “adult” : “minor”;
In this conditional expression either of the two values can be returned which is
NOTES based on the value of age. If age is greater than 18 the assigned value to the
Status_of_person will be ‘adult’, and if age is less than 18 then the assigned value
will be ‘minor’.
Types of Operators

Arithmetic Operators
The basic arithmetic operators are:
+ addition, e.g., c = a + b
– subtraction, e.g., c = a – b
* multiplication, e.g., c = a * b
/ division, e.g., c = a/b
% modulus, e.g., c = a % b
When we divide two numbers, we get a quotient and a remainder. To get the
quotient we use c = a/b;
/* c contains quotient */
To get the remainder we use c = a % b;
/* c contains the remainder */.
% is also popularly called modulus operator. Modulus cannot be used with floating-
point numbers.
Therefore, c = 100/6; will produce c = 16.
c = 100 % 6, will produce c = 4.
In expressions, the operators can be combined.
For example, a = 100 + 2/4;
What is the right answer?
Is it 100 + 0.5 = 100.5
or 102/4 = 25.5
To avoid ambiguity, there are defined precedence rules for operators in ‘C’.
Precedence refers to the evaluation order of operators. However, in an expression
there may be the same type of operators at a number of places, which may lead to
ambiguity as to which one to evaluate first. In addition, more than one operator
may have the same precedence. For example, * and / have the same precedence.
To avoid ambiguity in such cases, there is a rule called associativity. The precedence
and associativity of operators in ‘C’ are given in Annexure 1.

Self-Instructional
22 Material
Annexure 1 Program Development
Styles and Basics of C
Operator Precedence
Operator Associativity
( ) [ ] –> . (dot) Left to right NOTES
! ~ ++ — (unary) + - * &(address) sizeof Right to left
* / % (modulus) Left to right
(binary)+ - (subtract) Left to right
<< >> Left to right
< <= > >= Left to right
= = != Left to right
& (bitwise and) Left to right
^ Left to right
| Left to right
&& Left to right
|| Left to right
?: Right to left
= += -= *= /= %= &= ^= |= < <= > >= Right to left
, (comma) Left to right

A associativity says either left to right or vice versa. This means that when
operators of the same precedence are encountered, the operators of the same
precedence have to be evaluated from left to right, if the associativity is left to
right.
Now refer to the previous example. Since / has precedence over +, the
expression will be evaluated as 100 + 0.5 = 100.5.
In the precedence table, operators in the same row have the same
precedence. The lower the row, the lower the precedence.
For example, (), which represents a function, has a higher precedence
than !, which is in the lower row. Similarly * and / have higher precedence over
+ and –.
Whenever you are in doubt about the outcome of an expression, it would
be better to use parentheses to avoid the ambiguity.
Consider the following examples:
1) 12 – 3 * 3 = 12 – 9 = 3 and not 27.
2) 24 / 6 * 4 = 4 * 4 = 16 and not 1.
3) 4 + 3 * 2 = 4 + 6 = 10 and not 14.

Self-Instructional
Material 23
Program Development 4) 8 + 8 / 2 – 4 * 6 / 2
Styles and Basics of C
= 8 + 4 – 4 * 6 / 2
= 8 + 4 – 24 / 2
= 8 + 4 – 12 = 0
NOTES
Note the steps involved in the previous example.
Relational Operators
Two variables of the same type may have a relationship between them. They can
be equal or one can be greater than the other or less than the other. You can check
this by using relational operators. While checking, the outcome may be true or
false.
For example, if a = 5 and b = 6;
a equals b is false.
a greater than b is false.
a greater than or equal to b is false.
a less than b is true.
a less than or equal to b is true.
Any two variables or constants or expressions can be compared using
relational operators. Table 1.7 below gives the relational operators available in
‘C’.
Table 1.7 Relational Operations in C

Operator Example Read as


< less than a < b Is a < b
> greater than a > b Is a > b
<= less than or equal to a <= b Is a < or = b
>= greater than or equal to a >= b Is a > or = b
== equal to a == b Is a equal to b
!= not equal to a != b Is a not equal to b

Note that for checking equality the double equal sign is used, which is different
from other programming languages. The statement a = b assigns the value of b
to a. For example, if b = 5, then a is also assigned the value of 5. The statement
a == b checks whether a is equal to b or not. If they are equal, the output will be
true; otherwise, it will be false.
Now look at their precedence from Annexure 1.
> >= < <= have precedence over == !=
Note that arithmetic operators + – * / have precedence over the relational
and logical operators.
Self-Instructional
24 Material
Therefore, in the following statement: Program Development
Styles and Basics of C
(x – 3 > 5)
x – 3 will be evaluated first and only then the relation will be checked.
Therefore, there is no need to enclose x – 3 within parenthesis as in ((x – 3) NOTES
> 5.
Logical Operators
You can use logical operators in programs. These logical operators are:
&& denoting logical And
|| denoting logical Or
! denoting logical Negation
The relational and logical operators are evaluated to check whether they are true
or false. ‘True’ is represented as 1 and ‘False’ is represented as 0.
It is also by convention that any non-zero value is considered as 1 (true)
and zero value is considered as 0 (false).
For example,
if (a – 3)
{s1}
else
{s2}
If a is 5, then s1 will be executed.
If a = 3, then s2 will be executed.
If a = –5, s1 will still be executed.
To summarize, the relational and logical operators are used to take decisions
based on the value of an expression.
Assignment Operators
Assignment operators are written as follows:
identifier = expression;
For example,
i = 3;
Note: 3 is an expression
const A = 3;
‘C’ allows multiple assignments in the following form:
identifier 1 = identifier 2 = .....= expression.
For example,
a = b = z = 25;

Self-Instructional
Material 25
Program Development However, you should know the difference between an assignment operator
Styles and Basics of C
and an equality operator. In other languages, both are represented by =.
In ‘C’ the equality operator is expressed as = = and assignment as =.

NOTES Shorthand Assignment Operators


You have been looking at simple and easily understandable assignment statements.
This can be written in a different manner when the RHS includes LHS; or in other
words, when the result of computation is stored in one of the variables in the RHS.
The general form is exp1 = exp1 + exp2.
This can be also written as exp1 + = exp2.
Examples:
simple form special form
a = a + b; a += b;
a = a + 1; a += 1;
a= a – b; a – = b;
a = a – 2; a – = 2;
a = a*b; a*= b;
a = a*(b + c); a*= b + c;
a = a/b; a / = b;
a = a/2; a / = 2;
d = d – (a + b); d – = a + b
The assignment operators =, + =, – =, * =, / =, % =, have the
same precedence or priority; however, they all have a much lower priority or
precedence than the arithmetic operators. Therefore, the arithmetic operations
will be carried out first before they are used to assign the values.
Conditional Operator
The condition operator is also termed as ternary operator and is denoted by?:
The syntax for the conditional operator is as follows:
(Condition)? statement1: statement2;
What does it mean? If the condition is true, execute statement1; else, execute
statement2. The conditional operator is quite handy in simple situations as follows:
(a > b)? print a greater
: print b greater
Thus, the operator has to be used in simple situations. If nothing is written in the
position corresponding to else, then it means nothing is to be done when the
condition is false.
An example is as follows:
/*Example 1.9
Demonstrates use of the ? operator*/
Self-Instructional
26 Material
#include <stdio.h> Program Development
Styles and Basics of C
int main()
{
unsigned int a,b;
NOTES
printf (“enter two integers\n”);
scanf(“%u%u”, &a, &b);
(a==b)?printf(“you typed equal numbers\n”):
printf(“numbers not equal\n”);
}
Result of the program
enter two integers
123 456
numbers not equal
Increment and Decrement Operators
C contains two increment and decrement operators which are present in postfix
and prefix forms. Both forms are used to increment or decrement the appropriate
variable. The statement ++i (prefix form) increments i before using its value,
while i++ (postfix form) increments it after its value has been used. Both the
forms will produce different outputs when evaluated.
Increment operator ++
The ++ (increment) operator adds 1 to the value of a scalar operand or if the
operand is a pointer then increments the operand by the size of the object to
which it points. The operand receives the result of the increment operation. The
operand must be a modifiable lvalue of arithmetic or pointer type. You can
put ++ before or after the operand. If it appears before the operand, the operand
is incremented first and then used in the expression. If you put ++ after the operand,
the value of the operand is used in the expression before the operand is
incremented. The following statement shows the increment operator concept:
play = ++play1 + play2++;
This statement is similar to the following expressions:
int temp, temp1, temp2;
temp1 = play1 + 1;
temp2 = play2;
play1 = temp1;
temp = temp1 + temp2;
play2 = play2 + 1;
play = temp;

Self-Instructional
Material 27
Program Development The result has the same type as the operand after integral promotion. The usual
Styles and Basics of C
arithmetic conversions on the operand are performed. In prefix operation, the
value is incremented or decremented first and then applied, while in postfix the
value is applied first and then incremented or decremented.
NOTES /*Example 1.10
#include<stdio.h>
main( )
{
int i = 3,j = 4,k;
k = i++ + ++j;
printf(“i = %d, j = %d, k = %d”,i,j,k);
}
Result of the Program
i = 4, j = 5, k = 8
Decrement operator
The — (decrement) operator subtracts 1 from the value of a scalar operand or if
the operand is a pointer decreases the operand by the size of the object to which
it points. The operand receives the result of the decrement operation. The operand
must be a modifiable lvalue. You can put — before or after the operand. If it
appears before the operand the operand is decremented and the decremented
value is used in the expression. But if — appears after the operand then the current
value of the operand is first used in the expression and then the operand is
decremented. The following statement shows the decrement operator concept:
play = —play1 + play2—;
This statement is similar to the following expressions:
int temp, temp1, temp2;
temp1 = play1 - 1;
temp2 = play2;
play1 = temp1;
temp = temp1 + temp2;
play2 = play2 - 1;
play = temp;
Other Operators: sizeof and period
Bitwise operators access the internal representation of the numbers, which are
bits 0 or 1. These operators apply only to the integer family operands including
char. There are six operators for bit-wise operation or manipulation. The operators
and their symbols are as follows:
& bit-wise AND
| bit-wise OR
Self-Instructional
28 Material
^ bit-wise exclusive OR Program Development
Styles and Basics of C
<< left shift
>> right shift
~ one’s complement
NOTES
Bitwise Operators
Now using examples, see the operation of bit-wise operators in detail:
1’s complement operator: It changes 1 to 0 and 0 to 1, if x = 1100,
~x = 0011.
The usage is as follows:
Let us declare i, j as integers;
i = 6666;
j = ~ i;
A table for 1’s complement of octal numbers is as follows:
Octal Number Binary Complement in Binary Complement in Octal
0 000 111 7
1 001 110 6
2 010 101 5
3 011 100 4
4 100 011 3
5 101 010 2
6 110 001 1
7 111 000 0

Therefore, you can write the 1’s complements from this table.
j = ~ i
i = 015012
j = 762765
This can be used for encryption of information for security purposes.
Right shift operator
unsigned int i , j ;
j = i >> 2 ;
Here, the bits in i will be shifted 2 places to the right and stored in j.
If you write i >> 6, the bits will be shifted right by 6 places. What
happens to the leftmost bits? They will all be filled with zeros.
Now, i = 6666 in hex will be 1A0A
Let, j = i >> 4 ;
The bits will be shifted by 4 places.

Self-Instructional
Material 29
Program Development Therefore, j = 01A0
Styles and Basics of C
i = 015012 in octal
if j = i >> 3 then
NOTES j = 001501;
Left shift operator
j = i << p;
This expression will shift i by p bits and will store it in j.
What happens to the rightmost bits shifted? Zeros will be inserted whenever
a bit is shifted.
i = 015012 / * octal number * /
j = i << 3;
Their result will be 150120.
i = 1A0A; / * hexadecimal number * /
j = i << 4 ;
The shifted number will be,
A0A0
Shifting again by 4 bits will give 0A00.
Check by shifting the octal number 150120 to the left by 3 bits. The result
will be 501200. Find out the reason yourself.
AND operator: It compares two bits, and if both are 1, the output is 1, otherwise
zero. This can be used to compare the sets of bits. Assume that you want to check
only the 16th bit in the 16-bit word of a number, you can carry out the task using
AND operator on the number and word with the 16th bit 1 and all other 15 bits 0.
When you compare using AND, the 15 bits will be 0 and 16th bit will be a 0 or 1,
depending on the number.
Remember to use a single &. You know && stands for logical AND.
This will operate on 2 operands.
Let us have, a = 015012 octal
b = 177777
c = a & b will provide an output of c = 015012 octal. Verify
by converting into bits.
A = 1A0A hexadecimal
b = 0000
c = a & b will produce c = 0000 hexadecimal because b is all
zeros.
OR operator
This is also a binary operator. The output of OR operator will be 0, if both the
Self-Instructional inputs are 0, and 1 otherwise.
30 Material
Let, a = 015012 octal Program Development
Styles and Basics of C
b = 000000
c = a | b will produce c = 015012 octal
Let, a = 015012 octal NOTES
b = 177777 octal
c = a | b will produce c = 177777 octal.
This is because b is all ones and hence a | b will automatically produce all 1s,
even without looking at a.
When b is all zeros, the output will be 1 wherever a is 1. Therefore, the
output will be same as a.

Exclusive OR operator: When one of the two operands is 1, we get the output
of exclusive OR as 1; otherwise, the output will be 0.
a = 1A0A hex
= 0001101000001010
Let, b = 1111111111111111 = FFFF hex
a ^ b= 1110010111110101 = E5F5 hex
Let, c = 0000000000000000 = 0000 hex
a ^ c= 0001101000001010 = 1A0A hex
b ^ c= FFFF hex

Check Your Progress


4. What is expression?
5. What are the three types of expressions?

1.4 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. The following sequence is required to run a C program: Creating a C program


 compiling the program  linking the program from C library  executing
the C program  Get the output.
2. Keywords are also known as reserved words in C. ‘int’ is a reserved
word or keyword. Keywords have a specific meaning to the compiler. They
should be used for giving specific instructions to the computer. These words
cannot be used for any other purpose such as naming a variable.
3. A character constant is a single character enclosed in single quotes as in ‘x’.
Characters can be alphabets, digits or special symbols.
Self-Instructional
Material 31
Program Development 4. An expression is a combination of variables, constants and operators written
Styles and Basics of C
according to the syntax of C language.
5. The three types of expressions in C language are as follows:
NOTES (i) Arithmetic
(ii) String
(iii) Logical

1.5 SUMMARY

 A computer program or software is a sequence of instructions coded in a


programming language, such as FORTRAN, Pascal, BASIC, C, C++, Java,
C # (pronounced C sharp), etc.
 The following sequence is required to run a C program: Creating a C program
 compiling the program  linking the program from C library  executing
the C program  Get the output.
 Tokens are similar to atomic elements or building blocks of a program. A C
program is constructed using tokens.
 Any name is an identifier. Just as the name of a person, street or city helps
in the identification of a person or a street or a city, the identifier in the C
language assigns names to files, functions, constants, variables, etc.
 Keywords are also known as reserved words in C. ‘int’ is a reserved
word or keyword. Keywords have a specific meaning to the compiler.
They should be used for giving specific instructions to the computer.
 A character constant is a single character enclosed in single quotes as in ‘x’.
Characters can be alphabets, digits or special symbols.
 Variables and constants are fundamental data types. A variable can be
assigned only one value at a time, but can change value during program
execution. A constant, as the name indicates, cannot be assigned a different
value during program execution.
 An expression is a combination of variables, constants and operators written
according to the syntax of C language. In C, every expression evaluates to
a value, i.e., every expression results in some value of a certain type that
can be assigned to a variable.
 Precedence refers to the evaluation order of operators.
 Associativity says either left to right or vice versa. This means that when
operators of the same precedence are encountered, the operators of the
same precedence have to be evaluated from left to right, if the associativity
is left to right.

Self-Instructional
32 Material
Program Development
1.6 KEY WORDS Styles and Basics of C

 Token: It is the smallest element of a program that is meaningful to the


compiler. NOTES
 Operator: It is a symbol that tells the compiler to perform specific
mathematical or logical functions.
 Compiler: It is a special program that processes statements written in a
particular programming language and turns them into machine language or
“code” that a computer’s processor uses.

1.7 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What are the different stages for software development?
2. What are the different types of constants?
3. What are the different types of operators in C language?
Long-Answer Questions
1. Discuss the process to run a C program.
2. What are the six classes of tokens in C programming language? Explain.
3. What are variables? Explain the declaration and initialization of variables.
4. Discuss the evaluation of expression with the help of an example.

1.8 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 33
Data Input, Output and
Preliminaries
UNIT 2 DATA INPUT, OUTPUT AND
PRELIMINARIES
NOTES
Structure
2.0 Introduction
2.1 Objectives
2.2 I/O Functions: Entering Input Data and Writing Output Data
2.2.1 Single Character Input/Output
2.2.2 Strings—gets() and puts()
2.3 Answers to Check Your Progress Questions
2.4 Summary
2.5 Key Words
2.6 Self Assessment Questions and Exercises
2.7 Further Readings

2.0 INTRODUCTION

In this unit, you will learn about the various types of functions for reading and
writing data on the console. The input and output functions of a computer facilitate
interactions between the computer and the user. The user has to input data in
order to process it in the computer and get the result as the output. The printf
() statement can be programmed to give the output in the desired manner. The
functions gets () and puts () are appropriate when strings are to be
received from the screen or sent to the screen without errors.

2.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss the different input/output functions
 Define string
 Understand the use of gets() and puts()

2.2 I/O FUNCTIONS: ENTERING INPUT DATA


AND WRITING OUTPUT DATA

The input and output functions of a computer facilitate communication with the
external world. They also facilitate interaction between the computer and the user.
The user has to input data in order to process it in the computer and get the result
as output. The peripheral device for input is the keyboard. Keying in of the input

Self-Instructional
34 Material
data into the computer at run-time is achieved easily by library functions, which Data Input, Output and
Preliminaries
are supplied along with the ‘C’ compiler and have been standardized. The functions,
which enable keying in of the input data in the keyboard, are given as follows:
 scanf()
NOTES
 getchar()
 getch()
 getche()
 gets()
The computer communicates its output, i.e., the result of computation, either
through the console video monitor, printer, disk drive or input/output ports. Since
we are learning ‘C’ through the PC, we will get the output through the video
monitor. Just as there are library functions for input, there are standard library
functions for output as well. They are given as follows:
 printf()
 putchar()
 putch()
 puts()
Giving input and getting output are achieved by using the standard library
functions. Note that all the function names are followed by parentheses. The
parentheses are meant for passing arguments or getting arguments. The arguments
may be absent in certain functions as in the case of main(). However, function
names must be followed by parentheses to indicate to the compiler that they are
functions. Arguments can be either variables or constants.
Use of printf()
Initially, formatted input/output statements will be discussed. Here, one must
categorically specify the data type of variables to be read or written and their
formats. The printf() and scanf() are formatted input and output statements.
The printf() statement can be programmed to give the output in the desired
manner. Example 2.1 is given below to illustrate the use of the printf() function
in printing the required values.
/*Example 2.1
/* To demonstrate the print function*/
#include <stdio.h>
main()
{
printf(“welcome to more serious programming\n”);
}
Here the first statement after the comment line directs the compiler to include
the standard input/output header file. Note that in the printf() statement whatever

Self-Instructional
Material 35
Data Input, Output and is given within quotes except those immediately following ‘\’ and ‘%’ symbols
Preliminaries
will be printed as it is. Those with the ‘\’ symbol like ‘\n’, ‘\t’ are escape
sequences. Those of the ‘%’ symbol such as ‘%d’, ‘%f’ are known as conversion
characters. They have a specific meaning to the printf() function.
NOTES Result of Program
welcome to more serious programming
Then the cursor will go to the next line. The cursor is made to go to the next
line because of the new line character \n. The escape sequences carry out the
functions assigned when their turn for execution is reached in the printf()
statement.
Now correct the statement as given below:
printf(“\tWelcome”); and execute the program.
Welcome will appear from the first tab.
Execute the program again. You will find that the message is printed from
the next available tab position on the same line.
2.2.1 Single Character Input/Output
Characters can be scanned through the scanf() function and printed through
the printf() function as given in Example 2.2.
/*Example 2.2
/*input and output of character through scanf()
and printf()*/
#include<stdio.h>
int main()
{
char alpha;
printf(“Enter a character\n”);
scanf(“%c”, &alpha);
printf(“\n The character typed by you is:- %c\n”,
alpha);
}
The program is simple to understand.
A variable alpha is declared of type char. The message directs the user
to enter a character. The scanf() function uses the conversion character %c
since we have to scan a character type variable. The (ampersand) &alpha indicates
the address where the scanned character is to be stored.
There are two points to be noted here. We have to specify the format as %c
and after entering the character, we have to hit the Return key. The interaction
with the computer while executing the program was captured and given below:
Enter a character
S
The character typed by you is:- S
Self-Instructional
36 Material
2.2.2 Strings—gets() and puts() Data Input, Output and
Preliminaries
A string is an array of characters. The functions gets() and puts() are
appropriate when strings are to be received from the screen or sent to the screen
without errors. NOTES
Standard Library for Strings
There are a number of library functions for string manipulation as given below:
strlen (CS)—returns the length of string CS.
char * strcpy (s, ct)—copy string ct to string s, including
NULL and return s.
char * strcat (s, ct)—concatenate string ct to end of string s;
return s.
int strcmp (cs, ct)—compare string cs to string ct; return < 0 if
cs < ct,
0 if cs = = ct or > 0 if cs > ct
char * strchr (cs, c)—returns the pointer to the first occurrence
of c in cs or NULL if not present.
There are some more string functions.
If these are to be used, <string.h> should be included before the main
function.
Use of gets() and puts()
One can use scanf() to receive strings from the screen. The program using
scanf() for reading a name is as follows:
char name [25];
scanf (“ %s “, name);
Strings can be declared as an array of characters as shown above. In the
scanf() function, when we get the array of characters as a string, it is enough to
indicate the name of the array without a subscript. When we get a string, there is
no need for writing ‘&’ in the scanf() function. We can indicate the name of
the string variable itself.
Strings may contain blanks in between. If you use a scanf() function to
get the string with a space in between such as ‘Rama Krishnan’, Krishnan will not
be taken note of since space is an indicator of the end of entry of the input. But
gets() will take all that is entered till the enter key is pressed. Therefore, after
entering the full name, the enter() key can be pressed. Thus, using gets() is
a better way for strings. We can get a string in a much simpler way using gets().
The syntax for gets is, gets(name);
Similarly puts() can be used for printing a variable or a constant as given
below:
puts (name);
puts (“Enter the word”); Self-Instructional
Material 37
Data Input, Output and However, there is a limitation. printf() can be used to print more than
Preliminaries
one variable and scanf() to get more than one variable at a time in a single
statement. However, puts() can output only one variable and gets() can
input only one variable in one statement. In order to input or output more than one
NOTES variable, separate statements have to be written for each variable. As you know
that gets() and puts() are unformatted I/O functions, there are no format
specifications associated with them.
We will take another interesting example. If a word is a palindrome, we will
get the same word when we read the characters from the right to the left as well.
Examples are : nun
malayalam
These words when read from either side give the same name. We will write
a program to check whether a word is a palindrome or not.
This program uses a library function called strlen(). The function
strlen(str) returns the size or length of the given string. Now let us look at
the program.
/*Example 2.3
/* To check whether a string is palindrome*/
#include <stdio.h>
#include <string.h>
#define FALSE 0
int main()
{
int flag = 1;
int right, left, n;
char w[50]; /* maximum width of string 50*/
puts(“Enter string to be checked for palindrome”);
gets(w);
n=strlen(w)-1;
for ((left = 0, right = n); left <= n/2; ++left,
– —right)
{
if (w[left]!=w[right])
{
Flag = FALSE;
break;
}
}
if (flag)
{
Self-Instructional
38 Material
puts(w); Data Input, Output and
Preliminaries
puts(“is a palindrome”);
}
else
NOTES
printf(“%s is NOT a palindrome”);
}
Result of the program
Enter string to be checked for palindrome
palap
palap
is a palindrome
If strlen() or gets() or puts() are used in a program, we have to
include <string.h> before the main().
Step 1
Now let us analyse the functioning of the above example.
We are defining a symbolic constant FALSE as 0.
We initialize flag as 1. We define a string w as an array of 50 characters.
gets(w); returns the word typed and stores it from location &w [0]
Let us assume that we typed ‘nun’ and analyse what happens in the program.
strlen (w) will return the length of the word typed. In this case
strlen (w) = 3.
We subtract this by 1 to get the subscript of the rightmost character. The
subscript of the leftmost character is obviously 0.
We are initializing the for loop with the following:
left = 0 right = n = 2
flag = 1
We check whether left < = n/2 and
Since it is so, we check whether w [0]! = w [2].
The condition is false since w[0] = w[2] = ‘n’.
Therefore, the groups of statements following if are skipped : flag remains 1.

Step 2
Now left is incremented to 1 and right is decremented to 1.
Again w[1] ! = w[1] is false, flag remains 1.
Therefore, control returns to the for statement.
Now, left = 2 right = 0
Since left is greater than n/2, control comes out of the for loop. Now the
statement if (flag) will be executed. Self-Instructional
Material 39
Data Input, Output and It will check whether flag is true. In this case, flag is still true.
Preliminaries
Therefore, the computer prints that the word is palindrome. If the word is
not a palindrome, then what happens?
NOTES Let us now assume that we typed ‘book’ and see what happens in the
program.
To start with, left = 0 right = 3
w [left] ! = w[right],
Therefore, the statements within the { } will be executed, flag will be set
to false and then the break statement will be executed.
The statement break causes immediate exit from the loop. Now flag is
false. Therefore, the else statement is executed to say that the word is NOT a
palindrome.

Check Your Progress


1. What is the significance of printf() statement?
2. What are the functions that are appropriate when strings are to be received
from the screen or sent to the screen?

2.3 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. The printf() statement can be programmed to give the output in the


desired manner.
2. The functions gets() and puts() are appropriate when strings are to
be received from the screen or sent to the screen without errors.

2.4 SUMMARY

 The input and output functions of a computer facilitate communication with


the external world.
 The printf() and scanf() are formatted input and output statements.
The printf() statement can be programmed to give the output in the
desired manner.
 Characters can be scanned through the scanf() function and printed
through the printf() function.
 A string is an array of characters. The functions gets() and puts()
are appropriate when strings are to be received from the screen or sent to
the screen without errors.
Self-Instructional
40 Material
Data Input, Output and
2.5 KEY WORDS Preliminaries

 printf(): It is a function used to print the “character, string, float,


integer, octal and hexadecimal values” onto the output screen. NOTES
 Scanf (): It is a function used to take input from the user or console.

2.6 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What are the different input and output functions?
2. What is the significance of printf() and scanf() function?
Long-Answer Questions
1. Write a program to demonstrate the use of printf () and scanf
() function.
2. Write a program to explain the use of gets () and puts () function.

2.7 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 41
Control Statements

UNIT 3 CONTROL STATEMENTS


NOTES Structure
3.0 Introduction
3.1 Objectives
3.2 Decision and Control Structures
3.2.1 Logical Operators and Branching
3.2.2 Loops and Control Constructs
3.2.3 switch Statement
3.2.4 break, continue, return and goto
3.3 Answers to Check Your Progress Questions
3.4 Summary
3.5 Key Words
3.6 Self Assessment Questions and Exercises
3.7 Further Readings

3.0 INTRODUCTION

In this unit, you will learn about the control statements used in C programs. These
are used for solving complex problems. Depending on the occurrence of a particular
situation, if and else keywords are used for branching to different segments
of the program. C is ideal for handling branching because the syntax is clear and
unambiguous. If the condition is true, then a single statement or group of statements
following if will be executed. If more than one statement is to be executed, then
the statements are grouped within braces. To perform the same operation a number
of times, loop or iteration is used. In this unit, you will also learn about while,
do …while, switch, break, continue and return statements.

3.1 OBJECTIVES
After going through this unit, you will be able to:
 Explain the basic concepts of branching
 Use if and if...else branching statements in your C programs
 Define switch, break, continue and goto statements
 Understand the importance of loop and control constructs in a C program
 Use if, for, while and do...while in a program

3.2 DECISION AND CONTROL STRUCTURES

Real-life application programs do not merely consist of simple multiplication or


addition. They call for solving complex problems. Depending on the occurrence
Self-Instructional of a particular situation, we may follow different paths; the if and else keywords
42 Material
are quite handy in branching to different segments of the program. ‘C’ is ideal for Control Statements

handling branching because the syntax is clear and unambiguous. You will now
read about the branching constructs. Relational operators are used in conjunction
with branching constructs. Following are the few relational operators:
NOTES
if Statement
The syntax of the if statement is as follows:
if (condition)
{statements}
If the condition is True, then a single statement or group of statements
following the if will be executed. If more than one statement is to be executed,
then the statements are grouped within braces. If it is a single statement, then curly
braces are not required.
If the condition turns out to be False, then the next statement after those
belonging to the if will be executed. Example 3.1 will make the concept clear.
Input two integers from the keyboard. If they are equal, then the program
will print, ‘*you typed equal numbers’; otherwise, it will print
nothing.
/*Example 3.1
*This program demonstrates the use of if*/
#include <stdio.h>
main()
{
unsigned int a,b;
printf (“enter two integers\n”);
scanf(“%u%u”, &a, &b);
if (a==b)
{
printf(“you typed equal numbers\n”);
}
}
Each opening curly brace has to have a matching closing curly brace. In
Example 3.1 the first closing curly brace corresponds to the if statement and the
second one to the (main) function. Execute the program by first keying in two
equal valued unsigned integers.
Result of the program
enter two integers
56 56
you typed equal numbers
After you are satisfied, you can try the program with unequal numbers.
You will not get any message. Self-Instructional
Material 43
Control Statements if...else Statement
You did not get any message when the numbers were unequal and this can be
avoided by using the else statement.
NOTES The usage of if .. else is shown below:
if (condition true)
{
statements s1
}
else
{
statements s2
}
statements s3;
The statement else is always associated with an if.
If the condition is True, then statements s1 will be executed. After executing
them, the program will skip the else block and control goes to statement s3
that follows the else block.
If the condition is False, then the statements in the else block, i.e., s2
will be executed followed by statement s3.
Statements s1 will not be executed at all when the condition becomes
False. The usage of braces clearly brings out which statements belong to the if
block and which to the else block.
Example 3.2 brings out the usage of if... else.
/*Example 3.2
*This program demonstrates use of if.. else*/
#include <stdio.h>
main()
{
unsigned int a,b;
printf (“enter two integers\n”);
scanf(“%u%u”, &a, &b);
if (a==b)
{
printf(“you typed equal numbers\n”);
}
else
{
printf(“numbers not equal\n”);
}
Self-Instructional }
44 Material
The output of the program when unequal numbers were keyed in is as Control Statements

follows.
Result of the program
enter two integers NOTES
17 13
numbers not equal
Nesting of the if...else Statements
You witnessed the usage of a single if statement in Example 3.1. You saw if
followed by else in Example 3.2. There is no restriction to the number of if,
which can be used in a program. This applies to else as well, but else can
only follow an if statement.
You can have the following in a program:
{
if (condition1)
{
if (condition2)
{statements–s1}
else
if (condition3)
{statements–s2,}
}
else
{statements–s4}
statements–s5
}
This is called a nested if and else statement. As the level of nesting
increases, it will be difficult to analyse and logical mistakes will be made more
easily.
In the above example, when condition1 is false, statements–
s4 will be executed. If condition1 is True and condition2 is also True,
then statements–s1 will be executed.
If condition1 is True and condition2 and condition3 are
False, statements–s5 will be executed directly.
To execute statements–s2
 Condition1 has to be True;
 Condition2 has to be False,
 condition3 has to be True.

Self-Instructional
Material 45
Control Statements Write a program to find the greatest of the three unequal integers which are
keyed in and are called x, y and z.
Before writing a program, we must write the algorithm. We should not straight
away get down to programming.
NOTES
Algorithm 1
Algorithm for finding the greatest of 3 integers.
Step 1: Print a message to enter 3 integers.
Step 2: Get three numbers and store them at &x, &y and &z.
Step 3: Check if x > y
Step 4: If False, go to Step 9
Step 5: If True
Step 6: Check if x > z
Step 7: If True, write x is the greatest; End
Step 8: If False, write z is the greatest; End
Step 9: Check if y>z
Step 10: If True, write y is the greatest; End
Step 11: If not, write z is the greatest.
Step 12: End
Now code these steps into a ‘C’ program, which is shown in Example 3.3.
/*Example 3.3
*This program demonstrates the use of the nested if..
else*/
#include <stdio.h>
main()
{
int x,y,z;
printf (“enter three unequal integers\n”);
scanf(“%d%d%d”, &x, &y, &z);
if(x>y)
{
if(x>z)
{
printf(“x is greatest\n”);
}
else
{
printf(“z is greatest\n”);

Self-Instructional
46 Material
} Control Statements
}
else
{
NOTES
if(y>z)
{
printf(“y is greatest”);
}
else
{
printf(“z is greatest”);
}
}
}
Test the correctness of the program by giving a different set of values for x,
y and z.
Result of the program
enter three unequal integers
908 231 907
x is greatest
Look at the example. It uses multiple nesting of if .. else.
Take care to see that every opening brace has a corresponding closing
brace. It is better to indent the braces as shown in the example so that no mistake
is committed. Take care to see that else matches with the corresponding if
and each opening brace ‘{’ matches with a corresponding closing brace ‘{’; if
either an opening ‘{’ or closing ‘{’ is extra, then an error will result.
3.2.1 Logical Operators and Branching
In the above examples you have been checking one condition at a time. It would
be nice if you could check more than one condition at a time. ‘C’ provides three
logical operators for combining more than one condition. These are as follows:
Logical AND represented as &&
Logical OR represented as ||
Negation or NOT represented as ! (exclamation).
if x > y and if x > z, then x is the greatest.
You will represent the same as,
if ((x > y) && (x > y))
printf (“x is the greatest”);

Self-Instructional
Material 47
Control Statements You will see that the program has become much more elegant.
The syntax for && is,
if ((condition1) && (condition2))
NOTES {
statements–s1
}
statements–s1 will be executed only if both the conditions are True.
The syntax for ‘or’ is as follows:
if ((condition 1 ) ||(condition 2))
{
statements–s2
}
In this case, even if one of the conditions is True, the statements–s2
will be executed. At least one condition has to be True for the execution of s2.
However, if both are False, s2 will not be executed.
The NOT operator with symbol ! can be used along with any other relational
or logical operator or as a stand-alone operator. It simply negates the operator
following it. The syntax of ‘!’ is as follows:
if ! (condition) statement s3;
s3 will be executed only when the condition is not True or the condition is
False.
Now rewriteAlgorithm 1 by using the logical operators. The revised algorithm
2 is shown here:
Algorithm 2
Step 1: If (x > y) and (x > z), x is the greatest.
Step 2: Else if (x<y) and (y>z), y is the greatest.
Step 3: Else print z is the greatest.
The complete program is given in Example 3.4.
/*Example 3.4
*This Example demonstrates the use of logical operators*/
#include <stdio.h>
main()
{
int x,y,z;
printf (“enter three unequal integers\n”);
scanf(“%d%d%d”, &x, &y, &z);
if ((x>y) && (x>z))
printf(“x is greatest\n”);

Self-Instructional
48 Material
else Control Statements
{
if((x<y) && (y>z))
printf(“y is greatest\n”);
NOTES
else
printf(“z is greatest\n”);
}
}
Result of the program
enter three unequal integers
12 23 78
z is greatest
Now write a program to convert a lower case letter typed into an upper
case letter.
It is obvious that if you subtract 32 from the ASCII value of a lower case
alphabet, you will get the ASCII value of the corresponding upper case letter.
Now write an algorithm for the conversion of lower case to an upper case letter.
It is given in Algorithm 3.
Algorithm 3
Step 1: Send a message for getting a character.
Step 2: Get a character.
Step 3: Check whether the character typed is >=a and <=z.
(This is essential since you can only convert a lower case alphabet into
upper case.)
Step 4: If so, subtract 32 from the value of the character; if not, go to Step
6.
Step 5: Output the character with the revised ASCII value; END.
Step 6: Print ‘an invalid character’ END.
The algorithm is implemented in Example 3.5.
/*Example 3.5
*Conversion of lower case letter to upper case*/
#include <stdio.h>
main()
{
char alpha;
printf (“enter lower case alphabet\n”);
alpha=getchar();
if (( alpha >=’a’)&& (alpha<=’z’))
{

Self-Instructional
Material 49
Control Statements alpha= (alpha-32);
putchar (alpha);
}
else
NOTES printf(“invalid entry; retry”);
}
Now you can test the program by giving both the valid and invalid inputs;
valid inputs are the lower case letters and invalid inputs are all other characters.
Result of the program
The result for the invalid input is as follows:
enter lower case alphabet
8
invalid entry; retry
The result when tried with a valid input is given below:
enter lower case alphabet
n
N
The programs should be executed, i.e., tested with both the valid and invalid
inputs.
Conditional Operator and if...else
The syntax for the conditional operator is as shown here:
(Condition)? statement1: statement2;
What does it mean? If the condition is True, execute statement1;
else, execute statement2. Here nesting is not possible. The if...else
statement is more readable than the conditional(?) operator. However,
the conditional operator is quite handy in simple situations as follows:
(a > b) ? print a greater
: print b greater;
Thus, the operator has to be used in simple situations. If nothing is written in
the position corresponding to else, then it means nothing is to be done when
the condition is False.
/*Example 3.6
*This Example demonstrates use of the ? operator*/
#include <stdio.h>
main()
{
unsigned int a,b;
printf (“enter two integers\n”);
scanf(“%u%u”, &a, &b);
Self-Instructional
50 Material
(a==b)?printf(“you typed equal numbers\n”): Control Statements
printf(“numbers not equal\n”);
}
Result of the program
NOTES
enter two integers
123 456
numbers not equal
3.2.2 Loops and Control Constructs
Quite often, you have to perform the same operation a number of times. You may
also have to repeat the same operation with one or more of the values changed,
which is known as loop or iteration. It is definitely possible to write a program for
such situations with the concepts you have learned so far. However, there are
special constructs in any programming language for carrying out repeated
calculations.
Iteration using if
Before you look at loop constructs, let us consider an example to see the need for
repetitive calculations. Assume that you want to find the sum of the first 10 natural
numbers 1 to 10. This can be achieved through successive addition, i.e., first you
initialize the sum to 0 and then add 1 to the sum. Next you add 2 to the sum, then
3, and so on till you add 10 to the sum. Thus, by repeated addition 10 times, you
have found the sum of first 10 natural numbers.
The algorithm below summarizes what you have done:
Step 1: Sum = 0
Step 2: I = 1
Step 3: If I < = 10 perform the following operations:
sum = sum + I; I = I + 1;
Step 4: Print the sum
Now analyse the algorithm:
At the beginning, Steps 1 and 2 are entered with sum = 0 and I = 1
Since I < = 10;
Sum will be equal to sum + I,
i.e., sum = 0 + 1 = 1
I = I + 1, i.e., I = 2
Now the program goes to Step 3.
with I = 2 and sum = 1
Since, I < = 10

Self-Instructional
Material 51
Control Statements sum = sum + I
Sum was 1 and I is 2
sum = 1 + 2 = 3
NOTES Next I will be incremented to 3
Third iteration:
Step 3 is approached with I = 3 and sum = 3
Since, I < = 10
sum = sum + I = (1 + 2) + 3
I = 4
Ninth iteration:
Step 3 is approached with I = 9
Since, I < = 10
sum = (1 + 2 + 3 +............+ 8) + 9
I is incremented to 10
Tenth iteration:
Step 3 is approached with I = 10
Since, I < = 10
sum = sum + I
= (1 + 2 + 3 +........+ 8 + 9) + 10
Now I is incremented to 11
Since, I <= 10 is not True, the program does not execute the statements
following the if and jumps to Step 4.
In Step 4 the sum is printed.
This algorithm is implemented in Program 3.7.
/*Example 3.7
Demonstrates use of if for iteration*/
#include <stdio.h>
main()
{
int sum=0, i=1; /*declaration and initialization
combined*/
step3: /*label- loop starts here*/
if (i <=10)
{
Sum = sum + i;
i = i + 1;

Self-Instructional
52 Material
goto step3; Control Statements
}
printf(“sum of first 10 natural numbers=%d\n”, sum);
}
NOTES
Result of the program
sum of first 10 natural numbers = 55
The program has implemented the algorithm in goto. The program uses
if and goto keywords. According to the algorithm, the program has to go to
Step 3. Step 3 in this program is called a label, which is followed by a colon. The
rules for coining a label name are the same as for an identifier. The label can be
placed anywhere in the same function where the goto statement is found. Usage
of goto is considered to be bad programming practice since, it leads to errors
when changes are made in the program and also affects readability. It is always
possible to write a program without using goto. The program can be rewritten
without goto by using a for statement.
for Statement
The for statement is meant for the easy implementation of iterations unlike if.
The syntax of for is given below:
for (exp1; exp2; exp3)
{statements;}
Note the keyword, the parentheses and semicolons. There is no semicolon
after exp3, exp1, exp2 and exp3 are expressions. The usage of the for
loop is given below:
exp1: Contains the initial value of an index or a variable.
exp3: Contains the alteration to the index after each iteration of the body
of the for statement. The body of the statement is either a single statement or a
group of statements enclosed within braces.
If a single statement has to be executed, then braces are not required.
exp2: Condition that must be satisfied if the body of statements is to be
executed.
An example of a for loop is given below:
for (i = 0; i < 5; i++)
printf(“%d”, i);
The loop will start with an initial value of i = 0. Since, i < 5, the body
of the for loop will be executed and it will print 0. Now the exp3 will be
executed and i will be incremented to 1. Since, i is less than 5, body of the loop
will again be executed to print 1. This will continue till 4 is printed. i will now be
incremented to 5 and since, i is not less than 5, the for loop will be terminated.
This is how for loop is used to carry out repetitive operations.
Self-Instructional
Material 53
Control Statements Now write a program for finding the sum of the first 10 natural numbers
using the for statement.
The program is given in Example 3.8
NOTES /*Example 3.8
demonstrates the use of the for statement to find the sum
of the first 10 natural numbers*/
#include <stdio.h>
main()
{
int sum=0, i; /*declaration and initialization
combined*/
for (i=1; i<=10; i++) /*loop starts here*/
{
Sum = sum + i;
}
printf(“sum of the first 10 natural numbers=%d\n”,
sum);
}
Result of the program
sum of first 10 natural numbers = 55
Note the difference between Example 3.7 and Example 3.8.
You have eliminated the label Step 3 and the goto statement.
The initialization of i = 1 is carried out as part of the for statement.
The incrementing of i is also carried out as part of the for statement. The
program has, therefore, been simplified.
How does the program work?
Step 1: i = 1
i is checked with i <= 10
Since, i is less than 10, the for loop is executed.
sum = sum + i = 0 + 1 = 1
Step 2: i is incremented to 2
2 is <=10
Therefore, the for loop is executed.
sum = sum + i = (1) + 2
The Steps 3, 4, 5, 6, 7, 8 will continue to get incremented, respectively.
Step 9:
i is incremented to 9
9 is < = 10
Self-Instructional
54 Material
Therefore, the for loop is executed. Control Statements

sum = sum + i = (1 + 2 +........+ 8) + 9


Step 10:
i is incremented to 10 NOTES
10 is < = 10
sum = sum + I = (1 + 2 +..........+ 9) + 10
Step 11:
i is incremented to 11.
11 is not < = 10.
Therefore, the for loop is now terminated.
The printf() function is now executed automatically.
Now summarize the operation of the for loop.
When a program encounters a for loop, it first checks the condition through
the expression in the middle. If the condition is satisfied, it executes the group of
statements. After executing the statements in the body of the loop, the program
transfers the execution to the for statement and the third expression is executed,
which is usually incrementing or decrementing. Then the condition is checked. If
the condition is not satisfied, the group of statements will not be executed and the
program will skip to the next statement after the statements pertaining to the for
statement.
By chance if the initial value was typed as 11 instead of 1 in the program,
the condition will turn out to be False and the group of statements will not be
executed at all.
Three Components of for Statement
The three components of a for statement are as follows:
exp1 and exp3 are assignments or function calls. Function calls will be
discussed at a later stage; exp2 is a relational expression. The three expressions
may not always be present. However, even if an expression is not present, the
associated semicolon should be present.
For example,
for (; exp2 ;)
{s1}
Here, the initial value is not specified and the incrementing does not take
place after every iteration. Presumably, the initial value is assigned elsewhere and
incrementing or a similar operation takes place as part of the group of statements
following the for. However, since exp2 is present, the loop will terminate.
However, if all three expressions are omitted as follows:
for ( ; ; )

Self-Instructional
Material 55
Control Statements the loop will never terminate because the conditional statement is absent. If exp2
is not present, it is assumed that the condition is True always. Such a statement
should not be used.
Instead of incrementing, you can use i + = 2 as exp3 when i will be
NOTES
incremented by 2 every time.
Now try to print the list of even numbers up to 50. The program is as shown
below:
/*Example 3.9
variation in for statement - to print even numbers*/
#include <stdio.h>
main()
{
int i=2;
for (; i<10; i+=2) /*loop starts here*/
{
printf(“%i is an even number\n”,i);
}
}
Here you initialize i = 2 before the for loop itself. However, the
corresponding semicolon is present at the right place.
Result of the program
2 is an even number
4 is an even number
6 is an even number
8 is an even number
Symbolic Constants and Looping
So far you have been giving the initial value, the increment and final value as part of
the programs. Assuming you want to change one or more of them later on, how
are you to go about it? You would then have to rewrite the program. C provides
a method by which this can be done with the least changes by using the # define
statement. The format of this statement is as follows:
# define name constant
For example, you can define,
# define INITIAL 1
Which defines INITIAL as 1. The INITIAL types of definitions are
called symbolic constants. They are not variables and hence, they are not defined
as part of the declarations of variables. They are specified on top of the program
before the main() function. The symbolic constants are to be written in capital
or upper case letters. Wherever, the symbolic constant names appear in a program,
Self-Instructional
56 Material
the compiler will replace them with the corresponding replacement constants defined Control Statements

in the # define statement. In this case, 1 will be substituted, wherever


INITIAL appears in the program. Note that there is no semicolon at the end of
the # define statement.
NOTES
You can now write a program to print out the numbers between a given
range, say 100 to 150, which are divisible by 3, i.e., when divided by 3 the
modulus = 0. Such numbers are ‘evenly divisible by 3’.
/*Example 3.10
Demonstrates the use of symbolic constants-
program to find numbers between 100 and 150
evenly divisible by 3*/
#include <stdio.h>
#define LOW 100
#define UPPER 125
#define STEP 1
main()
{
int num;
for (num=LOW; num<UPPER; num+=STEP) /*loop starts
here*/
{
if(num%3 == 0)
printf(“%i is evenly divisible by 3\n”, num);
}
}
Result of the program
102 is evenly divisible by 3
105 is evenly divisible by 3
108 is evenly divisible by 3
111 is evenly divisible by 3
114 is evenly divisible by 3
117 is evenly divisible by 3
120 is evenly divisible by 3
123 is evenly divisible by 3
You can use this technique in future programs. Assume that you want to find
out all the numbers evenly divisible by 3 between 1 and 1000.
You have to define LOWER as 1 and UPPER as 1000. Assuming that later
on you want to find out the numbers evenly divisible by 7, you would have to again
rewrite the program. This can be avoided by defining the DIVISOR as 7 and
substituting within the condition (number % DIVISOR == 0). In this way,
Self-Instructional
Material 57
Control Statements you can write a program to find out the numbers evenly divisible by any number in
any range.
Other Forms of the for Loop
NOTES The for loops can be nested as follows:
for (i = 1; i <= 10; i++)
{
for (j = 1; j <= 5; j++)
{
for (k = 1; k <= 2; k++)
{
s1
}
}
}
The statement s1 will be executed as follows:
First time i = 1, j = 1, k = 1
Second time i = 1, j = 1, k = 2
Third time i = 1, j = 2, k = 1
i = 1, j = 2, k = 2
i = 1, j = 3, k = 1
i = 1, j = 3, k = 2
Lastly i = 10, j = 5, k = 2
s1 will be executed 2*5*10 = 100 times.
Any level of nesting is acceptable; However, the higher the level of nesting
is, the more easy it will be to commit mistakes and more difficult to understand.
Now, look at some more for loops.
For ( x = –5; – –x >= –10; )
{
}
Here, the decrement and conditional statements are combined in exp2.
Since, decrement is a prefix, the decrement of x is carried out first. The
condition is then checked in order to decide whether to continue or not. Then the
loop is executed. Therefore, the first iteration will be carried out with x = –6
and the last with x = –10.

Self-Instructional
58 Material
Another variation of the statement is as follows: Control Statements

for (y = 100; y ++<= 200;)


{
s2
NOTES
}
Here too exp2 and exp3 are combined. This is a postfix notation. The
following sequence is carried out: condition check, increment and execute the
loop. Therefore, the statement s2 following the for loop will be executed the
first time with y = 101 and finally with y = 201 as well.
The for loop is a popular iteration construct not only in ‘C’ but also in
other languages. Here, the initial value, the step and the final value are clear and
unambiguous and simple to write. There are other loop statements also. In the
next section, you will study the while loop.
The while Loop
The while loop is a subset of the for loop. The syntax for the while loop is
given below:
while (expression)
{statements;}
This means that the statement(s), which is a single statement or multiple
statements, will be executed while the expression is True. When it becomes False,
the execution will stop.
The while is similar to the for loop without exp1 and exp3. The
for loop can be simulated or replaced with the while loop as given below.
exp1;
while (exp2)
{
statements
exp3;
}
The programmer can use while or for at his discretion. The for loop
is preferred when the initialization and incrementation are simpler.
Let us look at an example.
Let us write a program for the generation of any multiplication table.
The program is given below:
/*Example 3.11
Use of while - You can generate multiplication
tables of your choice using this program.
caution: Don’t exceed maximum limits of integer */
#include <stdio.h>
Self-Instructional
Material 59
Control Statements main()
{
int a,b,product;

NOTES a=1;
b=0;
product=0;
printf(“Enter which table you want”);
scanf(“%d”,&b);
while (a <=10)
{
product = a*b;
printf(“%2d X %d= %3d\n”,a,b,product);
a++;
}
}
When the program asks you to enter a table and you type 12, you will get
the 12th table as given below.
Result of the program
Enter which table you want 12

1 X 12= 12

2 X 12= 24

3 X 12= 36

4 X 12= 48

5 X 12= 60

6 X 12= 72

7 X 12= 84

8 X 12= 96

9 X 12= 108

10 X 12= 120
Note here that the condition is (a < =10); incrementing is done within the
loop in a++. Variable a is initialized as 1 before entering the while loop.
If you want to print the table up to 16 × 12 = 192 then simply change
the condition to while (a < =16).

Self-Instructional
60 Material
do . . . while Loop Control Statements

This is a modification of the while statement. In the while statement, before


the group of statements following the while are executed, the condition associated
with the while is checked. If the condition is True or fulfilled, then the associated
NOTES
statements are executed. If not, the program skips the statements associated with
the while loop. This is depicted in Figure 3.1.

Fig. 3.1 while Loop

After execution of the statements, the program will check again whether the
condition is True and then continue to execute or skip the statements depending
on the condition. The statements may not be executed even once if the condition
was False at the entry point.
However, the do...while works differently as shown in Figure 3.2.

Fig. 3.2 do...while Loop

Here the statements following do loop will be executed once before the
condition is checked. If it is True, then the statements will be executed again. If
not, the program will skip the statements and proceed further. Thus, whatever be
the condition, the statements following do loop will be executed once before
Self-Instructional
Material 61
Control Statements checking the condition. This is the essential difference between do...while
and while. The while loop tests the condition on top of the loop; but
do...while tests at the bottom after executing the statements. The while
loop executes the statements after checking the condition; but do...while
NOTES executes the statements before testing the condition. The syntax of do...while
is as follows:
do
{
statements
}
while (expression);
The statement do...while is not used as frequently as the while
loop.
/*Example 3.12
Conversion of upper case to a lower case alphabet*/
#include <stdio.h>
#include<conio.h>
main()
{
int alpha=0;
do
{
printf (“\nenter upper case alphabet- enter 1 to
quit\n”);
alpha=getche();
if ( alpha >=’A’&& alpha<=’Z’)
{
alpha=(alpha+32);
putch(alpha);
}
else
{
if(alpha==’1')
printf(“End of Session”);
else
printf(“\ninvalid entry; retry”);
}
}while(alpha!=’1');
}

Self-Instructional
62 Material
Result of the program Control Statements

enter upper case alphabet- enter 1 to quit


Gg
enter upper case alphabet- enter 1 to quit
NOTES
o
invalid entry; retry
enter upper case alphabet- enter 1 to quit
Dd
enter upper case alphabet- enter 1 to quit
1End of Session
How does it differ? Here too, the program will attempt to convert one
character before it can be terminated. Assuming that the first character was 1, the
program will still attempt to convert it and print the message “End of
Session” before it quits.
Suppose the first character is a valid one and a number of characters are
converted in succession; when you want to terminate the program, 1 has to be
pressed and even then the program will not stop immediately. It will stop only after
the statements are executed. Since, the problem is the same, a detailed look at
both the examples will bring out the similarity in operation between both the
constructs. However, there are occasions when it is quite suitable as given in the
next section.

Check Your Progress


1. Give the syntax of if statement.
2. Explain the logical operators provided by ‘C’.
3. Why is goto not used in a program? What can be used instead of goto?
4. Write the syntax for for statement.
5. Define while loop.

3.2.3 switch Statement


switch statements allow clear and easy implementation of multiway decision-
making. Assuming that a number is received from the keyboard and depending
on the value, we want to carry out some operations, the switch statement can
be used effectively in this situation. In simpler situations if...else could be
used, and in complex situations, switch can be used. For example, if you get
numbers starting from 1 to 4 and print their values in words, you can use the
if...else statement.
Assuming that you want to get a one digit number up to 9 and print its value
in words, it would become a more complex task. The switch statement comes
Self-Instructional
Material 63
Control Statements in handy in such situations. The syntax of the switch statement is as follows:
switch (expression)
{
case constant or expression : statements
NOTES
case constant or expression : statements
..
default : statements }
When the switch keyword is encountered, the associated expression is
evaluated. The program now looks for the case, which matches with the value
of the expression. Execution then starts from the statement corresponding to the
case which matches. Each case has to be accompanied by integer expressions,
which must be unique, as otherwise the program will not know where to start. For
example, if the first two cases are as follows:
case 10 : s1;
case 10 : s2;
In this case, the program would not know whether to execute s1 or s2
when the expression of switch evaluates to 10. Therefore, the constant
expressions following the case keyword should all be unique. There may be
occasions when none of the constant expressions matches the switch expression
in which case the default statements will be executed. Thus, switch allows
branching of the program execution to an appropriate place.
3.2.4 break, continue, return and goto
The keywords while, for and switch test the condition on top, while
do...while checks at the bottom for quitting the loop. The break statement
helps immediate exit from any part of the loop as demonstrated with the switch
statement. It can be used with any other loop construct or anywhere in the program.
When the break statement is executed it goes to the bottom of the block. Recall
that a block is a group of statements enclosed between an opening brace and the
corresponding closing brace.
The continue statement is related to break. When continue is
executed, it causes the next iteration of the corresponding for, do...while
or while loop to begin. Therefore, continue takes the program to the top
of the block and in the for loop, it will cause the next increment operation,
followed by checking whether the condition is true or false in order to decide the
next course of action. This is similar to skipping the current execution and continuing
with the next operation after incrementing. The statement continue skips the
rest of the statements in the loop for that iteration, whereas break terminates the
loop.
Write a program to check whether a given number is positive or negative. If
it is zero, the program should terminate after printing the value. If it is a positive
integer above zero and <=20000, the value will be printed; if negative, it will go
Self-Instructional
64 Material
to fetch the next number. If the number is >20000, the program terminates. The Control Statements

program is given below:


/*Example 3.13
/*Program to demonstrate continue*/
NOTES
#include <stdio.h>
main()
{
int a;
do
{
printf(“enter a number-enter 0 to end session\n”);
scanf(“%d”, &a);
if(a > 20000)
{
printf(“you entered a high value-going out of
range\n”);
break;
}
else
if(a>=0)
printf(“you entered %d\n”, a);
if (a <0)
{
printf(“you entered a negative number\n”);
continue;
}
}
while(a !=0);
printf(“ End of session\n”);
}
Result of the program
enter a number-enter 0 to end session
33
you entered 33
enter a number-enter 0 to end session
-60
you entered a negative number
enter a number-enter 0 to end session
45
you entered 45
Self-Instructional
Material 65
Control Statements enter a number-enter 0 to end session
25000
you entered a high value-going out of range
End of session
NOTES
If the number typed > 20000, or if it is equal to zero, the program comes
out of the loop and prints “End of session”. If the number is negative, a
< 0 and hence, continue will be executed. It will go to the top of the loop.
The next integer will be received. The program, therefore, terminates when a =
0 as well as a > 20000, but there is a difference. If the number entered is zero,
the program checks whether a > 20000. Since, the condition fails, it checks
whether a > = 0 and since, it is True, 0 will be printed and then the while
condition is checked. The program terminates after the while condition is
checked.
However, if the number entered is > 20000, the loop terminates instantly
without transacting any business except printing messages as shown.
The return statement can appear anywhere in a function and when it is
encountered a value is returned to the called function. The return may also not
return a value in statements as shown:
return ;
return (0) ;
The return statement may appear anywhere in the function and not
necessarily at the end of the function. Whenever, return is executed, the program
returns to the function called the current function. The program returns to the place
from where it called the function. Thus return is also used to suddenly exit
from a function or a loop in a function.
exit Function
There is a library function exit(), which causes the termination of the current
program. Note that, exit() terminates the execution of the program itself, and
not the block. The statement break enables coming out of the block or loop in
which it is executed but exit terminates the program at whatever, stage the
program may be. The exit() is a powerful function.
goto Statement
The goto statement in C refers to structured programming principles. This statement
leads to ‘spaghetti’ code, which is difficult to understand. The syntax of code is
written as follows:
Goto Syscrash
//Other statement
Syscrash:
//Control will begin here following goto

Self-Instructional
66 Material
The label is always terminated by a colon. The goto statement is a jump statement, Control Statements
which jumps from one point to another point within a function. This keyword is
marked by label statement. Label statement can be used anywhere in the function
above or below the goto statement. The following C code displays the list of
numbers from 0 to 9. For this, you need to define the label statement loop above NOTES
the goto statement. The given program declares a variable n initialized to 0. The
n++ increments the value of n by 1 till the loop reaches 10. Then on declaring the
goto statement, it will jump to the label statement and prints the value of n. The
code is written in the C language as follows:
Example 3.14
/*Program to demonstreate goto statement*/
#include <stdio.h>
#include <conio.h>
int main()
{
clrscr(); //Clear the screen
int n = 0;
loop: ;
printf(“\n%d”, n);
n++;
if (n<10)
{
goto loop;
}
getch();
return 0;
}
The result of above code is as follows:
0
1
2
3
4
5
6
7
8
9
A goto statement causes your program to unconditionally transfer control to
the statement associated with the label specified on the goto statement. Because, Self-Instructional
Material 67
Control Statements the goto statement can interfere with the normal sequence of processing, it makes a
program more difficult to read and maintain. Often, a break statement, a continue
statement, or a function call can eliminate the need for a goto statement. If an active
block is exited using a goto statement, any local variables are destroyed when
NOTES control is transferred from that block. You cannot use a goto statement to jump over
initializations.

Check Your Progress


6. What is a break statement?
7. How are continue and break statements related?
8. Define the exit() function.

3.3 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. The syntax of the if statement is given below.


if (condition)
{statements}
If the condition is True, then a single statement or group of statements
following the if will be executed. If more than one statement is to be
executed, then the statements are grouped within braces. If it is a single
statement, then curly braces are not required.
2. ‘C’ provides three logical operators for combining more than one condition.
These are as follows:
Logical and represented as &&
Logical or represented as ||
Negation or not represented as !(exclamation).
3. Usage of goto is considered to be bad programming practice since it
leads to errors when changes are made in the program and also affects
readability. It is always possible to write a program without using goto.
The program can be rewritten without goto by using a for statement.
4. The for statement is meant for the easy implementation of iterations unlike
if. The syntax of for is given below:
for (exp1; exp2; exp3)
{statements;}
5. The whileloop is a subset of the for loop. The syntax for the while
loop is given below:
while (expression)
Self-Instructional {statements;}
68 Material
6. The break statement takes the program to the end of the switch Control Statements

statement. The end is just the closing brace corresponding to switch


after the default printf() statement.
7. The continue statement is related to break. When continueis
NOTES
executed, it causes the next iteration of the corresponding for,
do...while or while loop to begin. Therefore, continue takes
the program to the top of the block and in the for loop, it will cause the
next increment operation, followed by checking whether the condition is
true or false in order to decide the next course of action. This is similar to
skipping the current execution and continuing with the next operation after
incrementing. The statement continueskips the rest of the statements in
the loop for that iteration, whereas break terminates the loop.
8. There is a library function exit(), which causes the termination of the
current program. Note that, exit() terminates the execution of the
program itself, and not the block. The statement breakenables coming
out of the block or loop in which it is executed but exit() terminates the
program at whatever stage the program may be. The exit() is a powerful
function.

3.4 SUMMARY

 The syntax of if statement is, if (condition) {statements}.


If the condition is True, then a single statement or group of statements
following if will be executed. If more than one statement is to be executed,
then the statements are grouped within braces. If it is a single statement then
curly braces are not required.
 Each opening curly brace has to have a matching closing curly brace.
 The syntax of if .. else is if (condition true)
{statements s1} else {statements s2}. The statement
else is always associated with an if. If the condition is True, then statements
s1 will be executed else statements s2 will be executed.
 ‘C’ provides three logical operators for combining more than one condition.
These are logical AND represented as &&, logical OR represented as || and
Negation or NOT represented as ! (Exclamation).
 The syntax for the conditional operator is, (Condition)?
statement1: statement2;. If the condition is True, execute
statement1; else execute statement2.
 The for statement is meant for the easy implementation of iterations unlike
i f . The syntax for f o r loop is, f o r ( e x p 1 ; e x p 2 ;
exp3){statements;}.
 The while loop is a subset of for loop. The syntax for the while loop
is, while (expression){statements;}. Self-Instructional
Material 69
Control Statements  Linear search is the process of finding out whether a given number or string
is present in an array of data. The entire array is compared from the beginning
till the end. If the item to be searched matches with an item in the array of
data, then the search is stopped; otherwise, the search is continued till the
NOTES end of the array.
 switch statements are used for implementation of multiway decision-
making. Every switch statement contains a condition in the form of an
expression.
 The break statement is used if we do not want the program to execute
irrelevant statements.
 The return statement can appear anywhere in a function and when it is
encountered a value is returned to the called function.
 A library function exit() causes the termination of the current program.
It terminates the execution of the program itself and not the block.
 A comma to declare more than one data type.

3.5 KEY WORDS

 Loop or iteration: It is used to perform the same operation a number of


times by repeating the same operation with one or more of the values
changed.
 Switch: When the switchkeyword is encountered, the associated expression
is evaluated.

3.6 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. Define the term loops.
2. What is a while loop?
3. What is a switch statement?
Long-Answer Questions
1. What is the significance of branching in C program?
2. What are the three component of for loop?
3. Discuss the difference between while and do…while loop.

Self-Instructional
70 Material
Control Statements
3.7 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill. NOTES
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 71
Function
BLOCK - II
FUNCTIONS, ARRAYS AND POINTERS

NOTES
UNIT 4 FUNCTION
Structure
4.0 Introduction
4.1 Objectives
4.2 Function
4.2.1 Defining and Accessing a Function
4.2.2 Function Arguments
4.2.3 Arrays and Functions
4.2.4 Recursive Function
4.3 Storage Classes
4.3.1 Automatic Variables
4.3.2 Register Variables
4.3.3 External (Global) Variables
4.3.4 Static Variables
4.3.5 External (Global) Static Variable
4.3.6 Multi-file Program
4.4 Macros
4.5 Preprocessor Directives
4.6 Answers to Check Your Progress Questions
4.7 Summary
4.8 Key Words
4.9 Self Assessment Questions and Exercises
4.10 Further Readings

4.0 INTRODUCTION

In this unit, you will learn about user-defined functions, and about function
declaration, function call and how to define a function. A function, when declared
in a program, is a function prototype which may be declared at the beginning of a
main program. A function, on execution, is supposed to do something: either return
a value as integer, character or float; or perform some operation. A function consists
of two parts, declarator and declaration. A function declaration has the format in
which the type of data returned, name of the function and arguments on which the
function is to operate, are mentioned. Arguments declared as part of the function
prototype are called formal parameters, which are enclosed in a pair of parentheses.
A function may not contain any parameter, in which case an empty pair of
parentheses should follow the name of the function. A function may not return a
value, in which case void is written as the return data type. A function may be
called directly or indirectly by another function. For this, there should be one-to-
one correspondence between formal arguments declared and actual arguments
Self-Instructional
72 Material
sent and should be of the same data type. A function declarator is a replica of a Function

function declaration; the difference lies in the way they are written inside a program
body. A declaration in a calling function will end with a semicolon and a declarator
in a called function will not end with a semicolon. You will also learn about the
different storage classes used to define the scope of variables. A variable may be NOTES
local to a particular function where it is defined or global for all the functions in that
program body if defined before all the functions in that program. There are four
types of storage classes: auto, static, extern and register.

4.1 OBJECTIVES
After going through this unit, you will be able to:
 Understand user-defined functions
 Define, Call and declare a function
 Explain formal and actual parameters
 Use recursion in writing more compact programs
 Use different storage classes for variables in writing a program
 Define macros
 Understand preprocessor directives

4.2 FUNCTION

A function in a program consists of three characteristics:


(a) Function prototype
(b) Function call
(c) Function definition
Function Prototype
A function prototype is called a function declaration. A function may be declared
at the beginning of the main function. Function declaration is of the following type:
return data - type function name (formal argument 1,
argument 2, ............. );
A function after execution may return a value to the function, which called it.
It may not return a value at all but may perform some operations instead. It may
return an integer, character, or float. If it returns a float, we may declare the function
as
float f1(float arg 1, int arg 2);
If it does not return any value we may write the above as
void fun2(float arg1, int arg2); /*void means nothing*.
Self-Instructional
Material 73
Function If it returns a character, we may write
char fun3(float arg1, int arg2);
If no arguments are passed into a function, an empty pair of parentheses
NOTES must follow the function name. For example,
char fun4( );
The arguments declared as part of the prototype are also known as formal
parameters. The formal arguments indicate the type of data to be transferred from
the calling function.
Function Call – Passing Arguments to a Function
We may call a function either directly or indirectly. When we call the function, we
pass the actual arguments or values. Calling a function is also known as function
reference.
There must be a one-to-one correspondence between formal arguments
declared and the actual arguments sent. They should be of the same data type and
in the same order. For example,
sum=f1 (20.5, 10); fun4( );
4.2.1 Defining and Accessing a Function
Function definition can be written anywhere in the file with a proper declarator,
followed by the declaration of local variables and statements. Function definition
consists of two parts, namely function declarator or heading and function
declarations. The function heading is similar to function declaration but will not
terminate with a semicolon.
The use of functions will be demonstrated with simple programs in this unit.
Suppose you wish to get two integers. Pass them to a function add. Add
them in the add function. Return the value to the main function and print it. The
algorithm for solving the problem will be as follows:
Main Function
Step 1: Define function add
Step 2: Get 2 integers
Step 3: Call add and pass the 2 values
Step 4: Get the sum
Step 5: Print the value
function add
Step 1: Get the value
Step 2: Add them
Step 3: Return the value to main

Self-Instructional
74 Material
Thus, you have divided the problem. The program is as follows: Function

/*Example 4.1
/* use of function*/
#include <stdio.h>
NOTES
int main()
{
int a=0, b=0, sum=0;
int add(int a, int b); /*function declaration*/
printf(“enter 2 integers\n”);
scanf(“%d%d”, &a, &b);
sum =add(a, b); /*function call*/
printf(“sum of %d and %d =%d”, a, b, sum);
}
/*function definition*/
int add (int c, int d) /*function declarator*/
{
int e;
e= c+d;
return e;
}
Result of the program
enter 2 integers
6667 4445
sum of 6667 and 4445 =11112
The explanation as to how the program works is given below:
On the fifth statement (seventh line), the declaration of the function add is
given. Note that the function will return an integer. Hence, the return type is defined
as int. The formal arguments are defined as int a and int b. The function
name is add. You cannot use a variable without declaring it, as also a function
without telling the compiler about it. Note also that function declaration ends with
a semicolon, similar to the declaration of any other variable. Function declaration
should appear at the beginning of the calling function. It hints to the compiler that
the function is going to call the function add, later in the program. If the calling
function is not going to pass any arguments, then empty parentheses are to be
written after the function name. The parentheses must be present in the function
declaration. This happens when the function is called to perform an operation
without passing arguments. In this case, if a and b are part of the called
function(add) itself, then we need not pass any parameters. In such a case, the

Self-Instructional
Material 75
Function function declaration will be as follows assuming that the called function returns an
integer:
int add ( ) ;

NOTES In Example 4.1, you get the values of a and b. After that you call the function
add and assign the value returned by the function to an already defined int
variable sum as follows:
sum = add ( a , b );
Note that add(a, b) is the function call or function reference. Here, the
return type is not to be given. The type of arguments are also not to be given. It is
a simple statement without all the elements of the function declaration. However,
the function name and the names of the arguments passed, if any, should be present
in the function call. When the program sees a function reference or function call, it
looks for and calls the function and transfers the arguments.
The function definition consists of two parts, i.e., the function declarator
and function declarations.
The function declarator is a replica of the function declaration. The only
difference is that while the declaration in the calling function will end with a semicolon,
the declarator in the called function will not end with a semicolon. As in main(),
the entire functions body will be enclosed within braces. The whole function can
be assumed to be one program statement. This means that all the statements within
the body will be executed one after another before the program execution returns
to the place in the main function from where it was called.
The important points to be noted are:
(a) The declarator must agree totally with the declaration in the called
function, i.e., the return data type, the function name, the argument
type should all appear in the same order. The declarator will not end
with a semicolon.
(b) You can also give the same name as in the calling function—in
declaration statement or function call—or different names to the
arguments in the function declarator. Here, we have given the names
c and d. What is important; however, is that the type of arguments
should appear, as it is in the declaration in the calling program. They
must also appear in the same order.
(c) At the time of execution, when the function encounters the closing
brace }, it returns control to the calling program and returns to the
same place at which the function was called.
In this program, you have a specific statement return (e) before the
closing brace. Therefore, the program will go back to the main function with the
value of e. This value will be substituted as
sum = (returned value)

Self-Instructional
76 Material
Therefore, sum gets the value which is printed in the next statement. This is Function

how the function works.


Assume now that the program gets a and b values, gets their sum1, gets c
and d and gets their sum2 and then both the sums are passed to the function to
NOTES
get their total. The program for doing this is as follows:
/*Example 4.2
/* A function called many times */
#include <stdio.h>
int main()
{
float a, b, c, d, sum1, sum2, sum3;
float add(float a, float b); /*function declaration*/
printf(“enter 2 float numbers\n”);
scanf(“%f%f”, &a, &b);
sum1 =add(a, b); /*function call*/
printf(“enter 2 more float numbers\n”);
scanf(“%f%f”, &c, &d);
sum2 =add(c, d); /*function call*/
sum3 =add(sum1, sum2); /*function call*/
printf(“sum of %f and %f =%f\n”, a, b, sum1);
printf(“sum of %f and %f =%f\n”, c, d, sum2);
printf(“sum of %f and %f =%f\n”, sum1,sum2, sum3);
}
/*function definition*/
float add (float c, float d) /*function declarator*/
{
float e;
e= c+d;
return e;
}
Result of the program
enter 2 float numbers
1.5 3.7
enter 2 more float numbers
5.6 8.9
sum of 1.500000 and 3.700000 =5.200000
sum of 5.600000 and 8.900000 =14.500000
sum of 5.200000 and 14.500000 =19.70000
You have defined sum1, sum2 and sum3 as float variables.

Self-Instructional
Material 77
Function You are calling the function add three times with the following assignment
statements:
sum1 = add( a, b );
sum2 = add( c, d);
NOTES
sum3 = add( sum1 , sum2 );
Thus, the program goes back and forth between main() and add as
given below:
int main()
add (a, b)
int main()
add (c, d)
int main()
add (sum 1, sum 2)
int main()
Had you not used the function add, you would have to write statements
pertaining to add 3 times in the main program. Such a program would be large
and difficult to read. In this method, you have to code for add only once and
hence, the program size is small. This is one of the reasons for the usage of functions.
In Example 4.2, you could add another function call by add (10.005,
3.1125); This statement will also work perfectly. After the function is executed,
the sum will be returned to the main() function. Therefore, both variables and
constants can be passed to a function by making use of the same function declaration.
4.2.2 Function Arguments
You know now that an argument is a parameter or value. It could be of any of the
valid types, such as all forms of integers or a float or char. You come across two
types of arguments when you deal with functions:
formal arguments
actual arguments
Formal arguments are defined in the function declaration in the calling
function. What is actual argument? Data, which is passed from the calling function
to the called function, is called the actual argument. The actual arguments are
passed to the called function through a function call.
Each actual argument supplied by the calling function should correspond to
the formal arguments in the same order. The new ANSI standard permits declaration
of the data types within the function declaration to be followed by the argument
name. You have used only this type of declaration as it will help students follow the
C++ program easily. This helps in understanding one to one correspondence
between the actual arguments supplied and those received in the function and
facilitates the compiler to verify that one to one correspondence exists and that the
right number of parameters have been passed. It may be noted that formal
Self-Instructional arguments cannot be used for any other purposes. They only give a prototype for
78 Material
the function. Thus, the names of the formal arguments are dummy and will not be Function

recognized elsewhere, even in the functions in which they are defined.


Although, the types of variables in the function declaration, also known as
prototype and function call are to be the same, the names need not be the same.
NOTES
You have already used this concept in Example 4.2 after defining float a and
float b in the functions prototype, you first called add (a, b), add (c,
d) and then add (sum1, sum2). Thus, the formal arguments defined in the
prototype and the actual arguments were not the same in two of the above cases.
When the actual arguments are passed to a function, the function notes the
order in which they are received and appropriately stores them in different locations.
You must note that even if you use a and b in the add function, they will be stored
in different locations. They will have no relationship with a and b of the main
function. Therefore, even if a and b are assigned different values in the called
function, the corresponding values in the calling function would not have changed.
You will verify this point in the program in the next section.
Scope: Rules for Functions
The scope of the variable is local to the function, unless it is a global variable. For
instance,
int function1(int I )
{ int j=100;
double function2 (int j) ;
function2 (j) ;
}
double function2 (int p)
{ double m;
return m;
}
The variable j in function1 is not known to function2. You pass it
to function2 through the argument j. This will be assigned as equal to int p.
Similarly, m in function2 is not known to function1. It can be made known
to function1 through the return statement. This makes the scope rules of
variables in function quite clear. The scope of variables is local to the function
where defined. However, global variables are accessible by all the functions in the
program if they are defined above all functions.
/*Example 4.3
/* To demonstrate that the scope of a
variable is local to the function*/
#include <stdio.h>
int main()
{
Self-Instructional
Material 79
Function float a=100.250, b=200.50;
void change (float a, float b);
change(a, b);
printf(“a= %f b= %f\n “, a, b);
NOTES
printf(“these are the original values”);
}
/*function definition*/
void change (float a, float b) /*function declarator*/
{
a +=1000;
b-=200.5;
}
Result of the program
a= 100.250000 b= 200.500000
these are the original values

We passed a = 100.25 and b = 200.5 to the function. In the function,


you modified a as 1100.25 and b as zero. However, when you print a and b
in the main function, you get the same old values. This confirms that variables are
local to the function unless otherwise specified.
Notice; however, that in the calling function, the type declaration of formal
parameters is symbolic and used only to indicate the format. You will notice, for
example in Example 4.1, that the int a has been declared and assigned a value
of 0. This has no relationship with int a in the function declaration. You could
even omit the variable name and declare as int add(int, int).It will still
work. Here a and b have been given for better readability.
This is the reverse in the case of a called function. In the same program,
int c and int d are explicitly defined in function add, in the declarator. The
variables are used further in the function add. This is not the case with the variables
in the declaration statement or prototype of the calling function, which will never
be used further.
This method of invoking a function is called call by value, i.e., you call the
functions with values as arguments.
Return Values
The return data type is declared in the function declaration in the main() function
or the calling function and the declarator is indicated in the first line of the function
definition. If no value is to be returned, the return data type void is specified.
Void simply means NULL or nothing. Therefore, it does not fall in any other data
types, such as integer or float or char.
Self-Instructional
80 Material
The return value as you have seen is the result of computation in the called Function

function. You return a value, which is stored in a data type in the called function.
The return value means that the value, thus stored in the called function is assigned
or copied to a variable in the main() or calling function. Therefore, to receive
the result, a data type should have been declared and preferably initialized in the NOTES
calling function.
The return statement can be any of the following types:
return (sum) ;
return V1;
return “ true” ;
return ‘ Z ’ ;
return 0;
return 4.0 + 3.0;
In some examples, you have returned variables whose values are known
when they are returned and in other examples, you return constants. You can even
return expressions. If the return statement is not present, it means the return data
type is void.
You can also have multiple return statements in a function. However, for
every call, only one of the return statements will be active and only one value will
be returned.
4.2.3 Arrays and Functions
There is no restriction in passing any number of values to a function; the restriction
is only in the return of values from a function. Therefore, arrays can be passed to
a function without any difficulty, one element at a time, as follows:
#include <stdio.h>
int main()
{
int a[]={1,2,3,4,5};
int j;
int func(int a);
for (j=0; j<=4; j++)
func (a[j]);
..........
}
int func(int c)
{
......
}
Here, func has been declared as a function passing a single integer. Note
here that the declaration or the prototype gives only the format of the parameters
Self-Instructional
Material 81
Function passed. The values are only indicative and are not actual values. They are the
formal values. Therefore, the parameters declared inside the parentheses act only
as a checklist. They cannot be used in the main function elsewhere without actually
declaring them on top of the function. But, for this rule, there would have been a
NOTES conflict between a[] which is an array and a which is a simple variable. Here, no
conflict arises because a is not recognized in the main function. It is only a checklist
to see that whenever the function calls func, an integer has to be passed. If we
try to pass a float, the compiler will detect an error. This is not so in the case of
variables defined in the function declarator above the functions body, as they are
recognized as actual names. In this case, int c is declared as a variable in
func. The initial value will be the same as passed by the calling function. Thus,
since a is used in the function declaration, only one integer can be passed to the
function func. Actually, the entire array can be passed to a function irrespective
of its size, by suitable declaration, as the following example indicates.
/*Example 4.4
/* To find the greatest number in an array*/
#include <stdio.h>
int main()
{
int array[]= {8, 45, 5, 911, 2};
int size=5, max;
int fung(int array[], int size);
max=fung(array, size);
printf(“%d\n”, max);
}
int fung(int a1[], int size)
{
int i, j, maxp=0;
for (j=0; j<size; j++)
{
if (a1[j] > maxp)
{
maxp=a1[j];
}
}
return maxp;
}
Result of the program
911
The objective of Example 4.4 is to find the greatest number in an array. In
the program, an array called array is initialized with 5 values as given below:
int array[]= {8, 45, 5, 911, 2}; size is declared as 5 and a
function called fung has been declared. It will pass an array and an integer to the
called function. The array size has been kept open and the called function will
Self-Instructional
82 Material
return an integer. The next statement calls fung and passes all elements of the Function
array and an integer 5 equal to size. The function gets the actual values and
size=5. The maximum value in the array is found in the for loop and stored in
maxp. The value maxp is returned to the main function and printed there. Thus,
the function is called by value. NOTES
Call by Value
In this section, you have been calling functions by passing values. For example,
function calls in some of the above programs are as follows:
change(a, b);
rev = reverse(num);
The values passed to the function change are a & b which are known.
Similarly, while calling function reverse, we pass num. This is called call by
value. When you call functions by value, the called functions can return only one
value.
Call by Reference
You can enable a function to return more than one value. One way of accomplishing
it is by call by reference.
Formal and Actual Parameters
Parameters are written in the function prototype and function header of the definition.
They are local variables which are assigned values from the arguments when the
function is called. When a function is called, the values (expressions) that are
passed in the call are called the arguments or actual parameters. At the time of
the function call each actual parameter is assigned to the corresponding formal
parameter in the function definition. For value parameters (default) the value of the
actual parameter is assigned to the formal parameter variable. For reference
parameters, the memory address of the actual parameter is assigned to the formal
parameter. By default, argument values are simply copied to the formal parameter
variables at the time of the call. This type of parameter passing is called pass-by-
value. It is the only kind of parameter passing used in C language. Thus, parameters
define information that is passed to a function and are of the following types:
 Actual parameters —parameters that appear in function calls.
 Formal parameters—parameters that appear in function declarations.
A parameter cannot be both a formal and an actual parameter, but both formal
parameters and actual parameters can be either value parameters or variable
parameters. The following example shows how the actual parameters work with
calc_consumer_bill function:
/* Example 4.5
#include <stdio.h>
#include <stdlib.h>
int main (void); Self-Instructional
Material 83
Function int calc_consumer_bill (int, int, int);
int main()
{
int bill;
NOTES
int a = 25;
int b = 32;
int c = 27;

bill = calc_consumer_bill (a, b, c);


printf(“The total bill comes to %d\n”, bill);
exit (0);
}
int calc_consumer_bill (int consumer1, int consumer2,
int consumer3)
{
int total;
total = consumer1 + consumer2 + consumer3;
return total;
}
In the function main() in this example a, b, and c are actual parameters in the
function call calc_consumer_bill. On the other hand, the corresponding
variables in cal c_con sumer _bill , namely consumer 1 ,
consumer2 and consumer3 are formal parameters because they appear in
a function definition.

Formal parameters are always variables; this does not mean that they are always
variable parameters. You can use numbers, expressions or function calls as actual
parameters. Here are some examples of valid actual parameters in the function
call calc_consumer_bill:
bill = calc_consumer_bill (25, 32, 27);
bill = calc_consumer_bill (50+60, 25*2, 100-75);
bill = calc_consumer_bill (a, b, (int) sqrt(25));
The last line in this example code will use math.h header file because sqrt is the
square root function and returns double value, so it must be cast into an int to
be passed to cal_consumer_bill.
4.2.4 Recursive Function

Basic Concepts
The previous section dealt with the concept of a function calling another function,
as well as multiple functions, being called by a number of functions. A function
Self-Instructional
84 Material
calling itself is called recursion and the function may call itself either directly or Function

indirectly. This concept is difficult to understand unless explained through examples.


Every program can be written without using recursion but the reverse is not true.
Some problems; however, are suitable for recursion. For instance, the factorial
problem can be solved using recursion as shown in program below: NOTES
/* Example 4.6
To find the factorial of a given number*/
#include <stdio.h>
int main()
{
int n;
long int result;
long int fact(int n);
printf(“Enter the number whose “);
printf(“factorial is to be found\n”);
scanf(“%d”, &n);
result=fact(n);
printf(“result=%ld”, result);
}
long int fact(int n)
{
if (n<1) return 0;
else
if (n==1) return 1;
else
return (n*fact(n-1));
}
Result of the program
Enter the number whose factorial is to be found
10
result=3628800
Now, let us analyse how the program proceeds. You get an integer n from
the keyboard. In order to find factorial n, you call fact(n), where fact
is the function for finding the factorial of number n. The recursion takes place in
function fact. Assume that n=1. The main function calls fact(1), which will
be assigned to result in the main function after return from the function. In the
function, since n is equal to 1, 1 is returned and printed in main().
Next, assume you want to find out the factorial of say, 2 and fact(2) is
called. In the function fact, since n is not equal to 1, n * fact(n-1) is
returned, i.e., 2 * fact (1) is returned to result. Result = 2 * fact(1).
Self-Instructional
Material 85
Function

Check Your Progress


1. What are formal parameters?
NOTES 2. Where a function definition should be written in a program?
3. What does a function definition consist of?
4. What is ‘recursion’?

4.3 STORAGE CLASSES

A variable has two specifiers, namely data type and storage class.
 data type (e.g. int, float, char, etc.)
 storage class (e.g. auto, static, etc.)
Data type specifies the types of data stored in a variable. Storage class
specifies the segments of the program where the variable is recognized and how
long the storage of the value of the variable will last.
There are four types of storage class specifications as follows:
 automatic
 register
 static
 extern
You have so far been defining only the data type of the variables but not the
storage class. You may wonder then, how your programs worked! The you as a
programmer have to specify the type when it is required to operate in a particular
manner. If the storage class is not specified, the compiler will assume the type on
its own. The storage class is applicable to all types of variables and is prefixed to
the data type declaration as given below:
auto char z ;
extern int a, b, c ;
static float x ;
register char y ;
The basic characteristics of each storage class are discussed in the following
sections:
4.3.1 Automatic Variables
Any variable declared in a function is assumed to be an automatic variable by
default.
(i) Storage location: Except for register variables, the other three types
Self-Instructional will be stored in memory.
86 Material
(ii) Scope: Auto variables are declared within a function and are local to the Function

function. This means the value will not be available in other functions.
Any variable declared within a function is interpreted as an auto variable,
unless a different type of storage class is specified.
NOTES
Auto variables defined in different functions will be independent of each
other, even if they have the same name.
Auto variables are local to the block in a function. If an auto variable is
defined on the top of the function after the opening brace, then it is available for
the entire function. If it is defined later in a block after another opening brace, it
will be valid only till the end of the block, i.e., up to the corresponding closing
brace.
The following program illustrates the concept.
/*Example 4.7
/* To demonstrate the use of auto variable*/
#include <stdio.h>
int main()
{
auto int x=10;
void f1(int x);
int f2(int x);
{
auto int x =20;
printf(“x = %d in the first block\n”, x);
x=f2(x);/*20 is passed to f2 and returned value
assigned to x*/
printf(“x = %d after the return from f2\n”, x);
}
printf(“x = %d after the first block\n”, x);
{
auto int x=30;
printf(“x = %d in the second block\n”, x);
}
printf(“x = %d after the second block\n”, x);
f1(x);/*x=10 is passed to the function f1*/
printf(“x = %d after return from function will be
10\n”, x);
}
void f1(int a)
{
auto float x=5.555;/*integer x will be lost*/
printf(“x = %f in the function\n”, x);
Self-Instructional
Material 87
Function }
int f2(int x)
{
auto int y=100;
NOTES
y+=x; /*y will be 120*/
printf(“y = %d in the function\n”, y);
return y;
}
}
Execute the program and you will get the following results:
Result of the program
x = 20 in the first block
y = 120 in the function
x = 120 after the return from f2
x = 10 after the first block
x = 30 in the second block
x = 10 after the second block
x = 5.555000 in the function
x = 10 after return from function will be 10
This gives a clear idea about the scope of the auto variables.
(iii) Initial values: The auto variable will contain some garbage values
unless initialized. Therefore, they must be initialized before use.
(iv) Life: How long will the value stored in the auto variable last?
It will last as long as the function or block in which it is defined is active. If
the entire function has been executed and the value has been returned, the value of
the auto variables of the functions will be lost. You cannot call it later. This point
should be noted.
4.3.2 Register Variables
Register variables have characteristics similar to auto variables. The only
difference between them is that while auto variables are stored in memory,
register variables are stored in the register of the CPU. The initial value will
be an unpredictable or garbage value; the variables are local to the block and they
will be available as long as the blocks are active.
Why then do you need to declare one more storage class? The CPU registers
respond much faster than the memory. After all, you want to access, store and
retrieve the stored variables faster so that the computing time is reduced.
Registers are faster than memory. Therefore, those variables which are used
frequently can be declared as register variables.
They are declared as,
Self-Instructional register int i ;
88 Material
A memory’s basic unit may be 1 byte but depending on the size of the Function

variable even 10 contiguous bytes of memory can be used to store a long double
variable. Such an extension of size is not; however, possible in the case of registers.
The registers are of fixed length like 2 bytes or 4 bytes and therefore, only integer
or char type variables can be stored as register variables. Since registers NOTES
have many other tasks to do, register variables may be defined sparingly. If a
register variable is declared and if it is not possible to accept it as a register
variable for whatever reasons, the computer will treat it as an auto variable.
Therefore, the programmer may specify a frequently used variable in a program
as a register variable in order to speed up the execution of the program.
4.3.3 External (Global) Variables
External variables are also known as global variables. What is global? The scope
of external variables extends to all the functions of a program. You have so far
created all the functions in a single file. You can create functions in more than one
file. However, for the sake of simplicity, assume that all the functions are in one
file. Global variables will be declared like other variables with a storage class
specifier extern. For example,
extern int a, b
extern float c, d
The scope of the variables starts from the point of declaration to the end of
the program. The value of the external variable at any point of time is that of the
last assignment. Assume that the main function may assign a = 10. The function
z may then use it and perform a calculation and at the end assign a value 20. If
printed at that point of time, the value will be 20. It may be called by another
function p where its value may become zero. If, at this point of time, the main
function calls or z calls it, the value will be 0. Thus, external variable is accessible
and transparent to all the functions below it. We will write a program to demonstrate
this concept.
/*Example 4.8
/* To demonstrate use of external variable*/
#include <stdio.h>
extern int ext_a=10;
int main()
{
int f1(int a);
void f2(int a);
void f3();
printf(“ext_a = %d in the main function\n”, ext_a);
f1(ext_a);
printf(“ext_a = %d after the return from f1\n”, ext_a);
f2(ext_a);
Self-Instructional
Material 89
Function printf(“ext_a = %d after the return from f2\n”, ext_a);
ext_a*=ext_a;
printf(“ext_a = %d \n”, ext_a);
f3();
NOTES
printf(“ext_a = %d after return from f3\n”, ext_a);
}
int f1(int x)
{
ext_a-=10;
return ext_a;
}
void f2(int x)
{
ext_a+=20;
}
void f3()
{
ext_a/=100;
}
Result of the program
ext_a = 10 in the main function
ext_a = 0 after the return from f1
ext_a = 20 after the return from f2
ext_a = 400
ext_a = 4 after the return from f3
How does the program work?
ext_a is declared as an external variable with value 10 before main().
Therefore, ext_a will be recognized all through the program .
f1, f2 and f3 are functions.
f1 returns an integer and f2 returns void, i.e., it does not return a value.
f3 neither receives nor returns any value.
In the first printf(), we get ext_a = 10.
Now f1 is called. In f1 ext_a = 10 is passed as an argument.
The value of ext_a is 0 now in function f1. The second printf in main
prints ext_a = 0
Now f2 is called .
ext_a becomes 20 now. It does not return any value. However, the third
printf prints the value as 20. How does this happen? It is because the current
value of ext_a is known to main even without f2 passing it.
Self-Instructional
90 Material
We discussed that a function can return only one value. However, by using Function

a global variable, you can overcome this limitation as follows:


You square ext_a, i.e., ext_a = 400 now.
The 4th printf() statement confirms this. Now, you call f3. In spite of NOTES
the fact that you neither passed an argument nor returned any value from f3,
ext_a is known to f3 as 400. Then, 100 divides ext_a. Therefore, ext_a
will be 4 as confirmed by the fifth printf() statement.
This program illustrates the concept of external variables in simpler situations
where the name of the global variable is not assigned to the function’s local variables.
It is perfectly legal to use the same name for different local variables in a
function. We can even use the name and declare it as another data type.
For example, you can define ext_a as a float in another function f1.
Then how is the conflict to be resolved? You will reserve the answer to the question
for a few minutes.
You should be careful while handling external variables because the variables
may be disturbed in a remote corner inadvertently. Global variables when declared
on top of main() can be identified easily and therefore, the storage class specifier
extern need not be specified in such situations. If it cannot be easily recognized
by declaration elsewhere in the program, it should be specified clearly.
The initial value of an external variable is zero, if not assigned. The life of the
variable is till the termination of program execution. The scope extends from the
point of declaration till the end. It will be stored in memory.
4.3.4 Static Variables
The initial value of static variables is zero unless otherwise specified. This is
also stored in memory. Static variables are declared as follows:
static int x, y, z ;
static char a ; etc.
Static variables are local to the functions and exist till the termination
of the program. Therefore, when the program leaves the function, the value of the
static variable is not lost. If the program calls the function again, the static
variable will execute the function with the value it already possesses. Assume that
f1 is a function containing a static variable as given below:
main ()
{
int f1 (—);
f1 (—);
}
int f1 ( —)
{
static int var = 0 ;
}
Self-Instructional
Material 91
Function When f1 is called the first time, var will be initialized to zero. If var is
finally assigned the value 10 at the end of f1, then var = 10 will remain till the
program stops execution and if main calls f1 again, the value of var will not be
initialized to 0 again but will remain as 10. The initialization var = 0 will not have
NOTES any effect. However, var can further be modified depending on the statements in
f1. Had it been an auto variable, var would have been initialized each time to
zero. This is essentially due to the value of auto variable being lost immediately
after the program leaves the block. This is one of the differences between static
and auto variables. Static variables; however, will not be known outside the
function, i.e., in other functions, such as main() or any other functions called by
main().
Now, consider the conflict arising out of external variables and local variables
(auto/register, static) having the same names. In such cases, the local
variables take precedence over the external variables. This means that in a function
the local variable of the same name would only be recognized. The function will be
blind to the external variable of the same name. However, the global variables of
other names will be recognized as explained already. All other conditions remain
the same. The value of a local variable does not affect the global variable and vice
versa. The initial value of the local variable will be dependent upon whether it is
static or auto. The program below explains the concept of the working of the
different storage classes.
/*Example 4.9
/* To demonstrate scope of variables*/
#include <stdio.h>
char chara,charb,charc; /*global variables*/
int main()
{
int disp();
char prn(char m);
chara=’x’;
charb=’y’;
charc=’z’;
prn(charc);
putchar(chara);
disp();
putchar(charb);
disp();
}
int disp()
{
static int charb;
charb=charb+1;
Self-Instructional
92 Material
printf(“%d\n”,charb); Function
return charb;
}
char prn(char charn)
{ NOTES
auto char chara;
putchar(charc);
chara=charn;
putchar(chara);
return chara;
}
Result of the program
zzx1
y2
Let us understand how the program functions.
You declare chara, charb, and charc as global variables and in
main(), you assign chara = x, charb = y , charc = z.
You have declared disp as a function passing no variable but returning an
integer.
You have declared prn as a function passing and returning a character.
You call function prn. You have a chara of type auto in prn. The
statement putchar (charc) will display the value of charc in the main
function, which is z.
The next putchar (chara) in prn will display z because of chara =
charc. z is returned to main.
Now putchar (chara) will print x and not z since in the main function,
chara refers to the global variable.
Now you call disp (). Although, you have not initialized it, the initial
value of b will be zero and therefore, it will print 1. Now the program returns to
main().
The next putchar (charb) will print y because the global variable is
active in the main function. Now, you call disp again. Since the old value of b is
not lost, the next time, the program prints 2. Thus, the value of a static variable
is maintained between function calls. Local variables get precedence over global
variables.
4.3.5 External (Global) Static Variable
A static variable can be placed outside all functions. Then, it is called external
static variable. An ordinary external variable is accessible by all functions in
any file. But, the external static variable is accessible by functions in the same
Self-Instructional
Material 93
Function file where the variable was declared. An external static variable can be defined
outside all functions as follows:
#include <stdio.h>
static int ext_a=10;
NOTES
int main()
{
Initialization
External and static variables are automatically initialized to zero. On the other
hand, auto and register variables get initialized to garbage values and should,
therefore, be initialized with constants or by expressions. For example,
auto int a = 10;
auto char ch = ‘z’ ;
When it is an expression, the contents of the expression should have been
defined previously. For example,
auto int a = 5;
auto int b = a + a * 5 ;
auto int c = a * b ;
Static and external variables can only be initialized with a constant. For
example,
extern char z = ‘A’;
static double = 343.25 ;
Note also that static and external or global variables are initialized only
once, i.e., before program execution. However, in the case of auto variables, the
initialization is carried out every time the function or block is entered. Arrays can
also be initialized with statements like
int Z [ ] = { 1, 2, 3, 4, 5, 6 } ;
Similarly, a string can be initialized as follows:
char string1 [ ] = “ peter “ ;
4.3.6 Multi-file Program
Programs so far seen were contained in one file. Large programs may reside in
more than one file. You may need a global variable to be accessible in all functions
in all files. If such a need arises, you have to write the defining declaration in one
file and referencing declaration in all other files. The defining declaration of the
variable should not use the extern keyword. It can be declared on top of the file
without using the extern keyword as shown below:
/* File 1 Defining Declaration*/
#include <stdio.h>
int ext_a=10;

Self-Instructional
94 Material
int main() Function
{
The other file should also declare the variable by prefixing the extern
keyword as shown below.
NOTES
/* File 2 referencing Declaration*/
#include <stdio.h>
Extern int ext_a;
sort()
{
/* File 3 referencing Declaration*/
#include <stdio.h>
Extern int ext_a;
int arrange()
{
It should be noted that the initial value can be assigned in the defining
declaration and nowhere else as shown in the above program segments.
Scope and Lifetime of Variables
In everyday programming, it is not necessary to explicitly remove variables from
the workspace. All local variables of a function die on exit from that function
anyhow and the variables in the global name space usually do not need special
treatment. Scope refers to the scope and lifetime of the variables. The scope and
lifetime depends on the storage class of the variable in C language. Variables can
belong to any one of the four storage classes, i.e., automatic variables, external
variable, static variable or register variable. The scope determines over which part
or parts of the program the variable is available. Variables can also be categorized
as local or global. Local variables are the variables that are declared within the
function and are accessible to all functions in a program while global variables can
be declared both within a function and outside the function. The following are the
types of scope of declaring variables:
Block Scope: Block refers to any set of statements enclosed in curly braces
({and}). A variable declared within a block has block scope. Thus, the variable is
active and accessible from its declaration point to the end of the block. Block
scope is also called local scope. For example, the variable i declared within the
block of the following main function has block scope:
int main()
{
int i; /* block scope */
.
.
.
return 0;
}
A variable with block scope is called a local variable.
Self-Instructional
Material 95
Function Function Scope: Indicates that a variable is active and visible from the beginning
to the end of a function. In C, only the goto label has function scope start.
For example, the goto start; in the following code shows function scope:
int main()
NOTES
{
int i; /* block scope */
.
.
.
start: /* A goto label has function scope */
.
.
.
goto start; /* the goto statement */
.
.
.
return 0;
}
Here, the label start is visible from the beginning to the end of the main()
function. Therefore, there should not be more than one label having the same
name within the main() function.
Program Scope: A variable is said to have program scope when it is declared
outside a function. The following is an example for this code:
int x = 0; /* program scope */
float y = 0.0; /* program scope */
int main()
{
int i; /* block scope */
.
.
.
return 0;
}
Here, the int variable x and the float variable y have program scope. Variables
with program scope are also called global variables which are visible among
different files. These files are the entire source files that make up an executable
program. A global variable is declared with an initializer outside a function. The
following program shows the relationship between variables with program scope
and variables with block scope.
Self-Instructional
96 Material
/*Example 4.10 Function
/* Program for relationship between program scope
and block scope
#include <stdio.h>
int x = 1234; /* program scope */ NOTES
double y = 1.234567; /* program scope */
void function_1()
{
printf(“From function_1:\n x=%d, y=%f\n”, x, y);
}
int main()
{
int x = 4321; /* block scope 1*/
function_1();
printf(“Within the main block:\n x=%d, y=%f\n”, x,
y);
/* a nested block */
{
double y = 7.654321; /* block scope 2 */
function_1();
printf(“Within the nested block:\n x=%d, y=%f\n”,
x, y);
}
return 0;
}
Result of the Program
From function_1:
x=1234, y=1.234567
Within the main block:
x=4321, y=1.234567
From function_1:
x=1234, y=1.234567
Within the nested block:
x=4321, y=7.654321
C:\app>
As you can see in this program that there are two global variables x and y with
program scope and they are declared in lines 4 and 5. A function called
function_1() is declared. The function_1() function contains only
one statement. It prints out the values held by both x and y. Because there is no
variable declaration made for x or y within the function block, hence the values of
the global variables x and y are used for the statement inside the function. To
Self-Instructional
Material 97
Function prove this, the function_1() function is called twice in the above program,
respectively, from two nested blocks. The output shows that the values of the two
global variables x and y are passed to printf() enclosed in the
function_1() function body. Then, another integer variable, x, is defined
NOTES with block scope, which can replace the global variable x within the block of the
main() function. The result made by the statement in line 17 shows that the
value of x is the value of the local variable x with block scope while the value of
y is still that of the global variable y.
The lifetime of the variable retains a given value during the execution of the
program. Lifetime of variable is the period of time during which that variable
exists during execution. Some variables exist briefly. Some variables are repeatedly
created and destroyed while others exist for the entire execution of a program.
The automatic variables and static variables decide the lifetime phenomena in the
program.
An automatic variable’s memory location is created when the block in which
it is declared is entered. An automatic variable exists while the block is active and
then it is destroyed when the block is exited. Since a local variable is created
when the block in which it is declared is entered and is destroyed when the block
is left. One can see that a local variable is an automatic variable.
A static variable is a variable that exists from the point at which the program
begins execution and continues to exist during the duration of the program. Storage
for a static variable is allocated and initialized once when the program begins
execution. A global variable is similar to a static variable since a global variable
exists during the duration of the program. Storage for the global variable is allocated
and is initialized once when the declaration for the global variable is encountered
during execution. Thus, both the global variable and the static variable have a
history preserving feature and they continue to exist and their contents are preserved
throughout the lifetime of the program. A programmer can also declare a local
variable to be static by using the keyword static in the variable’s declaration as in
the following example:
static int num=0;
For most applications, the use of automatic variables works just fine. Sometimes,
however, programmers want a function to remember values between function
calls. This is the purpose of a static variable. A local variable that is declared as
static causes the program to keep the variable and its latest value even when the
function that declared it is executing. It is usually better to declare a local variable
as static as to use a global variable. A static variable is similar to a global variable
in that its memory remains for the lifetime of the entire program. However, a static
variable is different from a global variable because a static variable’s scope is local
to the function in which it is defined. Thus, other functions in the program cannot
modify a static variable’s value because only the function in which it is declared
Self-Instructional can access the variable.
98 Material
Function
4.4 MACROS

Macro is a kind of statement that has been used a number of times to increase the
flexibility of programming. The syntax is given below: NOTES
#define symbolic_name replacement_constant
For example, #define MAX 10
Wherever MAX is found, the compiler will replace it as 10. This may also
happen before the program is actually compiled, and such statements are also
called macros or macro definitions. MAX is called a macro template and 10, its
corresponding macro expansion. You know that the rules of macro definition are:
(a) No semicolon at the end of the statement.
(b) A macro template is usually written in capital letters for ease of identification.
(c) No commas in between.
The advantages of such macro definitions are clear:
(a) No accidental change of constants.
(b) If a constant occurs at a number of places like the size of an array and the
rate of interest, and if you want to increase the size of the array or in the
other case, the rate of interest later, then you would need to makes change
at a number of places in the program code. Using macros simplifies the
procedure, as a change carried at the top would be reflected throughout the
program.
We can use macros for substituting complex statements such as those given below:
# define INPUT(a) scanf(“%f” , &a);
# define OUTPUT printf(“Enter a value”);
# define AREACYL(r, h) (2*3.14 * r *( r+h))
# define volsp(r) ( 3.14 * r * r * r * 4.0/3.0 )
Note the last two statements carefully. We are passing arguments. For example,
AREACYL refers to the area of the cylinder, r is the radius and h is the height.
Whenever you want to calculate the area of the cylinder, you can specify the
symbolic constant AREACYL which is the macro with the arguments namely radius
and height. For example,
area = AREACYL(3.0, 2,0);
Now AREACYL will be replaced by 2.0 * 3.14 * 3.0 * (3.0 + 2.0).
Of course you should have defined area as a float in the function. Thus you
have used the macro like a function. Similarly the #define volsp calculates
the volume of a sphere of radius r. You can give any radius. It will substitute the r
with the given radius. For example, you can call,
x = volsp(4.0);

Self-Instructional
Material 99
Function Remember that there should not be spaces between the macro name and
the argument. Spaces left there, if any, will be treated as the argument and hence
as replacement text. Whenever you define a macro in the form of a function, the
entire macro should be enclosed within parentheses.
NOTES
Now look at Example 4.11 given below:
/*Example 4.11 - to demonstrate macros*/
#include<stdio.h>
#define INPUT(a) scanf(“%f” , &a);
#define OUTPUT printf(“Enter radius of sphere\n”);
#define volsp(r) ( 3.14 * r * r * r * 4.0/3.0 )
#define AREACYL(r,h) (2.0*3.14*r*(r+h))
int main()
{
float a, v, r, h;
OUTPUT
INPUT(a)
v=volsp(a);
printf(“voume of sphere of radius %f = %f\n”, a,v);
printf(“Enter radius and after a space height of
cylinder\n”);
INPUT(r)
INPUT(h)
printf(“area of cylinder=%f\n”, AREACYL(r,h));
}
Result of the program
Enter radius of sphere
2.0
voume of sphere of radius 2.000000 = 33.493332
Enter radius and after a space height of cylinder
4
5
area of cylinder=226.080000
Four macros have been defined in the program. In the main(), OUTPUT
will be replaced by printf(“Enter radius of sphere\n”); wherever
it is found, at the time of preprocessing, but before compilation. Similarly, the
other three macros will also be substituted at the time of preprocessing. Parameters
are passed in the macros AREACYL(r,h) and volsp(r). This is similar to
passing values in a function. The program prints the volume of the sphere and then
the area of the cylinder for the dimensions given at run time.
There are, however, differences between macros and functions although a
macro resembles a function in the above examples. As you have guessed right, the
Self-Instructional
100 Material
executable code of a macro based program will be larger than a function based Function

program if the macro is used more than once. This is due to the reason that macro
will be substituted wherever it appears, whereas actual code of the corresponding
function will be at one place and not substituted whenever called. Macros are
faster than functions, as they do not require to be called with arguments and return NOTES
values, unlike functions. This calling and return does not arise with macros, since
macros are substituted at every place of occurrence. Therefore, depending on the
context, a macro or a function could be used, i.e., if speed is the criteria, a macro
could to be used, and if program size is the criteria, a function could be used.
By now you know how a program is converted into an executable code.
The program statements you write are the source code, which is made complete
during preprocessing, and then compiled. These two operations take place when
you say compile, and the object code is produced after compilation with a file
name.OBJ extension. The linker in the system links all files and gives you an
executable file. The code in turn is called executable code and stored with file
name.EXE extension. When you say run, the .EXE file is executed. Thus you may
have four files in the same name with different extensions as given below:
name.c
name.bak /*back up file*/
name.OBJ
name.EXE
Whenever you want to execute the program, use the executable file.
#undef MACROS
‘C’ is a flexible language. We have defined certain macros. We can always undefine
them at some point later in the program. The syntax is given below:
#undef INPUT
#undef AREACYL
Whenever the compiler comes across #undef directives it will cease to recognize
the corresponding macro definition from that point onwards.

4.5 PREPROCESSOR DIRECTIVES

Preprocessor is a special text editor that performs lexical conversions on the


text. It basically handles macros and constants. In C, all preprocessor commands
begin with the ‘#’ (a hash mark) character. Preprocessor directives is terminated
at the end of each line without ‘;’. The preprocessor is a part of C compilation
process that recognizes statements that are preceded by a # sign. Although most
C compilers have the preprocessor integrated into it, the preprocessing is considered
independent, since it works on the source code before compilation. Preprocessor
statements have a different syntax from that of normal C statements and are used
for including header files, conditional compilation and macro definitions. The
Self-Instructional
Material 101
Function preprocessor directive must appear in your program before any of the definitions
contained in the header file are referenced. The preprocessor searches for this
header file on the system and includes its contents in the program at the point
where the #include statement appears. If you want to continue a preprocessor
NOTES directive, add a backslash ‘\’ to the end of the line. The following example
shows how backslash ‘\’ is used in a program.
#define Swap_Function(int val1, int val2)
{
val1 ^= val2; \
val2 ^= val1; \
val1 ^= val2; \
}
Use of Preprocessor
A very famous concept behind preprocessor is to define a constant. Table 4.1
shows the methods to define a constant as preprocessor with ‘#’ sign and with
‘const’ keyword.
Table 4.1 Methods to Define C Constant as Preprocessor

Method Declaration
Method I # define PI 3.141592654
Method II const double PI=3.141592654;

Both #define and ‘const’ perform the same operation. PI (π) value
is replaced automatically in the program if it is written more than once.
Preprocessor cares for white spaces, such as spaces, tabs, etc. Variable ‘PI’ is
declared as capital to make it macro or named constant. The following code shows
how we create a macro using preprocessor.
/*Example 4.12
#include <stdio.h>
#define OPEN {
#define CLOSE }
#define PRINT printf (“This is Preprocessor Macro.”);
void main()
OPEN
PRINT
CLOSE
Result of the Program
This is Preprocessor Macro.
Here OPEN, PRINT and CLOSE perform tasks as {, printf();
and } statements respectively. New C compilers support built-in preprocessor.
Self-Instructional
102 Material
A section of program is compiled with #ifdef preprocessor though it is not Function

written within the program. For example,


#ifdef File_Size
int file[File_Size];
#endif NOTES
The second line of the coding int file[File_Size]; is compiled only
if File_Size is defined independently with #define along with its value.
The #ifndef is the exact opposite statement of #ifdef. The code which is
written between #ifndef and #endif directives is compiled if the specified
identifier is not defined. For this consider the following example:
#ifndef File_Size
int file[File_Size];
#define File_Size 50
#endif
In the abovementioned code File_Size macro is defined and executed with
value 50.
The #if, #else and #elif directives follow the order in which they are
defined. The two statements #if and #elif evaluate constant expression as
well as macro expression.
#if File_Size>500
#undef File_Size
#define File_Size 300
#elif File_Size<100
#undef File_Size
#define File_Size 100
#else
#undef File_Size
#define File_Size 200
#endif
The structure of coding such as #if, #elif and #else statement chained
directives are ended with #endif.
Use #undef to undefine the value defined by #define keyword as shown in
the following example code:
#include <stdio.h>
#define VAL 40;
#undef VAL
#define VAL 50
void main()
{
printf (“%d\n”, VAL);
}
Result of the Program
50
Self-Instructional
Material 103
Function The #error directive displays an error message which is defined on the line past
this directive. This directive stops compilation. This is used to alert a programmer
that there is an error in the compilation of the program.
#ifdef DEBUG
NOTES #ifdef RELEASE
#error DEBUG and RELEASE both are defined
#endif
#endif
The #warning directive is similar to #error but this issues a warning
without stopping compilation.
The #pragma directive has no proper definition. It is compiler vendor
specific. In the GNU C preprocessor, the #pragma is used once to include a
header file only once. But this directive is now obsolete.

Check Your Progress


5. Which is the default storage class?
6. What is the difference between register variables and auto variables?
7. How are global variables declared?
8. Mention one characteristic of static variables.
9. List the rules for defining macros.

4.6 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. Arguments declared as part of the function prototype is known as ‘formal


parameters’.
2. A function definition can be written anywhere in the file with a proper
declarator, followed by the declaration of local variables and statements.
3. Function definition consists of function declarator and function declarations.
Function declaration is terminated by a semicolon but function declarators
are not terminated with colon.
4. When a function calls itself, directly or indirectly, it is known as recursion.
5. Automatic or auto is the default storage class.
6. auto variables are stored in the memory, while register variables
are stored in the CPU register.
7. Global variables are declared with a storage class specifier, extern.

Self-Instructional
104 Material
8. Static variables are local to the function and exist till the termination of Function

the program. When the program exits, the function value of the static
variable is not lost.
9. The rules of macro definition are:
NOTES
(a) No semicolon at the end of the statement.
(b) A macro template is usually written in capital letters for ease of
identification.
(c) No commas in between.

4.7 SUMMARY

 A function may be called directly or indirectly. You pass actual arguments or


value to the function when you call it.
 A function definition consists of function declarator and function declaration.
 Formal arguments are defined in the function declaration in the calling function.
 Parameters are written in the function prototype and function header of the
definition.
 When a function calls itself directly or indirectly it is called recursion.
 Storage class specifies the segments of the program where the variable is
recognized and how long the variable will be stored.
 The four types of storage classes are: auto, static, extern and register.
 Except for register variables, the other three types are stored in the memory.
 The initial value of static variable is 0, unless otherwise specified.
 When a static variable is placed outside all functions, it is called external
static variable.
 The scope and lifetime of a variable depends on the storage class of the
variable in C language.
 Macros are used to increase the flexibility of programming.
 Preprocessor is a special text editor that performs lexical conversions on
the text.

4.8 KEY WORDS

 Function: It is a self-contained and well-defined named group of statements


that is aimed at accomplishing a specific task or action in the program.
 Lifetime of variable: It is the period of time during which that variable
exists during execution.

Self-Instructional
Material 105
Function
4.9 SELF ASSESSMENT QUESTIONS AND
EXERCISES

NOTES Short-Answer Questions


1. What is a function declarator?
2. How do you call a function?
3. Write a short note on function arguments.
4. What are the features of auto variables?
5. What are storage classes?
6. What are preprocessor directives?
Long-Answer Questions
1. Explain the following:
(a) Function
(b) Formal vs actual parameters
(c) Return statement
(d) A function calling multiple functions
2. Explain the recursive factorial algorithm for finding the factorial of 7.
3. Explain the differences between static and auto variables.
4. What are the scope rules for the four storage classes?

4.10 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
106 Material
Arrays

UNIT 5 ARRAYS
Structure NOTES
5.0 Introduction
5.1 Objectives
5.2 One Dimensional Array
5.2.1 Two-Dimensional Arrays
5.3 Strings and Characters Array
5.3.1 String Manipulation Using Library Functions
5.4 Answers to Check Your Progress Questions
5.5 Summary
5.6 Key Words
5.7 Self Assessment Questions and Exercises
5.8 Further Readings

5.0 INTRODUCTION

C facilitates the arrangement of same data types in the memory as an array. Arrays
are also used in other computer languages, such as Pascal and BASIC. However,
in C, it is very popular to make a larger unit of same data types. It makes repetitive
tasks easy if elements of the same data type need to be processed repetitively. In
essence, an array holds a fixed number of same data elements. Data items can be
grouped in an array, which temporarily stores the data. In C array declaration is
done by using int and char keywords, such as int a[5]; and char
Student_Name[]= “Student”; respectively. In daily life, similar objects are grouped
into units; for example, in a library, all English fiction and non-fiction works are
kept in separate shelves. The same concept is used in C arrays. Subscripts or
indices are arranged in square brackets in arrays for addressing in the memory
location. Each indexed value is called an element. The types of arrays used in C
are one-dimensional, two-dimensional and multi-dimensional arrays.

5.1 OBJECTIVES
After going through this unit, you will be able to:
 Explain the basics of arrays
 Define strings
 Explain array of strings
 Do string manipulation using library functions
 Write programs on arrays
 Write programs using array strings

Self-Instructional
Material 107
Arrays
5.2 ONE DIMENSIONAL ARRAY

Definition
NOTES
An array is a vector defined as a simple data structure. It holds a fixed number of
equal size data elements of the same data type.
If the array elements are known beforehand, they can be defined right at the
beginning. If the ages of the employees are known beforehand, they can be declared
as:
int emp_age [5] = {40, 32, 45, 22, 27};
The data elements are written within braces separated by commas. In this
case, when data elements are declared, there is no need to declare the size; we
can write:
int emp_age [] = { 40, 32, 45, 22, 27 };
The latter is advantageous. If the size is not declared, the compiler will
count the number of elements and automatically allot the size. On the other hand,
if we specify the size and give lesser number of elements, the compiler will assume
the other elements to be zero.
For example, if we declare
int Marks [ 5 ] = { 100, 70, 80 };
In this case, Marks [0]= 100
Marks [1] = 70
Marks [2] = 80
What happens to the other elements? The computer will assign 0 to them.
Marks [3] = 0
Marks [4] = 0
If you declare the size as 5 and give 7 elements, there will be an error.
Therefore, if you know the data elements in advance, you can allow the compiler
to calculate the size.
Now let us try some programs.
For example, to get 10 integers, one at a time and print them after they are collected
the program will be as follows:
/*Example 5.1
Ten integers of an array are scanned the
scan function and printed */
#include <stdio.h>
int main()
{
Self-Instructional
108 Material
int s1[10]; Arrays

int i;
printf(“Enter 10 integers \n”);
for (i=0; i<=9; i++) NOTES
{
scanf(“%d”,&s1[i]);
}
printf(“you have entered:\n”);
for (i=0; i<=9; i++)
{
printf (“%d\n”, s1[i]);
}
return 0;
}
Result of the program
Enter 10 integers
1 2 3 4 5 6 7 8 9 0
you have entered:
1
2
3
4
5
6
7
8
9
0
Analyse the program carefully. We declare s1 as an integer array of size 10.
Next, we use the first for loop to scan the entered integers from the
keyboard. At the first iteration, s1[0] will be received and stored at location &
s1[0]. This is similar to a simple variable where ‘&’ denotes the address of the
variable. This is repeated till s1[9] is received and stored at & s1[9].
The next for loop prints the value of s1[0] to s1[9], i.e., 10 integers
one at a time and in new line.

Self-Instructional
Material 109
Arrays 5.2.1 Two-Dimensional Arrays
Multidimensional arrays operate on the same principle as one-dimensional arrays.
You have to give the dimensions of the two subscripts or indices in case of a two-
NOTES dimensional array. For example,
w [10][5]
is a two-dimensional array with different subscripts. Here, there will be 50 different
elements. The first element can be denoted as w [0][0].
The next element will be w[0][1].
The fifth element will be w[0][4].
The sixth element will be w[1][0].
The last element will be w[9][4].
This can be considered as a row and column representation. There are ten
rows and five columns in the above example. When data is stored in the array, the
second subscript will change from 0 to 4, one at a time, with the first subscript
remaining constant at 0. Then, the first subscript will become 1 and the second
subscript will keep increasing from 0 to 4. This is repeated till the first subscript
becomes 9 and the second 4. This array can be used to represent the names of 10
persons, with each name containing 5 characters. The first subscript refers to the
name of the 0th person, 1st person, 2nd person, and so on. The second subscript
refers to the 1st character, 2nd character, and so on of the name of a person.
Thus, ten such names can be stored in this array.
The dimension of the array can be increased to 3 with 3 square brackets as
given below:
Marks [ 50 ][ 3 ][3 ];
The name of the first element will be Marks [ 0 ] [ 0 ] [ 0 ]
The last element will be Marks [ 49 ] [ 2 ] [2 ].
It would be easy to add more dimensions to an array but it would also
become more difficult to comprehend under normal circumstances. It may, therefore,
be useful to solve complicated scientific applications, however. Now, let us
understand the concept of multidimensional arrays using a simple problem.
Assume that we need to write a program to read two arrays (both two-
dimensional) and multiply the corresponding elements and store them in another
two-dimensional array. To make the problem simpler, we will use [2][2] arrays.
Let us call the arrays x, y and z.
We have x = {x[0][0] x[0][1] } y = {y[0][0] y[0][1] }
{x[1][0] x[1][1] } {y[1][0] y[1][1] }
We want to multiply x[0][0] and y[0][0] and store the result in
z[0][0],and so on.
Self-Instructional
110 Material
The values of x and y are given in the program itself. Arrays

/*Example 5.2
/* multiplication of two 2 dimensional arrays*/
#include <stdio.h> NOTES
int main()
{
int i,j;
int z[2][2];
int x[2][2]= {1, 2, 3, 4};
int y[2][2]= {5, 6, 7, 8};
for (i=0; i<=1; i++)
{
for (j=0; j<=1; j++)
{
z[i][j]=x[i][j]*y[i][j];
printf(“z[%d][%d]=%d\n”, i, j, z[i][j]);
}
}
return 0;
}
We have declared two arrays x[2] and y[2] as follows:
x = {1 2} y = {5 6}
{3 4} {7 8}
x [0][0] = 1 x [1][1] = 4
y [0][0] = 5 y [1][1] = 8
Therefore, after multiplication of the respective elements, we get
z = {5 12}
{21 32}
The program prints out the values of the products stored in array z.
Result of the program
z[0][0]=5
z[0][1]=12
z[1][0]=21
z[1][1]=32
Note that the elements are stored row-by-row contiguously.
In the above example, we have declared elements with a two-dimensional
array and initialized its one-dimensional array as given below:
x [2] [2] = {1, 2, 3, 4} ;
Self-Instructional
Material 111
Arrays The system correctly interpreted the same and we get result the correct as
product of two matrices. We can actually present this in another manner as follows:
int x [2] [2] = {
{1, 2},
NOTES
{3, 4}
};
In this method, we can indicate the elements closer to a matrix form. Both
the above definitions are equivalent. In the latter definition, we can easily visualize
a two-dimensional array. The first row represents the first row of the two-
dimensional array. The values in the second row represent the second row of the
two-dimensional array.
Following are the characteristics of arrays.
1. Array holds elements having same data type.
2. Arrays elements stored in sequential manner in the memory.
3. Two-dimensional array elements are stored row by row.
4. Array name represents the address of the starting element.
5. Arrays size must be known before declaration.

Check Your Progress


1. What is an array?
2. How are arrays declared and named?
3. How are data elements written in an array declaration?

5.3 STRINGS AND CHARACTERS ARRAY


A string is a one-dimensional array of characters. The names of students in a
class can be denoted by a two-dimensional array like b[50][20], with the
second subscript denoting the width of the names and the first 50 denoting the
number of students. This is nothing but an array of strings.
In earlier examples, you created strings which are essentially one-dimensional
arrays of characters. You can create an array of strings. This will be a two-
dimensional array of characters. For instance,
char name[5][10];
declares a two-dimensional array of characters. This can be used to deal
with 5 strings of size 10 each.

Self-Instructional
112 Material
To read five names and display them, the program will be as follows: Arrays

/*Example 5.3
Two Dimensional array*/
#include<stdio.h> NOTES
int main()
{
int i;
char name[5][10];
/*Receiving strings*/
for (i=0; i<5; i++)
{
printf(“Enter name[%d]: \n”, i+1);
scanf(“%s”, name[i]);
}
/*displaying strings*/
for (i=0; i<5; i++)
{
printf(“name[%d]: %s\n”, i+1, name[i]);
}
return 0;
}
Look at the program. Declare a two-dimensional character array, called
name[5][10].
Then you receive the names. Look at the ease with which you receive them.
scanf(“%s”, name[i]);
You do not even give the second dimension. This is possible only in the case
of strings. Recall that str was a one-dimensional array. You read it just by specifying
str. But, since you are specifying scanf(), you cannot give white spaces in
between the names.
As you know, the elements of an array will be stored contiguously. In
name[i], you are specifying the address of the 0th location of the ith row. The
array received will be stored thereon continuously. In the next section, you print
each string the same way by specifying the first subscript alone. The result of the
program is given below.
Result of the program
Enter name[1]:
Ganapathy

Self-Instructional
Material 113
Arrays Enter name[2]:
Subramani
Enter name[3]:

NOTES Narayanan
Enter name[4]:
Joseph
Enter name[5]:
Mohammed
name[1]: Ganapathy
name[2]: Subramani
name[3]: Narayanan
name[4]: Joseph
name[5]: Mohammed
The result demonstrates the use of two-dimensional character arrays.
Be cautious not to exceed the size of the array. Although you did not enter
10 characters, the computer recognized the end of the string due to NULL character
generated by the pressing of [Enter] key each time.
5.3.1 String Manipulation Using Library Functions
The following library functions are used to carry out operations using strings:
<string.h>
<ctype.h>
<string.h>
These header files contain a number of functions for string manipulation.
Some of them are illustrated as follows:
1. String Copy
The library function strcpy is used to copy one string to another. Let us write a
program for copying a string to another using the library function. The program is
given below:
/*Example 5.4
String copy using library function*/
#include<stdio.h>
#include <string.h>
int main()
{
char str1[]=”Subramaniar”, str2[11];
strcpy(str2,str1);
printf(“you Entered:\n”);
Self-Instructional
114 Material
puts(str1); Arrays

printf(“copied string is:\n”);


puts(str2);
return 0; NOTES
}
Look at the declaration statement. You have declared and initialized str1
and declared str2 in one statement. There is no need to give the dimension
when you initialize the array as given in the program. The compiler will count the
number of elements and add the dimension by itself. Look at the string copy
statement, reproduced below:
strcpy (str2, str1) ;
The contents of the second named string will be copied to the first named
string. The strcpy() is a library function and it receives two string variables, the
first one is the destination string and the last, the source string. Look at the result
of execution of the program.
Result of the program
you Entered:
Subramaniar
copied string is:
Subramaniar
The program works correctly. Thus, you can use the library function easily
without writing a program for copying strings.
The library function is part of header file<string.h>. Therefore, some
compilers may require you to include <string.h> in the program for using the
function. Check this aspect.
2. String Length
Can be used for library function called strlen(. Although you write a program
for the finding out the length of the string. Let us use it and modify the above
program. The modified program is as follows:
/*Example 5.5
String length using library function strlen()*/
#include<stdio.h>
int main()
{
char str1[]=”Shri Rama Jeyam”;
printf(“you Entered: “);
puts(str1);
printf(“\nits length=%d”, strlen(str1));
Self-Instructional
Material 115
Arrays return 0;
}
Look at the program. We find the length and print it in the last statement
NOTES reproduced as follows:
printf(“\nits length=%d”, strlen(str1));
When program execution reaches the line, the cursor will go to the beginning
of the next line because of the appearance of \n. Then the text ‘its length
=‘ will be printed. Then, it will look for an integer specified outside the quote, to
print because of the appearance of %d. Here, the integer will come out of executing
the library function strlen(str1). Thus, we get an identical result shown as
follows.
Result of the program
you Entered: Shri Rama Jeyam
its length=15
From the above examples, it will be clear that use of library functions
conserves the effort and also results in shorter program.
The library functions available in <string.h> for string manipulations are
listed below:
Example 1
strncpy function
#include <string.h>
char*strncpy(char*s1, const char *s2, size_t n);
The strncpy function copies not more than n characters from the array
pointed to by s2 to the array pointed to by s1.
Example 2
strcat function
char*strcat(char*s1, const char *s2)
The strcat function appends a copy of the string pointed to by s2 to the
end of the string pointed to by s1.
Example 3
strncat (s1, s2, n)
Appends not more than n characters from s2 at the end of s1.
Example 4
strcmp (s1, s2)
compares the string s1 to s2. If the integer returned is greater than or equal to or
less than zero, the string s1 is greater than or equal to or less than string s2.

Self-Instructional
116 Material
Example 5 Arrays

strcpy (s1, s2)


copies s2 to s1.
Example 6 NOTES
strchr (s, c)
The function locates the first occurrence of c in the string s. Returns a
pointer to the character or null if c does not occur in s.
Example 7
strstr (s1, s2)
The function locates the first occurrence of s2 in s1. It returns a pointer to
the located string or null if not found.
Example 8
strlen (s)
Computes the length of string s.
3. Character Handling <ctype.h>
The function in the header returns non-zero (true) if and only if the value of the
argument c of type int conforms to the description of the function.
For example,
#include <ctype.h>
int isalnum (int c)
The function isalnum tests for any character (the argument c) for which
isalpha or isdigit is true. The function isalpha will be true if the character
received is an alphabet (A..Z, a..z). The function isdigit will be true if the
character received is a digit (0..9).
Other functions in <ctype.h>
isalpha(c)
iscntrl(c) – Tests for any control character
isdigit(c)
isgraph(c) – Tests for any printing character except space
islower(c) – Lower case letter
isprint(c) – Tests for any printing character including space
ispunct(c) – Tests for any printing character that is neither space
nor a character for which isalnum is true.
ispace
isupper
isxdigit – Tests for any hexadecimal digit character
tolower – Converts an uppercase letter to corresponding lowercase
toupper – Converts a lowercase letter to corresponding uppercase
letter Self-Instructional
Material 117
Arrays

Check Your Progress


4. Define ‘string’ with reference to ‘array’.
NOTES 5. What is the library function strcpy () used for?

5.4 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. Array is another form of data type. There can be arrays of integers, arrays
of characters and arrays of floating point numbers, etc. An array contains
data of the same type.
2. An array is a variable and hence must be declared like any other variable.
The naming convention is the same as in the case of other variables. The
array name cannot be a reserved word and must be unique in the program.
An array variable name and another ordinary variable name cannot be
identical. Since there is no limit to variable names, similar names for a variable
and an array should not be used.
3. The data elements are written within braces separated by commas.
4. Strings are special type of one-dimensional arrays. A string is an array of
characters, meaning that it can contain zero to many characters. A string is
enclosed within double quotes.
5. The library function strcpy() is used to copy one string to another.

5.5 SUMMARY

 An array is a form of data type and is used to define arrays of integers,


arrays of characters and arrays of floating point numbers.
 The most important feature of an array is that it contains data of the same
type.
 An array is a variable and hence it should be declared like any other variable
of C programming.
 Arrays can be given specific names.
 An array variable is distinguished from a single variable on the basis of
declaration.
 Array dimensions are declared within square brackets and each dimension
element is separated by a comma.
 In a two-dimensional array, the dimensions of the two subscripts have to be
given.
Self-Instructional
118 Material
 Arrays can be used to represent matrices. Thus, you can transpose a matrix Arrays

by interchanging rows and columns in a matrix.


 Matrix multiplication is possible when the number of columns in the first
matrix is equal to the number of rows in the second matrix.
NOTES
 Strings are a special type of one-dimensional array.
 The library function strcpy is used to copy one string to another.

5.6 KEY WORDS

 Arrays: It is the fixed-size sequence of elements of the same data type.


 Traversal: It is the process of accessing each element of an array.
 Multi-dimensional array: It is a collection of elements which are accessed
with the help of n subscript values.
 Two-dimensional array: It is defined as an array in which two subscript
values are used to access an array element.

5.7 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. Which of the following declarations can be invalid? Give reasons.
(a) float x [+20];
(b) age [20] of float;
(c) double d[50];
(d) int a[20]
(e) int number [0–50];
2. Find the number of elements in the following array declarations:
(a) int a[2][3];
(b) int x[6];
Long-Answer Questions
1. Write a C program to add two matrices.
2. Write a C program to find the smallest and largest values in an array.
3. Write a C program to check whether the row and column of matrix1 is
equal to the row and column of matrix2. Multiply both the matrices and
store the result in a matrix3.

Self-Instructional
Material 119
Arrays
5.8 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


NOTES York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
120 Material
Pointers

UNIT 6 POINTERS
Structure NOTES
6.0 Introduction
6.1 Objectives
6.2 Concept of Pointers
6.2.1 Pointer Arithmetic
6.2.2 Passing Pointer to a Function
6.2.3 Pointers and Strings
6.3 Answers to Check Your Progress Questions
6.4 Summary
6.5 Key Words
6.6 Self Assessment Questions and Exercises
6.7 Further Readings

6.0 INTRODUCTION

In this unit, you will learn about the pointers. A pointer is a variable that contains
the address of another variable. Pointer is a one of the unique concepts of C since
it facilitates direct access to the hardware and results in compact coding. It also
facilitates returning more than one value from a function. Pointers are closely
associated with memory addressing. We know that constants and variables are
stored in memory, and that each location in memory has an address just as a
person has an address. The memory locations are available in groups of 8 bits or
a byte. Each byte in memory has an address. Therefore, each location has an
address and stores a value.

6.1 OBJECTIVES
After going through this unit, you will be able to:
 Explain the use of pointers
 Write compact programs using pointers
 Use the feature, ‘call by reference’ to improve your programs
 Write simple programs using dynamic memory allocation

6.2 CONCEPT OF POINTERS


Definition and Features of Pointer
A pointer is a variable that contains the address of another variable. As you
know, any variable has the following four properties:
(a) Name (b) Value (c) Address (d) Datatype
Self-Instructional
Material 121
Pointers For instance, consider the following declaration of a simple integer:
int var = 10;
The name here is var and its value is 10. Its address is not declared here
NOTES since you want to give flexibility to the compiler to store it wherever it wants. If
you specify an address, then the compiler must store the value at the same address.
Specifying actual address is carried out during machine language programming.
However, this is not required in High-Level Language (HLL) programming and by
printing the value of &var, you can find out the address of the variable. When the
statement to find the address is executed at different times, different addresses will
be printed. What is important is that the compiler allocates an address at runtime
for each variable and retains this till program execution is completed. This is not
strictly so in the case of auto variables. The compiler forgets the address of a
variable when the program comes out of the block in which the variable is declared.
At this point, you may also recall that in the case of function declarations in the
calling function, the compiler does not allocate memory to the variables in the
declaration. That is the reason why the parameters in the declaration part are not
recognized in the calling function. It is only a prototype.
The fourth feature of a variable is its datatype. In the above example, var is
an integer. A pointer has all the four properties of variables. However, the datatype
of every pointer is always an integer because it is the value of the memory address.
Memory addresses are integers. They cannot be floats or any other datatype.
They may point to an integer or a float or a character or a function, etc. They have
a name. They have a value. For instance, the following is a valid declaration of a
pointer to an integer.
int * ip;
Here, ip is the name of a pointer. It points to or contains the address of an
integer, which is the value. It will also be stored in another location in memory like
any other variable. The pointer itself is an integer even though it is not declared as
such.
Pointer to void
A pointer to void can be declared as you declare any other pointer, such as
pointer to integer, float, etc.
void* void_pointer;
Note that void is a keyword and it means ‘nothing’. Note also that
void_pointer also points to a memory address and hence, it is also an integer like
any other pointer. The void_pointer points to nothing. Then what is the use of this
pointer?
If you try to assign the address of an integer variable to a float pointer or
any other type of pointer other than a pointer to an integer, then there will be an
Self-Instructional
error. This rule applies to the address of variables of others, such as double
122 Material
variable, char variable, etc. However, there is an exception to the rule and that is Pointers

where the void pointer is useful. You can assign address of any datatype to the
pointer to void. For instance, you can declare as follows:
void* void_pointer;
NOTES
int int_var;
void_pointer= & int_var;
The above assignment of address of an integer variable to the pointer to
void is permissible. Similarly, you can assign a pointer of any type, such as float,
double, etc. to a void pointer.
There are times when you write a function but do not know the datatype of
the returned value. When this is the case, you can use a void pointer. A void
pointer is a special type of pointer that has flexibility of pointing to any datatype.
However, there is one limitation in the use of void pointers as compared to
pointers of other types, namely direct dereferencing of void pointer is not permitted.
The programmer must change the pointer to void as any other pointer type that
points to a valid datatype, such as int, char and float and then dereference it.
Then, conversion of the pointer to some other valid datatype is achieved by using
the concept of type-casting.
NULL Pointer
The concept of NULL pointer is different from the above concept of void pointer.
NULL pointer is a type of pointer of any datatype and generally takes a value as
zero. This is; however, not mandatory. This denotes that a NULL pointer does not
point to any valid memory address.
For example:
int* var;
var =0;
The above statement denotes var as an integer pointer type that does not
point to a valid memory address. This shows that var has a NULL pointer value.
The difference between void pointers and NULL pointers:
A void pointer is a special type of pointer of void and denotes that it can
point to any datatype. NULL pointers can take any pointer type but do not point to
any valid reference or memory address. It is important to note that a NULL pointer
is different from a pointer that is not initialized.
6.2.1 Pointer Arithmetic
Take a look at some examples involving pointers:
Example 6.1
int dat = 100;
int * var;
var = &dat; Self-Instructional
Material 123
Pointers Here, dat is an integer variable. Its value is 100; its name is dat; it
will be stored in memory in a location with an address.
The next declaration means that var is a pointer to an integer and is a
NOTES variable. It is an integer and will be stored at a location in memory with an
address. The value of var is the address of the integer variable it points to.
We do not know as yet which integer it points to. It can; however, be made to
point to any integer we like, by a proper declaration.
Now, look at the next assignment. The variable var is assigned the value
of &dat. This means var has the same value as the address of dat. By taking
into consideration the previous statements, we can conclude that var is a pointer
and it points to dat.
Now, if you specify dat or * var, they point to the same value 100.
Similarly, if you specify &dat or var it is the address or to be precise, the
starting address of dat or * var.
Example 6.2
int * var
* var = 100
The above statements declare var as a pointer to an integer and later
propose to assign 100 to the integer variable. We do not know or do not want
to make public the name of the variable. However, we can always access the
variable as * var. This works well in Borland C++ compilers but could lead to
runtime errors in other compilers.
Both the statements cannot be combined into one as int * var = 100.
This will be flagged as an error even in the Borland C++ compiler. Therefore, it is
not possible to combine both the declaration and assignment insofar as a pointer
variable and integer constant are concerned. It would be safer to make var
point to another variable as given in Example 6.1.
Example 6.3
int * var;
* var = 100;
Note: If this gives an error while compiling or while running the program, modify
this as in Example 6.1 already given.
What will be the value of the output of the following statements after the
execution of the following statements?
printf (“%d”, * var);
printf (“%d”, (* var) ++);
printf (“%d”, * var);
printf (“%d”, var);
You can easily guess that the first printf will give the value of * var as 100.
Self-Instructional
124 Material
What is the significance of the parentheses and the increment operation, in Pointers

the second statement? As the bracket or parentheses has precedence over other
operators, the value of * var will be printed as 100. After printing, it will be
incremented as 101 Because the increment is postfix, the value of * var after
execution of the statement will be 101. The next statement will confirm this when NOTES
it prints 101.
The fourth printf case printed the address of var. It printed 1192 when
I executed the program. You are unlikely to get the same address on execution.
The location where the variable is stored will not vary till the execution of the
program is completed. If you try the program again, you will get a different address.
Don’t worry. It does not affect our work. However, you may note down the value
and substitute it for the values mentioned here for understanding the concept of
pointers.
Remember to enclose the pointer variable within parenthesis as given in the
example. The postfix of the increment operator, enables increment of the
variable after printing.
Example 6.4
After execution of the above four printf statements, you can execute the following
statements:
printf (“%d”, * var ++);
printf (“%d”, * case);
What happens? The value of * var, i.e. 101 is the first to be printed
out and then var will be incremented, i.e. instead of incrementing the value
as desired, the address is incremented and therefore, var now points to the
next location. Remember var was pointing to 1192. Will it go to 1193? No.
Since var is an integer, 1192 and 1193 (2 bytes) are already used. Hence,
var now points to 1194. The next statement prints the value of * var. You
had stored 101 in location 1192. You don’t know what is contained in 1194.
Hence, the next printf will print garbage value. Note carefully what happened.
You wanted to increment * var in the first statement of example 6.4. However,
the compiler has assumed that you wanted to increment thc address, which
underlines the importance of parenthesis. Also note that whenever you use a
postfix notation, the postfix operation is effective only after execution of
the statement.
Is everything lost now? Can you not go back to address 1192? Yes, you
can as the following indicates. Note that the following statements are executed in
continuation of all the above statements.
Example 6.5
printf (“%d”, var) ;
printf (“%d”, – – var) ;
printf (“%d”, * var) ; Self-Instructional
Material 125
Pointers The first statement in this example prints the address of the location in memory
pointed to by var. As expected, the pointer is at location 1194. The second
statement carries out two operations in the sequence show:
(a) Decrement the pointer
NOTES
(b) Print the new address
Decrementing takes place before printing because it is a prefix operator.
Now, it prints 1192; the original address is restored. Now the third statement
prints the value of * var or the value stored in location 1192, i.e., 101.
Note that prefix carries out the increment or decrement before printing
or any desired operation, whereas postfix does that after printing. Note that
the fundamentals of pointers are being discussed and that they should be understood
clearly before you proceed further.
You now have 101 stored in address 1192 and pointed to by var. Let us
see what happens on execution of the following statements, in continuation.
Example 6.6
printf (“%d”, ++ (* var)) ;
printf (“%d”, var) ;
These statements are perfectly correct. * var is incremented and the new
value printed in the first statement. Therefore, 102 will be printed. Has the address
been changed? No. Hence, the second statement will print the address as 1192.
Can you increment and decrement addresses? Yes, as the following indicates.
Example 6.7
printf (“%d”, var ++);
printf (“%d”, var) ;
What will be printed in the first statement above, 1192, or 1194? It will be
1192 because incrementing var will take place after the first printf. Obviously, the
second statement will print the address incremented after the previous printf viz.
1194. Let us not lose track but get back to the old address and try prefixing
increment / decrement operators to the address.
Example 6.8
var – –; (a)
printf (“%d”, var) ; (b)
printf (“% d”, ++ var) ; (c)
printf (“% d”, – – var) ; (d)
printf (“% d”, var) ; (e)
printf (“% d”, * var) (f)
Before execution of the first statement in this example, you have:
var = 1194
location 1192 contains 102.

Self-Instructional
126 Material
Now, analyse the execution statement-wise: Pointers

(a) decrements var to 1192


(b) confirms that var is 1192 indeed
(c) ++ var, increments var and then prints as 1194 NOTES
(d) – – var decrements var and then prints as 1192
(e) confirms var is 1192
(f) The value in var is 102
Now, the concepts are becoming clearer. Let us carry out one more
example. Assume that all the above statements have been executed
and the following are now executed:
Example 6.9
printf (“%d”, * (var ++)) ; (g)
printf (“%d”, * (– – var)) ; (h)
printf (“%d”, var) ; (i)
printf (“%d”, * var) ; (j)
(g) Here, * var is printed as 102 because of the postfix operator.
Then, var, i.e. the address is incremented to 1194.
(h) Here, because of the prefix, var is decremented to 1192 and
then, the value at 1192, i.e., 102 is printed.
Now you would be familiar with the intricacies of pointers, prefix, suffix
and parenthesis.
The program involving all these statements and the output is given Example
6.10:
/*Example 6.10
/* pointers*/
#include <stdio.h>
int main()
{
int * var;
int a =100;
var = &a;
printf(“value of * var=%d\n”, *var);
printf(“value of (*var)++=%d\n”, (*var)++);
printf(“value of * var=%d\n”, *var);
printf(“address var=%d\n”, var);
printf(“value of *var++=%d\n”, *var++);
printf(“value of * var=%d\n”, *var);
printf(“address var=%d\n”, var);
printf(“original address var again=%d\n”, —var);/
*original address restored*/ Self-Instructional
Material 127
Pointers printf(“value of * var=%d\n”, *var);
printf(“value of ++(*var)=%d\n”, ++(*var));
printf(“address var=%d\n”, var);
printf(“address var++=%d\n”, var++);
NOTES
printf(“address var=%d\n”, var);
var—;
printf(“address var after decrementing=%d\n”, var);
printf(“address ++var=%d\n”, ++var);
printf(“address – –var=%d\n”, – –var);
printf(“address var=%d\n”, var);
printf(“value of * var=%d\n”, *var);
printf(“value of *(var++)=%d\n”, *(var++));
printf(“value of *(– –var) = %d\n”, *(– –var));
printf(“address var=%d\n”, var);
printf(“value of * var=%d\n”, *var);
}
Result of the program
value of * var=100
value of (*var)++=100
value of * var=101
address var=9106
value of *var++=101
value of * var=9108
address var=9108
original address var again=9106
value of * var=101
value of ++(*var)=102
address var=9106
address var++=9106
address var=9108
address var after decrementing=9106
address ++var=9108
address – –var=9106
address var=9106
value of * var=102
value of * (var++)=102
value of *(– –var) = 102
address var=9106
value of * var=102
Self-Instructional
128 Material
Pointer Comparison Pointers

The addresses or pointers can be stepped up or stepped down. For example,


float *fp;
float f; NOTES
fp = &f;
fp++;
If the original address of fp was 1000, fp++ will take the pointer to 1004
since it is a float pointer.
What happens when fp = fp + 2; is executed? It will skip 2 locations
or 8 locations. You can verify this for yourself.
Similarly, fp = fp - 4; is also valid. Two pointers of the same type can
also be compared. Assuming that we declare float * fp1,
we can compare their relationship with if statements given below:
if (fp == fp1) ...
or if (fp < fp1) ...
Thus, when we say we are comparing pointers, we are comparing their
addresses.
Call by Reference
You have been calling functions and passing actual values to them. When you call
functions, you pass actual arguments as per the list provided in the declaration.
These are all values which are passed to a function and this method is called call
by value. In call by value, you can return only one value from a function and
therefore it puts restrictions on the usage of functions. Global variables help by
facilitating the indirect return of more than one value but their excessive usage
reduces understanding of the program and increases the chance of errors in coding.
This can be overcome by using call by reference, where any number of values can
be indirectly returned without taking the help of global variables. Call by reference
can be implemented by using pointers.
Assume that you want to pass a and b to a function, divide a by b and return
both the quotient and the remainder to the main function. It is not possible to do
this by call by value. Call by value can return only a single value, either the quotient
or the remainder but not both. This can be achieved by call by reference as Example
6.11 demonstrates:
/*Example 6.11
/* to demonstrate function call by reference*/
#include <stdio.h>
int main()
{
int a=100, b=13;
Self-Instructional
Material 129
Pointers void div(int *p, int *q);/*indicates call by
reference*/
div(&a, &b);/*addresses of a and b are passed*/
printf(“quotient= %d remainder= %d\n “, a, b);
NOTES }
/*function definition*/
void div(int *px, int *py) /*function declarator*/
{
int temp1, temp2;
temp1=*px;
temp2=*py;
*px=temp1/temp2;
*py=temp1%temp2;
}
Result of the program
quotient= 7 remainder= 9
How Does the Program Work?
In the declaration part, you have declared div as a function passing two pointer
variables and getting back void or none. You call div(&a, &b). You do not
pass the values but reference to the values. You actually pass the address of a and
b to function div.
The function div receives the reference, i.e., addresses of a & b.
px = &a;
py = &b;
px points to a and py points to b. Hence, *px gets the value of a and
*py gets the value of b. Now, the contents of *px and *py, i.e., a and b are
copied to temp1 and temp2.
Divide temp1 by temp2 and place the result in variable *px whose
address is known to both the main and the function div. In the main function, the
address corresponds to a and in the function it corresponds to *px. Therefore,
the address of the quotient is returned to the main function indirectly. Similarly,
*py contains the remainder. It will be stored in b through the reference. Thus, the
values of the quotient and remainder are stored in locations &a and &b and you
have indirectly returned 2 values to the main through call by reference. The concept,
though, may seem hazy at this point, will become clear as you see more examples.
Some more points are to be noted carefully.
The function declaration indicates that there is a function div, which returns
nothing. It passes two pointers to integers. The pointers are declared as int*p
and int*q. You will notice that they are not declared in the main function and
that, therefore, p and q have no significance except to indicate that they are pointers
to integers. They also indicate that the addresses of two integers are to be passed
Self-Instructional while calling the function.
130 Material
6.2.2 Passing Pointer to a Function Pointers

The function call using pointers is known as call by reference. Here, the address of
the variable is passed. Note that calling by reference has to be indicated in the
function declaration, such as these: NOTES
fun (int *p, char *cp, float *fp, int *array);
This is an indication that the function is to be called by reference for those
parameters which are pointers. A mixed declaration could also be used as given
below:
fun1(int a, char *cp);
Here, you are indicating that an integer is passed by value and a character
variable is passed by reference. In the above example, while you can either pass
a character array (string) or a character through the second declaration, you can
only pass one integer variable through the first parameter.
The function declarator above the function body has to match the declaration.
Hence, when fun1 is called, an integer followed by an address of character will
be passed. In the function fun1, you may have a declarator as follows:
fun1(int d, char * ch)
Here, the value of the integer variable will automatically get assigned to d and
the ch will be assigned the address of the character variable. This means both ch
and the address of the character variable in the calling function will point to the same
location. Thus, both in the calling function and in the called function, the variable is
accessible, although under different names. Any modification made to the character
variable either in the calling function or the called function affects both.
While calling by reference, you have to pass the address of the variable, if
the variable is declared by value, such as int a, then you have to pass &a. If it
was declared as a pointer to say, an integer, such as int * ip, then ip has to be
passed.
Return by Reference
So far, we have seen functions returning only a value, irrespective of whether they
were called by value or by reference. Functions can also return references or
pointers.
Whenever a function is to return a pointer, this has to be indicated by the
following:
Function Declaration
Declare the function as returning pointers. For instance,
int * fun1() ;
char * fun2() ;
float * fun3 (int a, float * b, char * c) ;
Self-Instructional
Material 131
Pointers The difference is the insertion of pointer(*) between the return datatype
and the function name.
Function Declarator
NOTES Function declarator will be in the same format as the prototype or function
declaration. For instance,
float * fun3 (int a, float * b, char * c)
to match the third function declaration in the above example.
Function Call
You may call,
fun3 (x, &y, z) ;
Here, x is a variable and &y is an address. Although z looks like a value, it
is in fact, a reference to character since prototype has the declaration char * c.
Return Statements
The program will obviously return an address or a pointer.
Pointer and One-Dimensional Arrays
At this point, you must note another way of representing the elements of the array.
The address of the 0th element is stored at location p2 or address p2 +0.
The element with a subscript 1 of the array will be at a location one above or at p2
+ 1. Thus, the address of the nth element of this array *p2 will be at address p2
+ n. If you know the value of the pointer variable, i.e. the address of pointer, then
it would be easy to express its value. The value of the integer stored at the nth
location can be represented as * (p2 + n) just as the value at address (p2+0)
is *(p2 + 0) or * p2. This notation is, therefore, quite handy.
Pointer to Array
If you now declare
int a [ 5 ];
nt * ip;
ip = &a[0];
You have defined an array of integer a with 5 elements. When you
assign the address of a[0], i.e., the 0th element of a to ip, ip will point to the
array. The system will automatically assign addresses for the other elements in the
array by noting the datatype of the array and size occupied by each element.
The following example also explains the concept of a function returning a pointer.
/* Example 6.12 to find the greatest number in an array*/
#include <stdio.h>
int main()
Self-Instructional
132 Material
{ Pointers
int array[]= {8, 45, 5, 131, 2};
int size=5, * max;
int* fung(int *p1, int size);/*function retuns
pointer to int*/ NOTES
max=fung(&array[0], size); /*max is a pointer*/
printf(“%d\n”, *max);
}
int * fung(int *p2, int size)
{
int i, j, maxp=0;
for (j=0; j<size; j++)
{
if (*(p2+j) > maxp)
{
maxp=*(p2+j);
i=j;
}
}
return (p2+i); /*pointer returned*/
}

Result of the program


131
The example involved arrays and pointers. The usage of a pointer made the
passing of an array to a function, a simple task. You define array as an integer.
However, in order to pass an address, the prototype was defined with a pointer
argument int * p1. This means an address will be passed to the function while
calling it. No distinction was made between a simple integer variable and an
integer array. int * p1 can be a single valued integer or an array. This is
possible because arrays are stored contiguously in memory. If the address of the
0th element is known and the datatype is known, it would not be difficult to calculate
the address of any element in the array. Therefore, it is enough if the address of the
0th element is passed to a function. *p1 refers to the address of a variable or to
the 0th element of an array. Note the flexibility of arrays in ‘C’ and how intelligently
the language uses pointers.
The called function returns the address of the greatest number in the array.
Look at the function declaration. The function returning a pointer is indicated by
the following (a star mark between the return datatype and function name):
int * fung(...)
The address of array [0] is received by fung() and stored in *p2.
Or, in other words, p2 points to the 0th location of the array. Self-Instructional
Material 133
Pointers While calling a function, the address has to be passed and this is achieved in
the above example by passing & array[0].
In the called program, *p2 is treated as an array without any additional
NOTES efforts. By adding the index to p2, you get the address of the various elements in
the array. Getting value is achieved by placing * before the address.
The if statement compares maxp with *(p2 + j) or p2 [j]
or array [j]. You will easily understand the logic as to how we get the
maximum or the greatest number in * (p2 + j). At the end of the iterations, *
(p2 + j) which is stored at location p2 + j contains the maximum value in the
array and therefore, we are returning an address or reference to the called function.
In this example, (p2 + 3) is the address of the greatest number in the array.
After return from the function, max gets the value of p2 + j. In the printf
*max which is the value stored in max is printed. We have already defined max
as a pointer to an integer in the main function. Thus, the function fung
returns the address of a value or a pointer and by returning the address; the value
is retrieved automatically by the main function.
Now, look at another example using arrays and pointers, as given below:
/*Example 6.13 passing an array of integers to function -
method2*/
#include<stdio.h>
int main()
{
int array[]={10, 20, 30, 40, 50};
int *a;
void pass(int *a, int k);
a=&array[0];
pass(a, 4);
}
void pass(int *b, int j)
{
int k=0;
while (k <= j)
{
(*b)=(*b)/2;
printf(“value %d @ address %d\n”, *b, b);
k++;
b++;
}
}

Self-Instructional
134 Material
Result of the program Pointers

value 5 @ address 8694


value 10 @ address 8696
value 15 @ address 8698
NOTES
value 20 @ address 8700
value 25 @ address 8702
Here, * a has been declared as a pointer to integer. The variable a has
been assigned the address of the 0th element of the array. Now, the function call is
made using pass (a, 4); again a is the address of the variable array [0].
In the function pass, b received the address of a and the next element of
the array is accessed each time by incrementing the address b. Note that the value
of elements in the array are divided by 2 in the called function. As you become
familiar with pointers, you can write programs very easily.
6.2.3 Pointers and Strings
It is amply clear that a string is an array of characters. Since we have seen an array
of numbers, we may want to represent a string in the same manner. We can initialize
a string as given below:
char str [ ] = { ’c’, ’h’, ’a’, ’r’ };
This is absolutely correct. At the end of every string a NULL character ‘\0’
is automatically inserted. However, strings can be written very easily as given
below:
char *str = “char”; The compiler accepts it when it is a string. str[0] is
the address of the 0th location of str. Since the character array is stored contiguously,
there is no need to know the address of each element. A program to find the length
of a string is given below:
/*Example 6.14 to find the length of a string*/
#include <stdio.h>
int main()
{
int wlength;
char *wp =”shri durgaya namaha”;
int wlen(char *wp);
wlength=wlen(wp);
printf(“length of the word=%d”,wlength);
}
/*function to find length*/
int wlen(char *w)
{
int n;
Self-Instructional
Material 135
Pointers for (n=0; *w!=’\0’; w++)
n++;
return n;
}
NOTES
Result of program
length of the word=19
We have a library function strlen() to find the length of the string. This
program carries out the same task although strlen() could be used to find the
length of any string. The program serves to illustrate the concept of strings and
pointers easily. It prints the length of the string in terms of the number of characters.
The white spaces in between will also be counted.
We have initialized *wp while declaring it as a pointer to a character. The
compiler will not normally accept such initialization, but makes an exception for
strings, since there is no anomaly as to whether the initialization is for the value or
the address, as a string cannot be an address, unlike an integer.
Note the statements in the function. The address as well as n are incremented
when the *w is not NULL. NULL indicates the end of string since NULL will be
appended to all the strings. When w points to NULL, there will be no further
increment of w as well as n. Therefore, n indicates the length of the string wp; n is
returned to the main function. You will notice that just by incrementing the address,
we can scan the string.
The same program to find length of string is modified and shown below:
/*Example 6.15- alternate method
to find the length of a string*/
#include <stdio.h>
int main()
{
int wlength;
char *wp=”shri durgaya namaha”;
int wlen(char *wp);
wlength=wlen(wp);
printf(“length of the word=%d”,wlength);
}
/*function to find length*/
int wlen(char *w)
{
char *p = w;
while (*p!=’\0’)
p++;
return p-w;
}
You get the same result.
Self-Instructional
136 Material
Here, the address w is assigned to p through the assignment and declaration Pointers

statement in the called function. Hence, p also points to the same string that is p
points to the first character in the word. When p points to the first character or 0th
element p is incremented to the 1st location. When p points to the (n-1)th element,
p is incremented to nth location. Here, only p is incremented, not w. It continues NOTES
to point to the 0th location. Therefore at the time of termination of while loop, p
will point to the location corresponding to NULL. Thus p - w, i.e., the difference
in the addresses that is equal to the number of characters in wp, is returned to the
main function and printed there.
/* Example 6.16 gets substring beginning with
specified character position */
#include <stdio.h>
#include <string.h>
void substring(char *str,char *substr,int len);
int main()
{
char text[80],substr[20];
int len,pos;
printf(“Enter any Text :”);
gets(text);
printf(“Enter the Length of Substring Required :”);
scanf(“%d”,&len);
printf(“Enter the Position From which Required :”);
scanf(“%d”,&pos);
substring(text+(pos-1),substr,len);
printf(“Substring Is %s \n”,substr);
}
void substring(char *str,char *substr,int len)
{
int cnt=0;
while(*str &&cnt <len)
{
*(substr++)=*(str++);
cnt++;
}
*(substr)=0;
}
Result of program
Enter any Text :This is a program to get a substring
Enter the Length of Substring Required :7
Enter the Position From which Required :11
Substring Is program

Self-Instructional
Material 137
Pointers Try to understand how the program works. The following statement, which
is a call to the function, needs explanation.
substring(text+(pos-1),substr,len);
NOTES Usually we count the first character of a word as the 1st position, whereas
C will recognize it as the 0th position. text points to the address of the 0th position
of the text. Therefore, text+(pos-1) points to the address of the character in text
from which the substring starts.
Although substr is empty, we are passing the address of the 0th location of
the substring as the next argument. We will store the substring in substr, which is
empty when it is passed to the called function.
The third argument is the length of the substring.
Now look at the called function:
In the while loop we carry out the following operations:
 Copy (str) to substr, i.e., from the starting position indicated by the
user, copy one character of text to substring
 Increment str
 Increment substr
 Increment local counter cnt
Copying will terminate either on finding no characters in str or when the
required number of characters have been copied
Thus, the program works correctly.
Note the following features:
Since the strings are passed by reference, there is no need to return the
addresses or values, as the addresses of strings are known to the main function.

Check Your Progress


1. What is a pointer?
2. How does the use of pointers economize memory space?
3. How is call by reference implemented?

6.3 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. A pointer is a variable that contains the address of another variable.


2. Pointers contain the addresses of variables. A compiler allocates an address
at runtime for each variable and retains this till program execution is
Self-Instructional
138 Material
completed. Thus, entire memory is not used at a time and only that part of Pointers

memory is used that is required for execution.


3. The function call using pointers is known as call by reference. Here, the
address of the variable is passed. Calling by reference has to be indicated
NOTES
in the function declaration, such as these:
fun (int *p, char *cp, float *fp, int *array);

6.4 SUMMARY

 A pointer is a variable that contains the address of another variable.


 A pointer to void can be declared as you declare any other pointer, such
as pointer to integer, float, etc.
 The concept of NULL pointer is different from the above concept of void
pointer. NULL pointer is a type of pointer of any data type and generally
takes a value as zero. This is; however, not mandatory. This denotes that a
NULL pointer does not point to any valid memory address.
 The function call using pointers is known as call by reference. Here, the
address of the variable is passed.

6.5 KEY WORDS

 Pointer: A pointer is a variable that contains the address of another variable.


 Void pointer: It is a special type of pointer that can be pointed at objects
of any data type.

6.6 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What is the data type of pointers?
2. What is the main limitation of call by value? How is call by reference better?
Write in brief.
3. How are one-dimensional and two-dimensional arrays represented using
pointers?
Long-Answer Questions
1. What is printed in each of the print statements given below if they are
executed one after another? After evaluation, confirm your answers by trying
them in a program.
Self-Instructional
Material 139
Pointers Declaration
* np = 10 ;
printf statement arguments
NOTES (a) np
(b) np++
(c) np
(d) np-
(e) *np
(f) *(np—)
(g) np
(h) *np
(i) * (++ np)
(j) ++np
(k) np++
(l) np = np - 2
(m) *np
(n) (* np) ++
(o) np
2. What does the following program segments do?
/* Exercise1*/
#include <stdio.h>
#include <string.h>
#include <alloc.h>
int leng(char *str1,char *str2);
int main()
{
char *str1,*str2;
str1=(char *)(malloc(20));
str2=(char *)(malloc(20));
printf(“Enter String 1:”);
gets(str1);
printf(“Enter String 2:”);
gets(str2);
printf(“Length Is %d\n”,leng(str1,str2));
}
int leng(char *str1,char *str2)
{

Self-Instructional
140 Material
int llfound,cnt,i,j,p,q,lg,lr; Pointers
p=strlen(str1);
q=strlen(str2);
if(p>q)
NOTES
lg=p,lr=q;
else
lg=q,lr=p;
cnt=0;
llfound=0;
if(p>q)
{
for(i=0;i<lg&&llfound==0;i++)
for(j=0;j<lr;j++)
{
if(*(str1+i+j)==*(str2+j))
llfound=1;
if(llfound)
cnt++;
}
}
else
{
for(i=0;i<lg&&llfound==0;i++)
for(j=0;j<lr;j++)
{
if(*(str2+i+j)==*(str1+j))
llfound=1;
if(llfound)
cnt++;
}
}
return(cnt);
}
3. Describe the following with examples:
(a) Pointers and two-dimensional arrays
(b) Advantages and operation of:
(i) malloc
(ii) calloc
(c) Call by reference and similarity with global variables
Self-Instructional
Material 141
Pointers
6.7 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


NOTES York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
142 Material
Structures and Union
BLOCK - III
STRUCTURES, UNIONS AND FILES

NOTES
UNIT 7 STRUCTURES AND UNION
Structure
7.0 Introduction
7.1 Objectives
7.2 Structures
7.2.1 Processing a Structure
7.2.2 Array of Structures
7.2.3 Structure Elements Passing to Functions
7.2.4 Structure passing to Functions
7.2.5 Structure Within Structure
7.2.6 Structure Containing Arrays
7.2.7 Pointers to Structures
7.3 Union Definition
7.4 Enumerated Data Types
7.5 Answers to Check Your Progress Questions
7.6 Summary
7.7 Key Words
7.8 Self Assessment Questions and Exercises
7.9 Further Readings

7.0 INTRODUCTION

In this unit, you will learn about structures and union. A structure is a user-defined
data type like an array. While an array contains elements of the same data type, a
structure contains members of varying data types. A structure declaration ends
with a semicolon and the keyword is struct. The tag for structure is optional.
Pointers to structures are similar to other pointers. Structures can be passed by
reference and can be nested.
Unions help in conserving memory as only one of the many variables will be
used at a time. The syntax of a union is similar to that of structure. The unit also
discusses enumerated data types.

7.1 OBJECTIVES
After going through this unit, you will be able to:
 Explain the basic concept of structure
 Declare a structure
 Process a structure
Self-Instructional
Material 143
Structures and Union  Explain structures and its relation to pointers
 Define an union
 Explain enumerated data types
NOTES  Define your own types with typedef

7.2 STRUCTURES
A structure is synonymous with a record. A structure, similar to a record, contains
a number of fields or variables. The variables can be of any of the valid data types.
In order to use structures, you have to first define a unique structure. The definition
of the record book, which you call a structure, is as follows:
struct book
{
char title [25];
char author [15];
char publisher [25];
float price ;
unsigned year;
};
struct is a keyword or a reserved word of ‘C’. A structure tag or name follows
which is book in this case. This is not a must but giving a tag to structure will
improve the reader’s understanding. The beginning of the structure is indicated by
an opening brace. Thereafter, the fields of the record or data elements are declared
one by one. The variables or fields declared are also called members of the structure.
Structure consists of different types of data elements which is different from an
array. Let us now look at the members of struct book.
The title of the book is declared as a string with width 25; similarly, the
author and publisher are arrays of characters or strings of the specified
width. The price is defined as a float to take care of the fractional part of the
currency. The year is defined as an unsigned integer.
Note carefully the appearance of a semicolon after the closing brace. This is
unlike other blocks. The semicolon indicates the end of the declaration of the
structure. Thus, you have to understand the following when you want to declare a
structure:
(a) struct is the header of a structure definition.
(b) It can be followed by an optional name for the structure.
(c) Then the members of the structure are declared one by one within a
block.
(d) The block starts with an opening brace but ends with a closing brace
followed by a semicolon.
Self-Instructional
144 Material
(e) The members can be of any data type. Structures and Union

(f) The members have a relation with the structure; i.e., all of them belong
to the defined structure and they have identity only as members of the
structure and not; otherwise. Therefore, if you assign a name to the
author, it will not be accepted. You can only assign values to NOTES
book.author.
Declaration
The structure definition given above is similar to the prototype in a function in so
far as memory allocation is considered. The system does not allocate memory as
soon as it finds the structure definition, which is for information and checking
consistency later on. The allocation of memory takes place only when structure
variables are declared. What is a structure variable? It is similar to other variables.
For example, int I means that I is an integer variable. Similarly, the following
is a structure variable declaration:
struct book s1;
Here s1 is a variable of type structure book. Suppose, you define,
struct book s1, s2 ;
This means that there are two variables s1 and s2 of type struct book.
These variables can hold different values for their members.
Another point to be noted is that the structure declaration appears above all
other declarations. An example which does nothing but defines structure and
declares structure variables is as follows:
main ( )
{
struct book
{
char title [25];
char author [15];
char publisher [25];
float price;
unsigned year;
};
struct book s1, s2, s3 ;
}
If you want to define a large number of books, then how will you modify the
structure variable declaration? It will be as follows:
struct book s[1000];
This will allocate space for storing 1000 structures or records of books.
However, how much storage space will be allocated for each element of the array?
Self-Instructional
Material 145
Structures and Union It will be the sum of storage spaces required for each member. In struct book,
the storage space required will be as follows:
title 25 + 1 (for NULL to indicate end of string)
author 15 + 1
NOTES
publisher 25 + 1
price 4
year 2
Therefore, the system allots space for 1000 structure variables each with
the above requirement. Space is allocated only after seeing the structure variable
declaration.
Look at another example to make the concept clear. You know that the
bank account of each account holder is a record. Now, define a structure for it.
struct account
{
unsigned number;
char name [15];
int balance ;
} a1, a2;
Instead of declaring separate structure variables, such as struct account
a1, a2; we can use coding as in the example given above. Here, the variables
are declared just after the closing brace of the structure declaration and terminated
with a semicolon. This is perfectly correct. The declaration of the members of the
structure is clear; the balance has been declared as an integer instead of a float
to make it simple. This means that the minimum transaction is a rupee.
7.2.1 Processing a Structure
The structure variable declaration is of no use unless the variables are assigned
values. Here, each member has to be specifically accessed for each structure
variable. For example, to assign the account number for variable a1, you have to
specify as follows:
a1. number = 0001 ;
There is a dot operator between the structure variable name and the member
name or tag.
Suppose, you then want to assign account no.2 to a2, it can be assigned as
follows:
a2. number = 2;
If you want to know the address where a2 number is stored, you can use
printf (“ %u “ , & a2 . number) ;
This is similar to other data types. The structure is a complex data type and
Self-Instructional therefore, you have to indicate which structure variable to which the member
146 Material
belongs as; otherwise, the number is common to all the structure variables, such as Structures and Union

a1, a2, a3, etc., Therefore, it is necessary to be specific. Assuming that you
want to get the value from the keyboard, you can use scanf() as follows:
scanf ( “ % u “, & a1 . number ) ;
NOTES
You can also assign initial values directly as follows:
struct account a1 = { 0001, “Vasu”, 1000};
struct account a2 = { 0002, “Ram”, 1500 };
All the members are specified. This is similar to the declaration of initial
values for arrays. However, note the semicolon after the closing brace. The struct
a1 will, therefore, receive the values for the members in the order in which they
appear. Therefore, you must give the values in the right order and they will be
accepted automatically as follows:
a1 . number = 0001
a1 . name = Vasu
a1 . balance = 1000
Note too that if the initial values are assigned as above, inside a function,
they will be treated as static variables. If they are declared before main, they
will be treated as global variables.
Let us write a program to create a structure account, open 2 accounts with
initial deposits in the accounts. Deposit Rs 1000 to Vasu’s account, withdraw 500
from Ram’s account and print the balance. The following example demonstrates
the above.
/*Example 7.1 - structures*
#include<stdio.h>
int main()
{
struct account
{
unsigned number;
char name[15];
int balance;
};
static struct account a1= {001, “VASU”, 1000};
static struct account a2= {002, “RAM”, 2000};
a1.balance+=1000;
a2.balance-=500;
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a1.number, a1.name, a1.balance);
printf(“A/c No:=%u\t Name:=%s\t balance:=%d\n”,
a2.number, a2.name, a2.balance);
Self-Instructional
} Material 147
Structures and Union Result of the program
A/c No:=1 Name:=VASU Balance:=2000
A/c No:=2 Name:=RAM balance:=1500
NOTES A simple program was written for a bank transaction. For a deposit, we
write
a1 . balance = a1 . balance + 1000 ;
Therefore, the balance is updated. Similarly, when an amount is withdrawn,
the balance is adjusted. However, in practice, the user cannot write a program for
each credit and deposit. You will develop a program soon which will not require
the user to do programming.
User-Defined Data Type
A structure is also a data type. It is not a basic type defined in C language like
int, float, etc. But, it is defined by the programmer. Hence, it is called as
user-defined data type.
7.2.2 Array of Structures
Let us now create an array of structures for the account. This is nothing but an
array of accounts, declared with size. Let us restrict the size to 5. The records will
be created by using keyboard entry. The program is given below:
/*Example 7.2 - to demonstrate structures*
#include<stdio.h>
int main()
{
struct account
{
unsigned number;
char name[15];
int balance;
}a[5];
int i;
for(i=0; i<=4; i++)
{
printf(“A/c No:=\t Name:=\t Balance:=\n”);
scanf(“%u%s%d”, &a[i].number, a[i].name,
&a[i].balance);
}
for(i=0; i<=4; i++)
{
Self-Instructional
148 Material
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”, Structures and Union

a[i].number, a[i].name, a[i].balance);


}
} NOTES
Result of the program
A/c No:= Name:= Balance:=
1 suresh 5000
A/c No:= Name:= Balance:=
2 Lesley 3000
A/c No:= Name:= Balance:=
3 Ahmed 5500
A/c No:= Name:= Balance:=
4 Lakshmi 10900
A/c No:= Name:= Balance:=
5 Thomas 29000
A/c No:=1 Name:=suresh Balance:=5000
A/c No:=2 Name:=Lesley Balance:=3000
A/c No:=3 Name:=Ahmed Balance:=5500
A/c No:=4 Name:=Lakshmi Balance:=10900
A/c No:=5 Name:=Thomas Balance:=29000
The structure array has been declared as a part of structure declaration as
a[5]. The individual elements of the 5 accounts are scanned and printed in the
same order. Note that when you scan a name, you do not give the address but the
actual name of the variable as in a[I].name, since it is a string variable. Remember
this uniqueness. This program basically gets the 5 structures or records pertaining
to 5 account holders. Thereafter, the details of the 5 accounts are printed using the
for statement. The first half of the result was typed by the user and the last 5 lines
are the output of the program.
7.2.3 Structure Elements Passing to Functions
Structures can be copied individually, member wise as well as at one go.
For example, let a3 and a1 be struct account. The following are
valid:
a3.number = a1.number;
a3.balance = a1.balance;
Here, the members of a1 are copied into a3, one by one.
You can also write a3=a1; when all the elements of a1 will be copied to
a3. The latter coding can be used if all the elements are to be copied and the
former, if some members are to be copied selectively. Self-Instructional
Material 149
Structures and Union Note that structures cannot be compared as: for example if (a4 == a2).
This is not a valid operation.
Let us pass a structure into a function, element by element. This is implemented
and shown in Example 7.3.
NOTES
/*Example 7.3 - passing structure element
to function*/
#include<stdio.h>
#include<string.h>
int main()
{
struct account
{
unsigned number;
char name[15];
int balance;
};
static struct account a1= {001, “VASU”, 1000};
int credit(unsigned a, char *n, int d);
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a1.number, a1.name, a1.balance);
a1.balance=credit(a1.number, a1.name, a1.balance);
printf(“A/c No:=%u\t Name:=%s\t new balance:=%d\n”,
a1.number, a1.name, a1.balance);
}
int credit(unsigned a, char *name, int b)
{
int d;
unsigned num;
char *client;
printf(“Enter account number\n”);
scanf(“%u”, &num);
if(a==num)
{
printf(“Enter name in caps\n”);
scanf(“%s”, client);
if(strcmp(name, client)== 0)
{
Self-Instructional printf(“enter deposit made\n”);
150 Material
scanf(“%d”, &d); Structures and Union

b+=d;
return b;
} NOTES
else
{
printf(“name does not match\n”);
return b;
}
}
else
{
printf(“account number does not match\n”);
return b;
}
}
Result of the program
A/c No:=1 Name:=VASU Balance:=1000
Enter account number
1
Enter name in caps
VASU
enter deposit made
4600
A/c No:=1 Name:=VASU new balance:=5600
Now, look at the program carefully. A function prototype has been declared
in main() as given below:
int credit(unsigned a, char *n, int d);
Here, there is no reference to structure at all. A structure a1 is passed to
function credit by passing individual elements of a structure. In function credit,
these values are received. Then, the deposit is entered and added to the balance
after checking the correctness of the details of the account. The new balance is
returned to main and stored in a1.
7.2.4 Structure Passing to Functions
Passing each member of the structure is a tedious job. The entire structure can
instead be passed to the function making for easy handling. The above example
can be altered by passing an entire structure, as follows:
Self-Instructional
Material 151
Structures and Union /*Example 7.4 - passing entire structure to function*/
#include<stdio.h>
struct account
{
NOTES
unsigned number;
char name[15];
int balance;
};
int main()
{
static struct account a1= {001, “Vasu”, 1000};
struct account credit(struct account x);
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a1.number, a1.name, a1.balance);
a1=credit(a1);
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a1.number, a1.name, a1.balance);
}
struct account credit(struct account y)
{
int x;
printf(“enter deposit made”);
scanf(“%d”, &x);
y.balance+=x;
return y;
}
Result of the program
A/c No:=1 Name:=Vasu Balance:=1000
enter deposit made 6700
A/c No:=1 Name:=Vasu Balance:=7700
If you want to pass a structure, the called function should also know the
structure and hence, the structure has to be declared before the main function.
Therefore, structure account has been declared as a global structure. The function
credit is declared with return data type structure as follows:
struct account credit(struct account x);
Thus, you pass and return struct account. Then, credit is called
by simply passing structure a1. In the called program, the deposit is added to the
balance and updated. This is returned to the main() where the updated record
Self-Instructional is printed.
152 Material
7.2.5 Structure Within Structure Structures and Union

You have been nesting if statement and loops so far. You can now create structures
within structures. Here, a structure defined earlier can become a member of another
structure. For example, you can create a structure called deposit using other data NOTES
types and structure account. The declaration of the basic structure should precede
the desired structure as follows:
struct account
{
Unsigned number ;
char name [15];
int balance ;
};
struct deposit
{
struct account ac ;
int amount;
int years;
};
You can write a program to demonstrate the concept.
/*Example 7.5 - structure within structure*/
#include<stdio.h>
int main()
{
struct account
{
unsigned number;
char name[15];
int balance;
};
struct deposit
{
struct account ac;
unsigned amount;
int years;
}d2;
static struct deposit d1= {001, “VASU”, 1000, 50000,
3};
d2=d1; /*structure copy*/
Self-Instructional
Material 153
Structures and Union printf(“A/c No:=%u\t Name:=%s\t
Balance:=%d\tdeposit=%u\tterm=%d\n”,
d2.ac.number, d2.ac.name, d2.ac.balance, d2.amount,
d2.years);
NOTES }
A/c No:=1 Name:=VASU Balance:=1000
deposit=50000 term=3
You have created the structure account. Then, you have created another
structure deposit. In the deposit structure, struct account ac is one of
the members and 2 more members, amount and years have been declared.
Next structure deposit d1 is initialized. The first 3 elements pertain to
the members of struct account and the last two for amount and years,
respectively.
Now d1 is copied to d2 in a simple manner and the deposit details of d2
are printed.
Whenever members of included structures are accessed, you will find two
dots, instead of the usual one dot. This is essential since d1.name is invalid.
Since name is in ac, you have to access it as d1.ac.name.
However, nesting can be up to any level. You can create one more level of
nesting as shown:
struct account
{
unsigned number ;
char name [15];
int balance ;
};
struct deposit
{
struct account ac;
int amount ;
int years ;
};
struct loan
{
struct deposit dep ;
int amount;
char date [10];
};
You see that struct deposit is included as a member of loan.
Self-Instructional
154 Material
Let us declare: Structures and Union

struct loan loan1;


Now, to access loan amount, we have to specify:
loan1 . amount NOTES
To access deposit amount, we have to specify:
oan1 . dep . amount
To access the balance, we have to specify:
loan1 . dep . ac . balance
Therefore, usage of the same variable name amount has not resulted in a
conflict since it has to be seen in which context it is defined. Thus, structures can
be used within structures without difficulty.
7.2.6 Structure Containing Arrays
You saw some members of structures being arrays. You used them to represent an
array of characters. You can also have integer, float arrays as members of
structures. For example,
struct fixed_deposit
{
unsigned Ac_Number;
char name[25];
double deposit[4];
}rama;
Here, you have defined a structure fixed_deposit and declared a
variable rama of the same type. In the structure definition, you have a member
deposit as an array of double of size 4. This means, you can store 4 deposits of
rama. This is called array within structure.
You can access the first deposit of rama by the following:
rama.deposit[0]
The last deposit of rama can be accessed by:
rama.deposit[3]

Application of structures
Structures, as you know, can be used for database management. They can be
used in several applications, such as libraries, departmental stores, banking, etc.
Structures are also used in C++. The syntax of structures is similar to classes in
C++. Structures contain data but classes contain data and functions.
Structures are also used in a variety of other applications, such as:
 Graphics
 Formatting floppy discs
Self-Instructional
Material 155
Structures and Union  Mouse movement
 Payroll
Thus, structure is a very useful construct.
NOTES 7.2.7 Pointers to Structures
You know how to declare pointers to various datatypes and arrays. Similarly,
pointers to structures can also be declared. For instance, in the case of the structure
account, you can declare a pointer as follows:
struct account a1 = { 1, “Vasu”, 1000 };
struct account * sp;
struct sp = &a1 ;
Now, sp is a pointer to structure. Therefore, if you assume structure as
another basic datatype, declaring it as an array and declaring it as a pointer, etc.
follow the same rules. Structure is in fact, a user defined datatype.
Access to individual elements of a structure defined in the form of a pointer
is similar but instead of dot, we use an arrow pointer - >. Arrow pointer is formed
by typing minus followed by the greater sign. However, on to the left of the arrow
operator, there must be a pointer to a structure. The following example will clarify
the point:
/*Example 7.6- structure pointers */
#include<stdio.h>
int main()
{
struct account
{
unsigned number;
char name[15];
int balance;
}a5;
static struct account a1= {001, “VASU”, 1000};
struct account *sp;
sp=&a1;
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
sp->number, sp->name, sp->balance);
a5=*sp;
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a5.number, a5.name, a5.balance);
}
Result of the program
A/c No:=1 Name:=VASU Balance:=1000
Self-Instructional A/c No:=1 Name:=VASU Balance:=1000
156 Material
In this program, sp is a pointer to structure account and therefore, Structures and Union

sp is assigned the address of structure a1. Then, the contents of structure *sp are
printed. The elements of *sp are copied to a5 and then, printed (to demonstrate
copying of structures). Note the difference between the notations when accessing
elements of a structure and a structure pointer. NOTES

Passing Structure by Reference


Structures can be passed by reference. Remember however, that the structure
should be defined as a global variable. The following passes structure by reference
and withdrawal of money is processed in the called function.
/*Example 7.7 - structure pointers & functions*/
#include<stdio.h>
struct account
{
unsigned number;
char name[15];
int balance;
};
int main()
{
static struct account a1= {001, “VASU”, 1000};
struct account debit(struct account *sp, int y);
int deb;
printf(“Enter amount to be withdrawn”);
scanf(“%d”, &deb);
debit(&a1, deb);
printf(“A/c No:=%u\t Name:=%s\t Balance:=%d\n”,
a1.number, a1.name, a1.balance);
}
struct account debit(struct account *x, int y)
{
x->balance-=y;
return *x;
}
Result of the program
Enter amount to be withdrawn299
A/c No:=1 Name:=VASU Balance:=701
Note that the address of a1 is passed. The prototype declares passing
structure by reference and the amount of debit by value. In the called program,
debit is adjusted. Note the pointer notation in subtracting the debit amount from
the balance.

Self-Instructional
Material 157
Structures and Union Now, look at some more examples to understand structure pointers but
before that you must know now to allocate dynamic memory allocation for
structures.
Let us say, struct cycle * cp; then to allocate memory dynamically,
NOTES
we can write:
cp = (struct cycle * ) malloc (size of (struct cycle ) * n));
- n is the number of structure variables of type struct cycle.
In the above, you have treated structure similar to other datatype. The size
of the structure is not fixed like basic datatype. However, you can use the sizeof
operator to get the size of the structure variable.

Check Your Progress


1. What is the basic difference between a structure and an array?
2. What is the keyword to declare a structure?
3. What do you understand by members of a structure?
4. How is a structure declared?

7.3 UNION DEFINITION


Union is a variable which holds at a common assigned area different data types of
varying sizes at different points in time. Assume that a program, at different points
in time of execution uses a double, a float, an integer and a string. In
the normal course, you would have to allocate memory space for each data type.
Assuming that you want to use only one of them at any time and if you do not mind
losing the values, you can save a lot of memory space by declaring a common area
for storing them. If you use dedicated memory for each variable, the space would
remain unutilized most of the time during program execution. This common storage
area can be declared as a union as shown here:
union uname
{
double d;
float f;
char s[ ];
int i ;
} un1 ;
See the resemblance between structure declaration and union
declaration. The common properties are :
(i) They can have a name optionally, such as uname.
Self-Instructional
158 Material
(ii) They can contain members of varying data types. Structures and Union

(iii) The declarations end with a semicolon.


(iv) These union members can be accessed in the same way as structure
members, as shown:
NOTES
union_name. member or union_pointer -> member
(v) A union can be assigned to another union, such as
un2 = un1;
where the structure of un1 along with its members are copied to
un2.
However, the difference is:
(i) The memory size of the structure variable is the sum of the sizes of its
members, whereas in union, it is the largest size of its members.
(ii) It is the programmer’s responsibility to keep track of which type is
currently in use unlike in structure where no member is lost.
(iii) In structures, all members can be initialized, whereas a union can
only be initialized with a value of the type of its first member.
A program using union to store either an int value or float value is
given as follows:
/* Example 7.8 – Demonstrate union */
#include <stdio.h>
#include <conio.h>
union sel
{ int n;
float f;
};
int main()
{ union sel m1;
void printval(union sel *m1,char type);
char type =’p’;
char cont =’y’;
while(cont==’y’)
{
printf(“\nWant to enter Integer Or Float (i/
f):”);
type=getche();
if(type==’i’)
{
printf(“\nEnter Integer Value :”);
scanf(“%d”,&m1.n);
Self-Instructional
Material 159
Structures and Union }
else
{
printf(“\nEnter Float Value :”);
NOTES
scanf(“%f”,&m1.f);
}
printval(&m1,type);
printf(“\nWant to continue- enter (y/n):”);
cont=getche();
}
}
void printval(union sel *m1,char type)
{
if(type==’i’)
printf(“Integer Value Is %d\n”,m1->n);
else
printf(“Float Value Is %f\n”,m1->f);
}
Result of the program
Want to enter Integer Or Float (i/f):i
Enter Integer Value :456
Integer Value Is 456
Want to continue- enter (y/n):y
Want to enter Integer Or Float (i/f):3
Enter Float Value :300.30
Float Value Is 300.299988
Want to continue- enter (y/n):n
The union sel is declared before main, with two members int n
and float f. A function printval is also declared. Depending upon whether
the user wants an integer value or float value, type is set to i or f. If type
is i, integer value is received and if it is f, a float value is received. The value
received is printed in the function printval. Note carefully how the pointer to
union is declared in the function prototype and header. You have to declare
union whenever you pass a union to a function. It is declared as union sel
* m1. As you would have declared int * i, the type here is union sel.
If you have perused the result, you would find that the program asks for
float value, although 3 has been typed instead of f. It is because of the design
of the program. It will ask for float value when a character other than i is
typed. You can take this as an exercise to correct the program to ask for float
Self-Instructional
value only when ‘f’ is typed.
160 Material
Structures and Union
7.4 ENUMERATED DATA TYPES

Enumerated data types is one of the basic data types. The keyword for this data
type is enum. You can define guardian as follows: NOTES
enum guardian
{
father,
husband,
guardian
};
Note that there are no semicolons, but only a comma after the members except
the last member. There is not even a comma after the last member. There is no
conflict between both guardians. The top one is enum and the bottom one is a
member of enum guardian. See the similarity between structure, union
and enum. The enum variables can be declared as follows:
enum guardian emp1, emp2, emp3;

This is similar to structures and unions. The first part is the declaration of the data
type. The second part is the declaration of the variable.
The initial values can be assigned in a simpler manner as given below:
emp1 = husband;
emp2 = guardian;
emp3 = father;

You have to assign only those declared as part of the enum declaration. Assigning
constants not declared will cause error. The compiler treats the enumerators given
within the braces as constants. The first enumerator father will be treated as 0, the
husband as 1 and the guardian as 2. Therefore, it is strictly as per the natural order
starting from 0.
The enumerated data type is never used alone. It is used in conjunction with
other data types. You can write a program using enum and struct. It is as
follows:
/*Example 7.9
enum within structure*/
#include <stdio.h>
int main()
{
enum guardian
{
father,
husband,
relative
}; Self-Instructional
Material 161
Structures and Union struct employee
{
char *name;
float basic;
NOTES char *birthdate;
enum guardian guard;
}emp[2];
int i;
emp[0].name=”RAM”;
emp[0].basic= 20000.00;
emp[0].birthdate= “19/11/1948”;
emp[0].guard= father;
emp[1].name=”SITA”;
emp[1].basic= 12000.00;
emp[1].birthdate= “19/11/1958”;
emp[1].guard=husband;
for(i=0;i<2;i++)
{
if( emp[i].basic ==12000)
{
printf(“Name:%s\nbirthdate:%s\nguardian: %d\n”,
emp[i].name, emp[i]. birthdate, emp[i].guard);
}
}
}
Result of the program
Name : SITA
birthdate : 19/11/1958
guardian : 1

The program clearly assigns the relationships between the employee and the
guardian; enum guardian is the data type, and guard is a variable of
this type.
However, when you are printing emp [i]. guard, you are printing an
integer. Hence 0, 1 or 2 will only be printed for the status, and this is a limitation.
However, this can be overcome by modifying the program. The program modified
with a switch statement is given below:
/*Example 7.10 expanding enum*/
#include <stdio.h>
int main()
{ enum guardian

Self-Instructional
162 Material
{ father, Structures and Union
husband,
relative };
struct employee
NOTES
{
char *name;
float basic;
char *birthdate;
enum guardian guard;
}emp[2];
int i;
emp[0].name=”RAM”;
emp[0].basic= 20000.00;
emp[0].birthdate= “19/11/1948”;
emp[0].guard= father;
emp[1].name=”SITA”;
emp[1].basic= 12000.00;
emp[1].birthdate= “19/11/1958”;
emp[1].guard=husband;
for(i=0;i<2;i++)
{
if( emp[i].basic ==12000)
{printf(“Name:%s\nbirthdate:%s\nguardian:”,
emp[i].name, emp[i]. birthdate);
switch(emp[i].guard)
{ case 0:printf(“father\n”);
break;
case 1:printf(“husband\n”);
break;
case 2:printf(“relative\n”);
break;
}
}
}
}
Result of the program
Name : SITA
birthdate : 19/11/1958
guardian : husband

Self-Instructional
Material 163
Structures and Union

Check Your Progress


5. What is union? How is it declared?
NOTES 6. What is the difference in the context of initialization of variables?

7.5 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. A structure contains different data elements whereas an array contains same


data elements.
2. Keyword struct is used to declare a structure.
3. A structure is like a record that contains fields as name of variables. These
variables are known as members.
4. A structure is created with the keyword struct followed by a name
which is optional. The beginning of a structure is denoted by an opening
brace in which members of the structure are declared. These members can
be of different data types. Declaration of structure is completed by a closing
brace followed by a semicolon.
5. A union is a variable that holds a common assigned area for different
data types but only one is used at a time. It is declared with a keyword
union followed by a name (which is optional) above the opening brace
and defining members as done in case of structure.
6. In structures all members can be initialized, whereas a union can only be
initialized with a value of the type of its first member.

7.6 SUMMARY

 A structure is similar to a record and contains a number of fields or variables.


 Structure variable declaration is of no use unless the variables are assigned
values.
 Structures find application in departmental stores, graphics, formatting floppy
discs and mouse movement, among others.
 Structures can be passed by reference after defining it as a global variable.
 Union is a variable which holds at a common assigned area different data
types of varying sizes at different instances of time.
 Enumerated data types are never used alone; rather, they are used in
conjunction with other data types.

Self-Instructional
164 Material
Structures and Union
7.7 KEY WORDS

 Union: In C they are related to structures and are defined as objects that
may hold objects of different types and sizes. They are analogous to various NOTES
records in other programming languages.
 Enumeration: It is a user defined data type in C. It is mainly used to assign
names to integral constants, the names make a program easy to read and
maintain.

7.8 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. Write the rules to declare a structure.
2. How do you assign pointer to a structure?
3. Write a short note on enumerated data types.
Long-Answer Questions
1. Give descriptive answers with examples for the following:
(a) Structures vs array
(b) Passing structures to functions
(c) Nested structures
2. Deduce what the following program does:
#include<stdio.h>
int main()
{
struct autos
{
char brand [5];
float price;
};
int sale1, sale2;
staticstruct autos auto1 = { “xyz”, 50000.50 };
staticstruct autos auto2 = { “abc”, 50000.00 };
if (auto2.price > auto1.price)
printf(“xyz is cheaper\n”);
else
printf(“abc is cheaper\n”);
Self-Instructional
Material 165
Structures and Union stdio.h>
int main()
{
struct autos
NOTES
{
char brand [5];
float price;
};
staticstruct autos auto1 = { “xyz”, 50000.50 };
staticstruct autos auto2 = { “abc”, 50000.00 };
if (auto2.price > auto1.price)
printf(“xyz is cheaper”);
else
printf(“abc is cheaper”);
}

7.9 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
166 Material
Data File

UNIT 8 DATA FILE


Structure NOTES
8.0 Introduction
8.1 Objectives
8.2 Data File
8.2.1 Opening and Closing a Data File
8.2.2 Concept of Binary Files
8.2.3 Formatted I/O Operations with Files
8.2.4 Writing and Reading a Data File
8.2.5 Unformatted Data Files
8.2.6 Processing a Data File
8.2.7 Use of the Command Line Argument
8.3 Answers to Check Your Progress Questions
8.4 Summary
8.5 Key Words
8.6 Self Assessment Questions and Exercises
8.7 Further Readings

8.0 INTRODUCTION

In this unit, you will learn about data files. In C programming, you include
<stdio.h>: a file essential for any program to read from a standard input
device or to write to a standard output device. It has declaration pointers to three
files, namely stdin, stdout and stderr which means that the contents of
these files are added to the program, when the program executes. In this unit, you
will learn how to use either the hard disk drive or the floppy disk drive as the input/
output medium. In day-to-day usage of large applications, the standard input/
output is neither convenient nor adequate to handle large volumes of data and
hence, the disk drives only serve as Input/Output (I/O) devices. You will also learn
about the usage of files for storing data, popularly known as data files. Data files
stored in the secondary or auxiliary storage devices, such as hard disk drives or
floppy disks, are permanent unless deleted. In contrast, what is written to the
monitor is only for immediate use. The data stored in disk drives can be accessed
later and modified, if necessary. Further, in this unit, you will learn about file pointers,
binary mode and text mode operations, reading a data file and processing a data
file.

8.1 OBJECTIVES
After going through this unit, you will be able to:
 Understand the basic concept of data files
 Explain the significance of file pointers
Self-Instructional
Material 167
Data File  Discuss the steps to open and close a data file
 Understand the importance of binary files
 Understand formatted I/O operations with files
NOTES  Discuss how to write and read a data file
 Identify unformatted data files
 Know how to process a data file

8.2 DATA FILE

In programming <stdio.h> is included in every file. This file is essential for


any program to read from standard input device or to write to the standard output
device. The file <stdio.h> has declarations to the pointers to three files, namely
stdin, stdout and stderr. It means that the contents of these files are
added to the program, when the program executes. Each of the files performs an
essential task as follows:
(a) stdin facilitates usage of the standard input device for program execution
and normally points to the keyboard, which is the default input device.
(b) stdout facilitates the usage of a standard output device where program
output is displayed and points to the video monitor.
(c) stderr facilitates sending error messages to the standard device that is
again the monitor.
stdin, stdout and stderr are pointers or file pointers and are declared
in <stdio.h>. So far you have been using stdin and stdout for input
and output. In day-to-day usage of large applications, the standard input/output is
neither convenient nor adequate to handle large volumes of data and hence, the
disk drives only serve as Input/Output (I/O) devices. Data files stored in the
secondary or auxiliary storage devices, such as hard disk drives or floppy disks,
are permanent unless deleted. In contrast, what is written to the monitor is only for
the immediate use. The data stored in disk drives can be accessed later and
modified, if necessary.
In C, we come across two types of files:
(a) Stream Oriented
(b) System Oriented
System oriented files or low-level files are more closely related to the
operating system and hence, require more complex programming skills to use
them. They may be found to be more efficient than the former in some cases, but
we will not discuss them further because of their complexity.

Self-Instructional
168 Material
Stream oriented files are also called standard files. Data can be stored in the Data File

standard files in two ways as given below:


 Storing characters or numerals consecutively. Each character is interpreted
as an individual data item.
NOTES
 The data items are arranged in blocks in an unformatted manner. Each
block may be an array or a structure.
Let us see how disk I/O is organized. If the file is stored in a floppy or hard disk
drive, the following actions are involved in reading from the file:
 Finding out where the data is.
 Positioning the head over the correct location on the disk.
 Reading the content.
 Transmitting to the main memory.
Similar activities are involved in writing to a disk as well. If the computer or
more specifically the operating system, which handles files in a computer, reads or
writes one character at a time comprising the four steps listed above, then it will be
uninteresting and the response will be too slow. It may cause wear out of the
storage system quickly. Therefore, it would be better to receive large volumes of
data or characters to a buffer in the computer system and then perform whatever
actions are dictated by the program. Similarly, all characters to be written can be
collected in a buffer and written on to the disk, either after the buffer is full or after
the operation is completed. This will minimize the overheads required for the read
or write operations. The buffer is also, the memory, which is used to store data
temporarily without the knowledge of the user. In fact, you created a buffer and
stored values into it before printing them using the sprintf() function. The
concept is similar here also. This is a good practice. Therefore, the characters are
read or written through a buffer assigned by the system. The operations are
essentially performed as depicted pictorially below:

FILE BUFFER SYSTEM

File Pointer
A file pointer is a pointer to a file, just like other pointers to arrays, structures, etc. It
points to a structure that contains information about the file. The information
connected with a file is as follows:
 Location of the buffer.
 The position in the file of the character currently being pointed to.
 Whether the file is being read or written.
 Whether an error has occurred or the end of the file has been reached.

Self-Instructional
Material 169
Data File You do not need to know the details of these because stdio.h handles it
elegantly. There is a structure with typedef FILE in stdio.h, which
handles all file-related tasks as above, whether it is in the floppy or the hard disk
drive. Therefore, in order to use a file without difficulty, you have to include
NOTES stdio.h and declare a file pointer, which points to FILE as shown below:
FILE * fp;
Therefore, the file pointer points to a structure, which contains information about
the file management functions. When you open a file and when the opening of the
file is successful, the file pointer will point to the first character in the file. In other
words, the file gets opened and loaded to the buffer. NULL is a macro defined in
<stdio.h>, which indicates that file open has failed. Therefore, when file open is
successful, the file pointer will point to the address of the buffer, which will be a
non-zero integer value. If not, the file pointer will get a value of NULL, which is 0.
The file pointer will point to the next character after the first one is fetched or
copied on to the system. The structure FILE keeps track of where the pointer
remains at any point in time after opening the file. It keeps track of which files are
being used. It also knows whether the file is empty, the end of the file has been
reached or an error has occurred. You do not have to worry about the file management
tasks once a file pointer has been declared in our program to point to FILE. Since
FILE is known to <stdio.h>, you do not have to bother about it. This declaration
of structure FILE has relieved the programmer from most of the mundane jobs.
8.2.1 Opening and Closing a Data File
Any file has to be opened for any further processing, such as reading, writing or
appending, i.e., writing at the end of the file. The characters will be written or read
one after another from the beginning to the end, unless otherwise specified. You
have to open the file and assign the file pointer to take care of further operations.
Hence, you can declare,
FILE * fp;
fp = fopen (“filename”, “r”);
The filename is the name of the file, which you want to open. You must give the
path name correctly so that the file can be opened. ‘r’ indicates that the file has
to be opened for reading purposes.
fp = fopen (“Ex1.C”, “r”); will enable opening file Ex1.C.
Therefore, the arguments to fopen() are the name of the file and the
mode character. Obviously w is for write, a for append, i.e., adding at the end of
the file. If these characters are specified, the operations as indicated can be
performed after opening the file. It is, therefore, essential to indicate the operations
to be performed before opening the file. When the file is opened in the ‘w’ mode,
the data will be written to the file from the beginning. This means that if the named

Self-Instructional
170 Material
file is already in existence, the previous contents will be lost due to overwriting. If Data File

the file does not exist, then a file with the assigned name will be opened. When the
append mode is specified, the writing will start after the last entry or in other
words previous contents of the file will be preserved.
NOTES
FILE provides the link between the operating system and the program
currently being executed. FILE is a structure containing information about the
corresponding files, including information, such as:
 The location of the file.
 The location of the buffer.
 The size of the file.
After the command is executed in the read mode, the file will be loaded into
the buffer if it is present. If the file is absent, or the file specification is not correct,
then the file will not be opened. If the opening of the file is successful, the pointer
will point to the first character in the file, and if not, NULL is returned, meaning
that the access is not successful. The fopen() function returns a pointer to the
starting address of the buffer area associated with the file and assigns it to the file
pointer, fp in this case.
After the operations are completed, the file has to be closed. The syntax for
closing file is given below:
fclose(filepointer);
fclose() also empties the buffer. The function fputc() performs putting
one character into a file. If for every fputc(), the computer prints a character
to a file, then it will get tired. Therefore, it collects all the characters to be written
onto a file in the buffer. When the buffer is full or when fclose() is executed,
the buffer is emptied by writing to the hard disc drive in the assigned file name.
8.2.2 Concept of Binary Files
We can open files in the text mode or the binary mode. In the binary mode,
everything will be stored in the binary form and the storage space will be equal to
the number of bytes required for the storage of various data types. In the text
mode, they will be stored as alphanumeric characters. If you require to use the file
in the binary mode, you must use ‘rb’ for reading, ‘wb’ for writing, and ‘ab’
for appending. If you want to store data in the text mode, you have to append t to
the mode character as ‘rt’, ‘wt’, ‘at’, etc. Since, the default is in the
text mode, t will be assumed if nothing is specified after the mode character.
Therefore, mode ‘w’ means opening a text file for writing.

Self-Instructional
Material 171
Data File The difference between opening files in the binary mode and the text mode
are given below in Table 8.1:
Table 8.1 Difference between Binary Mode and Text Mode Operations
NOTES
Text Mode Binary Mode

New line character (\n) is converted No such conversion.


to CR|LF combination while writing
to file.

While reading, CR|LF is converted Does not arise.


back to \n.

A special character is inserted at the


end of the file. While reading the file, There is no such arrangement.
EOF is detected.

Text mode needs more than the 2 In binary mode the numbers will be
bytes for storing an integer, since it stored in the specified width. 30000
treats each digit as a character. e.g., needs 2 bytes only.
30,000 needs 5 bytes.

Therefore, binary files and text files are to be processed taking into account
their properties as above, although the file could be opened in any mode. The file
I/O functions, such as fgetc, fputc, fscanf, fprintf, fgets, fputs,
are applicable to the operations in any of the modes.
The files can be used to store employee records using structures in a payroll
program. Book records can be stored in a file in a library database. Inventories
can be stored in a file. However, storing all these in the text mode will consume
more space on the file. Hence, the binary mode can be used to create the files.
Some files cannot be stored in the text mode at all, such as executable files.
8.2.3 Formatted I/O Operations with Files
We are familiar with reading and writing. So far we were reading from and writing
to standard input/output. Therefore, we used functions for the formatted I/O with
stdio such as scanf() and printf(). We also used unformatted I/O,
such as getch(), putch() and other statements. When dealing with files,
there are similar functions for I/O. The functions getc(), fgetc(), fputc()
and putc() are unformatted file I/O functions similar to getch() and
putch(). We will consider the formatted file operations in this section. When it
pertains to standard input or output, we use scanf() and printf(). To
handle formatted I/O with files, we have to use fscanf() and fprintf().
We can write numbers, characters, etc. to the file using fprintf(). This
helps in writing to the file neatly with a proper format. In fact, any output can be
directed to a file instead of the monitor. However, we have to indicate which file we
are writing to by giving the file pointer. The following syntax has to be followed for
fprintf():

Self-Instructional
172 Material
fprintf (filepointer, “format specifier”, variable Data File
names);
We are only adding the file pointer as one of the parameters before the
format specifier. This is similar to sprintf(), which helps in writing to a buffer.
In the case of sprintf(), buffer was a pointer to a string variable. Here, NOTES
instead of a pointer to a string variable, a pointer to a file is given in the
fprintf() statement. Like the string pointer in sprintf(), the file pointer
should have been declared in the function and should be pointing to the file.
Before writing to a file, the file must be opened in the write mode. You can
declare the following:
FILE * fp ;
fp = fopen (“filename”, “wb”);
You have to write wb within double quotes for opening a file for writing in
the binary mode. Therefore, fopen() searches the named file. If the file is
found, it starts writing. Obviously the previous contents will be lost. If a file is not
found, a new file will be created. If unable to open a file for writing, NULL will be
returned.
We can also append data to the file after the existing contents. In this manner,
we will be able to preserve the contents of a file. However, when you open the file
in the append mode, and the file is not present, a new file will be opened. If a file
is present, then writing is carried out from the current end of the file. After writing
is completed either in the write mode or the append mode, a special character will
be automatically included at the end of the text in case of text files. In case of
binary files, no special character will be appended. This can be read back as
EOF. Usually it is – 1, but it is implementation dependent. Hence, it is safer to use
EOF to check the end of the text files.
8.2.4 Writing and Reading a Data File
Let us look at a program to write numbers to a binary file using fprintf()
and then read from the file using fscanf(). It is given in Example 8.1.
/* Example 8.1
Writing digits to a binary file and then reading*/
#include <stdio.h>
int main()
{
int alpha,i;
FILE *fp;
fp=fopen(“ss.doc”, “wb”);
if(fp==NULL)
printf(“could not open file\n”);

Self-Instructional
Material 173
Data File else
{
for (i=0; i<=99; i++)
NOTES fprintf(fp,” %d”, i);
fclose(fp);
/*now read the contents*/
fp=fopen(“ss.doc”, “rb”);
for (i=0; i<100; i++)
{
fscanf(fp,”%d”, &alpha);
printf(“ %d”, alpha);
}
fclose (fp);
}
}
Result of the program
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
87 88 89 90 91 92 93 94 95 96 97 98 99
A program performs the following tasks:
(a) The file ss.doc is opened in the binary mode for writing. If the opening of
the file was not successful, the message will be displayed and program execution
will stop. If successful, the program will enter the else block. Numbers 0
to 99 are generated one after another and written then and there to the file
using the fprintf() function. There should be space before %d as shown
in fprintf(), otherwise the program may not work.
(b) The file is closed using fclose().
(c) Now the same file is opened for reading in the binary mode.
(d) Next the text is scanned using fscanf(), one at a time, and written on
the monitor using simple printf(). The difference between scanf()
and fscanf() is the specification of the file pointer before the format
specifier.
(e) After reading, the file is closed.
The result of the program is read from the file ss.doc and printed on the monitor.
In all the programs involving files, a similar check to see that file opening
was successful should be made. For the sake of improved readability, this statement
has been skipped in the rest of the programs.
Self-Instructional
174 Material
Let us look at one more example of writing, appending and then reading Data File

one integer at a time with the help of the for loop. Look at the program below:
/* Example 8.2
Writing, then appending digits to a file and then reading*/
NOTES
#include <stdio.h>
int main()
{
int alpha,i;
FILE *fp;
fp=fopen(“ss.doc”, “wb”);
for (i=0; i<20; i++)
fprintf(fp,” %d”, i);
fclose(fp);
fp=fopen(“ss.doc”, “ab”);
for (i=20; i<100; i++)
fprintf(fp,” %d”, i);
fclose(fp);
/*now read the contents*/
fp=fopen(“ss.doc”, “rb”);
for (i=0; i<100; i++)
{
fscanf(fp,”%d”, &alpha);
printf(“ %d”, alpha);
}
fclose (fp);
}
Result of the program
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
87 88 89 90 91 92 93 94 95 96 97 98 99
A binary file is opened in the write mode, and digits from 0 to 19 are written on to
the file. The file is then closed using fclose(). The same file is opened in the
append mode again, and numbers from 20 to 99 are appended to the file. After
the file is closed, the file is opened in the read mode. The contents of the file are
read using fscanf() and written to the monitor. Remember to leave a space
Self-Instructional
Material 175
Data File before %d in fprintf() as otherwise you may have a problem. The file is
closed again. We have used the same file pointer, since at any time only one file is
in use. If more than one file is to be kept open simultaneously, it may call for
multiple pointers.
NOTES
8.2.5 Unformatted Data Files
After having worked with the formatted I/O, let us now look at the unformatted I/
O. If you want to read a character from the file, you can use the getc() or
fgetc() functions. If alpha is the name of the character variable, you can
write,
alpha = fgetc (fp);
This means the character pointed to by fp is read and assigned to alpha. A
summary of header files and functions are given in Annexure 3. You can also go to
the help screen of the ‘C’ language system to get more details as well as search
for help on any of the library functions. The help screen gives the syntax of the
functions and also provides examples in which the function or command is used.
Even after reading this book or any other book on ‘C’, you will not be able to use
all the functions. Hence, the best way is to take the help from the help screen
whenever other functions are to be used.
fgetc() reads the character pointed to by fp. It then increments fp
so that fp points to the next character. We can keep on incrementing fp till the
end of file, i.e., end of data is reached. When a file is created in the text mode, the
system inserts a special character at the end of the text. Therefore, while reading
a file, when the last character has been read and the end of the file is reached,
EOF is returned by the file pointer. The following program reads one character at
a time till EOF is reached from an already created text file, ss.doc. The program
is implemented using the do...while statement.
/* Example 8.3 - Reading characters from a file */
#include <stdio.h>
int main()
{
int alpha;
FILE *fp;
fp=fopen(“ss.doc”, “r”);
do
{
alpha=fgetc(fp);
putchar(alpha);
} while(alpha!=EOF);
fclose (fp);
}
Self-Instructional
176 Material
Result of the program Data File

Since file ss.doc is read, the output will be same as Example 8.1, if no change
has been made in the file. If we were to read from a binary file, EOF may not be
recognized. Therefore, a counter can be set up to read a predefined number of NOTES
characters as given in the previous examples.

Check Your Progress


1. Why is the <stdio.h> file essential?
2. Define the tasks performed by stdin, stdout and stderr.
3. Where are the pointers stdio, stdout and stderr declared?
4. Explain system oriented and stream oriented files.
5. What does NULL macro define?

8.2.6 Processing a Data File


A data file is processed in the following way.
File Copy
File copy can be achieved by reading one character at a time and writing to another
file either in the write mode or the append mode. Here it is proposed to read from
a file and write to two different files, one in the write mode and another in the
append mode. This means we have to open three files in the following manner:
FILE * fr, *fw; *fa;
You can assign three file pointers as given above. Three files are then opened. You
can use any name for the file pointers and there may be as many file pointers as the
number of files to be used. The program is given below:
/*Example 8.4
Reading from file ss, writing to file ws and appending
to file as, all at a time*/
#include <stdio.h>
int main()
{
int alpha;
FILE *fr,*fw, *fa;
fr=fopen(“ss.doc”, “r”);
fw=fopen(“ws.doc”, “w”);
fa=fopen(“as.doc”, “a”);
do
{ Self-Instructional
Material 177
Data File alpha=fgetc(fr);
fputc(alpha, fw);
fputc(alpha, fa);
NOTES putchar(alpha);
}while(alpha!=EOF);
fclose (fr);
fclose (fw);
fclose (fa);
}
After opening the three files, alpha() gets the character, which is written to
both the files using fputc(), and the character is also displayed on the screen.
This is continued till EOF is received in alpha from ss.doc, the source file.
Finally, the files are closed. Verify that our program has worked alright.
Since, we are also writing to the monitor, in addition to writing and appending
to files, the program output appears as follows.
Result of the program
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
88 89 90 91 92 93 94 95 96 97 98 99
There are some more mode specifiers with fopen like r+, w+ and a+,
which are given in the Table 8.2.

Table 8.2 Mode Specifier

Mode Specifier Purpose


r+ Opens an already existing file for reading and writing.
w+ Opens a new file for writing as well as reading.
a+ Opens an already existing file for appending and
reading.

Line Input/Output
We have discussed writing to and reading from a file, one character at a time,
using both the unformatted and formatted I/O for the purpose. We can also read
one line at a time. This is enabled by the fgets() function. This is a standard
library function with the following syntax:
Char * fgets (char * buf, int max line, FILE * fp);
fgets() reads the next line from the file pointed to by fp into the character
Self-Instructional array buf. The line means characters up to maxline –1, i.e., if maxline is
178 Material
80, each execution of the function will permit reading of up to 79 characters in the Data File

next line. Here 79 is the maximum, but you can even read 10 characters at a time,
if it is specified.
fgets(alpha, 10, fr);
NOTES
Here alpha is the name of buffer from where 10 characters are to be read at a
time. The file pointer fr points to the file from which the line is read, and the line
read is terminated with NULL. Therefore, it returns a line if available and NULL
if the file is empty or an error occurs in opening the file or reading the file.
The complementary function to fgets() is fputs(). Obviously
fputs() will write a line of text into the file. The syntax is as follows:
int fputs (char * buf , file * fp );
The contents of array buf are written onto the file pointed to by fp. It returns
EOF on error and zero otherwise. Note that the execution of fgets() returns
a line and fputs() returns zero after a normal operation.
The functions gets() and puts() were used with stdio, whereas
fgets() and fputs() operate on files.
We can write a program to transfer two lines of text from the buffer to a file
and then read the contents of the file to the monitor. This is shown in Example 8.5.
/* Example 8.5
Writing and reading lines on files */
#include <stdio.h>
#include<string.h>
int main()
{
int i;
char alpha[80];
FILE *fr,*fw;
fw=fopen(“ws.doc”, “wb”);
for(i=0; i<2; i++)
{
printf(“Enter a line up to 80 characters\n”);
gets(alpha);
fputs(alpha, fw);
}
fclose(fw);
fr=fopen(“ws.doc”, “rb”);
while
( fgets(alpha,20, fr)!=NULL)
puts(alpha);
fclose (fr);
} Self-Instructional
Material 179
Data File Note carefully the fgets() statement. Here alpha is the buffer with a width
of 80 characters. Each line can be up to 80 characters and two lines are entered
through alpha to ws.doc. Later on, 20 characters are read into alpha at a
time from same file till NULL is returned.
NOTES
Result of the program
Enter a line upto 80 characters
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Enter a line upto 80 characters
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
aaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaabbbb
bbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbb
bb
More than 20 numbers of a & b were written on to the file. However, since we
have specified reading 20 characters at a time. The output appears in 6 lines. Had
we specified reading more characters at a time, the number of reads would have
reduced. You can try this yourself.
Thus you can read and write one line at a time.
8.2.7 Use of the Command Line Argument
This can be used to copy a file to another file. Assume that the first named file is to
be copied to the second named file. We may write a program and convert it into
an executable file, specifying the argument in the DOS command line.
We may specify as follows at the C> prompt:
C > prgname . exe f1.cpp f2.cpp.
This means that we want to copy the contents of f1.cpp to f2.cpp. Here the
number of arguments are 3, and therefore argc will contain 3.
*agrv[0] = prgname.exe
*agrv[1] = f1.cpp - source to copy from
*agrv[2] = f2.cpp - file where to be copied
A character at a time is to be fetched from f1.cpp and put into f2.cpp.
Personal File of an Employee
A menu-based program to create employee records on file and calculate the age
of any employee on date is given below:
/* Example 8.6
Create a Personal File for Employees & calculate the age
of any employee ON DATE*/
#include <stdio.h>
Self-Instructional
180 Material
#include <dos.h> Data File
#include <string.h>
#include <stdlib.h>
#include <conio.h>
typedef struct NOTES
{
char name[40];
char code[5];
char dob[9];
char qual[40];
}employee;
FILE *fp;
struct date today;
int main()
{
int create_emp();
int calc_age();
int ret,ch,onscrn=1;
getdate(&today);
printf(“Today’s Date Is %d/%d/%d\nIs It O.K :”,
today.da_day,today.da_mon,today.da_year);
scanf(“%c”,&ch);
onscrn=1;
while(onscrn)
{
clrscr();
printf(“1: Create Employee Data File\n”);
printf(“2: Calculate Age Of Employee\n”);
printf(“3: Exit From Program\nEnter Your Choice
:”);
scanf(“%d”,&ch);
switch(ch)
{
case 1:
create_emp();
break;
case 2:
calc_age();
break;
case 3:
onscrn=0;
break;
Self-Instructional
Material 181
Data File }
}
fclose(fp);
}
NOTES int create_emp()
{
employee emp1;
int i,n;
fp=fopen(“emp.dat”,”a”);
clrscr();
printf(“How Many Employees :”);
scanf(“%d”,&n);
for(i=0;i<n;i++)
{
clrscr();
printf(“Employee %d Details :\n”,i+1);
printf(“\n\nEmployee Name :”);
scanf(“%s”,&emp1.name);
printf(“Employee Code :”);
scanf(“%s”,&emp1.code);
printf(“Date Of Birth :(dd/mm/yy)”);
scanf(“%s”,&emp1.dob);
printf(“Qualification :”);
scanf(“%s”,&emp1.qual);
fprintf(fp, “%40s%5s%9s%40s\n”, emp1.name,
emp1.code,emp1.dob, emp1.qual);
}
fclose(fp);
return(0);
}
int calc_age()
{
int ret,nyob,age,llfound=0,onscrn=1;
employee emp1;
char nam[40],*sear,*ori;
char yob[5];
fp=fopen(“emp.dat”,”r”);
clrscr();
printf(“Employee Name To Search :”);
scanf(“%s”,nam);
sear =strlwr(nam);
while(onscrn)
Self-Instructional
182 Material
{ Data File
ret=fscanf(fp, “%40s%5s%9s%40s\n”, emp1.name,
emp1.code, emp1.dob, emp1.qual);
if(ret==EOF)
{ NOTES
onscrn=0;
continue;
}
ori=strlwr(emp1.name);
if(strcmp(sear,ori)==0)
{
clrscr();
printf(“Employee Name :%s\n”,emp1.name);
printf(“Employee Code :%s\n”,emp1.code);
printf(“Date of Birth :%s\n”,emp1.dob);
printf(“Qualification :%s\n”,emp1.qual);
strcpy(yob,”19");
strncat(yob,emp1.dob+6,2);
yob[4]=0;
nyob=atoi(yob);
age = today.da_year - nyob;
printf(“Age of Employee :%d\n”,age);
getch();
onscrn=0;
llfound=1;
}
}
fclose(fp);
if (!llfound)
{
printf(“%s Not found in emp.dat\n”,nam);
getch();
}
return(0);
}
Result of the program
Employee Name : saravanan
Employee Code : 06
Date of Birth : 02/06/63
Qualification : MBA
Age of Employee : 36

Self-Instructional
Material 183
Data File You should be able to understand the program by reading the following:
The function <dos.h> is included. Look at the online help and see what
it does. You will find that it defines various constants and declarations needed for
DOS and 8086 specific calls. You can use it to get the system date to calculate the
NOTES
age of the employee. After the system date is confirmed, the menu appears. If you
choose 1, it calls create_emp and asks for the number of employees. Then it
accepts the records of a specified number of employees. The employee record is
a structure.
After creating the records, you can opt to calculate the employee’s ages by
entering 2. This invokes the function calc_age. The function asks for the name
of employee it must search for. If the name matches, the age will be calculated and
displayed. The records are written in the append mode, so you will not lose the
records. Note that the structures are written to the file using fprintf() and
read from the file using fscanf(). Note also that date is a structure with
three members da_day, da_mon, da_year.
This example has demonstrated that structures can be written to a file.

Check Your Progress


6. What does mode w mean?
7. Which functions is used to read a character from a file?
8. How is file copy achieved?
9. Why are command line arguments used?

8.3 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. <stdio.h> file is essential for any program to read from standard input
device or to write to the standard output device. The file <stdio.h>
has declarations to the pointers to three files, namely stdin, stdout
and stderr. It means that the contents of these files are added to the
program, when the program executes.
The tasks performed by stdin, stdout and stderr are as follows
2. (a) stdinfacilitates usage of the standard input device for program
execution and normally points to the keyboard, which is the default input
device.
(b) stdoutfacilitates the usage of a standard output device where program
output is displayed and points to the video monitor.
(c) stderrfacilitates sending error messages to the standard device that
is again the monitor.
Self-Instructional
184 Material
3. stdin,stdout and stderrare pointers or file pointers and are Data File

declared in <stdio.h>.
4. System oriented files or low-level files are more closely related to the
operating system and hence, require more complex programming skills to
NOTES
use them.
Stream oriented files are also called standard files. Data can be stored in
the standard files in two ways as given below:
 Storing characters or numerals consecutively. Each character is
interpreted as an individual data item.
 The data items are arranged in blocks in an unformatted manner. Each
block may be an array or a structure.
5. NULL is a macro defined in <stdio.h>, which indicates that file open
has failed.
6. Mode ‘w’ means opening a text file for writing.
7. To read a character from the file, you can use the getc() or fgetc()
functions. If alpha is the name of the character variable, you can write,
alpha = fgetc (fp);
8. File copy can be achieved by reading one character at a time and writing to
another file either in the write mode or the append mode.
9. The command line arguments are used to copy a file to another file.

8.4 SUMMARY

 For programming, <stdio.h> is included in every file. This file is essential


for any program to read from standard input device or to write to the standard
output device.
 The file <stdio.h> has declarations to the pointers to three files, namely
stdin, stdout and stderr.
 stdinfacilitates usage of the standard input device for program execution
and normally points to the keyboard, which is the default input device.
 stdoutfacilitates the usage of a standard output device where program
output is displayed and points to the video monitor.
 stderrfacilitates sending error messages to the standard device that is
again the monitor.
 System oriented files or low-level files are more closely related to the
operating system and hence require more complex programming skills to
use them.

Self-Instructional
Material 185
Data File  Stream oriented files are also called standard files. Data can be stored in
the standard files in two ways, as storing characters or numerals consecutively
where each character is interpreted as an individual data item and arranging
the data items in blocks in an unformatted manner where each block may
NOTES be an array or a structure.
 A file pointer is a pointer to a file. It points to a structure that contains
information about the file.
 Any file has to be opened for further processing, such as reading, writing or
appending. The characters will be written or read one after another from
the beginning to the end unless otherwise specified.
 When the file is opened in the ‘w’ mode, the data will be written to the file
from the beginning. If the named file is already in existence, the previous
contents will be lost due to overwriting.
 When the append mode is specified, the writing will start after the last entry
or in other words, previous contents of the file will be preserved.
 The files can be opened in the text mode or the binary mode. In the text
mode, the files will be stored as alphanumeric characters.
 In the binary mode, everything will be stored in the binary form and the
storage space will be equal to the number of bytes required for the storage
of various data types.
 The functions getc(), fgetc(), fputc() and putc() are
unformatted file I/O functions similar to getch()and putch().
 File copy can be achieved by reading one character at a time and writing to
another file either in the write mode or the append mode.
 The functions gets()and puts()are used with stdio whereas
fgets()and fputs()operate on files.

8.5 KEY WORDS

 stdin: It facilitates usage of the standard input device for program


execution
 stdout: It facilitates the usage of a standard output device where the
program output is to be displayed
 stderr: It sends error messages to the standard output device
 System oriented files or low-level files: These are closely related to the
operating system and require complex programming
 Stream oriented files: These are standard files and store data

Self-Instructional
186 Material
Data File
8.6 SELF ASSESSMENT QUESTIONS AND
EXERCISES

Short-Answer Questions NOTES

1. Name the two types of files used in C.


2. What is a file pointer?
3. What is the function of a file buffer?
4. How data is stored in binary format?
Long-Answer Questions
1. Write and modify programs to extend the personal file of employees for the
following:
(a) To modify employee record.
(b) To delete employee records.
(c) To sort employee records on their name.
(d) To maintain a leave record for each employee.
(e) To calculate the superannuation of each employee.
2. Explain system oriented and stream oriented files with the help of example.
3. What do you mean by formatted I/O with files Discuss in brief.
4. Write a program to write and read a data file.
5. Write a program that reads one character at a time till EOF is reached.
6. Write a program to transfer two lines of text from the buffer to a file and
then read the contexts of the file to the monitor.

8.7 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 187
Introduction to Data
Structure BLOCK - IV
LINEAR DATA STRUCTURE

NOTES
UNIT 9 INTRODUCTION TO DATA
STRUCTURE
Structure
9.0 Introduction
9.1 Objectives
9.2 Stack Related Terms and Operations on Stack
9.3 Application and Implementation of Stack
9.3.1 Converting Infix Notation to Postfix and Prefix or Polish Notations
9.4 Answers to Check Your Progress Questions
9.5 Summary
9.6 Key Words
9.7 Self Assessment Questions and Exercises
9.8 Further Readings

9.0 INTRODUCTION

A stack is a linear data structure in which an element can be added or removed


only at one end called the top of the stack. In the terminology related to stacks,
the insert and delete operations are known as PUSH and POP operations
respectively. The last element added to the stack is the first element to be removed,
that is, the elements are removed in the opposite order in which they are added to
the stack. Hence, a stack works on the principle of last in first out, and is also
known as a last-in-first out (LIFO) list. In this unit, you will learn about the stack
organisation and operations on stack.

9.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss stack and its definition
 Explain the various stack related terms
 Analyze the operations on stack
 Explain what are stacks
 Discuss the application of stacks
 Analyse the implementation of stacks

Self-Instructional
188 Material
Introduction to Data
9.2 STACK RELATED TERMS AND OPERATIONS Structure

ON STACK

A stack can be organized (represented) in the memory either as an array or as a NOTES


singly-linked list. In both the cases, insertion and deletion of elements is allowed
only at one end. Insertion and deletion at the middle of an array or a linked list is
not allowed. An array representation of a stack is static, but linked list representation
is dynamic in nature. Though array representation is a simple technique, it provides
less flexibility and is not very efficient with respect to memory utilization. This is
because if the number of elements to be stored in a stack is less than the allocated
memory then the memory space will be wasted. Conversely, if the number of
elements to be handled by a stack is more than the size of the stack, then it will not
be possible to increase the size of the stack to store these elements. In this section,
only the array organization of a stack will be discussed.
When a stack is organized as an array, a variable named Top is used to
point to the top element of the stack. Initially, the value of Top is set as -1 to
indicate an empty stack. Before inserting a new element onto a stack, it is necessary
to test the condition of overflow. Overflow occurs when a stack is full and there
is no space for a new element and an attempt is made to push a new element. If a
stack is not full then the push operation can be performed successfully. To push an
item onto a stack, Top is incremented by one and the element is inserted at that
position.
Similarly, before removing the top element from the stack, it is necessary to
check the condition of underflow. Underflow occurs when a stack is empty and
an attempt is made to pop an element. If a stack is not empty, POP operation can
be performed successfully. To POP (or remove) an element from a stack, the
element at the top of the stack is assigned to a local variable and then Top is
decremented by one.
The total number of elements in a stack at a given point of time can be
calculated from the value of Top as follows:
number of elements = Top + 1
Figure 9.1 shows an empty stack with size 3 and Top = –1.
Empty
stack
2
1
0

Top = –1

Fig. 9.1 An Empty Stack


Self-Instructional
Material 189
Introduction to Data To insert an element 1 in a stack, Top is incremented by one and the element 1 is
Structure
stored at stack[Top]. Similarly, other elements can be added to the same stack
until Top reaches 2, as shown in Figure 9.2. To POP an element from the stack
(data element 3), Top is decremented by one, which removes the element 3 from
NOTES the stack. Similarly, other elements can be removed from the stack until Top reaches
–1. Figure 9.2 shows different states of stack after performing PUSH and POP
operations on it.

2
Top = 1
1 1
Top = 0

(a) Stack after pushing the element 1 (b) Stack after pushing the element 2

3
Top = 2
2

(c) Stack after pushing the element 3

2
Top = 1 1
1
Top = 0

(d) Stack after popping the element 3 (e) Stack after popping the element 2

Empty
stack
2

Top = 1

(f) Stack after popping the element 1

Fig. 9.2 Various States of Stack after Push and Pop Operations

Self-Instructional
190 Material
To implement a stack as an array in C language, the following structure named Introduction to Data
Structure
Stack needs to be defined as follows:

struct stack
{ NOTES
int item[MAX]; /*MAX is the maximum size of the
array*/
int Top;
Algorithm 9.1 Push Operation on Stack
push(s, element) //s is a pointer to stack

1. If s->Top = MAX-1 //checking for stack overflow


Print “Overflow: Stack is full!” and go to step 5
End If
2. Set s->Top = s->Top + 1 //incrementing Top by 1
3. Set s->item[s->Top] = element //inserting element in the stack
4. Print “Value is pushed onto the stack…”
5. End

Algorithm 9.2 Pop Operation on Stack


pop(s)

1. If s->Top = -1 //checking for stack underflow


Print “Underflow: Stack is empty!”
Return 0 and go to step 5
End If
2. Set popped = s->item[s->Top] //taking off the top element from the stack
3. Set s->Top = s->Top - 1 //decrementing Top by 1
4. Return popped
5. End

Example 9.1: A program to implement a stack as an array is as follows:


#include<stdio.h>
#include<conio.h>
#define MAX 10
#define True 1
#define False 0
typedef struct stack
{
int item[MAX];
int Top;
}stk;
/*Function prototypes*/
void createstack(stk *); /*to create an empty stack*/
void push(stk *, int); /*to push an element
onto the stack*/
int pop(stk *); /*to pop the top element from
the stack*/
int isempty(stk *); /*to check for the underflow
condition*/

Self-Instructional
Material 191
Introduction to Data int isfull(stk *); /*to check for the
Structure overflow condition*/
void main()
{
NOTES int choice;
int value;
stk s;
createstack(&s);
do{
clrscr();
printf(“\n\tMain Menu”);
printf(“\n1. Push”);
printf(“\n2. Pop”);
printf(“\n3. Exit\n”);
printf(“\nEnter your choice: “);
scanf(“%d”, &choice);
switch(choice)
{
case 1: printf(“\nEnter the value to be inserted:
“);
scanf(“%d”, &value);
push(&s, value);
getch();
break;
case 2: value=pop(&s);
if (value==0)
printf(“\nUnderflow: Stack is empty!”);
else
printf(“\nPopped item is: %d”, value);
getch();
break;
case 3: exit();
default: printf(“\nInvalid choice!”);
}
}while(1);
}
void createstack(stk *s)
{
s->Top=-1;
}

Self-Instructional
192 Material
void push(stk *s, int element) Introduction to Data
Structure
{
if (isfull(s))
{
NOTES
printf(“\nOverflow: Stack is full!”);
return;
}
s->Top++;
s->item[s->Top]=element;
printf(“\nValue is pushed onto the stack...”);
}
int pop(stk *s)
{
int popped;
if (isempty(s))
return 0;
popped=s->item[s->Top];
s->Top—;
return popped;
}
int isempty(stk *s)
{
if (s->Top==-1)
return True;
else return False;
}
int isfull(stk *s)
{
if (s->Top==MAX-1)
return True;
else return False;
}
The output of the program is as follows:
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 1
Enter the value to be inserted: 23
Value is pushed onto the stack...
Self-Instructional
Material 193
Introduction to Data Main Menu
Structure
1. Push
2. Pop
3. Exit
NOTES
Enter your choice: 1
Enter the value to be inserted: 35
Value is pushed onto the stack...
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 1
Enter the value to be inserted: 40
Value is pushed onto the stack...
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 2
Popped item is: 40
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 2
Popped item is: 35
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 2
Popped item is: 23
Main Menu
1. Push
2. Pop
3. Exit
Enter your choice: 2
Underflow: Stack is empty!

Self-Instructional
194 Material
Main Menu Introduction to Data
Structure
1. Push
2. Pop
3. Exit NOTES
Enter your choice: 3

Check Your Progress


1. How can a stack be organized?
2. How can you POP an element from a stack?

9.3 APPLICATION AND IMPLEMENTATION OF


STACK

Stacks are used where the last-in-first-out principle is required like reversing strings,
checking whether the arithmetic expression is properly parenthesized, converting
infix notation to postfix and prefix notations, evaluating postfix expressions,
implementing recursion and function calls, etc. This section discusses some of
these applications.
Reversing Strings
A simple application of stacks is reversing strings. To reverse a string, the characters
of a string are pushed onto a stack one by one as the string is read from left to
right. Once all the characters of the string are pushed onto the stack, they are
popped one by one. Since the character last pushed in comes out first, subsequent
POP operations result in reversal of the string.
For example, to reverse a string ‘REVERSE’, the string is read from left to
right and its characters are pushed onto a stack, starting from the letter R, then E,
V, E, and so on, as shown in Figure 9.3.
Result=E Result=ES Result=ESR Result=ESREVER
Top Pop
E
Top Pop
Reverse S S
Top Pop
R R R
Top
E E E E
Push to stack
V V V V
E E E E
R R R R
Stack Stack Stack Stack Top=–1 Stack

Fig. 9.3 Reversing a String using a Stack


Self-Instructional
Material 195
Introduction to Data Once all the letters are stored in a stack, they are popped one by one. Since the
Structure
letter at the top of the stack is E, it is the first letter to be popped. The subsequent
POP operations take out the letters S, R, E, and so on. Thus, the resultant string is
the reverse of original one as shown in Figure 9.3.
NOTES
Algorithm
Algorithm9.33.3
String Reversal
String Using
Reversal Stack
Using Stack
reversal(s, str)

1. Set i = 0
2. While(i < length_of_str)
Push str[i] onto the stack
Set i = i + 1
End While
3. Set i = 0
4. While(i < length_of_str)
Pop the top element of the stack and store it in str[i]
Set i = i + 1
End While
5. Print “The reversed string is: ”, str
6. End

Example 9.2: The following is a program to reverse a given string using stacks:
#include<stdio.h>
#include<conio.h>
#include<string.h>
#define MAX 101
typedef struct stack
{
char item[MAX];
int Top;
}stk;
/*Function prototypes*/
void createstack(stk *);
void reversal(stk *, char *);
void push(stk *, char);
char pop(stk *);
void main()
{
stk s;
char str[MAX];
int i;
createstack(&s);
clrscr();
do
{
printf(“Enter any string (max %d characters): “,
MAX-1);

Self-Instructional
196 Material
for(i=0;i<MAX;i++) Introduction to Data
Structure
{
scanf(“%c”, &str[i]);
if(str[i]==’\n’)
NOTES
break;
}
str[i]=’\0';
}while(strlen(str)==0);
reversal(&s, str);
getch();
}
/*Function definitions*/
void createstack(stk *s)
{
s->Top=-1;
}
void reversal(stk *s, char *str)
{
int i;
for (i=0;i<strlen(str);i++)
push(s, str[i]);
for(i=0;i<strlen(str);i++)
str[i]=pop(s);
printf(“\nThe reversed string is: %s”, str);
}
void push(stk *s, char item)
{
s->Top++;
s->item[s->Top]=item;
}
char pop(stk *s)
{
char popped;
popped=s->item[s->Top];
s->Top—;
return popped;
}
The output of the program is as follows:
Enter any string (max 100 characters): Hello World
The reversed string is: dlroW olleH
Self-Instructional
Material 197
Introduction to Data 9.3.1 Converting Infix Notation to Postfix and Prefix or Polish Notations
Structure
Another important application of stacks is the conversion of expressions from
infix notation to postfix and prefix notations. The general way of writing arithmetic
NOTES expressions is known as infix notation where the binary operator is placed between
two operands on which it operates. For simplicity, expressions containing unary
operators have been ignored. For example, the expressions ‘a+b’ and ‘(a–c)*d’,
‘[(a+b)*(d/f)–f]’ are in infix notation. The order of evaluation of these expressions
depends on the parentheses and the precedence of operators. For example, the
order of evaluation of the expression ‘(a+b)*c’ is different from that of ‘a+(b*c)’.
As a result, it is difficult to evaluate an expression in an infix notation. Thus, the
arithmetic expressions in the infix notation are converted to another notation which
can be easily evaluated by a computer system to produce a correct result.
A Polish mathematician Jan Lukasiewicz suggested two alternative notations
to represent an arithmetic expression. In these notations, the operators can be
written either before or after the operands on which they operate. The notation in
which an operator occurs before its operands is known as the prefix notation
(also known as Polish notation). For example, the expressions ‘+ab’ and ‘*–
acd’ are in prefix notation. On the other hand, the notation in which an operator
occurs after its operands is known as the postfix notation (also known as
Reverse Polish or suffix notation). For example, the expressions ‘ab+’ and
‘ac–d*’ are in postfix notation.
A characteristic feature of prefix and postfix notations is that the order of
evaluation of expression is determined by the position of the operator and operands
in the expression. That is, the operations are performed in the order in which the
operators are encountered in the expression. Hence, parentheses are not required
for the prefix and postfix notations. Moreover, while evaluating the expression, the
precedence of operators is insignificant. As a result, they are compiled faster than
the expressions in infix notation. Note that the expressions in an infix notation can
be converted to both prefix and postfix notations. The subsequent sections will
discuss both the types of conversions.
Conversion of infix to postfix notation
To convert an arithmetic expression from an infix notation to a postfix notation, the
precedence and associativity rules of operators should always kept in mind. The
operators of the same precedence are evaluated from left to right. This conversion
can be performed either manually (without using stacks) or by using stacks.
Following are the steps for converting the expression manually:
(i) The actual order of evaluation of the expression in infix notation is determined.
This is done by inserting parentheses in the expression according to the
precedence and associativity of operators.
(ii) The expression in the innermost parentheses is converted into postfix notation
by placing the operator after the operands on which it operates.
Self-Instructional
198 Material
(iii) Step 2 is repeated until the entire expression is converted into a postfix Introduction to Data
Structure
notation.
For example, to convert the expression ‘a+b*c’ into an equivalent postfix
notation, the steps will be as follows:
NOTES
(i) Since the precedence of * is higher than +, the expression b*c has to be
evaluated first. Hence, the expression is written as follows:
(a+(b*c))
(ii) The expression in the innermost parentheses, that is, b*c is converted
into its postfix notation. Hence, it is written as bc*. The expression now
becomes as follows:
(a+bc*)
(iii) Now the operator + has to be placed after its operands. The two operands
for + operator are a and the expression bc*. The expression now becomes
as follows:
(abc*+)
Hence, the equivalent postfix expression will be as follows:
abc*+
When expressions are complex, manual conversion becomes difficult. On
the other hand, the conversion of an infix expression into a postfix expression is
simple when it is implemented through stacks. In this method, the infix expression
is read from left to right and a stack is used to temporarily store the operators and
the left parenthesis. The order in which the operators are pushed on to and popped
from the stack depends on the precedence of operators and the occurrence of
parentheses in the infix expression. The operands in the infix expression are not
pushed on to the stack, rather they are directly placed in the postfix expression.
Note that the operands maintain the same order as in the original infix notation.
Algorithm 9.4 Infix to Postfix Conversion
infixtopostfix(s, infix, postfix)

1. Set i = 0
2. While (i < number_of_symbols_in_infix)
If infix[i] is a whitespace or comma
Set i = i + 1 and go to step 2
If infix[i] is an operand, add it to postfix
Else If infix[i] = ‘(’, push it onto the stack
Else If infix[i] is an operator, follow these steps:
i. For each operator on the top of stack whose precedence is greater
than or equal to the precedence of the current operator, pop the
operator from stack and add it to postfix
ii. Push the current operator onto the stack
Else If infix[i] = ‘)’, follow these steps:
i. Pop each operator from top of the stack and add it to postfix
until ‘(’ is encountered in the stack
ii. Remove ‘(’ from the stack and do not add it to postfix
End If
Set i = i + 1
End While
3. End

Self-Instructional
Material 199
Introduction to Data For example, consider the conversion of the following infix expression to a postfix
Structure
expression:
a-(b+c)*d/f
Initially, a left parenthesis ‘(’ is pushed onto the stack and the infix expression is
NOTES
appended with a right parenthesis, ‘)’. The initial state of the stack, infix expression
and postfix expression are shown in Figure 9.4.
a - ( b + c ) * d / f )
Infix

Postfix

Stack

Fig. 9.4 Initial State of the Stack, Infix Expression, and Postfix Expression

infix is read from left to right and the following steps are performed:
1. The operand a is encountered, which is directly put to postfix.
2. The operator – is pushed on to the stack.
3. The left parenthesis ‘(’ is pushed onto the stack.
4. The next element is b, which being an operand is directly put to postfix.
5. +, being an operator, is pushed onto the stack.
6. Next, c is put to postfix.
7. The next element is the right parenthesis ‘)’ and hence, the operators at the
top of the stack are popped until ‘(’ is encountered in the stack. Till then,
the only operator in the stack above the ‘(’ is +, which is popped and put
to postfix. ‘(’ is popped and removed from the stack, as shown in
Figure 9.5(a). Figure 9.5(b) shows the current position of stack.
8. Then, the next element * is an operator and hence, it is pushed onto the
stack.
9. Then, d is put to postfix.
10. The next element is /. Since the precedence of / is same as the precedence
of *, the operator * is popped from the stack and / is pushed onto the
stack, as shown in Figure 9.6.
11. The operand f is directly put to postfix after which, ‘)’ is encountered.

Self-Instructional
200 Material
Introduction to Data
Stack Structure
Stack status
a b c
Postfix NOTES

a - ( b + c * ) d / f )
+
( Infix

-
( Push to stack Pop + from stack -
and remove ( (

(a) Postfix Expression when + is Popped (b) State of the Stack

Fig. 9.5 Intermediate States of Postfix and Infix Expressions and the Stack

12. On reaching ‘)’, the operators in stack before the next ‘(’ is reached and
popped. Hence, / and – are popped and put to postfix as shown in
Figure 9.6.
13. ‘(’ is removed from the stack. Since the stack is empty, the algorithm is
terminated and postfix is printed.

Fig. 9.6 The State when – and / are Popped

Self-Instructional
Material 201
Introduction to Data The step-wise conversion of expression a-(b+c)*d/f into its equivalent postfix
Structure
expression is shown in Table 9.1.
Table 9.1 Conversion of Infix Expression into Postfix
NOTES Element Action Performed Stack Status Postfix Expression
A Put to postfix ( A
– Push (- a
( Push (-( a
b Put to postfix (-( ab
+ Push (-(+ ab
c Put to postfix (-(+ abc
) Pop +, put to postfix, pop ( (- abc+
* Push (-* abc+
d Put to postfix (-* abc+d
/ Pop *, put to postfix, push / (-/ abc+d*
f Put to postfix (-/ abc+d*f
) Pop / and - Empty abc+d*f/-

Conversion of infix to prefix notation


The conversion of an infix expression to a prefix expression is similar to the
conversion of infix to postfix expression. The only difference is that the expression
in an infix notation is scanned in reverse order, that is, from right to left. Therefore,
the stack in this case stores the operators and the closing (right) parenthesis.
Algorithm 9.53.5
Algorithm Infix to to
Infix Prefix Conversion
Prefix Conversion
infixtoprefix(s, infix, prefix)

1. Set i = 0
2. While (i < number_of_symbols_in_infix)
If infix[i] is a whitespace or comma
Set i = i + 1 go to step 2
If infix[i] is an operand, add it to prefix
Else If infix[i] = ‘)’, push it onto the stack
Else If infix[i] is an operator, follow these steps:
i. For each operator on the top of stack whose precedence is greater
than or equal to the precedence of the current operator, pop the
operator from stack and add it to prefix
ii. Push the current operator onto the stack
Else If infix[i] = ‘(’, follow these steps:
i. Pop each operator from top of the stack and add it to prefix until
‘)’ is encountered in the stack
ii. Remove ‘)’ from the stack and do not add it to prefix
End If
Set i = i + 1
End While
3. Reverse the prefix expression
4. End

For example, consider the conversion of the following infix expression to a prefix
expression:
a-(b+c)*d/f

The step-wise conversion of the expression a-(b+c)*d/f into its equivalent


prefix expression is shown in Table 9.2. Note that initially ‘)’ is pushed onto the
stack, and ‘(’ is inserted in the beginning of the infix expression. Since the infix
Self-Instructional
202 Material
expression is scanned from right to left, but elements are inserted in the resultant Introduction to Data
Structure
expression from left to right, the prefix expression needs to be reversed.
Table 9.2 Conversion of Infix Expression into Prefix Expression

Element Action Performed Stack Status Prefix Expression


NOTES
f Put to expression ) f
/ Push )/ f
d Put to expression )/ fd
* Push )/* fd
) Push )/*) fd
c Put to expression )/*) fdc
+ Push )/*)+ fdc
b Put to expression )/*)+ fdcb
( Pop and + and put to expression, pop ) )/* fdcb+
- Pop *, / and push – )– fdcb+*/
a Put to expression )/*– fdcb+a
( Pop - and put to expression, pop ( Empty fdcb+*/a-
Reverse the resultant expression –a/*+bcdf

The equivalent prefix expression is –a/*+bcdf.


Evaluation of Postfix Expression
In a computer system, when an arithmetic expression in an infix notation needs to
be evaluated, it is first converted into its postfix notation. The equivalent postfix
expression is then evaluated. Evaluation of postfix expressions is also implemented
through stacks. Since the postfix expression is evaluated in the order of appearance
of operators, parentheses are not required in the postfix expression. During
evaluation, a stack is used to store the intermediate results of evaluation.
Since an operator appears after its operands in a postfix expression, the
expression is evaluated from left to right. Each element in the expression is checked
to find out whether it is an operator or an operand. If the element is an operand, it
is pushed onto the stack. On the other hand, if the element is an operator, the first
two operands are popped from the stack and an operation is performed on them.
The result of this operation is then pushed back to the stack. This process is
repeated until the entire expression is evaluated.
Algorithm 9.63.6
Algorithm Evaluation of aofPostfix
Evaluation Expression
a Postfix Expression
evaluationofpostfix(s, postfix)

1. Set i = 0, RES=0.0
2. While (i < number_of_characters_in_postfix)
If postfix[i] is a whitespace or comma
Set i = i + 1 and continue
If postfix[i] is an operand, push it onto the stack
If postfix[i] is an operator, follow these steps:
i. Pop the top element from stack and store it in operand2
ii. Pop the next top element from stack and store it in operand1
iii. Evaluate operand2 op operand1, and store the result in
RES (op is the current operator)
iv. Push RES back to stack
End If
Set i = i + 1
End While
3. Pop the top element and store it in RES
4. Return RES
5. End Self-Instructional
Material 203
Introduction to Data For example, consider the evaluation of the following postfix expression using
Structure
stacks:
abc+d*f/-
NOTES where,
a=6
b=3
c=6
d=5
f=9
After substituting the values of a, b, c, d and f, the postfix expression becomes
as follows:
636+5*9/-
The following are the steps performed to evaluate an expression:
1. The expression to be evaluated is read from left to right and each element is
checked to find out if it is an operand or an operator.
2. First element is 6, which being an operand is pushed onto the stack.
3. Similarly, the operands 3 and 6 are pushed onto the stack.
4. Next element is +, which is an operator. Hence, the element at the top of
stack 6 and the next top element 3 are popped from the stack, as shown in
Figure 9.7.

Fig. 9.7 Evaluation of the Expression using Stacks

5. Expression 3+6 is evaluated and the result, that is 9, is pushed back to


stack, as shown in Figure 9.8.
6. Next element in the expression, that is 5, is pushed to the stack.

Self-Instructional
204 Material
7. Next element is *, which is a binary operator. Hence, the stack is popped Introduction to Data
Structure
twice and the elements 5 and 9 are taken off from the stack, as shown in
Figure 9.8.

NOTES

Fig. 9.8 Popping 9 and 5 from the Stack

8. Expression 9*5 is evaluated and the result, that is 45, is pushed to the
back of the stack.
9. Next element in the postfix expression is 9, which is pushed onto the stack.
10. Next element is the operator /. Therefore, the two operands from the top
of the stack, that is 9 and 45, are popped from the stack and the operation
45/9 is performed. Result 5 is again pushed to the stack.
11. Next element in the expression is –. Hence, 5 and 6 are popped from the
stack and the operation 6–5 is performed. The resulting value, that is 1, is
pushed to the stack (see Figure 9.9).

Fig. 9.9 Final State of Stack with the Result

Self-Instructional
Material 205
Introduction to Data 12. There are no more elements to be processed in the expression. Element on
Structure
top of the stack is popped, which is the result of the evaluation of the postfix
expression. Thus, the result of the expression is 1.
The step-wise evaluation of the expression 636+5*9/- is shown in Table 9.3.
NOTES
Table 9.3 Evaluation of the Postfix Expression

Element Action Performed Stack Status


6 Push to stack 6
3 Push to stack 6 3
6 Push to stack 6 3 6
+ Pop 6 6 3
Pop 3 6
Evaluate 3+6=9 6
Push 9 to stack 6 9
5 Push to stack 6 9 5
* Pop 5 6 9
Pop 9 6
Evaluate 9*5=45 6
Push 45 to stack 6 45
9 Push to stack 6 45 9
/ Pop 9 6 45
Pop 45 6
Evaluate 45/9=5 6
Push 5 to stack 6 5
- Pop 5 6
Pop 6 EMPTY
Evaluate 6-5=1 EMPTY
Push 1 to stack 1
Pop VALUE=1 EMPTY

Check Your Progress


3. What is a characteristic feature of prefix and postfix notation?
4. Where are stacks used?
5. Where is stack used during evaluation?

9.4 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. A stack can be organized (represented) in the memory either as an array or


as a singly-linked list.
2. To POP (or remove) an element from a stack, the element at the top of the
stack is assigned to a local variable and then Top is decremented by one
3. A characteristic feature of prefix and postfix notations is that the order of
evaluation of expression is determined by the position of the operator and
operands in the expression.
Self-Instructional
206 Material
4. Stacks are used where the last-in-first-out principle is required like reversing Introduction to Data
Structure
strings.
5. During evaluation, a stack is used to store the intermediate results of
evaluation.
NOTES

9.5 SUMMARY

 A stack can be organized (represented) in the memory either as an array or


as a singly-linked list.
 Though array representation is a simple technique, it provides less flexibility
and is not very efficient with respect to memory utilization.
 When a stack is organized as an array, a variable named Top is used to
point to the top element of the stack.
 An array representation of a stack is static, but linked list representation is
dynamic in nature
 When a stack is organized as an array, a variable named Top is used to
point to the top element of the stack. Initially, the value of Top is set as -1 to
indicate an empty stack.
 Overflow occurs when a stack is full and there is no space for a new element
and an attempt is made to push a new element.
 When a stack is organized as an array, a variable named Top is used to
point to the top element of the stack.
 Similarly, before removing the top element from the stack, it is necessary to
check the condition of underflow.
 To POP (or remove) an element from a stack, the element at the top of the
stack is assigned to a local variable and then Top is decremented by one.
 Stacks are used where the last-in-first-out principle is required like reversing
strings.
 A simple application of stacks is reversing strings. To reverse a string, the
characters of a string are pushed onto a stack one by one as the string is
read from left to right.
 Once all the characters of the string are pushed onto the stack, they are
popped one by one.
 Since the character last pushed in comes out first, subsequent POP
operations result in reversal of the string.
 The general way of writing arithmetic expressions is known as infix notation
where the binary operator is placed between two operands on which it
operates.

Self-Instructional
Material 207
Introduction to Data  A characteristic feature of prefix and postfix notations is that the order of
Structure
evaluation of expression is determined by the position of the operator and
operands in the expression.
 To convert an arithmetic expression from an infix notation to a postfix notation,
NOTES
the precedence and associativity rules of operators should always kept in
mind.
 The conversion of an infix expression to a prefix expression is similar to the
conversion of infix to postfix expression.
 In a computer system, when an arithmetic expression in an infix notation
needs to be evaluated, it is first converted into its postfix notation.
 Evaluation of postfix expressions is also implemented through stacks.
 Since the postfix expression is evaluated in the order of appearance of
operators, parentheses are not required in the postfix expression.
 During evaluation, a stack is used to store the intermediate results of evaluation.
 Since an operator appears after its operands in a postfix expression, the
expression is evaluated from left to right.

9.6 KEY WORDS

 Overflow: It occurs when a stack is full and there is no space for a new
element and an attempt is made to push a new element.
 Stack: It is an abstract data type that serves as a collection of elements,
with two principal operations: push, which adds an element to the collection,
and pop, which removes the most recently added element that was not yet
removed.
 Stack: It is an abstract data type that serves as a collection of elements,
with two principal operations- push and pop.
 Reversing Strings: It is a simple application of stacks. To reverse a string,
the characters of a string are pushed onto a stack one by one as the string is
read from left to right.

9.7 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What is a stack?
2. How is a stack organized?
3. What do you mean by overflow in a stack?
Self-Instructional
208 Material
4. Write the algorithm for reversing a string. Introduction to Data
Structure
5. Write a short note on stacks.
6. Discuss the application of stacks.
Long-Answer Questions NOTES

1. Write a program to implement a stack as an array.


2. How can you insert elements in a stack? Explain.
3. Write a note on the operations on stack.
4. What do you mean by implementation of stack? Discuss in detail.
5. Write a program to reverse a given string using stacks.
6. Write a detailed note on Conversion of infix to postfix notation.
7. Write a program to convert an expression from infix notation to postfix
notation.

9.8 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 209
Queues

UNIT 10 QUEUES
NOTES Structure
10.0 Introduction
10.1 Objectives
10.2 Queues
10.3 Representation of Queues
10.4 Circular Queue
10.5 Applications of Queues
10.6 Answers to Check Your Progress Questions
10.7 Summary
10.8 Key Words
10.9 Self Assessment Questions and Exercises
10.10 Further Readings

10.0 INTRODUCTION

In this unit, you will learn about the queues, their representation and applications.
A Queue is an abstract data structure which is somewhat similar to Stacks. But
unlike stacks, a queue is open at both its ends. One end of a queue is always used
to insert data (called enqueue) and the other is used to remove data (called
dequeue). Queue follows the basic and simple First-In-First-Out methodology,
which means that the data item stored first will be accessed first.

10.1 OBJECTIVES

After going through this unit, you will be able to:


 Understand queues
 Discuss the representation of queues
 Analyze circular queues
 List the applications of queue

10.2 QUEUES

A queue is a linear data structure in which a new element is inserted at one end
and an element is deleted from the other end. The end of the queue from which the
element is deleted is known as the Front and the end at which a new element is
added is known as the Rear. Figure 10.1 shows a queue.

Self-Instructional
210 Material
Queues
Deletion Insertion

Front Rear
NOTES

Fig. 10.1 A Queue

The following are the basic operations that can be performed on queues:
 Insert Operation: To insert an element at the rear of the queue
 Delete Operation: To delete an element from the front of the queue
Before inserting a new element in the queue, it is necessary to check whether there
is space for the new element. If no space is available, the queue is said to be in the
condition of overflow. Similarly, before deleting an element from the queue, it is
necessary to check whether there is an element in the queue. If there is no element
in the queue, the queue is said to be in the condition of underflow.

10.3 REPRESENTATION OF QUEUES

Like stacks, queues can be represented in the memory by using an array or a


singly linked list. In this section, we will discuss how a queue can be implemented
using an array.
Array Implementation of a Queue
When a queue is implemented as an array, all the characteristics of an array are
applicable to the queue. Since an array is a static data structure, the array
representation of a queue requires the maximum size of the queue to be
predetermined and fixed. As we know that a queue keeps on changing as
elements are inserted or deleted, the maximum size should be large enough for
a queue to expand or shrink.
The representation of a queue as an array needs an array to hold the elements
of the queue and two variables Rear and Front to keep track of the rear and
the front ends of the queue, respectively. Initially, the value of Rear and Front
is set to –1 to indicate an empty queue. Before we insert a new element in the
queue, it is necessary to test the condition of overflow. A queue is in a condition of
overflow (full) when Rear is equal to MAX–1, where MAX is the maximum size
of the array. If the queue is not full, the insert operation can be performed. To
insert an element in the queue, Rear is incremented by one and the element is
inserted at that position.

Self-Instructional
Material 211
Queues Similarly, before we delete an element from a queue, it is necessary to test
the condition of underflow. A queue is in the condition of underflow (empty) when
the value of Front is –1. If a queue is not empty, the delete operation can be
performed. To delete an element from a queue, the element referred by Front is
NOTES
assigned to a local variable and then Front is incremented by one.
The total number of elements in a queue at a given point of time can be calculated
from the values of Rear and Front given as follows:
Number of elements = Rear – Front + 1
To understand the implementation of a queue as an array in detail, consider
a queue stored in the memory as an array named Queue that has MAX as its
maximum number of elements. Rear and Front store the indices of the rear
and front elements of Queue. Initially, Rear and Front are set to –1 to indicate
an empty queue (refer Figure 10.2(a)).
Whenever a new element has to be inserted in a queue, Rear is incremented
by one and the element is stored at Queue[Rear]. Suppose an element 9 is to
be inserted in the queue. In this case, the rear is incremented from –1 to 0 and the
element is stored at Queue[0]. Since it is the first element to be inserted, Front
is also incremented by one to make it to refer to the first element of the queue
(refer Figure 10.2(b)). For subsequent insertions, the value of Rear is incremented
by one and the element is stored at Queue[Rear]. However, Front remains
unchanged (refer Figure 10.2(c)). Observe that the front and rear elements of the
queue are the first and last elements of the list, respectively.
Whenever, an element is to be deleted from a queue, Front is incremented
by one. Suppose that an element is to be deleted from Queue. Then, here it must
be 9. It is because the deletion is always made at the front end of a queue. Deletion
of the first element results in the queue as shown in Figure 10.2(d). Similarly,
deletion of the second element results in the queue as shown in Figure 10.2(e).
Observe that after deleting the second element from the queue, the values of Rear
and Front are equal. Here, it is apparent that when values of Front and
Rear are equal other than –1, there is only one element in the queue. When this
only element of the queue is deleted, both Rear and Front are again made
equal to –1 to indicate an empty queue.
Further, suppose that some more elements are inserted and Rear reaches the
maximum size of the array (refer Figure 10.2(f)). This means that the queue is full
and no more elements can be inserted in it even though the space is vacant on the
left of the Front.

Self-Instructional
212 Material
0 1 2 3 4 … …. MAX-1 Queues

Front = –1 Rear = –1 NOTES


(a) An Empty Queue

0 1 2 3 4 … …. MAX-1

Front=0 Rear=0

(b) Queue after Inserting the First Element

0 1 2 3 4 … …. MAX-1

9 5 3 …. .

Front=0 Rear=2

(c) Queue after Inserting a few Elements

0 1 2 3 4 … …. MAX-1

5 3 …. ….

Front=1
Rear=2

(d) Queue after Deleting the First Element

0 1 2 3 4 … …. MAX-1

3 …. ….

Front=2 Rear=2

(e) Queue after Deleting the Second Element

0 1 2 3 4 … …. MAX-1

3 8 2 …. …. 7

Front=2 Rear=MAX – 1
(f ) Queue having Vacant Space though Rear = MAX – 1

Fig. 10.2 Various States of a Queue after the Insert and Delete Operations

Self-Instructional
Material 213
Queues To implement a queue as an array in the C language, the following structure named
queue is used:
struct queue
{
int item[MAX];
NOTES int Front;
int Rear;
};

Algorithm 10.1 Insert Operation on a Queue


qinsert(q, val)) //q is a pointer to structure type queue and val is the value to be
//inserted

1. If q->Rear = MAX-1 //check if queue is full


Print “Overflow: Queue is full!” and go to step 5
End If
2. If q->Front = -1 //check if queue is empty
Set q->Front = 0 // make front to refer to first element
End If
3. Set q->Rear = q->Rear + 1 //increment Rear by one
4. Set q->item[q->Rear] = val //insert val
5. End

Algorithm 10.2 Delete Operation on a Queue


qdelete(q)

1. If q->Front = -1 //check if queue is empty


Print “Underflow: Queue is empty!”
Return 0 and go to step 5
End If
2. Set del_val = q->item[q->Front] //del_val is the value to be deleted
3. If q->Front = q->Rear //check if there is only one element
Set q->Front = q->Rear = -1
Else
Set q->Front = q->Front + 1 //increment Front by one
End If
4. Return del_val
5. End

Linked Implementation of a Queue


A queue implemented as a linked list is known as a linked queue. A linked queue
is represented using two pointer variables Front and Rear that point to the
first and the last node of the queue, respectively. Initially, Rear and Front are
set to NULL to indicate an empty queue.
To understand the implementation of a linked queue, consider a linked queue,
say Queue. The info and next fields of each node represent the element of
the queue and a pointer to the next element in the queue, respectively. Whenever
a new element is to be inserted in the queue, a new node nptr is created and the
element is inserted into the node. If it is the first element being inserted in the
queue, both Front and Rear are modified to point to this new node. On the
other hand, in subsequent insertions, only Rear is modified to point to the new
node; Front remains unchanged.
Whenever an element is deleted from the queue, a temporary pointer is
created, which is made to point to the node pointed to by Front. Then Front
Self-Instructional
214 Material
is modified to point to the next node in the queue, and the temporary node is Queues

deleted from the memory. Figure 10.3 shows the various states of a queue after
the insert and delete operations.
Note: Since the memory is allocated dynamically, a linked queue reaches the overflow
condition when no more free memory space is available to be dynamically allocated. NOTES

(a) Queue after Inserting the First Element

(b) Queue after Inserting Element 2

(c) Queue after One more Insertion

(d) Queue after Deleting One Element

Fig. 10.3 Various states of a Linked Queue after the Insert and Delete Operations
Self-Instructional
Material 215
Queues Algorithm 10.3 Insert Operation on a Linked Queue
qinsert(q, val) //val is the value to be inserted

1. Allocate memory for nptr //nptr is a pointer to the new node to be inserted
2. If nptr = NULL // checking for queue overflow
Print “Overflow: Memory not allocated!” and go to step 6
NOTES End If
3. Set nptr->info = val
4. Set nptr->next = NULL
5. If Front = NULL //check if queue is empty
Set q->Rear = q->Front = nptr //rear and front are made to point to new
//node
Else
Set q->Rear->next = nptr
Set q->Rear = nptr //Rear is made to point to new node
End If
6. End

Algorithm 10.4 Delete Operation on a Linked Queue


qdelete(q)

1. If Front = NULL
Print “Underflow: Queue is empty!”
Return 0 and go to step 7
End if
2. Set del_val = q->Front->info //del_val is the element pointed by the Front
3. Set temp = q->Front //temp is the temporary pointer to Front
4. If q->Front = q->Rear //checking if there is one element in the queue
Set q->Front = q->Rear = NULL
Else
Set q->Front = q->Front->next //making Front point to next node
End If
5. De-allocate temp //de-allocating memory
6. Return del_val
7. End

Check Your Progress


1. When is the queue said to be in the condition of underflow?
2. What happens when a queue is implemented as an array?

10.4 CIRCULAR QUEUE

As discussed earlier, in the case of a queue represented as an array, once the


value of the rear reaches the maximum size of the queue, no more elements can
be inserted. However, there may be the possibility that the space on the left of
the front index is vacant. Hence, in spite of space on the left of front being
empty, the queue is considered full. This wastage of space in the array
implementation of a queue can be avoided by shifting the elements to the
beginning of the array if space is available. In order to do this, the values of the
Rear and Front indices have to be changed accordingly. However, this is
a complex process and difficult to implement. An alternative solution to this
problem is to implement a queue as a circular queue.
The array implementation of a circular queue is similar to the array
Self-Instructional
implementation of the queue. The only difference is that as soon as the rear index
216 Material
of the queue reaches the maximum size of the array, Rear is reset to the beginning Queues

of the queue, provided it is free. The circular queue is full only when all the locations
in the array are occupied. A circular queue is shown in Figure 10.4.

NOTES

Fig. 10.4 A Circular Queue


Note: A circular queue is generally implemented as an array though it can also be implemented
as a circular linked list.

To understand the operations on a circular queue, consider a circular queue


represented in the memory by the array CQueue[MAX]. Rear and Front
are used to store the indices of the rear and front elements of CQueue, respectively.
Initially, both Rear and Front are set to NULL to indicate an empty queue.
Whenever an element is to be inserted in a circular queue, Rear is
incremented by one. However, if the value of the rear index is MAX-1, instead of
incrementing Rear, it is reset to the first index of the array if space is available in
the beginning. Hence, if any location to the left of the front index is empty, the
elements can be added to the queue at an index starting from 0. A queue is
considered full in the following cases:
 When the value of Rear equals the maximum size of the array and Front
is at the beginning of the array
 When the value of Front is one more than the value of Rear
Whenever an element is to be deleted from the queue, Front is incremented
by one. However, if the value of Front is MAX-1, it is reset to the 0th position
in the array. When the value of Front equals the value of Rear (other than –1),
it indicates that there is only one element in the queue. On deleting the last element,
both Rear and Front are reset to NULL to indicate an empty queue. Figure
10.5 shows the various states of a queue after some insert and delete operations.
Self-Instructional
Material 217
Queues

NOTES

(a) An Empty Queue

(b) Queue after Inserting a few Elements

(c) Queue after Deleting a few Elements

(d) Queue when Rear = MAX–1

(e) Rear is Reset to Zero

(f ) Queue Full

Self-Instructional
218 Material
Queues

NOTES
(g) Front is Reset to Zero

(h) Queue after deletion of only element

Fig. 10.5 Various States of a Circular Queue after the Insert and Delete Operations

The total number of elements in a circular queue at any point of time can be calculated
from the current values of the rear and front indices of the queue. In case, Front
< Rear, the total number of elements = Rear – Front + 1. For instance,
in Figure 10.6(a), Front = 3 and Rear = 7. Hence, the total number
of elements in CQueue at this point of time is 7 – 3 + 1 = 5. In case, Front >
Rear, the total number of elements = Max + (Rear – Front) + 1. For
instance, in Figure 10.6(b), Front = 3 and Rear = 0. Hence, the total
number of elements in CQueue is 8 + (0 – 3) + 1.

(a)

(a)

(b)

I
(b)
Fig. 10.6 Number of Elements in a Circular Queue

Self-Instructional
Material 219
Queues Algorithm 10.5 Insert Operation on a Circular Queue
qinsert(q, val)

1. If ((q->Rear = MAX-1 AND q->Front = 0) OR (q->Rear + 1 = q->Front))


Print “Overflow: Queue is full!” and go to step 5
End If //check if circular queue is full
NOTES 2. If q->Rear = MAX-1 // check if rear is MAX-1
Set q->Rear = 0
Else
Set q->Rear = q->Rear + 1 //increment rear by one
End If
3. Set q->CQueue[q->Rear] = val //val is the value to be inserted in the queue
4. If q->Front = -1 //check if queue is empty
Set q->Front = 0
End If
5. End

Algorithm 10.6 Delete Operation on a Circular Queue


qdelete(q)

1. If q->Front = -1
Print “Underflow: Queue is empty!”
Return 0 and go to step 5
End If
2. Set del_val = q->CQueue[q->Front] //del_val is the value to be deleted
3. If q->Front = q->Rear // check if there is one element in the queue
Set q->Front = q->Rear = -1
Else
If q->Front = MAX-1
Set q->Front = 0
Else
Set q->Front = q->Front +1
End If
End If
4. Return del_val
5. End

10.5 APPLICATIONS OF QUEUES

There are numerous applications of queues in computer science. Various real-life


applications such as railway ticket reservation and the banking system are
implemented using queues. One of the most useful applications of a queue is in
simulation. Another application of a queue is in the operating system, to implement
various functions like CPU scheduling in a multiprogramming environment, device
management (printer or disk), etc. Besides, there are several algorithms like level-
order traversal of binary tree, etc., that use queues to solve problems efficiently.
This section discusses some of the applications of queues.
Simulation
Simulation is the process of modelling a real-life situation through a computer
program. Its main use is to study a real-life situation without actually making it
occur. It is mainly used in areas like military operations, scientific research, etc.,

Self-Instructional
220 Material
where it is expensive or dangerous to experiment with the real system. In simulation, Queues

corresponding to each object and action, there is a counterpart in the program.


The objects that are studied are represented as data and the actions are represented
as operations on the data. By supplying different data, we can observe the result
of the program. If the simulation is accurate, the result of the program represents NOTES
the behaviour of the actual system accurately.
Consider a ticket reservation system having four counters. If a customer
arrives at time ta and a counter is free, then the customer will get the ticket
immediately. However, it is not always possible that a counter is free. In that case,
a new customer goes to the queue having fewer customers. Assume that the time
required to issue the ticket is t. Then the total time spent by the customer is equal
to the time t (time required to issue the ticket) plus the time spent waiting in line.
The average time spent in the line by the customer can be computed by a program
simulating the customer action. This program can be implemented using a queue,
since while one customer is being serviced, the others are kept waiting.
CPU Scheduling in a Multiprogramming Environment
As we know, in a multiprogramming environment, multiple processes run
concurrently to increase CPU utilization. All the processes that are residing in the
memory and are ready to execute are kept in a list referred to as a ready queue.
It is the job of the scheduling algorithm to select a process from the processes and
allocate the CPU to it.
Let us consider a multiprogramming environment where the processes are
classified into three different groups, namely system processes, interactive processes
and batch processes. Some priority is associated with each group of processes.
The system processes have the highest priority, whereas the batch processes have
the least priority. To implement a multiprogramming environment, a multi-level
queue scheduling algorithm is used. In this algorithm, the ready queue is partitioned
into multiple queues (refer Figure 10.7). The processes are assigned to the respective
queues. The higher priority processes are executed before the lower priority
processes. For example, no batch process can run unless all the system processes
and interactive processes are executed. If a batch process is running and a system
process enters the queue, then batch process would be preempted to execute this
system process.

System Processes
Medium priority
Interactive Processes
Lowest priority
Batch Processes

Fig. 10.7 Multi-Level Queue Scheduling

Self-Instructional
Material 221
Queues In this algorithm, the processes of a lower priority may starve if the number of
processes in a higher-priority queue is high. Starvation can be prevented by two
ways. One way is to time-slice between the queues, that is, each queue gets a
certain interval of time. Another way is using a multi-level feedback queue
NOTES algorithm. In this algorithm, processes are not assigned permanently to a queue;
instead, they are allowed to move between the queues. If a process uses too
much CPU time, it is moved to lower priority. Similarly, a process that has been
waiting for too long in a lower-priority queue is moved to the higher-priority queue.
To implement multiple programming environments, a priority queue using multiple
queues can be used.
Round Robin Algorithm
The Round Robin algorithm is one of the CPU scheduling algorithms designed for
time-sharing systems. In this algorithm, the CPU is allocated to a process for a
small time interval called time quantum (generally from 10 to 100 milliseconds).
Whenever a new process enters, it is inserted at the end of the ready queue. The
CPU scheduler picks the first process from the ready queue and processes it until
the time quantum elapses. Then, the CPU switches to the next process in the
queue and the first process is inserted at the end of the queue if it has not been
finished. If the process is finished before the time quantum, the process itself releases
the CPU voluntarily and the process gets deleted from the ready queue. This
process continues until all the processes are finished. When a process is finished,
it is deleted from the queue. To implement the Round Robin algorithm, a circular
queue can be used.
Suppose there are n processes, such as P1, P2, …, Pn served by the CPU. Different
processes require different execution time. Suppose, sequence of processes arrivals
is arranged according to their subscripts, i.e., P1 comes first, then P2. Therefore, Pi
comes after Pi1 where 1< i  n. Round Robin algorithm first decides a small unit of
time called time quantum or time slice represented by τ. A time quantum generally
starts from 10 to 100 milliseconds. CPU starts services from P1. Then, P1 gets CPU
for τ instant of time; afterwards CPU switches to process P2 and so on. Now,
during time-sharing, if a process finishes its execution before the finding of its time
quantum, the process then simply releases the CPU and the next process waiting
will get the CPU immediately. When CPU reaches the end of time quantum of Pn it
returns to P1 and the same process will be repeated. For an illustration, consider
Table 10.1 for the set of processes.
Table 10.1 Table for Process and Burst Time
Process Burst Time
P1 7
P2 18
P3 5

Self-Instructional
222 Material
The total required CPU time keeps 30 units for burst time as summarized in Table Queues

10.1 and depicted in Figure 10.8.

NOTES

Fig. 10.8 Round Robin Scheduling

The advantage of Round Robin algorithm is in reducing the average turn-around


time. The turn-around time of a process is the time of its completion, i.e., time of
its arrival. Thus, Round Robin algorithm uses first come first Served or FCFS
strategy.

Check Your Progress


3. What is ready queue?
4. What is simulation?

10.6 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. If there is no element in the queue, the queue is said to be in the condition of


underflow.
2. When a queue is implemented as an array, all the characteristics of an array
are applicable to the queue.
3. All the processes that are residing in the memory and are ready to execute
are kept in a list referred to as a ready queue.
4. Simulation is the process of modelling a real-life situation through a computer
program.

10.7 SUMMARY

 A queue is a linear data structure in which a new element is inserted at one


end and an element is deleted from the other end.
 The end of the queue from which the element is deleted is known as the
Front and the end at which a new element is added is known as the Rear.

Self-Instructional
Material 223
Queues  Before inserting a new element in the queue, it is necessary to check whether
there is space for the new element.
 If there is no element in the queue, the queue is said to be in the condition of
underflow.
NOTES
 Like stacks, queues can be represented in the memory by using an array or
a singly linked list.
 When a queue is implemented as an array, all the characteristics of an array
are applicable to the queue.
 Since an array is a static data structure, the array representation of a queue
requires the maximum size of the queue to be predetermined and fixed.
 Whenever a new element has to be inserted in a queue, Rearis
incremented by one and the element is stored at Queue[Rear].
 Whenever, an element is to be deleted from a queue, Frontis
incremented by one.
 Whenever an element is deleted from the queue, a temporary pointer is
created, which is made to point to the node pointed to by Front.
 The info and next fields of each node represent the element of the queue
and a pointer to the next element in the queue, respectively.
 In the case of a queue represented as an array, once the value of the rear
reaches the maximum size of the queue, no more elements can be inserted.
 The array implementation of a circular queue is similar to the array
implementation of the queue. The only difference is that as soon as the rear
index of the queue reaches the maximum size of the array, Rearis reset
to the beginning of the queue, provided it is free.
 One of the most useful applications of a queue is in simulation.
 Simulation is the process of modelling a real-life situation through a computer
program.
 If the simulation is accurate, the result of the program represents the behaviour
of the actual system accurately.
 To implement a multiprogramming environment, a multi-level queue
scheduling algorithm is used.
 If a batch process is running and a system process enters the queue, then
batch process would be preempted to execute this system process.
 The advantage of Round Robin algorithm is in reducing the average turn-
around time.

Self-Instructional
224 Material
Queues
10.8 KEY WORDS

 Simulation: It is the process of modelling a real life situation through a


computer program. NOTES
 FIFO: It is a methodology used in queue for organising and manipulation
data, where first entry is processed first.

10.9 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What are queues?
2. Write a short note about representation of queues.
3. What is a circular queue?
4. List few basic operations performed on queues.
Long-Answer Questions
1. Write a program to implement a queue as an array.
2. Write a program to implement a circular queue.
3. Write a program to illustrate the insertion and deletion operations on a queue.

10.10 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 225
List

UNIT 11 LIST
NOTES Structure
11.0 Introduction
11.1 Objectives
11.2 Merging List and Linked List
11.3 Singly-Linked Lists
11.3.1 Traversing
11.3.2 Insertion
11.3.3 Deletion
11.4 Doubly-Linked Lists
11.4.1 Insertion
11.4.2 Deletion
11.5 Header List
11.6 Representation of Linked List
11.7 Answers to Check Your Progress Questions
11.8 Summary
11.9 Key Words
11.10 Self Assessment Questions and Exercises
11.11 Further Readings

11.0 INTRODUCTION

A list or sequence is an abstract data type that represents a countable number of


ordered values, where the same value may occur more than once. A linked list is
a sequence of data structures, which are held together by links. A Linked List is a
sequence of links which contains items. Each link contains a connection to another
link. A Linked list is the second most-used data structure after an array.

11.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss linked list
 Explain singly-linked list
 Analyze doubly-linked list
 Understand merging list and header linked list
 Discuss lists
 Analyze insertion and deletion of operators in linked lists
 Understand insertion and deletion in circular and doubly-linked lists

Self-Instructional
226 Material
 Discuss traversing linked lists List

 Explain the representation of linked lists


 Analyze memory allocation
NOTES
11.2 MERGING LIST AND LINKED LIST

Merge lists or algorithms are a family of algorithms that take multiple sorted lists as
a medium of input and in turn produce a single list as an output. This output contains
all the elements of the inputs lists in a neatly sorted out order. These algorithms are
then used as subroutines in various sorting algorithms, which most
famously merge sort.
Linked List
A linked list is a linear collection of homogeneous elements called nodes.
Successive nodes of a linked list need not occupy adjacent memory locations.
The linear order between nodes is maintained by means of pointers. In linked
lists, insertion or deletion of nodes do not require shifting of existing nodes as in
the case of arrays; they can be inserted or deleted merely by adjusting the pointers
or links.
Depending on the number of pointers in a node or the purpose for which
the pointers are maintained, a linked list can be classified into various types such
as singly-linked list, circular-linked list and doubly-linked list. The unit will discuss
these types in detail in the subsequent sections.

11.3 SINGLY-LINKED LISTS

A singly-linked list is also known as a linear linked list. In it, each node consists of
two fields, viz. ‘info’ and ‘next’, as shown in Figure 11.1. The ‘info’ field contains
the data and the ‘next’ field contains the address of memory location where the
subsequent node is stored. The last node of the singly-linked list contains NULL in
its ‘next’ field which indicates the end of the list.

Fig. 11.1 Node of a Linked List

Note: The data stored in the ‘info’ field may be a single data item of any data type
or a complete record representing a student, or an employee, or any other entity.
In this unit, however, it is assumed that the ‘info’ field contains an integer data.

Self-Instructional
Material 227
List A linked list contains a list pointer variable ‘Start’ that stores the address of
the first node of the list. In case, the ‘Start’ node contains NULL, the list is called
an empty list or a null list. Since each node of the list contains only a single
pointer pointing to the next node, not to the previous node—allowing traversing in
NOTES only one direction—hence, it is also referred to as a one-way list. Figure 11.2
shows a singly-linked list with four nodes.

Fig. 11.2 A Singly-Linked List with Four Nodes

Operations
A number of operations can be performed on singly-linked lists. These operations
include traversing, searching, inserting and deleting nodes, reversing, sorting, and
merging linked lists. Before implementing these operations, it is important to
understand how the node of a linked list is created.
Creating a node means defining its structure, allocating memory to it, and its
initialization. As discussed earlier, the node of a linked list consists of data and a
pointer to the next node. To define a node containing an integer data and a pointer
to next node in C language, a self-referential structure can be used whose definition
is as follows:
typedef struct node
{
int info; /*to store integer type data*/
struct node *next; /*to store a pointer to next
node*/
}Node;
Node *nptr; /*nptr is a pointer to node*/
After declaring a pointer nptr to new node, the memory needs to be allocated
dynamically to it. If the memory is allocated successfully (means no overflow), the
node is initialized. The info field is initialized with a valid value and the next
field is initialized with NULL.
Algorithm 11.1 Creation of a Node
create_node()

1. Allocate memory for nptr //nptr is a pointer to new node


2. If nptr = NULL
Print “Overflow: Memory not allocated!” and go to step 7
End If
3. Read item //item is the value to be inserted in the
new node
4. Set nptr->info = item
5. Set nptr->next = NULL
6. Return nptr //returning pointer nptr
7. End
Self-Instructional
228 Material
Now, the linked list can be formed by creating several nodes of type Node and List

inserting them either in the beginning or at the end or at a specified position in the
list.
11.3.1 Traversing NOTES
Traversing a list means accessing the elements of a linked list, one by one, to
process all or some of the elements. For example, to display values of the nodes,
the number of nodes counted, or a particular item in the list is searched, then
traversing is required. A list can be traversed by using a temporary pointer variable
temp, which will point to the node that is currently being processed. Initially,
temp points to the first node, processes that element, moves temp point to the
next node using the statement temp=temp->next, processes that element,
and moves on as long as the last node is not reached, that is, until temp becomes
NULL.
Algorithm 11.2 Traversing a List
display(Start)

1. If Start = NULL //Start points to the first node of list


Print “List is empty!!” and go to step 4
End If
2. Set temp = Start //initialising temp with Start
3. While temp != NULL
Print temp->info //displaying value of each node
Set temp = temp->next //moving temp to point to next node
End While
4. End

Another example of traversing a linked list is counting the number of nodes in the
linked list, which is described in the algorithm as illustrated here.
Algorithm 11.3 Counting the Number of Nodes
count_node(Start)

1. Set count = 0
2. Set temp = Start //initialising temp with Start
3. While temp != NULL //traversing the list
Set count = count + 1 //incrementing count
Set temp = temp->next
End While
4. Return count //returning total number of nodes in the
list
5. End

11.3.2 Insertion
To insert a node in a linked list, a new node is created and then placed at the
desired position by adjusting the pointers. Nodes can be inserted either in the
beginning or at the end or at any specified position in the list. All three situations
have been discussed in this section.
Insertion in the beginning
To insert a node in the beginning of a list, the next field of the new node (pointed
to by nptr) is made to point to the existing first node and the Start pointer is
modified to point to the new node as shown in Figure 11.3. Self-Instructional
Material 229
List

NOTES

Fig. 11.3 Insertion in the Beginning of a Linked List

Algorithm 11.4 Insertion in Beginning


insert_beg(Start)

1. Call create_node() //creating a new node pointed to by nptr


2. Set nptr->next = Start
3. Set Start = nptr //Start pointing to new node
4. End

Insertion at end
To insert a node at the end of a linked list, the list is traversed up to the last node
and the next field of this node is modified to point to the new node. However, if
the linked list is initially empty then the new node becomes the first node and
Start points to it. Figure 11.4(a) shows a linked list with a pointer variable
temp pointing to its first node and Figure 11.4(b) shows temp pointing to the
last node and the next field of last node pointing to the new node.

(a)

Self-Instructional
230 Material
List

NOTES

(b)

Fig. 11.4 Insertion at the End of a Linked List

Algorithm 11.5 Insertion at the End


insert_end(Start)

1. Call create_node() //creating a new node pointed to by


nptr
2. If Start = NULL //checking for empty list
Set Start = nptr //inserting new node as the first node
Else
Set temp = Start
While temp->next != NULL //traversing up to the last node
Set temp = temp->next
End While
Set temp->next = nptr //appending new node at the end
End If
3. End

Insertion at a specified position


To insert a node at a position pos as specified by the user, the list is traversed up
to pos-1 position. Then the next field of the new node is made to point to the
node that is already at the pos position and the next field of the node at pos-
1 position is made to point to the new node. Figure 11.5 shows the insertion of the
new node pointed to by nptr at the third position.

Fig. 11.5 Insertion at a Specified Position in a Linked List

Self-Instructional
Material 231
List
Algorithm 11.6 Insertion at a Specified Position
insert_pos(Start)

1. Call create_node() //creating a new node pointed to by


nptr
NOTES 2. Set temp = Start
3. Read pos //position at which the new node is to be
inserted
4. Call count_node(temp) //counting total number of nodes in count
variable
5. If (pos > count + 1 OR pos = 0)
Print “Invalid position!” and go to step 7
End If
6. If pos = 1
Set nptr->next = Start
Set Start = nptr //inserting new node as the first node
Else
Set i = 1
While i < pos - 1 //traversing up to the node at pos-1
position
Set temp = temp->next
Set i = i + 1
End While
Set nptr->next = temp->next //inserting new node at pos
position
Set temp->next = nptr
End If
7. End

11.3.3 Deletion
Like insertion, nodes can be deleted from the linked list at any point of time and
from any position. Whenever a node is deleted, the memory occupied by the
node is de-allocated. It must be noted that while performing deletions, the immediate
predecessor of the node to be deleted must be keep track of. Thus, two temporary
pointer variables are used (except in case of deletion from beginning), while
traversing the list.
Note: A situation where the user tries to delete a node from an empty linked list is termed as underflow.

Deletion from beginning


To delete a node from the beginning of a linked list, the address of the first node is
stored in a temporary pointer variable temp and Start is modified to point to
the second node in the linked list. After this, the memory occupied by the node
pointed to by temp is de-allocated. Figure 11.6 shows the deletion of a node
from the beginning of a linked list.

Self-Instructional Fig. 11.6 Deletion from the Beginning of a Linked List


232 Material
List
Algorithm 11.7 Deletion from the Beginning
delete_beg(Start)

1. If Start = NULL //checking for underflow


Print “Underflow: List is empty!” and go to step 5
End If
2. Set temp = Start //temp pointing to the first node NOTES
3. Set Start = Start->next //moving Start to point to the second node
4. Deallocate temp //deallocating memory
5. End

Deletion from end


To delete a node from the end of a linked list, the list is traversed up to the last
node. Two pointer variables save and temp are used to traverse the list where
save points to the node as previously pointed to by temp. At the end of
traversing, temp points to the last node and save points to the second last
node. Then the next field of the node pointed to by save is made to point to
NULL and the memory occupied by the node pointed to by temp is de-allocated.
Figure 11.7 shows the deletion of a node from the end of a linked list.

Fig. 11.7 Deletion from the End of a Linked List

Algorithm 11.8 Deletion from the End


delete_end(Start)

1. If Start = NULL //checking for underflow


Print “Underflow: List is empty!” and go to step 6
End If
2. Set temp = Start //temp pointing to the first node
3. If temp->next = NULL //deleting the only node of the list
Set Start = NULL
Else
While (temp->next) != NULL //traversing up to the last node
Set save = temp //save pointing to node previously
//pointed to by temp
Set temp = temp->next //moving temp to point to next node
End While
End If
4. Set save->next = NULL //making new last node to point to
NULL
5. Deallocate temp //deallocating memory
6. End

Deletion from a specified position


To delete a node from a position pos as specified by the user, the list is traversed
up to pos position using pointer variables temp and save. At the end of Self-Instructional
Material 233
List traversing, temp points to the node at pos position and save points to the
node at pos-1 position. Then the next field of the node pointed to by save
is made to point to the node at pos+1 position and the memory occupied by the
node as pointed to by temp is de-allocated. Figure 11.8 shows the deletion of a
NOTES node at the third position.

Fig. 11.8 Deletion from a Specified Position in a Linked List

Algorithm 11.9 Deletion from a Specified Position


delete_pos(Start)

1. If Start = NULL //checking for underflow


Print “Underflow: List is empty!” and go to step 8
End If
2. Set temp = Start
3. Read pos //position of the node to be deleted
4. Call count_node(Start) //counting total number of nodes in count
variable
5. If pos > count OR pos = 0
Print “Invalid position!” and go to step 8
End If
6. If pos = 1
Set Start = temp->next //deleting the first node
Else
Set i = 1
While i < pos //traversing up to the node at position pos
Set save = temp
Set temp = temp->next
Set i = i + 1
End While
Set save->next = temp->next //deleting the node at position
pos
End If
7. Deallocate temp //deallocating memory
8. End

Program 11.1: A program to illustrate the implementation of a singly-linked list is


as follows:
#include<stdio.h>
#include<conio.h>
#define True 1
#define False 0
typedef struct node
{
int info;
Self-Instructional
234 Material
struct node *next; List
}Node;
/* Function prototypes */
Node * create_node();
NOTES
int isempty(Node *);
void display(Node *);
int count_node(Node *);
void insert_beg(Node **);
void insert_end(Node **);
void insert_pos(Node **);
void delete_beg(Node **);
void delete_end(Node **);
void delete_pos(Node **);
void main()
{
int item,ch,ch1;
Node *Start=NULL;
do
{
clrscr();
printf(“\n\n\tMain Menu”);
printf(“\n1. Insert”);
printf(“\n2. Delete”);
printf(“\n3. Display”);
printf(“\n4. Exit\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch);
switch(ch)
{
case 1: printf(“\n1. Insert in the beginning”);
printf(“\n2. Insert at the end”);
printf(“\n3. Insert at a
specified position”);
printf(“\n4. Back to main menu\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch1);
switch(ch1)
{
case 1: insert_beg(&Start);
break;
case 2: insert_end(&Start);
break;
Self-Instructional
Material 235
List case 3: insert_pos(&Start);
break;
case 4: break;
default: printf(“\nInvalid choice!”);
NOTES }
break;
case 2: printf(“\n1. Delete from the beginning”);
printf(“\n2. Delete from the end”);
printf(“\n3. Delete from a specified position”);
printf(“\n4. Back to main menu\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch1);
switch(ch1)
{
case 1: delete_beg(&Start);
break;
case 2: delete_end(&Start);
break;
case 3: delete_pos(&Start);
break;
case 4: break;
default: printf(“\nInvalid choice!”);
}
break;
case 3: display(Start);
break;
case 4: exit();
default: printf(“\nInvalid choice!”);
}
getch();
}while(1);
}
Node * create_node()
{
Node *nptr;
int item;
nptr=(Node *)malloc(sizeof(Node));
if(nptr==NULL)
{
printf(“\nOverflow: Memory not allocated!”);
exit();
}

Self-Instructional
236 Material
printf(“\nEnter the value to be inserted: “); List
scanf(“%d”,&item);
nptr->info=item;
nptr->next=NULL;
NOTES
return(nptr);
}
int isempty(Node *Start)
{
if(Start==NULL)
return True;
else
return False;
}
void display(Node *Start)
{
Node *temp=Start;
if(isempty(temp))
{
printf(“\nList is empty!!”);
return;
}
printf(“\nThe linked list is: “);
while(temp != NULL)
{
printf(“%d “,temp->info);
temp=temp->next;
}
}
int count_node(Node *Start)
{
Node *temp=Start;
int count=0;
while(temp != NULL)
{
count++;
temp=temp->next;
}
return(count);
}
void insert_beg(Node **Start)
Self-Instructional
Material 237
List {
Node *nptr=create_node();
nptr->next=*Start;
*Start=nptr;
NOTES
printf(“\nNode inserted.”);
}
void insert_end(Node **Start)
{
Node *temp=*Start;
Node *nptr=create_node();
if(isempty(temp))
*Start=nptr;
else
{
while(temp->next != NULL)
temp=temp->next;
temp->next=nptr;
}
printf(“\nNode inserted.”);
}
void insert_pos(Node **Start)
{
int i,pos,count;
Node *nptr=create_node();
Node *temp=*Start;
printf(“\nEnter the position at which you want to
insert: “);
scanf(“%d”,&pos);
count=count_node(temp);
if(pos>count+1 || pos==0)
{
printf(“\nInvalid position!”);
return;
}
if(pos==1)
{
nptr->next=*Start;
*Start=nptr;
}
else

Self-Instructional
238 Material
{ List
for(i=1;i<pos-1;i++)
temp=temp->next;
nptr->next=temp->next;
NOTES
temp->next=nptr;
}
printf(“\nNode inserted.”);
}
void delete_beg(Node **Start)
{
Node *temp=*Start;
if(isempty(temp))
{
printf(“\nUnderflow: List is empty!”);
return;
}
*Start=temp->next;
free(temp);
printf(“\nNode deleted.”);
}
void delete_end(Node **Start)
{
Node *temp=*Start;
Node *save;
if(isempty(temp))
{
printf(“\nUnderflow: List is empty!”);
return;
}
if(temp->next==NULL)
*Start=NULL;
else
{
while(temp->next != NULL)
{
save=temp;
temp=temp->next;
}
save->next=NULL;
}
Self-Instructional
Material 239
List free(temp);
printf(“\nNode deleted.”);
}
void delete_pos(Node **Start)
NOTES
{
Node *temp=*Start,*save;
int i,pos,count;
if(isempty(temp))
{
printf(“\nUnderflow: List is empty!”);
return;
}
printf(“\nEnter the position of the node to be deleted:
“);
scanf(“%d”,&pos);
count=count_node(temp);
if(pos>count || pos==0)
{
printf(“\nInvalid position!”);
return;
}
if(pos==1)
*Start=temp->next;
else
{
for(i=1;i<pos;i++)
{
save=temp;
temp=temp->next;
}
save->next=temp->next;
}
free(temp);
printf(“\nNode deleted.”);
}
The output of the program is as follows:
Main Menu
1. Insert
2. Delete
3. Display
Self-Instructional
4. Exit
240 Material
Enter your choice: 1 List
1. Insert in the beginning
2. Insert at the end
3. Insert at a specified position
NOTES
4. Back to main menu
Enter your choice: 1
Enter the value to be inserted: 1
Node inserted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 1
1. Insert in the beginning
2. Insert at the end
3. Insert at a specified position
4. Back to main menu
Enter your choice: 2
Enter the value to be inserted: 3
Node inserted
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 1
1. Insert in the beginning
2. Insert at the end
3. Insert at a specified position
4. Back to main menu
Enter your choice: 3
Enter the value to be inserted: 2
Enter the position at which you want to insert: 2
Node inserted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 3 Self-Instructional
Material 241
List The linked list is: 1 2 3
Main Menu
1. Insert
2. Delete
NOTES
3. Display
4. Exit
Enter your choice: 2
1. Delete from the beginning
2. Delete from the end
3. Delete from a specified position
4. Back to main menu
Enter your choice: 1
Node deleted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 3
The linked list is: 2 3
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 2
1. Delete from the beginning
2. Delete from the end
3. Delete from a specified position
4. Back to main menu
Enter your choice: 3
Enter the position of the node to be deleted: 2
Node deleted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 2
1. Delete from the beginning
Self-Instructional
2. Delete from the end
242 Material
3. Delete from a specified position List
4. Back to main menu
Enter your choice: 2
Node deleted.
NOTES
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 3
List is empty!!
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 4

Check Your Progress


1. What is a dynamic data structure?
2. What is a singly-linked list also known as?

11.4 DOUBLY-LINKED LISTS

In a singly-linked list, each node contains a pointer to the next node and it has no
information about its previous node. Thus, one can traverse only in one direction,
i.e., from beginning to end. However, sometimes it is required to traverse in the
backward direction, i.e., from end to beginning. This can be implemented by
maintaining an additional pointer in each node of the list that points to the previous
node. Such type of a linked list is called doubly-linked list.
Each node of a doubly-linked list consists of three fields—prev, info,
and next (see Figure 11.9). The info field contains the data, the prev field
contains the address of the previous node, and the next field contains the address
of the next node.

Fig. 11.9 Node of a Doubly-Linked List


Self-Instructional
Material 243
List Since a doubly-linked list allows traversing in both forward and backward
directions, it is also referred to as a two-way list. Figure 11.10 shows an example
of a doubly-linked list having four nodes. It must be noted that the prev field of
the first node and next field of the last node in a doubly-linked list points to
NOTES NULL.

Fig. 11.10 A Doubly-Linked List with Four Nodes

To define the node of a doubly-linked list in C language, the structure used to


represent the node of a singly-linked list is extended to have an extra pointer
which points to previous node. The structure of a node of a doubly-linked list is as
follows:
typedef struct node
{
int info; /*to store integer type data*/
struct node *next; /*to store a pointer to next
node*/
struct node *prev; /*to store a pointer to
previous node*/
}Node;
Node *nptr; /*nptr is a pointer to node*/
When memory is allocated successfully to a node, i.e., when there is no condition
of overflow, the node is initialized. The info field is initialized with a valid value
and the prev and next fields are initialized with NULL.
Algorithm 11.10 Creating a Node of a Doubly Linked List
create_node()

1. Allocate memory for nptr //nptr is a pointer to new node


2. If nptr = NULL
Print “Overflow: Memory not allocated!” and go to step 8
3. Read item //item is the value stored in the node
4. Set nptr->info = item
5. Set nptr->next = NULL
6. Set nptr->prev = NULL
7. Return nptr
8. End

It must be brought to notice that all the operations that are performed on singly-
linked lists can also be performed on doubly-linked lists. In the subsequent sections,
only insertion and deletion operations on doubly-linked lists have been discussed.

Self-Instructional
244 Material
11.4.1 Insertion List

Insertion in the beginning


To insert a new node in the beginning of a doubly-linked list, a pointer, for example NOTES
nptr to new node is created. The next field of the new node is made to point
to the existing first node and prev field of the existing first node (that has become
the second node now) is made to point to the new node. After that, Start is
modified to point to the new node. Figure 11.11 shows the insertion of node in the
beginning of a doubly-linked list.

Fig. 11.11 Insertion in the Beginning

Algorithm 11.11 Insertion in the Beginning


insert_beg(Start)

1. Call create_node() //creating a new node pointed to by


nptr
2. If Start != NULL
Set nptr->next = Start //inserting node in the beginning
Set Start->prev = nptr
End If
3. Set Start = nptr //making Start to point to new node
4. End

Insertion at the end


To insert a new node at the end of a doubly-linked list, the list is traversed up to
the last node using some pointer variable, for example, the temp. At the end of
traversing, temp points to the last node. Then, field next of the last node (pointed
to by temp), is made to point to the new node and the field prev of the new
node is made to point to the node pointed to by temp. However, if a list is empty,
the new node is inserted as the first node in the list. Figure 11.12 shows the insertion
of a new node at the end of a doubly-linked list.

Self-Instructional
Material 245
List

NOTES

Fig. 11.12 Insertion at the End

Algorithm 11.12 Insertion at the End


insert_end(Start)

1. Call create_node() //creating a new node pointed to by


nptr
2. If Start = NULL
Set Start = nptr //inserting new node as the first
node
Else
Set temp = Start //pointer temp used for traversing
While temp->next != NULL
Set temp = temp->next
End While
Set temp->next = nptr
Set nptr->prev = temp
End If
3. End

Insertion at a specified position


To insert a new node (pointed to by nptr) at a specified position, for example,
pos in a doubly-linked list, the list is traversed up to pos-1 position. At the end
of traversing, temp points to the node at pos-1 position. For simplicity, another
pointer variable, ptr is used to point to the node that is already at position pos.
Then, field prev of the node, pointed to by ptr is made to point to the new node
and field next of the new node is made to point to the node pointed to by ptr.
Also, field prev of the new node is made to point to the node pointed to by
temp and field next of the node pointed to by temp is made to point to the new
node. Figure 11.13 shows the insertion of a new node at the third position in a
doubly-linked list.

Self-Instructional
246 Material
List

NOTES

Fig. 11.13 Insertion at a Specified Position

Algorithm 11.13 Insertion at a Specified Position


insert_pos(Start)

1. Call create_node() //creating a new node pointed to by


nptr
2. Set temp = Start
3. Read pos
4. Call count_node(temp) //counting number of nodes in count
5. If pos = 0 OR pos > count + 1
Print “Invalid position!” and go to step 7
End If
6. If pos = 1
Set nptr->next = Start //inserting node at the beginning
Set Start = nptr //Start pointing to new node
Else
Set i = 1
While i < pos-1 //traversing up to the node at pos-1
position
Set temp = temp->next
Set i = i + 1
End While
Set ptr = temp->next
Set ptr->prev = nptr
Set nptr->next = ptr
Set nptr->prev = temp
Set temp->next = nptr
End If
7. End

11.4.2 Deletion
To delete a node from the beginning of a doubly-linked list, a pointer variable, for
example, temp is used to point to the first node. Then Start is modified to
point to the next node and the prev field of this node is made to point to NULL.
After that, the memory occupied by the node pointed to by temp is de-allocated.
Figure 11.14 shows the deletion of a node from the beginning of a doubly-linked
list.

Self-Instructional
Material 247
List

NOTES

Fig. 11.14 Deletion from the Beginning

Algorithm 11.14 Deletion from the Beginning


delete_beg(Start)

1. If Start = NULL
Print “Underflow: List is empty!” and go to step 6
End If
2. Set temp = Start //temp points to the node to be deleted
3. Set Start = Start->next //making Start to point to next node
4. Set Start->prev = NULL
5. Deallocate temp //de-allocating memory
6. End

Note: The process of deleting node from the end of a doubly-linked list is same as that
of singly-linked list.

Deletion from a specified position


To delete a node from a position, for example, pos, as specified by the user, the
list is traversed up to the position pos, using pointer variables temp and
save. At the end of traversing, temp points to the node at pos position and
save points to the node at pos-1 position. For simplicity, another pointer
variable ptr is used to point to the node at pos+1 position. Then the next
field of the node at pos-1 position (pointed to by save) is made to point to the
node at pos+1 position (pointed to by ptr). In addition, the field prev of the
node at pos+1, position (pointed to by ptr) is made to point to the node at
pos-1 position (pointed to by save). After that, the memory occupied by the
node pointed to by temp is de-allocated. Figure 11.15 shows the deletion of a
node at the third position from a doubly-linked list.

Fig. 11.15 Deletion from a Specified Position


Self-Instructional
248 Material
Algorithm 11.15 Deletion from a Specified Position List
delete_pos(Start)

1. If Start = NULL
Print “Underflow: List is empty!” and go to step 8
End If
2. Set temp = Start
3. Read pos NOTES
4. Call count_node(temp) //counting total number of nodes in count
variable
5. If pos > count OR pos = 0
Print “Invalid position!” and go to step 6
End If
6. If pos = 1
Set Start = Start->next //deleting the first node
Start->prev = NULL
Else
Set i = 1
While i < pos //traversing up to the node at pos
position
Set save = temp //save pointing to the node at pos-1
position
Set temp = temp->next //making temp to point to next node
Set i = i + 1
End While
Set ptr = temp->next
Set save->next = ptr
Set ptr->prev = save
End If
7. Deallocate temp //de-allocating memory
8. End

Note: A doubly-linked list, in which the next field of the last node points to the first
node instead of ‘NULL’, is termed as a doubly circular linked list.
Program 11.2: A program to illustrate the implementation of a doubly-linked list
is as follows:
#include<stdio.h>
#include<conio.h>
#define True 1
#define False 0
typedef struct node
{
int info;
struct node *next;
struct node *prev;
}Node; /* node of a doubly-linked list */
/* Function prototypes */
Node * create_node();
int isempty(Node *);
void display(Node *);
int count_node(Node *);
void insert_beg(Node **);
void insert_end(Node **);
void insert_pos(Node **);
void delete_beg(Node **);
Self-Instructional
Material 249
List void delete_end(Node **);
void delete_pos(Node **);
/*Main Function*/
void main()
NOTES
{
int item,ch,ch1;
Node *Start=NULL;
do
{
clrscr();
printf(“\n\n\tMain Menu”);
printf(“\n1. Insert”);
printf(“\n2. Delete”);
printf(“\n3. Display”);
printf(“\n4. Exit\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch);
switch(ch)
{
case 1: printf(“\n1. Insert in the beginning”);
printf(“\n2. Insert at the end”);
printf(“\n3. Insert at a specified
position”); printf(“\n4.
Back to main menu\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch1);
switch(ch1)
{
case 1:insert_beg(&Start);
break;
case 2:insert_end(&Start);
break;
case 3:insert_pos(&Start);
break;
case 4: break;
default:printf(“\nInvalid choice!”);
}
break;
case 2: printf(“\n1. Delete from the beginning”);
printf(“\n2. Delete from the end”);
printf(“\n3. Delete from a specified position”);
printf(“\n4. Back to main menu\n”);
printf(“\nEnter your choice: “);
scanf(“%d”,&ch1);
Self-Instructional
250 Material
switch(ch1) List
{
case 1:delete_beg(&Start);
break;
case 2:delete_end(&Start); NOTES
break;
case 3:delete_pos(&Start);
break;
case 4:break;
default: printf(“\nInvalid choice!”);
}
break;
case 3:display(Start);
break;
case 4: exit();
default: printf(“\nInvalid choice!”);
}
getch();
}while(1);
}
Node * create_node()
{
Node *nptr;
int item;
nptr=(Node *)malloc(sizeof(Node));
if(nptr==NULL)
{
printf(“\nOverflow: Memory not allocated!”);
exit();
}
printf(“\nEnter the value to be inserted: “);
scanf(“%d”,&item);
nptr->info=item;
nptr->next=NULL;
nptr->prev=NULL;
return(nptr);
}
int isempty(Node *Start)
{
if(Start==NULL)
return True;
Self-Instructional
Material 251
List else
return False;
}
void display(Node *Start)
NOTES
{
Node *temp=Start;
if(temp==NULL)
printf(“\nList is empty!!”);
else
{
printf(“\nThe linked list is: “);
while(temp != NULL)
{
printf(“%d “,temp->info);
temp=temp->next;
}
}
}
int count_node(Node *Start)
{
Node *temp=Start;
int count=0;
while(temp != NULL)
{
count++;
temp=temp->next;
}
return(count);
}
void insert_beg(Node **Start)
{
Node *nptr=create_node();
if (*Start != NULL)
{
nptr->next=*Start;
(*Start)->prev=nptr;
}
*Start=nptr;
printf(“\nNode inserted.”);
}
Self-Instructional
252 Material
void insert_end(Node **Start) List
{
Node *temp;
Node *nptr=create_node();
NOTES
if(*Start==NULL)
*Start=nptr;
else
{
temp=*Start;
while(temp->next != NULL)
temp=temp->next;
temp->next=nptr;
nptr->prev=temp;
}
printf(“\nNode inserted.”);
}
void insert_pos(Node **Start)
{
int i,pos,count;
Node *nptr=create_node();
Node *temp=*Start,*ptr;
printf(“\nEnter the position at which you want to
insert: “);
scanf(“%d”,&pos);
count=count_node(temp);
if(pos==0 || pos>count+1)
{
printf(“\nInvalid position!”);
return;
}
if(pos==1)
{
nptr->next=*Start;
*Start=nptr;
}
else
{
for(i=1;i<pos-1;i++)
temp=temp->next;
ptr=temp->next;

Self-Instructional
Material 253
List ptr->prev=nptr;
nptr->next=ptr;
nptr->prev=temp;
temp->next=nptr;
NOTES
}
printf(“\nNode inserted.”);
}
void delete_beg(Node **Start)
{
Node *temp=*Start;
*Start=(*Start)->next;
(*Start)->prev=NULL;
free(temp);
printf(“\nNode deleted.”);
}
void delete_end(Node **Start)
{
Node *temp=*Start;
Node *save;
if(isempty(temp))
{
printf(“\nUnderflow: List is empty!”);
return;
}
if(temp->next==NULL)
*Start=NULL;
else
{
while(temp->next != NULL)
{
save=temp;
temp=temp->next;
}
save->next=NULL;
}
free(temp);
printf(“\nNode deleted.”);
}
void delete_pos(Node **Start)

Self-Instructional
254 Material
{ List
Node *temp=*Start,*save,*ptr;
int i,pos,count;
printf(“\nEnter the position of the node to be deleted:
“); NOTES
scanf(“%d”,&pos);
count=count_node(temp);
if(pos>count)
{
printf(“\nInvalid position!\n”);
return;
}
if(pos==1)
{
*Start=temp->next;
(*Start)->prev=NULL;
}
else
{
for(i=1;i<pos;i++)
{
save=temp;
temp=temp->next;
}
ptr=temp->next;
save->next=ptr;
ptr->prev=save;
}
free(temp);
printf(“\nNode deleted.\n”);
}
The output of the program is as follows:
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 1
1. Insert in the beginning
2. Insert at the end
Self-Instructional
Material 255
List 3. Insert at a specified position
4. Back to main menu
Enter your choice: 1
Enter the value to be inserted: 6
NOTES
Node inserted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 1
1. Insert in the beginning
2. Insert at the end
3. Insert at a specified position
4. Back to main menu
Enter your choice: 2
Enter the value to be inserted: 5
Node inserted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 1
1. Insert in the beginning
2. Insert at the end
3. Insert at a specified position
4. Back to main menu
Enter your choice: 3
Enter the value to be inserted: 8
Enter the position at which you want to insert: 2
Node inserted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit

Self-Instructional
256 Material
Enter your choice: 3 List
The linked list is: 6 8 5
Main Menu
1. Insert
NOTES
2. Delete
3. Display
4. Exit
Enter your choice: 2
1. Delete from the beginning
2. Delete from the end
3. Delete from a specified position
4. Back to main menu
Enter your choice: 3
Enter the position of the node to be deleted: 4
Invalid position!
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 2
1. Delete from the beginning
2. Delete from the end
3. Delete from a specified position
4. Back to main menu
Enter your choice: 1
Node deleted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 3
The linked list is: 8 5
Main Menu
1. Insert
2. Delete
3. Display
4. Exit

Self-Instructional
Material 257
List Enter your choice: 2
1. Delete from the beginning
2. Delete from the end
3. Delete from a specified position
NOTES
4. Back to main menu
Enter your choice: 2
Node deleted.
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 3
The linked list is: 8
Main Menu
1. Insert
2. Delete
3. Display
4. Exit
Enter your choice: 4

11.5 HEADER LIST

A header linked list is a linked list that contains a special note at the front of the list.
This special node is called a headed node and it does not contain any actual data
item that is included in the list but generally contains some useful information about
the entire linked list.

11.6 REPRESENTATION OF LINKED LIST

To maintain a linked list in the memory, two parallel arrays of equal size are used.
One array (INFO) is used for the ‘info’ field and another array (NEXT) is used
for the ‘next’ field of the nodes of a list. Values in the arrays are stored such that the
‘ith’ location in arrays ‘INFO’ and ‘NEXT’ contain the ‘info’ and ‘next’ fields of a
node of the list respectively. In addition, a pointer variable ‘Start’ is maintained in
memory that stores the starting address of the list. Figure 11.16 shows the memory
representation of a linked list where each node contains an integer.

Self-Instructional
258 Material
List

NOTES

Fig. 11.16 Memory Representation of a Linked List

In Figure 11.16, the pointer variable Start contains 25, which is the address of
first node of the list. This node stores the value 37 in array INFO and its
corresponding element in array NEXT stores 49 which is the address of next
node in the list and so on. Finally, the node at address 24 stores value 69 in array
INFO and NULL in array NEXT, thus, it is the last node of the list. It must be
noted that values in array INFO are stored randomly and array NEXT is used to
keep a track of the values in the list.
Memory allocation
As memory is allocated dynamically to the linked list, a new node can be inserted
anytime in the list. For this, the memory manager maintains a special linked list
known as a free-storage list or memory bank or free pool which consists of
unused memory cells. This list keeps a track of the free space available in the
memory and a pointer to this list is stored in a pointer variable Avail (see Figure
11.17). Note that the end of the free-storage list is also denoted by storing NULL
in the last available block of memory.

Self-Instructional
Material 259
List

NOTES

Fig. 11.17 Free-Storage List

In Figure 11.17, Avail contains 22, hence, INFO[22] is the starting point of
the free-storage list. Since NEXT[22] contains 26, INFO[26] is the next
free memory location. Similarly, other free spaces can be accessed and the NULL
in NEXT[23] indicates the end of free-storage list.
While creating a linked list or inserting an element into a linked list, if a
request for a new node arrives, the memory manager searches through the free-
storage list for the block of desired size. If the block of desired size is found, it
returns a pointer to that block. However, sometimes there is no space available,
i.e., the free-storage list is empty. This situation is termed as overflow and the
memory manager replies accordingly.

Check Your Progress


3. What are the three fields of each node in a doubly linked lists?
4. What is done to insert a node at the end of a linked list?
5. What is done to maintain a linked list in the memory?

Self-Instructional
260 Material
List
11.7 ANSWERS TO CHECK YOUR PROGRESS
QUESTIONS

1. A dynamic data structure is one in which the memory for elements is allocated NOTES
dynamically during run-time.
2. A singly-linked list is also known as a linear linked list.
3. Each node of a doubly linked list consists of three fields- prev, info
and next.
4. To insert a node at the end of a linked list, the list is traversed upto the last
node and the next field of this node is modified to point to the new node.
5. To maintain a linked list in the memory, two parallel arrays of equal size are
used.

11.8 SUMMARY

 A dynamic data structure is one in which the memory for elements is allocated
dynamically during run-time. The successive elements of a dynamic data
structure need not be stored in contiguous memory locations but they are
still linked together by means of some linkages or references.
 A linked list is a linear collection of homogeneous elements called nodes.
Successive nodes of a linked list need not occupy adjacent memory locations.
 A singly-linked list is also known as a linear linked list. In it, each node
consists of two fields, viz. ‘info’ and ‘next’.
 A number of operations can be performed on singly-linked lists. These
operations include traversing, searching, inserting and deleting nodes,
reversing, sorting, and merging linked lists.
 Traversing a list means accessing the elements of a linked list, one by one,
to process all or some of the elements.
 Each node of a doubly-linked list consists of three fields—prev, info,
and next.
 To maintain a linked list in the memory, two parallel arrays of equal size are
used. One array (INFO) is used for the ‘info’ field and another array (NEXT)
is used for the ‘next’ field of the nodes of a list.
 Memory is allocated dynamically to the linked list, a new node can be inserted
anytime in the list.

Self-Instructional
Material 261
List
11.9 KEY WORDS

 Linked list: It is a linear collection of homogeneous elements called nodes.


NOTES  Dynamic data structure: It is one in which the memory for elements is
allocated dynamically during run-time.
 Doubly-linked list: It is a list in which the next field of the last node points
to the first node instead of ‘NULL’.

11.10 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What is a linked list?
2. Explain singly-linked list.
3. What do you mean by doubly-linked list?
4. Differentiate between merging list and header linked list.
Long-Answer Questions
1. “Adynamicdatastructureisoneinwhichthememoryforelementsisallocated
dynamically during run-time.” Explain.
2. “The last node of the singly-linked list contains NULL in its ‘next’ field
which indicates the end of the list.” Explain with examples.
3. “A number of operations can be performed on singly-linked lists.” Elaborate.
4. “All the operations that are performed on singly- linked lists can also be
performed on doubly-linked lists.” Explain.

11.11 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.
Self-Instructional
262 Material
Introduction to Trees
BLOCK - V
NON-LINEAR DATA STRUCTURE

NOTES
UNIT 12 INTRODUCTION TO TREES
Structure
12.0 Introduction
12.1 Objectives
12.2 Trees
12.2.1 Forms of Binary Tree
12.3 Answers to Check Your Progress Questions
12.4 Summary
12.5 Key Words
12.6 Self Assessment Questions and Exercises
12.7 Further Readings

12.0 INTRODUCTION

A tree is a widely used abstract data type also called an ADT, or data structure.
This ADT simulates a hierarchical tree structure that has a root value and subtrees
of children with a parent node; these are represented as a set of linked nodes.
A binary tree is made up of nodes, where each node contains a ‘left’ and ‘right’
reference, and a data element. The topmost node in the tree is called the root.
Nodes that go with the same parent are called siblings. In this unit you will learn in
detail about trees.

12.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss binary trees
 Analyze the nature of binary trees
 Explain the forms of binary trees
 Understand the concept of extended binary tree

12.2 TREES

A binary tree is a special type of tree, which can either be empty or has finite set of
nodes such that one of the nodes is designated as root node and remaining nodes
are partitioned into two sub trees of root node known as left sub tree and right sub
tree. The nonempty left sub tree and the right sub tree are also binary trees. Unlike
Self-Instructional
Material 263
Introduction to Trees general tree each node in binary tree is restricted to have only two child nodes.
Consider a sample binary tree T shown in Figure 12.1.

Level 0 Root A
NOTES
Level 1 B C

Level 2 D E F T2, right


sub tree

Level 3 G H
T1, left
sub tree

Fig. 12.1 A Binary Tree

In Figure 12.1, the topmost node A is the root node of the tree T. Each
node in this tree has zero or at the most two child nodes. The nodes A, B and D
have two child nodes, node C has only one child node, and the nodes G, H, E and
F are leaf nodes having no child nodes. The nodes B, C, D are internal nodes
having child as well as parent nodes. Some basic terms associated with binary
trees are:
 Ancestor and descendant: Node N1 is said to be an ancestor of
node N2. N1 is the parent node of N2 or so on, whereas, node N2 is
said to be the descendant of node N1. The node N2 is said to be left
descendant of node N1 if it belongs to left sub tree of N1 and is said to
be the right descendant of N1 if it belongs to right sub tree of N1. In
binary tree shown in Figure 12.1, node A is ancestor of node H and
node H is left descendent of node A.
 Degree of a node: Degree of a node is equal to the number of its child
nodes. In binary tree shown in Figure 12.1, the nodes A, B and D have
degree 2, node C has degree 1 and nodes G, H, E and F have degree 0.
 Level: Since binary tree is a multilevel data structure, each node belongs
to a particular level number. In binary tree shown in Figure 12.1, the
root node A belongs to level 0, its child nodes belongs to level 1, child
nodes of nodes B and C belong to level 2, and so on.
 Depth (or height): Depth of the binary tree is the highest level number
of any node in a binary tree. In binary tree shown in Figure 12.1, the
nodes G and H are nodes with highest level number 3. Hence, the depth
of the binary tree is 3.
 Siblings: The nodes belonging to the same parent node are known as
sibling nodes. In binary tree shown in Figure 12.1, nodes B and C are
sibling nodes as they have same parent node, that is, A. Similarly, the
nodes D and E are also sibling nodes.
Self-Instructional
264 Material
 Edge: Edge is a line connecting any two nodes. In binary tree shown in Introduction to Trees

Figure 12.1, there exists an edge between nodes A and B, whereas,


there is no edge between the nodes B and C.
 Path: Path between the two nodes x and y is a sequence of consecutive
NOTES
edges being followed from node x to y. In binary tree shown in Figure
12.1, the path between the nodes A and H is A->B->D->H. Similarly,
the path from A to F is A->C->F.
12.2.1 Forms of Binary Tree
There are various forms of binary trees that are formed by imposing certain
restrictions on them. Some of the variations of binary trees are—complete binary
tree and extended binary tree.
Complete Binary Tree
A binary tree is said to be complete binary tree if all the leaf nodes of the tree are
at the same level. Thus, the tree has maximum number of nodes at all the levels
(Figure 12.2).
At any level n of binary tree, there can be at the most 2n nodes. That is,
At n = 0, there can be at most 20 = 1 node.
At n = 1, there can be at most 21 = 2 nodes.
At n = 2, there can be at most 22 = 4 nodes.
At level n, there can be at most 2n nodes.
A

B C

D E F G

I J K L M N O
H

Fig. 12.2 Complete Binary Tree

Extended Binary Tree


A binary tree is said to be extended binary tree (also known as 2-tree) if all of its
nodes are of either zero or two degree. In this type of binary trees, the nodes with
degree two (also known as internal nodes) are represented as circles and nodes
with degree zero (also known as external nodes) are represented as squares (see
Figure 12.3).

Self-Instructional
Material 265
Introduction to Trees
A

B C
NOTES
D E C F

G H

Fig. 12.3 Extended Binary Tree

Check Your Progress


1. What is the depth of binary tree?
2. What are sibling nodes?

12.3 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. Depth of the binary tree is the highest level number of any node in a binary
tree.
2. The nodes belonging to the same parent node are known as sibling nodes.

12.4 SUMMARY

 A binary tree is a special type of tree, which can either be empty or has
finite set of nodes. The nonempty left sub tree and the right sub tree are also
binary trees.
 Degree of a node is equal to the number of its child nodes.
 Since binary tree is a multilevel data structure, each node belongs to a
particular level number.
 Depth of the binary tree is the highest level number of any node in a binary
tree.
 The nodes belonging to the same parent node are known as sibling nodes.
 There are various forms of binary trees that are formed by imposing certain
restrictions on them. Some of the variations of binary trees are—complete
binary tree and extended binary tree.

Self-Instructional
266 Material
 A binary tree is said to be complete binary tree if all the leaf nodes of the Introduction to Trees

tree are at the same level.


 A binary tree is said to be extended binary tree (also known as 2-tree) if all
of its nodes are of either zero or two degree.
NOTES

12.5 KEY WORDS

 Complete Binary tree: It is a binary tree if all the leaf nodes of the tree are
at the same level.
 Extended Binary tree: It is a binary tree if all of its nodes are of either
zero or two degree.

12.6 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. Write a short note on binary trees.
2. Show a diagrammatic representation of a binary tree.
3. What do you mean by ancestor and descendent?
Long-Answer Questions
1. Write a detailed note on the forms of binary trees.
2. How is a complete binary tree different from an extended binary tree? Explain.
3. Write a detailed note on Binary tree representations.

12.7 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 267
Binary Tree
Representation
UNIT 13 BINARY TREE
REPRESENTATION
NOTES
Structure
13.0 Introduction
13.1 Objectives
13.2 Binary Tree
13.2.1 Array Representation
13.2.2 Linked Representation
13.3 Binary Tree Traversals
13.4 Answers to Check Your Progress Questions
13.5 Summary
13.6 Key Words
13.7 Self Assessment Questions and Exercises
13.8 Further Readings

13.0 INTRODUCTION

In the previous unit you have learnt about binary trees. This unit will explain about
binary tree representation, operations and applications. A binary tree can be
traversed in three different ways— in-order, pre-order and post-order traversal.
You will learn these in detail along with the basic concepts of binary search tree.

13.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss about traversing a binary tree
 Analyze what happens when a root node is visited in pre-order traversal
 Understand the different ways in which a tree can be traversed

13.2 BINARY TREE

Like stacks and queues, binary trees can also be represented in two ways in
memory—array (sequential) representation and linked representation. In array
representation, memory is allocated at compile time and in linked representation,
memory is allocated dynamically.
13.2.1 Array Representation
In array representation binary tree is represented sequentially in memory by using
single one-dimensional array. A binary tree of height n may comprise at most
Self-Instructional
2(n+1)-1 nodes, hence an array of maximum size 2(n+1)-1 is used for
268 Material
representing such a tree. All the nodes of the tree are assigned a sequence number Binary Tree
Representation
(from 0 to (2(n+1)-1)-1) level by level. That is, the root node at level 0 is
assigned a sequence number 0, and then nodes at level 1 are assigned sequence
number in ascending order from left to right, and so on. For example, the nodes of
a binary tree of height 2, having 7 (2(2+1)-1) nodes can be numbered as shown in NOTES
Figure 13.1(a).
0
M
1 2
G R

3 4 5 6
D I P T

(a) Ordering of Nodes of Binary Tree

0 1 2 3 4 5 6
M G R D I P T

(b) Nodes of Binary Tree Stored in an Array

Fig. 13.1 Array Representation of Binary Tree

The numbers assigned to the nodes indicates the position (index value) of an array
at which that particular node is stored. The array representation of this tree is
shown in Figure 13.1(b). It can be observed that if any node is stored at position
p, then its left child node is stored at 2*p+1 position and its right child node is
stored at 2*p+2 position. For example, in Figure 13.1(b), the node G is stored at
position 1, its left child node D is stored at position 3 (2*1+1) and its right child
node is stored at position 4 (2*1+2). Notice that if any of the nodes in the tree has
empty sub trees (except the leaf nodes), the nodes forming the part of these empty
sub trees are also numbered and their values in the corresponding position in the
array is NULL. For example, consider a binary tree shown in Figure 13.2(a), its
array representation is shown in Figure 13.2(b).

0
A
1 2
B C

4 5 6
3
D E F

7 8 9 10 11 12 13 14
G H

(a) Ordering of Nodes with Empty Sub Trees


Self-Instructional
Material 269
Binary Tree 0 1 2 3 4 5 6 7 8 9 10 11 12 14
13
Representation
A B C D E F G H

(b) Nodes with Empty Sub Trees Stored in an Array


NOTES Fig. 13.2 Array Representation of Binary Tree with Empty Sub Trees

In this representation, an array of maximum size is declared (to accommodate


maximum number of nodes for a binary tree of a given height) before run-time
which leads to wastage of lot of memory space in case of unbalanced trees. In
unbalanced trees the number of nodes is very small as compared to the maximum
number of nodes for a given height. For example, consider an unbalanced tree
shown in Figure 13.3(a). Since, this tree is of height 3, an array of size 14(2(3+1)-1)
will be declared to store nodes of this tree. The array representation of this tree is
shown in Figure 13.3(b).

0
A
1 2
B

4 5 6
3
D

7 8 9 10 11 12 13 14
G

(a) An Unbalanced Binary Tree

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
A B D G

(b) Nodes of an Unbalanced Binary Tree Stored in an Array

Fig. 13.3 Array Representation of an Unbalanced Binary Tree

It can be observed from this array representation that most of the array
positions are NULL, leading to wastage of memory space. Due to this disadvantage
of array representation of binary trees, the linked representation of binary trees is
preferred.
13.2.2 Linked Representation
Linked representation is one of the most common and important way of representing
a binary tree in memory. The linked representation of a binary tree is implemented
by using a linked list having info part and two pointers. The info part contains
the data value and two pointers, left and right are used to point to the left
and right sub tree of a node, respectively. The structure of such a node is shown in
Figure 13.4.
Self-Instructional
270 Material
Binary Tree
Pointing A Pointing Representation
to left to right
sub tree sub tree
info part
NOTES
Fig. 13.4 Structure of a Node of Binary Tree

To define a node of a binary tree in ‘C’ language, a self-referential structure


can be used whose definition is shown as follows:
typedef struct node
{
int info;
struct node *left;
struct node *right;
}Node;
In linked representation, a pointer variable Root of Node type is used to
point to the root node of a tree. Root variable is used for accessing the root and
the subsequent nodes of a binary tree. Since binary tree is empty in the beginning,
the pointer variable Root is initialized with NULL. The linked representation of
a sample binary tree (see Figure 12.1) is shown in Figure 13.5.

Root

B C

NULL
D E F

NULL NULL NULL NULL


G H

NULL NULL NULL NULL

Fig. 13.5 Linked Representation of a Binary Tree

13.3 BINARY TREE TRAVERSALS

Traversing a binary tree refers to the process of visiting each and every node of the
tree exactly once. The three different ways in which a tree can be traversed are—
in-order, pre-order and post-order traversal. The main difference in these traversal
methods is based on the order in which the root node is visited. Note that in all the
traversals the left sub tree is always traversed before the traversal of the right sub
tree. To understand these traversal methods, consider a simple binary tree T, shown
in Figure 13.6. Self-Instructional
Material 271
Binary Tree
Representation Root of Binary tree T A

B C
NOTES T1, left T2, right
sub tree sub tree
D E F G

Fig. 13.6 A Binary Tree

Pre-order
In pre-order traversal, the root node is visited before traversing its left and right
sub trees. Steps for traversing a nonempty binary tree in pre-order are as follows:
1. Visit the root node R.
2. Traverse the left sub tree of root node R in pre-order.
3. Traverse the right sub tree of root node R in pre-order.
For example, in binary tree T (shown in Figure 13.6), the root node A is
traversed before traversing its left sub tree and the right sub tree. In the left sub
tree T1, the root node B (of left sub tree T1) is traversed before traversing the
nodes D and E. After traversing the root node of binary tree T and traversing the
left sub tree T1, the right sub tree T2 is also traversed following the same procedure.
Hence, the resultant pre-order traversal of the binary tree T is A, B, D, E, C, F, G.
In-order
In in-order traversal, the root node is visited after the traversal of its left sub tree
and before the traversal of its right sub tree. Steps for traversing a nonempty
binary tree in in-order are as follows:
1. Traverse the left sub tree of root node R in in-order.
2. Visit the root node R.
3. Traverse the right sub tree of root node R in in-order.
For example, in binary tree T (shown in Figure 13.6), the left sub tree T1
is traversed before traversing the root node A. In the left sub tree T1, the node D
is traversed before traversing its root node B (of left sub tree T1). After traversing
the node D and B, the node E is traversed. Once the traversal of left sub tree T1
and the root node A of binary tree T is complete, the right sub tree T2 is traversed
following the same procedure. Hence, the resultant in-order traversal of the binary
tree T is D, B, E, A, F, C, G.

Self-Instructional
272 Material
Post-order Binary Tree
Representation
In post-order traversal, the root node is visited after traversing its left and right sub
trees. Steps for traversing a nonempty binary tree in post-order are as follows:
1. Traverse the left sub tree of root node R in post-order. NOTES
2. Traverse the right sub tree of root node R in post-order.
3. Visit the root node R.
For example, in binary tree T (shown in Figure 13.6), the root node A is
traversed after traversing its left sub tree and the right sub tree. In the left sub tree
T1, the root node B (of left sub tree T1) is traversed after traversing the nodes D
and E. Similarly, the nodes of right sub tree T2 are traversed following the same
procedure. After traversing the left sub tree (T1) and right sub tree (T2), the root
node A of binary tree T is traversed. Hence, the resultant post-order traversal of
the binary tree T is D, E, B, F, G, C, A.
Similarly, the pre-order, in-order and post-order traversals of a binary tree
shown in Figure 12.2 are as follows:
Pre-order traversal: A B D H I E J K C F L M G N O
In-order traversal: H D I B J E K A L F M C N G O
Post-order traversal: H I D J K E B L M F N O G C A
In addition to these traversals, there is another way of traversing a tree
known as level-order traversal. In this traversal, every node at one level is visited
before moving on to the next level. The level-order traversal of a binary tree shown
in Figure 12.2 is given as follows:
Level-order traversal: A B C D E F G H I J K L M N O

Check Your Progress


1. What does traversing a binary tree refer to?
2. When is the root node visited in pre-order traversal?

13.4 ANSWERS TO CHECK YOUR PROGRESS


QUESTIONS

1. Traversing a binary tree refers to the process of visiting each and every
node of the tree exactly once.
2. In pre-order traversal, the root node is visited before traversing its left and
right subtrees.

Self-Instructional
Material 273
Binary Tree
Representation 13.5 SUMMARY

 Like stacks and queues, binary trees can also be represented in two ways
NOTES in memory—array(sequential) representation and linked representation.
 In array representation binary tree is represented sequentially in memory
by using one-dimensional array.
 The numbers assigned to the nodes indicates the position (index value) of
an array at which that particular node is stored.
 Linked representation is one of the most common and important way of
representing a binary tree in memory.
 Traversing a binary tree refers to the process of visiting each and every
node of the tree exactly once. The three different ways in which a tree can
be traversed are— in-order, pre-order and post-order traversal.
 In pre-order traversal, the root node is visited before traversing its left and
right sub trees.
 In in-order traversal, the root node is visited after the traversal of its left sub
tree and before the traversal of its right sub tree.
 In post-order traversal, the root node is visited after traversing its left and
right sub trees.

13.6 KEY WORDS

 Traversing: It refers to the process of visiting each and every node of the
tree exactly once.
 Stack: It is a linear data structure which follows a particular order in which
the operations are performed. The order may be LIFO (Last in First Out)
or FILO (First in Last Out).

13.7 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What are the various ways of representing binary tree?
2. What do you mean by traversing binary trees?
3. How do you search a node in binary tree?

Self-Instructional
274 Material
Long-Answer Questions Binary Tree
Representation
1. “Linked representation is one of the most common and important way of
representing a binary tree in memory.” Explain.
2. Write the steps for traversing the tree in pre-order, In-order and post- NOTES
order.

13.8 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
Material 275
Binary Search Tree

UNIT 14 BINARY SEARCH TREE


NOTES Structure
14.0 Introduction
14.1 Objectives
14.2 Binary Search Tree
14.2.1 Inserting a Node
14.2.2 Deleting a Node
14.3 Applications of Trees
14.4 Hashing Techniques
14.5 Answers to Check Your Progress Questions
14.6 Summary
14.7 Key Words
14.8 Self Assessment Questions and Exercises
14.9 Further Readings

14.0 INTRODUCTION

In the previous unit, we considered a particular kind of a binary tree called a


Binary Search Tree (BST). A binary tree is a binary search tree (BST) if and only
if an in order traversal of the binary tree results in a sorted sequence. The idea of
a binary search tree is that data is stored according to an order, so that it can be
retrieved very efficiently. This unit will explain the various types of operations on
binary tree.

14.1 OBJECTIVES
After going through this unit, you will be able to:
 Discuss the operations in binary tree
 Understand the insertion and deletion operations
 List the hashing techniques
 Analyze the division remained method

14.2 BINARY SEARCH TREE

Binary search tree, also known as binary sorted tree, is a kind of a binary tree,
which satisfies the following conditions (Figure 14.1):
1. The data value in each node is a key (unique) value, that is, no two nodes
can have identical values.

Self-Instructional
276 Material
Binary Search Tree
66

40 90
NOTES

30 50 75 110

35 45 55 70 80 100 120
20

Fig. 14.1 Binary Search Tree

2. The data values in the nodes of the left sub tree, if exists, is smaller than the
value in the root node.
3. The data values in the nodes of the right sub tree, if exists, is greater than the
value in the root node.
4. The left and the right sub trees, if exists, are also binary search trees.
In other words, values in the left sub tree of a root node are smaller than the
value of the root node, and the values in the right sub tree are greater than the
value of the root node. This rule is applicable on all the subsequent sub trees in a
binary search tree. In addition, each and every value in binary search tree is unique,
that is, no two nodes in it can have identical values.
There are various operations that can be performed on the binary search
trees. Some of these are searching a node, insertion of a new node, traversal of a
tree and deletion of a node.
Searching a Node
Searching an element in a binary search tree is easy, since the elements in this tree
are arranged in a sorted order. The element to be searched is compared with the
value in the root node. If the element is smaller than the value in the root node, then
the searching will proceed to the left sub tree and if the element is greater than the
value in the root node, then the searching will proceed to the right sub tree. This
process is repeated until either the element to be searched is found or NULL value
is encountered.
For example, consider a sample binary search tree given in Figure 14.1.
The steps to search element 45 are given as follows:
1. Compare the element 45 with the value in root node (66). Since 45 is
smaller than 66, move it to its left sub tree.
2. Compare the element 45 with the value (40) appearing in the left sub
tree. Since 45 is greater than the 40, move it to its right sub tree.
3. Now, compare the element 45 with the value (50) appearing in the
right sub tree. Since 45 is smaller than 50, move it to its left sub tree.
Self-Instructional
Material 277
Binary Search Tree 4. In the next step, compare the element 45 with the value (45) appearing
in the left sub tree. Since 45 is equal to the value (45) stored in this
node, the required element is found. Therefore, terminate the
procedure.
NOTES
In case the value 48 is to be searched, the first four steps are same. After
the step 4, the right sub tree of 45 will be accessed, this is NULL indicating the end
of the tree. Therefore, the element is not found in the tree and the search is
unsuccessful.
Algorithm 14.1 Searching in a Binary Search Tree
search(item, ptr)

1. If !(ptr)
Print "Element not found!" and go to step 3
End If
2. If item < ptr->info
Call search(item, ptr->left)
Else If item > ptr->info
Call search(item, ptr->right)
Else
Print "Element found."
End If
3. End

14.2.1 Inserting a Node


Insertion in a binary search tree is similar to the procedure for searching an element
in a binary search tree. The difference is that in case of insertion, appropriate null
pointer is searched where new node can be inserted. The process of inserting a
node in a binary search tree can be divided into two steps—in the first step, the
tree is searched to determine the appropriate position where the node is to be
inserted and in the second step, the node is inserted at this searched position.
Here are two cases of insertion in a tree—first, insertion into an empty tree
and second, insertion into a nonempty tree. In case the tree is initially empty, the
new node to be inserted becomes its root node. In case the tree is nonempty,
appropriate position is determined for insertion. For this, first of all the value in the
new node is compared with the root node of the tree. If the value in the new node
is less than the value in the root node, the new node is added as the left leaf if the
left sub tree is empty, otherwise the search continues in the left sub tree. On the
other hand, if the value in the new node is greater than the value in the root node,
the new node is added as the right leaf if the right sub tree is empty, otherwise the
search continues in the right sub tree.
For example, consider a sample binary search tree shown in Figure 14.2.
For inserting elements 20 and 80, follow the steps given:
1. Compare 20 with the value in the root node, that is, 66. Since 20 is
smaller than 66, move to the left sub tree.

Self-Instructional
278 Material
Binary Search Tree
66

40 90

NOTES
30 50 75 110

35 120

Fig. 14.2 A Sample Binary Search Tree

2. Finding that the left pointer of root node is non-null, compare 20 with
the value (40) in this node. Since 20 is smaller than 40, move to the
left sub tree.
3. Again, the left pointer of the current node is non-null, compare 20
with the value (30) in this node. Since 20 is smaller than 30, move to
the left sub tree.
4. Now, the left pointer is null, thus 20 will be inserted at this position.
After insertion, the tree will appear as shown in Figure 14.3.
66

40 90

30 50 75 110

New inserted 35 120


20
node

Fig. 14.3 Insertion of a Node with Value 20

Steps for inserting the element 80 are as follows:


1. Compare 80 with the value in root node 66. Since 80 is greater than
66, move to the right sub tree.
2. Finding that the right pointer of root node is non-null, compare 80
with the value (90) in this node. Since 80 is smaller than 90, move to
the left sub tree.
3. Again, the left pointer of the current node is non-null, compare 80
with the value (75) in this node. Since 80 is greater than 75, move to
the right sub tree.
4. Now, the right pointer is null, thus 80 will be inserted at this position.
After insertion, the tree will appear as shown in Figure 14.4.

Self-Instructional
Material 279
Binary Search Tree
66

40 90

NOTES
30 50 75 110

New inserted
35 80 120
20 node

Fig. 14.4 Insertion of a Node with Value 80

Algorithm 14.2 Insertion into a Binary Search Tree


insert_node(item, ptr)

1. If !(ptr)
Allocate memory for ptr
Set ptr->info = item
Set ptr->left = NULL
Set ptr->right = NULL
Else
If item < ptr->info
Call insert_node(item, ptr->left)
Else
Call insert_node(item, ptr->right)
End If
End If
2. End

14.2.2 Deleting a Node


Deletion of a node from a binary search tree involves two steps—first, searching
the desired node and second, deleting the node. Whenever a node is deleted from
a tree, it must be ensured that the tree remains a binary search tree, that is, the
sorted order of the tree must not be disturbed. The node being deleted may have
zero, one or two child nodes. On the basis of the number of child nodes to be
deleted, there are three cases of deletion which are discussed as follows:
Case 1: If the node to be deleted has no child node, it is deleted by making
its parent’s pointer pointing to NULL and de-allocating memory allocated to it.
For example, the node with value 75 is to be deleted from the tree shown in
Figure 14.5. Since this node has no child node, its parent’s (90) left pointer will be
made to point to NULL and the memory space of the node (75) is de-allocated.
66

40 90

50 75 110
30

55 120
20 35 45

Self-Instructional Fig. 14.5 Deletion of a Node Having No Child Node


280 Material
Case 2: If the node to be deleted has only one child node, it is deleted by Binary Search Tree

adjusting its parent’s pointer pointing to its only child and de-allocating memory
allocated to it.
For example, the node with value 110 is to be deleted from the tree shown
NOTES
in Figure 14.6. Since this node has one child node, its parent’s (90) right pointer
will be made to point to its child node (120) and memory space of the node (110)
is de-allocated.
66

90
40

50 110
30

55 120
20 35 45

Fig. 14.6 Deletion of a Node Having Only One Child

Case 3: If the node to be deleted has two child nodes, it is deleted by


replacing its value by largest value in the left sub tree (in-order predecessor) or by
smallest value in the right sub tree (in-order successor). The node whose value is
used for replacement is then deleted using case 1 or case 2.
66

90
40

120
30 Copying 50

20 35 45 55

Fig. 14.7 Deletion of a Node Having Two Child Nodes

For example, the node with value 40 is to be deleted from the tree shown in
Figure 14.7. Since this node has two sub trees or child nodes, a value has to be
searched from its sub trees which can be used for its replacement. The value that
will be used for replacement can either be largest value from its left sub tree (35)
or smallest value from its right sub tree (45). Suppose, the value 35 is selected for
this purpose, then value 35 is copied in the node with the value 40. After this, the
right pointer of parent node (30) of the node used for replacement (35) is made to
point to NULL and memory allocated to the node with value 35 is de-allocated. As
Self-Instructional
Material 281
Binary Search Tree a result of deletion of this node the order of tree is maintained. The final structure
of the tree after deletion of node 40 will be as shown in Figure 14.8.
66
NOTES
90
35

120
30 50

20 45 55

Fig. 14.8 Binary Search Tree after Deletion

Algorithm 14.3 Deletion from Binary Search Tree


del_node(item, ptr)

1. If !(ptr)
Print "Item does not exist." and go to step 3
2. If item < ptr->info
Call del_node(item,&(ptr->left))
Else
If item > ptr->info
Call del_node(item,&(ptr->right))
Else
If item = ptr->info
Set save = ptr
If save->right = NULL
Set ptr = save->left
Deallocate save
Else
If save->left = NULL
Set ptr = save->right
Deallocate save
Else
Call del(&(save->left),save)
End If
End If
End If
End If
End If
3. End

del(p, q) // q is the node to be deleted, p is the node


whose value
//is used for replacing the value in q and p is de-
allocated

1. If p->right != NULL
Call del(&(p->right),q)
Else
Set delnode = p
Set q->info = p->info
Set p = p->left
Deallocate delnode
End If
2. End

Self-Instructional
282 Material
Binary Search Tree
14.3 APPLICATIONS OF TREES

Trees have found their applications in various fields of computer science. Some of
its applications like, set representation, B tree, and B+ tree have been discussed in NOTES
this section.
Set Representation
A set is said to be a collection of distinct elements of the same type. The elements
can be of any type like numbers, letters of alphabets, names of cities, etc. Such
sets can be represented in the form of trees. For example, consider three disjoint
sets (sets are disjoint when they have no common elements) S, P, and R of integers
such that S = {2, 4, 6, 8}, P = {3} and R = {5, 9, 11}. These
sets can be represented as trees (one tree for each set) as follows in Figure 14.9.
S P R

2 3 5

4 6 8 9 11

Fig. 14.9 Sets Represented as Trees

In this representation, one of the elements from a set becomes a root node (also
known as parent node) and all other elements from the set become its child nodes.
All the child nodes have a pointer pointing to their parent node (see Figure 14.9).
These trees representing the disjoint sets can be stored in an array. Since these
sets are disjoint, they can be stored in a single array. The value at the ith position
of the array is the parent node of the ith node. The zero value in the array
indicates the pointer to the root node. The array representation of the disjoint sets
S, P, and R is shown in Figure 14.10. Here, zero value at the 2nd position of the
array indicates that the node 2 is a parent node, parent node of 4 is 2, parent node
of 9 is 5, and so on.
1 2 3 4 5 6 7 8 9 10 11 12
Set S, P, R 0 0 2 0 2 2 5 5

Fig. 14.10 Disjoint Sets Stored in an Array

The union of sets can be obtained by setting the pointer of root node of
one set to the root node of other set. For example, the union of sets S and R, (S
U R) is obtained by setting the pointer of the root node of set R (5) to point to
the root node (2) of the set S. This can be represented in the form of a tree as
shown in Figure 14.11(a) and this tree can be stored in the array as shown in
Figure 14.11(b).

Self-Instructional
Material 283
Binary Search Tree SUR
2

NOTES 4 6 8 5

9 11

(a) S U R

1 2 3 4 5 6 7 8 9 10 11 12
Set S U R 0 2 2 2 2 5 5

(b) S U R stored in an array

Fig. 14.11 Representing S U R as Trees and its Storage in an Array

Another useful operation that can be performed on these sets is searching the set
to which the given element belongs. For example, a set is to be searched to which
the node 11 belongs [see Figure 14.11(b)]. First the value at the 11th index of a
corresponding array is accessed to find the parent node of 11. Here, the value is
5, now the 5th index of this array is accessed. This process is repeated until the 0
value is accessed. The value at 5th index is 2, and the value at the 2nd index is 0.
Here the search terminates. The result of the search is the set having 2 as parent
node to which the node 11 belongs.
This type of tree representation of sets can be used for representing the hierarchical
structure of a client-server environment. Each node in such a structure represents
a client-server system, in which there are number of clients connected to its server.
B Trees
A B tree is a type of tree in which each node can store multiple values and can
point to multiple sub trees. A B tree of order m satisfies the following conditions
(see Figure 14.12):
1. The root node must have at least two sub trees and can have at the most m
sub trees.
2. Each node, except root and leaf node, must have at least Ceiling(m/2)
nonempty sub trees and can have at the most m sub trees.
3. All the elements stored in a node are in ascending order, that is,
V1 < V2 <V3 . . . <Vk
where, Vi is the value stored in a node and k varies from 1 to m-1.
4. All the elements stored in the left sub tree of Vk are less than Vk, and all the
elements stored in the right sub tree of Vk are greater than Vk.

Self-Instructional
284 Material
5. All the sub trees pointed by any node are also B trees of order m. Binary Search Tree

6. All the leaf nodes are at same level.


7. Each leaf node must have at least (Ceiling(m/2) – 1) number of
elements. NOTES
66

35 45 80 100

20 30 38 40 48 52 56 68 70 73 77 85 90 110 115

Fig. 14.12 B Tree of Order 5

The B trees are useful in case of very large data that cannot be accommodated in
the main memory of the computer and is stored on disks in the form of files. Data
is stored in the form of records in files. In B trees, these records are represented
by a key value and a record pointer. Whenever a record is required to be accessed,
first its corresponding key value is searched in a B tree. After searching for the
key, the record pointer associated with that key value is used for accessing the
desired record from the file stored on the disk. The main aim of B trees is to
minimize the number of disk accesses for accessing a record. Since each node of
a tree corresponds to a disk block, m represents the maximum number of records
that can be stored in a single disk block. Therefore, m should be kept to maximum,
so that maximum possible records can be retrieved in single disk access.
Note: A B tree of order 3 can have either 2 or 3 sub trees, hence it is also known as a
2–3 tree.

Insertion in B tree
In a B tree, all the insertions take place at the leaf nodes. To insert an element, first
the appropriate leaf node is found and then the element is inserted in that node.
Now, while inserting an element in the searched leaf node—one of the two cases
may arise. First, there may be a space in the leaf node to accommodate more
elements. In that case, element is simply added to the node in such a way that the
order of elements is maintained. Second, there may not be a space in the leaf node
to accommodate more elements. In that case, after inserting new element in the
full leaf node, a single middle element is selected from these elements and is shifted
to the parent node and the leaf is split into two nodes—left node and right node (at
the same level). All the elements less than the middle element are placed in the left
node and all the elements greater than the middle element are placed in the right
node. If there is no space for the middle element in the parent node, the splitting of
the parent node takes place using the same procedure.

Self-Instructional
Material 285
Binary Search Tree The process of splitting may be repeated all the way to the root. In case the
splitting of the root node takes place, a new root node is created that comprises
the middle element from the old root node. The rest of the elements of the old root
node are distributed into two nodes that are created as a result of splitting. The
NOTES splitting of the root node increases the height of the B tree by one.
For example, consider the following step-by-step procedure for inserting elements
in the B tree of order 4, that is, any node can store at the most 3 elements and can
point to at the most 4 sub trees. The elements to be inserted in the B tree are—66,
90, 40, 75, 30, 35, 80, 70, 20, 50, 45, 55, 110, 100, and 120.
1. The element 66 forms a part of the new root node, as the B tree is empty
initially [see Figure 14.13(a)].
2. Since each node of the B tree can store up to 3 elements, the elements 90
and 40 also become a part of the root node [see Figure 14.13(b)].
3. Now, since the root node is full, it is split into two nodes. The left node
stores 40, the right node stores 90, and the middle element 66 becomes the
new root node. Since, 75 is less than 90 and greater than 66, it is placed
before 90 in the right node [see Figure 14.13(c)].
4. The elements 30 and 35 are inserted in the left sub tree and the element 80
is inserted in the right sub tree such that the order of elements is maintained
[see Figure 14.13(d)].
5. The appropriate position for the element 70 is in the right sub tree, and
since there is no space for more elements, the splitting of this node takes
place. As a result, the middle element 80 is moved to the parent node, the
element 75 forms the part of the left sub tree (of element 80) and the element
90 forms the part of the right sub tree (of element 80). The new element 70
is placed before the element 75 [see Figure 14.13(e)].
6. The appropriate position for the element 20 is in the left most sub tree, and
since there is no space for more elements, the splitting of this node takes
place as discussed in the previous step. The new element 20 is placed
before the element 30 [see Figure 14.13(f)].
7. This tree can be used for future insertions, but a situation may arise when
any of the sub trees may split and it may then be required to adjust the
middle element from that sub tree to the root node where there is no space
for more elements. Hence, keeping in mind the future requirements, as soon
as the root node becomes full, the splitting of root node must take place
[see Figure 14.13(g)]. This splitting of root node increases the height of the
tree by one.
8. Similarly, other elements 50, 45, 55, 110, 100, and 120 can be inserted in
this B tree. The resultant B tree is shown in Figure 14.13(h).

Self-Instructional
286 Material
66 40 66 90 Binary Search Tree
(a) (b)

66 66
NOTES
40 90 40 75 90
(c)

66

30 35 40 75 80 90
(d)

66 80 66 80

30 35 40 75 90 30 35 40 70 75 90

(e)

35 66 80 35 66 80

30 40 70 75 90 20 30 40 70 75 90

(f)

66

35 80

20 30 40 70 75 90

(g)

66

35 45 80 100

20 30 40 50 55 70 75 90 110 120

(h)

Fig. 14.13 Insertion in B Tree

Deletion in B tree
Deletion of an element from a B tree involves two steps—first searching the desired
element and second deleting the element. Whenever an element is deleted from a
B tree, it must be ensured that no property of B tree is violated after the deletion.
The element to be deleted may belong to either leaf node or internal node. These
two cases have been discussed here.

Self-Instructional
Material 287
Binary Search Tree
66

35 45 80 100

NOTES
20 30 38 40 48 52 56 68 70 73 77 85 90 110 115

Fig. 14.14 B Tree of Order 5

 Element belongs to a leaf node: The element can be easily deleted from
the node, if the number of elements stored in the node is greater than the
minimum required number of elements. However, if the number of elements
stored in the node is equal to the minimum required number of elements,
then as a result of deletion of element, underflow will occur. To avoid this
underflow, some additional changes are required to be made in the tree in
order to bring the node to a minimum size. These changes are made depending
on the number of elements in the left sibling or right sibling of the nodes.
Now, the following cases may arise.
o The right sibling node has more elements than the minimum
required number of elements: In such a situation, an element from
the parent node, which is at the right of the searched node, is shifted to
the current node and the left most element from the right sibling node is
shifted to the parent node. For example, suppose the element 40 is to
be deleted from the B tree as shown in Figure 14.14. The node having
the element 40 contains only two elements, hence the deletion of the
element 40 will lead to underflow. The right sibling node of this node has
more than two elements. In this case, the element 45 from the parent
node (which is to the right of this sub tree) is shifted to this node and the
element 48 from the right sibling node is shifted to the parent node (see
Figure 14.15).
66

35 48 80 100

20 30 38 45 52 56 68 70 73 77 85 90 110 115

Fig. 14.15 Deletion in B Tree

o The left sibling node has elements more that the minimum required
number of elements: In such a situation, the element from the parent
node, which is at the left of the searched node, is shifted to the current
node and the right most element from the left sibling node is shifted to
the parent node. For example, suppose the element 85 is to be deleted
from the B tree shown in Figure 14.15. The node that has the the element
85 contains only two elements, hence the deletion of the element 85
Self-Instructional
288 Material
will lead to an underflow. The left sibling node of this node has more Binary Search Tree

than two elements. In this case, the element 80 from the parent node
(which is to the left of this sub tree) is shifted to this node and the element
77 from the left sibling node is shifted to the parent node (see Figure
14.16). NOTES

66

35 48 77 100

20 30 38 45 52 56 68 70 73 80 90 110 115

Fig. 14.16 Deletion in B Tree

o If both the sibling nodes have minimum number of elements: In


such a situation, the new node is formed containing elements from the
searched node (node containing the element to be deleted), elements
from one of the sibling node and an element from the parent node
separating these two nodes (searched node and sibling node). If this
concatenation leads to an underflow in the parent node, the process of
concatenation is repeated for that parent node. This process of
concatenation may propagate upwards, unless the root node is reached.
For example, in the B tree shown in Figure 14.16, the element 45 is to
be deleted. In this case, after deleting the element 45, the element 38
from the current node (having elements 38 and 45) is merged with the
elements 52, 56 (from the right sibling) and 48 (from parent node). This
concatenation leads to underflow in the parent node hence, the further
concatenation is required to overcome this underflow (see Figure 14.17).

35 66 77 100

20 30 38 48 52 56 68 70 73 80 90 110 115

Fig. 14.17 Deletion in B Tree

 Element belongs to a internal node: When the element to be deleted


belongs to an internal node, it is replaced by its successor from the right sub
tree or predecessor from the left sub tree, which ever has the number of
elements greater than the minimum number of required elements. If both the
sub trees contain exactly the minimum required number of elements, then
the sub trees are merged using the same process as discussed above.
B+ Trees
B+ tree is a variation of the B tree in a sense that unlike B tree, it includes all the key
values and record pointers only in the leaf nodes, and key values are duplicated in
Self-Instructional
Material 289
Binary Search Tree internal nodes for defining the paths that can be used for searching purposes. In
addition, each leaf node contains a pointer, which points to the next leaf node, that
is, leaf nodes are linked sequentially (see Figure 14.18). Hence, B+ tree supports
fast sequential access of records in addition to the random access feature of B
NOTES tree. Note that in case of B+ trees, if a key corresponding to the desired record is
found in any internal node, the traversal continues until its respective leaf node is
reached to access the appropriate record pointer. A sample B+ tree is shown in
Figure 14.18.
Duplicated key
values
40

30 40 77 90

20 30 38 40 48 52 56 68 70 73 77 85 90 110 115

Only leaf nodes having record pointers


along with the key values

Fig. 14.18 A sample B+ Tree

14.4 HASHING TECHNIQUES

Hashing (also known as hash addressing) is generally applied to a file F containing


R records. Each record contains many fields, out of these one particular field may
uniquely identify the records in the file. Such a field is known as primary key
(denoted by k). The values k1, k2, … in the key field are known as keys or key
values.
Whenever a key is to be inserted in the hash table, a hash function is applied
on it, which yields an index for the key. It is then inserted at that index in the hash
table. However, since there is a finite number of locations in the hash table and
virtually an infinite number of keys to be stored, it is quite possible that two distinct
keys hash to the same index in the hash table. The situation in which a key hashes
to an index which is already occupied by another key is called collision. It should
be resolved by finding some other location to insert the new key. This process of
finding another location is called collision resolution.
Although, a variety of collision resolution techniques have been developed,
however, the possibility of collision should be minimized. It makes the study of
hash function an implied requirement because the hash function is responsible for
specifying the location for a new key. Therefore, the topic of hashing comprises
two major sub-parts, namely, hash functions and collision resolutions.
Since, the keys are inserted by applying hash functions on them, searching
a key in the hash table is straightforward. Simply, the same hash function is applied
on the key to be searched, which yields the index at which it may be found. Then
Self-Instructional
290 Material
the key at that location is compared with the desired key and if they are matched, Binary Search Tree

the search is successful.


Hash Functions
A hash function h is simply a mathematical formula that maps the key to some slot NOTES
in the hash table T. Thus, we can say that the key k hashes to slot h(k), or h(k) is
the hash value of key k. If the size of the hash table is N, then the index of the hash
table ranges from 0 to N-1. A hash table with N slots is denoted by T[N].
If the input keys are integers, then applying hash function on them is simple.
However, if the input keys are strings, then they are first converted into integers
before applying the hash function. For this, the numeric (ASCII) code associated
with characters can be used in converting character values into integers.
There are a number of hash functions available, however, the one which is
easy to compute and ensures that two distinct values hash to different location in
the hash table is desirable. It is quite simple to design a hash function which is
efficient and easy to compute. However, no hash function guarantees to achieve
the second condition always. However, a good hash function should keep the
number of collisions as minimum as possible. The following are some of the
commonly used hash functions.
Division-remainder method
The division-remainder is the simplest and most commonly used method. In this
method, the key k is divided by the number of slots N in the hash table, and the
remainder obtained after division is used as an index in the hash table. That is, the
hash function is
h(k) = k mod N
where, mod is the modulus operator. Different languages have different
operators for calculating the modulus. In C/C++, % operator is used for computing
the modulus.
Note that this function works well if the index ranges from 0 to N-1 (like in
C/C++). However, if the index ranges form 1 to N, the function will be
h(k) = k mod N + 1
This technique works very well if N is either a prime number not too close
to a power of two. Moreover, since this technique requires only a single division
operation, it is quite fast.
For example, consider a hash table with N=101. The hash value of the key
value 132437 can be calculated as follows:
h(132437)=132437 mod 101 = 26
Multiplication method
In multiplication method, first the key k is multiplied by a constant C, where 0 < C
< 1, and the fractional part of the product kC is extracted. In the second step, this Self-Instructional
Material 291
Binary Search Tree fractional part is multiplied by N, and the floor of the result is taken as the hash
value. The floor of a value x denoted by x is the largest integer less than or equal
to x. That is, the hash function is
h(k) = N (kC mod 1)
NOTES
where, kC mod 1 represents the fractional part of kC, calculated as kC -
kC.
Though, the constant C can take any value between 0 and 1, but still the
function gives better performance for some values of C. In his study, Knuth has
suggested that the following value of C is likely to work reasonably well.
C = (5 - 1) / 2 = 0.618033988749894848…
For example, consider a hash table with N=101 and C=0.6180339. The
hash value of the key 132437 can be calculated as follows:

h(132437) = 101 * ((132437 * 0.6180339…) mod 1)


= 101 * (81850.5673680698… mod 1)
= 101 * 0.5673680698…
= 57.3041750498…
= 57
Folded key method
The folded key is also a two-step process. In the first step, the key k is divided
into several groups from the left most digits, where each group contains n number
of digits, except the last one which may contain lesser number of digits. In the next
step, these groups are added together, and the hash value is obtained by ignoring
the last carry (if any).
For example, if the hash table has 100 slots, then each group will have two
digits; and the sum of the groups after ignoring the last carry will also be a 2-digit
number between 0 and 99. The hash value for the key value 132437 is computed
as follows:
1. The key value is first broken down into a group of 2-digit numbers
from the left most digits. Therefore, the groups are 13, 24, and 37.
2. These groups are then added like 13 + 24 + 37 = 74. The sum 74 is
now used as the hash value for the key value 132437.
Similarly, the hash value of another key value, say 6217569, can be calculated
as follows:
1. The key value is first broken down into a group of 2-digit numbers
from the left most digits. Therefore, the groups are 62, 17, 56, and 9.
2. These groups are then added, like, 62 + 17 + 56 + 9 = 144. The sum
44 after ignoring the last carry 1 is now used as the hash value for the
key value 6217569.
Self-Instructional
292 Material
Midsquare method Binary Search Tree

This method also operates in two steps. First, the square of the key k (that is, k2)
is calculated, and then some of the digits from left and right ends of k2 are removed.
The number obtained after removing the digits is used as the hash value. Note that NOTES
the digits at the same position of k2 must be used for all the keys.
For example, consider a hash table with N=1000. The hash value of the
key value 132437 can be calculated as follows:
1. The square of the key value is calculated, which is, 17539558969.
2. The hash value is obtained by taking 5th, 6th and 7th digits counting
from right, which is, 955.
Similarly, the hash value of another key value, say 6217569, can be calculated
as follows:
1. The square of the key value is calculated, which is, 38658164269761
2. The hash value is obtained by taking 5th, 6th and 7th digits counting
from right, which is, 426.

Collision Resolution Techniques


The main problem associated with most hashing functions is that they do not yield
distinct hash addresses for distinct keys, because the number of key values is
much larger than the number of available locations in the hash table. Due to this,
sometimes the problem called collision occurs. Since one cannot eliminate collisions
altogether, one needs some mechanisms to deal with them. There are several ways
for resolving collisions, the two most common techniques used are separate chaining
and open addressing.
Separate chaining
In this technique, a linked list of all the key values that hash to the same hash value
is maintained. Each node of the linked list contains a key value and the pointer to
the next node. Each index i (0<=i<N) in the hash table contains the address of the
first node of the linked list containing all the keys that hash to the index i. If there is
no key value that hashes to the index i, the slot contains NULL value. Therefore,
in this method, a slot in the hash table does not contain the actual key values;
rather it contains the address of the first node of the linked list containing the
elements that hash to this slot.
Consider the key values 20, 32, 41, 66, 72, 80, 105, 77, 56, and 53 that
need to be hashed using the simple hash function h(k) = k mod 10. The keys 20
and 80 hash to index 0, key 41 hashes to index 1, keys 32 and 72 hashs to index
2, key 53 hashes to index 3, key 105 hashes to index 5, keys 66 and 56 hashes to
index 6 and finally the key 77 hashes to index 7. The collision is handled using the
separate chaining (also known as synonyms chaining) technique as shown in
Figure 14.19.
Self-Instructional
Material 293
Binary Search Tree

NOTES

Fig. 14.19 Collision resolution by Separate Chaining

Note that a new element can be inserted either at the beginning or at the end
of the list. Generally, the elements are inserted in the beginning of the list because
it is simpler to implement, and moreover, it frequently happens that the elements
which are added recently are the most likely to be accessed in the near future.
The main disadvantage of separate chaining is that it makes use of pointers,
which slows down the algorithm a bit because of the time required in allocating
and deallocating the memory. Moreover, maintaining another data structure (that
is, linked list) in addition to the hash table causes extra overheads.
Open addressing
Unlike the separate chaining method, no separate data structure is used in open
addressing because all the key values are stored in the hash table itself. Since,
each slot in the hash table contains the key value rather than the address value, a
bigger hash table is required in this case as compared to separate chaining. Some
value is used to indicate an empty slot. For example, if it is known that all the keys
are positive values, then -1 can be used to represent a free or empty slot.
To insert a key value, first the slot in the hash table to which the key value
hashes, is determined, using any hash function. If the slot is free, the key value is
inserted into that slot. In case the slot is already occupied, then the subsequent
slots, starting from the occupied slot, are examined systematically in the forward
direction, until an empty slot is found. If no empty slot is found, then an overflow
condition occurs.
In case of searching, first the slot in the hash table to which the key value
hashes is determined, using any hash function. Then, the key value stored in that
slot is compared with the key value to be searched. If they match, the search
operation is successful; otherwise alternative slots are examined systematically in
the forward direction to find the slot containing the desired key value. If no such
slot is found, then the search is unsuccessful.
The process of examining the slots in the hash table to find the location of a
key value is known as probing. The various types of probing are linear probing,
quadratic probing, and double hashing that are used in open addressing method.

Self-Instructional
294 Material
Linear probing Binary Search Tree

Linear probing is the simplest approach for resolving collisions. It uses the following
hash function:
h(k, i) = [h’(k) + i] mod N NOTES
where,
h’(k) is any hash function (for simplicity we use k mod N)
i is the probe number ranging from 0 to N-1
To insert a key k in the hash table, first the slot T[h’(k)] is probed. If this slot
is empty, the key is inserted into the slot. Otherwise, the slots T[h’(k)+1],
T[h’(k)+2], T[h’(k)+3], and so on (up to T[N-1]) are probed sequentially
until an empty slot is found. If no empty slot is found up to T[N-1], we wrap
around to slots T[0], T[1], T[2], and so on until an empty slot is found or we
finally reach the slot T[h’(k)-1]. The main advantage of linear probing is that
as long as the hash table is not full, a free slot can always be found, however, the
time taken to find an empty slot can be quite large.
To understand linear probing, consider the insertion of the following keys
into the hash table with N=10.
126, 75, 37, 56, 29, 154, 10, 99
Further consider that the basic hash function is h’(k)=k mod N.
Step 1: The key value 126 hashes to the slot 6 as follows:
h(126, 0) = (126 mod 10 + 0) mod 10 = (6 + 0) mod 10 = 6
mod 10 = 6
Since slot 6 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
126

Step 2: Next, the key value 75 hashes to the slot 5 as follows:


h(75, 0) = (75 mod 10 + 0) mod 10 = (5 + 0) mod 10 = 5 mod
10 = 5
Since slot 5 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
75 126
Step 3: Next, the key value 37 hashes to the slot 7 as follows:
h(37, 0) = (37 mod 10 + 0) mod 10 = (7 + 0) mod 10 = 7 mod
10 = 7
Since slot 7 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
75 126 37

Self-Instructional
Material 295
Binary Search Tree Step 4: Now, the key value 56 hashes to the slot 6 as follows:
h(56, 0) = (56 mod 10 + 0) mod 10 = (6 + 0) mod 10 = 6 mod
10 = 6
Since slot 6 is not empty, the next probe sequence is computed as follows:
NOTES
h(56, 1) = (56 mod 10 + 1) mod 10 = (6 + 1) mod 10 = 7 mod
10 = 7
Slot 7 is also not empty, the next probe sequence is computed as follows:
h(56, 2) = (56 mod 10 + 2) mod 10 = (6 + 2) mod 10 = 8 mod
10 = 8
Since slot 8 is empty, 56 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
75 126 37 56

Step 5: Next, the key value 29 hashes to the slot 9 as follows:


h(29, 0) = (29 mod 10 + 0) mod 10 = (9 + 0) mod 10 = 9 mod
10 = 9
Since slot 9 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
75 126 37 56 29
Step 6: Now, the key value 154 hashes to the slot 4 as follows:
h(154, 0) = (154 mod 10 + 0) mod 10 = (4 + 0) mod 10 = 4
mod 10 = 4
Since slot 4 is empty, 154 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
154 75 126 37 56 29

Step 7: Now, the key value 10 hashes to the slot 0 as follows:


h(10, 0) = (10 mod 10 + 0) mod 10 = (0 + 0) mod 10 = 0 mod
10 = 0
Since slot 0 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9
10 154 75 126 37 56 29
Step 8: Now, the key value 99 hashes to the slot 9 as follows:
h(99, 0) = (99 mod 10 + 0) mod 10 = (9 + 0) mod 10 = 9 mod
10 = 9
Since slot 9 is not empty, the next probe sequence is computed as follows:
h(99, 1) = (99 mod 10 + 1) mod 10 = (9 + 1) mod 10 = 10 mod
10 = 0

Self-Instructional
296 Material
Slot 0 is also not empty, the next probe sequence is computed as follows: Binary Search Tree

h(99, 2) = (99 mod 10 + 2) mod 10 = (9 + 2) mod 10 = 11 mod


10 = 1
Since slot 1 is empty, 99 is inserted into this slot.
NOTES
0 1 2 3 4 5 6 7 8 9
10 99 154 75 126 37 56 29
In case of searching also, the same process is followed. The only difference
is that instead of finding an empty slot to store a given key value, you find the slot
containing the desired key value. The number of probes required in both the cases
(insertion and searching) is not more than the number of slots in the hash table.
Linear probing is easy to implement, but it has a disadvantage that if the
hash table is relatively empty, then blocks (clusters) of occupied slots start forming.
This problem is known as primary clustering in which many such blocks are
separated by free slots. For example, in the previous example, the slots 0 and 1
form one cluster of occupied slots, slots 4 to 9 form another cluster of occupied
slots. These two clusters are separated by free slots 2 and 3.
Once the clusters are formed, there are more chances that subsequent
insertions will also end up in one of the cluster. This further increases the size of the
cluster, thereby increasing the number of probes required to find a free slot. The
performance gets worse as you insert more and more values in the table. To avoid
this problem, two techniques, namely, quadratic probing and double hashing are
used.
Quadratic probing
In quadratic probing, the collision function is quadratic instead of linear function of
i as in linear probing. That is, it uses the following hash function:
h(k, i) = [h’(k) + i2] mod N
where,
h’(k) is any hash function (for simplicity we use k mod N)
i is the probe number ranging from 0 to N-1
To insert a key k in the hash table, first the slot T[h’(k)] is probed. If this
slot is empty, the key is inserted into the slot. Otherwise, the slots h’(k)+i2 are
probed. That is, the indexes h’(k)+1, h’(k)+4, h’(k)+9, and so on are
considered until an empty slot is found. Quadratic probing can also guarantee a
successful insertion of a key as long as the hash table is at most half full, and the
size of the table is a prime number. The same probe sequences are followed to
search a desired key value in the hash table.
To understand quadratic probing, consider the insertion of the following
keys into the hash table with N=11.
126, 75, 37, 56, 29, 154, 10, 99
Self-Instructional
Material 297
Binary Search Tree Further consider that the basic hash function is h’(k)=k mod N.
Step 1: The key value 126 hashes to the slot 5 as follows:
h(126, 0) = (126 mod 11 + 02) mod 11 = (5 + 0) mod 11 = 5
NOTES Since slot 5 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
126

Step 2: Next, the key value 75 hashes to the slot 9 as follows:


h(75, 0) = (75 mod 11 + 02) mod 11 = (9 + 0) mod 11 = 9
Since slot 9 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
126 75

Step 3: Next, the key value 37 hashes to the slot 4 as follows:


h(37, 0) = (37 mod 11 + 02) mod 11 = (4 + 0) mod 11 = 4
Since slot 4 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
37 126 75

Step 4: Now, the key value 56 hashes to the slot 1 as follows:


h(56, 0) = (56 mod 11 + 02) mod 11 = (1 + 0) mod 11
= 1
Since slot 1 is empty, 56 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
56 37 126 75

Step 5: Next, the key value 29 hashes to the slot 7 as follows:


h(29, 0) = (29 mod 11 + 02) mod 11 = (7 + 0) mod 11 = 7
Since slot 7 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
56 37 126 29 75

Step 6: Now, the key value 154 hashes to the slot 0 as follows:
h(154, 0) = (154 mod 11 + 02) mod 11 = (0 + 0) mod 11 = 0
Since slot 0 is empty, 154 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
154 56 37 126 29 75

Step 7: Now, the key value 10 hashes to the slot 10 as follows:


h(10, 0) = (10 mod 11 + 02) mod 11 = (10 + 0) mod 11 = 10
Since slot 10 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
Self-Instructional 154 56 37 126 29 75 10
298 Material
Step 8: Now, the key value 99 hashes to the slot 0 as follows: Binary Search Tree

h(99, 0) = (99 mod 11 + 02) mod 11 = (0 + 0) mod 11 = 0


Since slot 0 is not empty, the next probe sequence is computed as follows:
h(99, 1) = (99 mod 11 + 12) mod 11 = (0 + 1) mod 11 = 1 NOTES
Slot 1 is also not empty, the next probe sequence is computed as follows:
h(99, 2) = (99 mod 11 + 22) mod 11 = (0 + 4) mod 11 = 4
Slot 4 is also not empty, the next probe sequence is computed as follows:
h(99, 3) = (99 mod 11 + 32) mod 11 = (0 + 9) mod 11 = 9
Slot 9 is also not empty, the next probe sequence is computed as follows:
h(99, 4) = (99 mod 11 + 42) mod 11 = (0 + 16) mod 11 = 5
Slot 5 is also not empty, the next probe sequence is computed as follows:
h(99, 5) = (99 mod 11 + 52) mod 11 = (0 + 25) mod 11 = 3
Since slot 3 is empty, 99 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10
154 56 99 37 126 29 75 10

Though quadratic probing eliminates primary clustering, it sometimes results


in a milder form of clustering known as secondary clustering where the key values
that initially hash to the same position will probe the same sequence of slots. For
example, consider a key value 88 that is to be inserted into the hash table. Initially,
it hashes to slot 0 (as that of the key 99). Therefore, it will follow the same probe
sequence, that is, 1, 4, 9, 5, 3.
Double hashing
As you have seen that both the linear and quadratic probing add increments to the
initial hash value h’(k) to define a probe sequence. Linear probing adds i, and
quadratic probing adds i2 to the initial hash value to find an alternative slot. Both
these increments are independent of the key k. The double hashing method, on
the other hand, uses a different hash function h’’(k) to compute these increments.
Therefore, the increments are dependent on the key. Double hashing uses the
following hash function:
h(k, i) = [h’(k) + i*h’’(k)] mod N
where,
h’(k) is any hash function (for simplicity we use k mod N)
h’’(k) is another hash function (for simplicity we use k mod N’
where N’ is slightly less than N (say N-1 or N-2))
i is the probe number ranging from 0 to N-1
Initially, when a key k is to be inserted into the hash table, the first slot
probed is T[h’(k)]. If this slot is empty, the key is inserted into the slot. Otherwise,
alternative slots are searched using another independent hash function (hence the
Self-Instructional
Material 299
Binary Search Tree name double hashing). In case of searching also, the same process is followed
until the desired key value is found, or all the key values in the table are examined.
To understand double hashing, consider the insertion of the following keys
into the hash table with N=13.
NOTES
126, 75, 37, 56, 29, 152, 35, 99
Further, consider that the basic hash function is h’(k)=k mod N and
h’’(k)=k mod (N-2).
Step 1: The key value 126 is hashes to the slot 9 as follows:
h(126, 0) = (126 mod 13 + 0*(126 mod 11)) mod 13 = (9 + 0)
mod 13 = 9
Since slot 9 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
126

Step 2: Next, the key value 75 hashes to the slot 10 as follows:


h(75, 0) = (75 mod 13 + 0*(75 mod 11)) mod 13 = (10 + 0)
mod 13 = 10
Since slot 10 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
126 75

Step 3: Next, the key value 37 hashes to the slot 11 as follows:


h(37, 0) = (37 mod 13 + 0*(37 mod 11)) mod 13 = (11 + 0)
mod 13 = 11
Since slot 11 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
126 75 37

Step 4: Now, the key value 56 hashes to the slot 4 as follows:


h(56, 0) = (56 mod 13 + 0*(56 mod 11)) mod 13 = (4 + 0) mod
13 = 4
Since slot 4 is empty, 56 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
56 126 75 37

Step 5: Next, the key value 29 hashes to the slot 3 as follows:


h(29, 0) = (29 mod 13 + 0*(29 mod 11)) mod 13 = (3 + 0) mod
13 = 3
Since slot 3 is empty, it is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
29 56 126 75 37

Self-Instructional
300 Material
Step 6: Now, the key value 152 hashes to the slot 9 as follows: Binary Search Tree

h(152, 0) = (152 mod 13 + 0*(152 mod 11)) mod 13 = (9 + 0)


mod 13 = 9
Since slot 9 is not empty, the next probe sequence is computed as follows:
NOTES
h(152, 1) = (152 mod 13 + 1*(152 mod 11)) mod 13 = (9 + 9)
mod 13 = 5
Since slot 5 is empty, 152 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
29 56 152 126 75 37

Step 7: Now, the key value 35 hashes to the slot 9 as follows:


h(35, 0) = (35 mod 13 + 0*(35 mod 11)) mod 13 = (9 + 0) mod
13 = 9
Since slot 9 is not empty, the next probe sequence is computed as follows:
h(35, 1) = (35 mod 13 + 1*(35 mod 11)) mod 13 = (9 + 2) mod
13 = 11
Slot 11 is also not empty, the next probe sequence is computed as follows:
h(35, 2) = (35 mod 13 + 2*(35 mod 11)) mod 13 = (9 + 4) mod
13 = 0
Since slot 0 is empty, 35 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
35 29 56 152 126 75 37

Step 8: Now, the key value 99 hashes to the slot as follows:


h(99, 0) = (99 mod 13 + 0*(99 mod 11) mod 13 = (8 + 0) mod
13 = 8
Since slot 8 is empty, 99 is inserted into this slot.
0 1 2 3 4 5 6 7 8 9 10 11 12
35 29 56 152 99 126 75 37

Since the increment in double hashing depends on the value of key k, the
values that hash to the same initial slot may have different probe sequences. Thus,
double hashing almost eliminates the problem of primary and secondary clustering
and its performance is very close to the ideal hashing. For example, the key value
35 initially hashes to slot 9 (as that of the key 152). However, the next probe
sequence for 35 is 11 (not 5 as in case of 152).

Check Your Progress


1. What is hashing also known as?
2. What is a hash function?
3. What is the main problem associated with most hashing functions?

Self-Instructional
Material 301
Binary Search Tree
14.5 ANSWERS TO CHECK YOUR PROGRESS
QUESTIONS

NOTES 1. Hashing is also known as hash addressing.


2. A hash function h is simply a mathematical formula that maps the key to
some slot in the hash table T.
3. The main problem associated with most hashing functions is that they do
not yield distinct hash addresses for distinct keys

14.6 SUMMARY

 Insertion in a binary search tree is similar to the procedure for searching an


element in a binary search tree.
 The process of inserting a node in a binary search tree can be divided into
two steps—in the first step, the tree is searched to determine the appropriate
position where the node is to be inserted and in the second step, the node
is inserted at this searched position.
 Here are two cases of insertion in a tree—first, insertion into an empty tree
and second, insertion into a nonempty tree.
 The process of inserting a node in a binary search tree can be divided into
two steps—in the first step, the tree is searched to determine the appropriate
position where the node is to be inserted and in the second step, the node
is inserted at this searched position.
 Deletion of a node from a binary search tree involves two steps—first,
searching the desired node and second, deleting the node.
 If the node to be deleted has two child nodes, it is deleted by replacing its
value by largest value in the left sub tree (in-order predecessor) or by smallest
value in the right sub tree (in-order successor).
 Hashing (also known as hash addressing) is generally applied to a file F
containing R records.
 Whenever a key is to be inserted in the hash table, a hash function is applied
on it, which yields an index for the key.
 Since, the keys are inserted by applying hash functions on them, searching
a key in the hash table is straightforward.
 A hash function h is simply a mathematical formula that maps the key to
some slot in the hash table T.
 There are a number of hash functions available, however, the one which is
easy to compute and ensures that two distinct values hash to different location
in the hash table is desirable.
Self-Instructional
302 Material
 The main problem associated with most hashing functions is that they do Binary Search Tree

not yield distinct hash addresses for distinct keys, because the number of
key values is much larger than the number of available locations in the hash
table.
NOTES
 There are several ways for resolving collisions, the two most common
techniques used are separate chaining and open addressing.
 In this technique, a linked list of all the key values that hash to the same hash
value is maintained.
 The main disadvantage of separate chaining is that it makes use of pointers,
which slows down the algorithm a bit because of the time required in allocating
and deallocating the memory.
 Unlike the separate chaining method, no separate data structure is used in
open addressing because all the key values are stored in the hash table
itself.

14.7 KEY WORDS

 Hash table: It is a data structure that is used to store keys/value pairs. It


uses a hash function to compute an index into an array in which an element
will be inserted or searched.
 Division-remainder method: It is the simplest and most commonly used
method. In this method, the key k is divided by the number of slots N in the
hash table, and the remainder obtained after division is used as an index in
the hash table.

14.8 SELF ASSESSMENT QUESTIONS AND


EXERCISES

Short-Answer Questions
1. What do you mean by division remainder method?
2. List a few hashing techniques.
3. Draw a diagram of deletion of a node having two child nodes.
4. Write a short note about insertion and deletion operations.
Long-Answer Questions
1. “Insertion in a binary search tree is similar to the procedure for searching an
element in a binary search tree.” Explain in detail.
2. What are the various cases of insertion in a binary search tree? Explain.

Self-Instructional
Material 303
Binary Search Tree 3. What do you mean by deleting a node?
4. Explain the Division-remainder method in detail.

NOTES 14.9 FURTHER READINGS

Gottfried, Byron S. 1996. Programming With C, Schaum’s Outline Series. New


York: McGraw-Hill.
Jeyapoovan T. 2006. Computer Programming - Theory & Practice. New Delhi:
Vikas Publishing.
Khurana, Rohit. 2005. Object Oriented Programming. New Delhi: Vikas
Publishing.
Kanetkar, Yashwant. 2003. Let Us C. New Delhi: BPB Publication.
Saxena, Sanjay. 2003. A First Course In Computers. New Delhi: Vikas
Publishing.
Subburaj, R. 2000. Programming In C. New Delhi: Vikas Publishing.

Self-Instructional
304 Material
.emaN e1.sruIncrease
oC eht fothe
ezifont
s tnosize
f ehof
t esthe
aerCourse
cnI .1 Name.


.egaP revoC e2.ht nuse
i rethe
daefollowing
h a sa gniwasolaloheader
f eht esin
u the
.2 Cover Page.

YTISREVINUALAGAPPA
APPAGALAUNIVERSITY
B.C.A.


elcyC drihT eht ni )46.3:APGC( CA[Accredited
AN yb edarGwith
’+A’’A+’
htiwGrade
detidby
ercNAAC
cA[ (CGPA:3.64) in the Third Cycle
]CGU-DRHM yb ytisrevinU I–yrogeand
taC Graded
sa dedarasG Category–I
dna University by MHRD-UGC]
300 036 – IDUKIARA
KARAIKUDI
K – 630 003
101 13 NOITACUDE ECNATSIDDIRECTORATE
FO ETAROTCEOF
RIDDISTANCE EDUCATION

C AND DATA STRUCTURE


I - Semester

C AND DATA STRUCTURE


B.C.A.
101 13

itnem sYou
a egaare
p reinstructed
voc eht etatodpupdate
u ot dethe
tcurcover
tsni erpage
a uoYas mentioned below:
.emaN e1.sruIncrease
oC eht fothe
ezifont
s tnosize
f ehof
t esthe
aerCourse
cnI .1 Name.
aP revoC e2.ht nuse
i rethe
daefollowing
h a sa gniwasolaloheader
f eht esin
u the
.2 Cover Page.

ISREVINUALAGAPPA
APPAGALAUNIVERSITY
rihT eht ni )46.3:APGC( CA[Accredited
AN yb edarGwith
’+A’’A+’
htiwGrade
detidby
ercNAAC
cA[ (CGPA:3.64) in the Third Cycle
]CGU-DRHM yb ytisrevinU I–yrogeand
taC Graded
sa dedarasG Category–I
dna University by MHRD-UGC]
300 036 – IDUKIARA
KARAIKUDI
TACUDE ECNATSIDDIRECTORATE
K
FO ETAROTCEOF
– 630 003
RIDDISTANCE EDUCATION
C AND DATA STRUCTURE
B.C.A.

I - Semester

You might also like