DC Lab Manual 2024 - 25
DC Lab Manual 2024 - 25
LABORATORY MANUAL
Lab Manual
Final Year Semester-VII
Odd Semester
1
Institutional Vision, Mission and Quality Policy
Our Vision
To foster and permeate higher and quality education with value added engineering, technology
programs, providing all facilities in terms of technology and platforms for all round development with
societal awareness and nurture the youth with international competencies and exemplary level of
employability even under highly competitive environment so that they are innovative adaptable and
capable of handling problems faced by our country and world at large.
RAIT’s firm belief in new form of engineering education that lays equal stress on academics and
leadership building extracurricular skills has been a major contribution to the success of RAIT as one
of the most reputed institution of higher learning. The challenges faced by our country and world in
the 21 Century needs a whole new range of thought and action leaders, which a conventional
educational system in engineering disciplines are ill equipped to produce. Our reputation in providing
good engineering education with additional life skills ensure that high grade and highly motivated
students join us. Our laboratories and practical sessions reflect the latest that is being followed in the
Industry. The project works and summer projects make our students adept at handling the real life
problems and be Industry ready. Our students are well placed in the Industry and their performance
makes reputed companies visit us with renewed demands and vigour.
Our Mission
The Institution is committed to mobilize the resources and equip itself with men and materials of
excellence thereby ensuring that the Institution becomes pivotal center of service to Industry,
academia, and society with the latest technology. RAIT engages different platforms such as
technology enhancing Student Technical Societies, Cultural platforms, Sports excellence centers,
Entrepreneurial Development Center and Societal Interaction Cell. To develop the college to become
2
an autonomous Institution & deemed university at the earliest with facilities for advanced research
and development programs on par with international standards. To invite international and reputed
national Institutions and Universities to collaborate with our institution on the issues of common
interest of teaching and learning sophistication.
RAIT’s Mission is to produce engineering and technology professionals who are innovative and
inspiring thought leaders, adept at solving problems faced by our nation and world by providing
quality education.
The Institute is working closely with all stake holders like industry, academia to foster knowledge
generation, acquisition, dissemination using best available resources to address the great challenges
being faced by our country and World. RAIT is fully dedicated to provide its students skills that make
them leaders and solution providers and are Industry ready when they graduate from the Institution.
We at RAIT assure our main stakeholders of students 100% quality for the programmes we deliver.
This quality assurance stems from the teaching and learning processes we have at work at our campus
and the teachers who are handpicked from reputed institutions IIT/NIT/MU, etc. and they inspire the
students to be innovative in thinking and practical in approach. We have installed internal procedures
to better skills set of instructors by sending them to training courses, workshops, seminars and
conferences. We have also a full fledged course curriculum and deliveries planned in advance for a
structured semester long programme. We have well developed feedback system employers, alumni,
students and parents from to fine tune Learning and Teaching processes. These tools help us to ensure
same quality of teaching independent of any individual instructor. Each classroom is equipped with
Internet and other digital learning resources.
The effective learning process in the campus comprises a clean and stimulating classroom
environment and availability of lecture notes and digital resources prepared by instructor from the
comfort of home. In addition student is provided with good number of assignments that would trigger
his thinking process. The testing process involves an objective test paper that would gauge the
understanding of concepts by the students. The quality assurance process also ensures that the
learning process is effective. The summer internships and project work based training ensure learning
process to include practical and industry relevant aspects. Various technical events, seminars and
conferences make the student learning complete.
3
Our Quality Policy
It is our earnest endeavour to produce high quality engineering professionals who are
innovative and inspiring, thought and action leaders, competent to solve problems
faced by society, nation and world at large by striving towards very high standards in
learning, teaching and training methodologies.
4
Departmental Vision, Mission
Vision
To impart higher and quality education in computer science with value added engineering and
technology programs to prepare technically sound, ethically strong engineers with social awareness.
To extend the facilities, to meet the fast changing requirements and nurture the youths with
international competencies and exemplary level of employability and research under highly
competitive environments.
Mission
To mobilize the resources and equip the institution with men and materials of excellence to provide
knowledge and develop technologies in the thrust areas of computer science and Engineering. To
provide the diverse platforms of sports, technical, cocurricular and extracurricular activities for the
overall development of student with ethical attitude. To prepare the students to sustain the impact of
computer education for social needs encompassing industry, educational institutions and public
service. To collaborate with IITs, reputed universities and industries for the technical and overall
upliftment of students for continuing learning and entrepreneurship.
5
Departmental Program Educational Objectives
(PEOs)
3. Broad Base
To provide broad education necessary to understand the science of computer engineering and
the impact of it in a global and social context.
4. Techno-leader
To provide exposure to emerging cutting edge technologies, adequate training &
opportunities to work as teams on multidisciplinary projects with effective communication
skills and leadership qualities.
5. Practice citizenship
To provide knowledge of professional and ethical responsibility and to contribute to society
through active engagement with professional societies, schools, civic organizations or other
community activities.
6
Departmental Program Outcomes (POs)
7
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
PO11. Project management and finance: Demonstrate knowledge and understanding of
the engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and in multidisciplinary
environments.
PO12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological
change.
PSO2: To build appreciation and knowledge acquiring of current computer techniques with
an ability to use skills and tools necessary for computing practice.
PSO3: To be able to match the industry requirements in the area of computer science and
engineering. To equip skills to adopt and imbibe new technologies.
8
Index
Sr. No. Contents Page No.
1. Experiment No. 1
2. Experiment No. 2
3. Experiment No. 3
4. Experiment No. 4
5. Experiment No. 5
6. Experiment No. 6
7. Experiment No. 7
8. Experiment No. 8
9. Experiment No. 9
11 Experiment No. 11
9
List of Experiments
Sr. Course
Experiments Name
No. Outcome
10
Experiment Plan, Course Objectives &
Course Outcome
Course Objectives:
Course Outcomes:
CO6 Apply the knowledge of Distributed File System to analyze various file systems like
NFS, AFS and the experience in building large-scale distributed applications.
11
Module Week Course
Experiments Name
No. No. Outcome
12
Mapping of Course outcomes with Program outcomes
Subject Contribution to Program outcomes (PO)
Weight
1 2 3 4 5 6 7 8 9 10 11 12
13
Mapping of Course outcomes with Program Specific outcomes:
Contribution to
Course Outcomes Program Specific
outcomes
(PSO)
1 2 3
14
Mapping Course Outcomes (CO) -
Program Outcomes (PO)
Study and Evaluation Scheme
Course Course Name Teaching Scheme Credits Assigned
Code
Term Work:
Laboratory work will be based on above syllabus with minimum 10 experiments to be incorporated.
Practical Exam:
15
Distributed Computing
Experiment No. : 1
16
Experiment No. 1
1. Aim: Program to demonstrate datagram Socket for Chat Application using Java
2. Objectives:
3 Outcomes:
5. Theory:
17
Client/Server Communication
At a basic level, network based systems consist of a server, client, and a media for
communication as shown in Figure 1. A computer running a program that makes
request for services is called client machine. A computer running a program that
offers requested services from one or more clients is called server machine. The
media for communication can be wired or wireless network.
Client Server
6. Procedure/ Program:
18
TCP Server Algorithm:
1. Create a socket and bind to the well-known address for the service being offered
2. Place the socket in passive mode
3. Accept the next connection request from the socket, and obtain a new socket for the
connection
4. Repeatedly read a request from the client, formulate a response, and send a reply
back to the client according to the application protocol
5. When finished with a particular client, close the connection and return to step 3 to
accept a new connection
9. References:
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and
Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
and Design” (4th Edition), Addison Wesley/Pearson Education.
19
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
Java Complete Reference, Herbert Schild
20
Distributed Computing
Experiment No. : 2
21
Experiment No. 2
1. Aim: Program to implement RMI Application using Java.
The RMI (Remote Method Invocation) is an API that provides a mechanism to create
distributed application in java.
The RMI allows an object to invoke methods on an object running in another JVM.
3. Outcomes:
5. Theory:
Java Remote Method Invocation (Java RMI) enables the programmer to create
distributed-Java technology-based to Java technology-based applications, in which the
methods of remote Java objects can be invoked from other Java virtual machines,
possibly on different hosts. RMI uses object serialization to marshal and unmarshal
parameters and does not truncate types, supporting true object oriented polymorphism.
Java RMI is a mechanism that allows one to invoke a method on an object that exists in
another address space. The other address space could be on the same machine or on a
different one. The RMI mechanism is basically an object-oriented RPC mechanism.
22
use the Object Registry to obtain access to a remote object using the name of the
object.
6. Procedure/ Program:
Create Registry
23
9. References:
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and
Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
and Design” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
JAVA Complete Reference, Herbert Schildt.
24
Distributed Computing
Experiment No. : 3
25
Experiment No. 3
1. Aim: Program to demonstrate Bully Election Algorithm using Java
2. Objective:
3. Outcomes:
Election Algorithms
26
-peer to peer communication: every process can send messages to every other process.
-Assume that processes have unique IDs, such that one is highest
6. Procedure/Program
Pi starts an election.
7. Conclusion
Hence from above experiment student understood the working of Bully election
algorithm for coordinator election in distributed system.. This method requires atmost
five stages, and the probability of detecting a crashed process during the execution of
algorithm is lowered in contrast to other algorithms.
27
8. QUIZ / Viva Questions:
Explain Bully algorithm in brief.
What are different election algorithms?
9. References:
M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International
Publishers 2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems:
Principles and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN:
0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems:
Concepts andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”,
IEEE computer society press
JAVA Complete Reference, Herbert Schild.
28
Distributed Computing
Experiment No. : 4
29
Experiment No. 4
1. Aim: Program to demonstrate Berkeley Clock Synchronization Algorithm using
Java
2. Objective: Student will understand
6. Procedure/ Program:
Algorithm
30
1. A master is chosen via an election process such as Chang and Roberts
algorithm.
2. The master polls the slaves who reply with their time in a similar way to
Cristian's algorithm.
3. The master observes the round-trip time (RTT) of the messages and estimates
the time of each slave and its own.
4. The master then averages the clock times, ignoring any values it receives far
outside the values of the others.
5. Instead of sending the updated current time back to the other process, the
master then sends out the amount (positive or negative) that each slave must
adjust its clock. This avoids further uncertainty due to RTT at the slave
processes.
With this method the average cancels out individual clock's tendencies to drift.
7. Conclusion
A physical clock is not present in distributed system for synchronization. To achieve the
synchronization all the machines timestamp is collected and the server broadcast the
appropriate time to all using berkely algorithm.
9. References:
M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers
2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles and
Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
and Design” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
JAVA Complete Reference, Herbert Schild.
31
Distributed Computing
Experiment No. : 5
32
Experiment No. 5
1. Aim: Program to implement Token Ring Algorithm for distributed mutual
exclusion.
2. Objective:
Understand the different approach to achieving mutual exclusion in a distributed
system.
3. Outcomes:
From this experiment, the student will be able to learn
Analyze the various techniques used for clock synchronization and mutual
exclusion
4. Software Required: JDK 1.6
5. Theory
Mutual exclusion:
Concurrent access of processes to a shared resource or data is executed in mutually
exclusive manner.
Only one process is allowed to execute the critical section (CS) at any given time.
In a distributed system, shared variables (semaphores) or a local kernel cannot be
used to implement mutual exclusion.
Two basic approaches for distributed mutual exclusion:
1. Token based approach
2. Non-token based approach
Token-based approach:
If a site wants to enter the CS and it does not have the token, it broadcasts a REQUEST message for
the token to all other sites. A site which possesses the token sends it to the requesting site upon the
receipt of its REQUEST message. If a site receives a REQUEST message when it is executing the
CS, it sends the token only after it has completed the execution of the CS.
33
This algorithm must efficiently address the following two design issues:
(1) How to distinguish an outdated REQUEST message from a current REQUEST message:
Due to variable message delays, a site may receive a token request message after the corresponding
request has been satisfied. If a site can not determined if the request corresponding to a token request
has been satisfied, it may dispatch the token to a site that does not need it. This will not violate
the correctness, however, this may seriously degrade the performance.
(2) How to determine which site has an outstanding request for the CS: After a site has finished
the execution of the CS, it must determine what sites have an outstanding request for the CS so
that the token can be dispatched to one of them.
6. Procedure/ Program:
Releasing the critical section Having finished the execution of the CS, site Si takes the following actions:
(d) It sets LN[i] element of the token array equal to RNi [i].
(e) For every site Sj whose id is not in the token queue, it appends its id to the token queue
if RNi [j]=LN[j]+1.
(e) If the token queue is nonempty after the above update, Si deletes the top site id from the token
queue and sends the token to the site indicated by the id.
7. Conclusion:
34
9. References:
M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International
Publishers 2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
35
Distributed Computing
Experiment No. : 6
36
Experiment No. 6
10. Aim: Program to implement Non-Token Ring Algorithm for distributed mutual
exclusion.
11. Objective:
12. Outcomes:
From this experiment, the student will be able to learn
Analyze the various techniques used for clock synchronization and mutual
exclusion
13. Software Required: JDK 1.6
14. Theory
Mutual exclusion:
Concurrent access of processes to a shared resource or data is executed in mutually
exclusive manner.
Only one process is allowed to execute the critical section (CS) at any given time.
In a distributed system, shared variables (semaphores) or a local kernel cannot be
used to implement mutual exclusion.
Two basic approaches for distributed mutual exclusion:
1. Token based approach
2. Non-token based approach
Token-based approach:
37
Two type of messages ( REQUEST and REPLY) are used and communication channels
are assumed to follow FIFO order.
A site send a REQUEST message to all other site to get their permission to enter
critical section.
A site send a REPLY message to other site to give its permission to enter the critical
section.
A timestamp is given to each critical section request using Lamport’s logical clock.
Timestamp is used to determine priority of critical section requests. Smaller
timestamp gets high priority over larger timestamp. The execution of critical section
request is always in the order of their timestamp.
16. Conclusion:
In distributed environment there is no centralized control over the distributed resources. The
machine which wants to acquire a resource must hold a token. But if a token is lost, it should
be regenerated using election algorithm. Here we have studied working of Ricart–Agrawala
algorithm which requires invocation of 2(N – 1) messages per critical section execution.
These 2(N – 1) messages involves (N – 1) request messages and (N – 1) reply messages
38
18. References:
M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International
Publishers 2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
39
Distributed Computing
Experiment No. : 7
40
Experiment No. 7
3. Aim: Program to Simulate Load Balancing Algorithm using Java
4. Objective:
Understand the different approach of load balancing in distributed system.
3. Outcomes:
From this experiment, the student will be able to learn
Analyze the various techniques used for load balancing in distributed system
4. Software Required: JDK 1.6,Python
5. Theory
Load balancing is the way of distributing load units (jobs or tasks) across a set of processors
which are connected to a network which may be distributed across the globe. The excess load
or remaining unexecuted load from a processor is migrated to other processors which have
load below the threshold load. Threshold load is such an amount of load to a processor that
any load may come further to that processor. In a system with multiple nodes there is a very
high chance that some nodes will be idle while the other will be over loaded. So the
processors in a system can be identified according to their present load as heavily loaded
processors (enough jobs are waiting for execution), lightly loaded processors (less jobs are
waiting) and idle processors (have no job to execute). By load balancing strategy it is possible
to make every processor equally busy and to finish the works approximately at the same time.
A load balancing operation consists of three rules. These are location rule, distribution rule
and selection rule
6. Procedure/ Program:
41
routine Load_balance(n, p)
// We have list of n nodes initialized to 0 and is returned at the end of the algorithm.
//Round Robin Algorithm is used to balance the load with time quantum as 1 process.
Create a list of n nodes with each node having 0 processes allocated currently.
Consider i processes, and assign j<-0
while i not equals 0, do
add a process to jth node(considering 1 process as time quantum of Round
Robin Algo )
j<-(j+1)%n
decrement i
Return the list.
Main routine:
User inputs the n nodes and p processes
Call route Load_balance(n,p) to get the balanced list
MENU
add a new node, call routine Load_balance(n+1, p)
remove a node, call routine Load_balance(n-1, p)
add a new process, call routine Load_balance(n, p+1)
remove a process, call routine Load_balance(n, p-1)
QUIT
Display the returned list
7. Conclusion:
In computing load balancing is a technique which improves the
workloads distribution through multiple resources, as computers, clusters, servers, disks.
Thus we have studied and implemented of load balancing technique to optimize the use of
resources available, maximize throughput, minimize response time, and avoid overload of
any single resource.
42
What are the benefits of using load balancing?
9. References:
(2012). The Study on Load Balancing Strategies in Distributed Computing System.
International Journal of Computer Science & Engineering Survey (IJCSES).
Vol.3,. 19-30. 10.5121/ijcses.2012.3203.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
43
Distributed Computing
Experiment No. : 8
44
Experiment No. 8
1. Aim: Program to implement Deadlock management in distributed system.
2. Objective:
To equip students with skills to analyze and design distributed applications
To study deadlock situation in operating system.
To understand Bankers algorithm for deadlock avoidance and detection.
Construct a resource allocation graph for a deadlock condition and verify using the
simulator.
3. Outcomes:
From this experiment, the student will be able to learn
Demonstrate the concepts of Consistency and Replication Management
4. Software Required: JDK 1.6
5. Theory
A deadlock is a condition in a system where a set of processes (or threads) have requests
for resources that can never be satisfied. Essentially, a process cannot proceed because it
needs to obtain a resource held by another process; but, it itself is holding a resource that the
other process needs. There are four conditions to be met for a deadlock to occur in a system:
1. Mutual exclusion: A resource can be held by at most one process.
2. Hold and wait: Processes that already hold resources can wait for another resource.
4. Circular wait: Two or more processes are waiting for resources held by one of the other
processes.
The banker's algorithm is a resource allocation and deadlock avoidance algorithm used in
distributed system. The Banker's algorithm is a resource allocation and deadlock avoidance
algorithm developed by Edsger Dijkstra that tests for safety by simulating the allocation of
predetermined maximum possible amounts of all resources, and then makes an "s-state"
check to test for possible deadlock conditions for all other pending activities, before deciding
whether allocation should be allowed to continue.The Banker's algorithm is run by the
operating system whenever a process requests resources.The algorithm avoids deadlock by
45
denying or postponing the request if it determines that accepting the request could put the
system in an unsafe state. When a new process enters a system, it must declare the maximum
number of instances of each resource type that it may ever claim; clearly, that number may
not exceed the total number of resources in the system. Also, when a process gets all its
requested resources it must return them in a finite amount of time.
6. Procedure/ Program:
The Banker’s Algorithm is as follows:
STEP 1: initialize
Work := Available;
for i = 1,2,...,n
Finish[i] = false
STEP 2: find i such that both
a. Finish[i] is false
b. Need_i <= Work
if no such i, goto STEP 4
STEP 3:
Work := Work + Allocation_i
Finish[i] = true
17
goto STEP 2
STEP 4:
if Finish[i] = true for all i, system is in safe state
Procedure:
1) Enter number of processes with their need.
2) Find out whether allocated resources are greater than required resources.
3) If allocated resources are greater than required resources then it in safe state or
else it an unsafe state.
7. Conclusion:
46
What necessary conditions can lead to a deadlock situation in a system?
What is Concurrency? Explain with example Deadlock and Starvation.
What are your solution strategies for “Dining Philosophers Problem”?
What is Bankers Algorithm?
9. References:
M.R. Bhujade, “Parallel Computing”, 2nd edition, New Age International
Publishers 2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press
47
Distributed Computing
Experiment No. : 9
48
Experiment No. 9
1. Aim: Case Study of Google File System
2. Objective:
To understand the approaches for designing a Google file system and understand its
architecture.
3. Outcomes:
Student will learn
Apply the knowledge of Distributed File System to analyse various file systems
and experience in building large-scale distributed applications.
4. Software Required: Microsoft Office, Internet
5. Theory
The Google File System, developed in late 1990s, uses thousands of storage
systems built from inexpensive commodity components to provide petabytes of
storage to a large user community with diverse need.
Some of the most important aspects of this analysis reflected in the GFS design
are:
• Scalability and reliability are critical features of the system; they must be
considered from the beginning, rather than at later design stages.
• The vast majority of files range in size from a few GB to hundreds of TB.
• The most common operation is to append to an existing file; random write
operations to a file are extremely infrequent.
• Sequential read operations are the norm.
• Users process the data in bulk and are less concerned with the response time
. • To simplify the system implementation the consistency model should be relaxed
without placing an additional burden on the application developers.
As a result of this analysis several design decisions were made:
1. Segment a file in large chunks.
2. Implement an atomic file append operation allowing multiple applications
operating concurrently to append to the same file.
3. Build the cluster around a high-bandwidth rather than low-latency
interconnection network. Separate the flow of control from the data flow; schedule
the high-bandwidth data flow by pipelining the data transfer over TCP connections
49
to reduce the response time. Exploit network topology by sending data to the
closest node in the network.
4. Eliminate caching at the client site; caching increases the overhead for
maintaining consistency among cashed copies at multiple client sites and it is not
likely to improve performance.
5. Ensure consistency by channeling critical file operations through a master
controlling the entire system.
6. Minimize master’s involvement in file access operations to avoid hot-spot
contention and to ensure scalability.
7. Support efficient checkpointing and fast recovery mechanisms.
8. Support efficient garbage collection mechanisms. GFS files are collections of
fixed-size segments called chunks; at the time of file creation each chunk is
assigned a unique chunk handle. A chunk consists of 64 KB blocks and each block
has a 32-bit checksum. Chunks are stored on Linux files systems and are replicated
on multiple sites; a user may change the number of the replicas, from the standard
value of three, to any desired value. The chunk size is 64 MB; this choice is
motivated by the desire to optimize the performance for large files and to reduce
the amount of metadata maintained by the system.
50
mutations, operations such as write or append which occur frequently. In such
cases the master grants a lease for a particular chunk to one of the chunk servers
called the primary; then, the primary creates a serial order for the updates of that
chunk. When data of a write operation straddles chunk boundary, two operations
are carried out, one for each chunk.
The following steps of a write request illustrate the process which buffers data and
decouples the control flow from the data flow for efficiency:
1. The client contacts the master which assigns a lease to one of the chunk servers
for the particular chunk, if no lease for that chunk exists; then, the master replies
with the Ids of the primary and the secondary chunk servers holding replicas of the
chunk. The client caches this information.
2. The client sends the data to all chunk servers holding replicas of the chunk; each
one of the chunk servers stores the data in an internal LRU buffer and then sends
an acknowledgment to the client.
3. The client sends the write request to the primary chunk server once it has
received the acknowledgments from all chunk servers holding replicas of the
chunk. The primary chunk server identifies mutations by consecutive sequence
numbers.
4. The primary chunk server sends the write requests to all secondaries.
5. Each secondary chunk server applies the mutations in the order of the sequence
number and then sends an acknowledgment to the primary chunk server.
6. Finally, after receiving the acknowledgments from all secondaries, the primary
informs the client.
The system supports an efficient checkpointing procedure based on copy-on-write
to construct system snapshots.
6. Conclusion:
From this experiment we have studied the concept of GFS which demonstrates the
qualities essential for supporting large scale data processing workloads on commodity
hardware.
51
List the different features of Google file system.
8. References:
Chapter 7 - Cloud Applications, Editor(s): Dan C. Marinescu, Cloud Computing
(Second Edition), Morgan Kaufmann,2018,Pages 237-279,ISBN 9780128128107.
R. Bhujade, “Parallel Computing”, 2nd edition, New Age International Publishers
2009.
Andrew S. Tanenbaum and Maarten Van Steen, “Distributed Systems: Principles
and Paradigms, 2nd edition, Pearson Education, Inc., 2007, ISBN: 0-13-239227-5.
George Coulouris, Jean Dollimore, Tim Kindberg, “Distributed Systems: Concepts
andDesign” (4th Edition), Addison Wesley/Pearson Education.
Pradeep K Sinha, “Distributed Operating Systems : Concepts and design”, IEEE
computer society press.
52