0% found this document useful (0 votes)

88 views28 pages

Version Control

Git is a distributed version control system that stores data as snapshots of the project over time rather than tracking changes to files. It functions much faster than centralized version control systems because nearly every operation is local and does not require network access. Additionally, Git has strong integrity as it checksums every file and reference to ensure data integrity during operations and prevent accidental changes from going undetected.

Uploaded by

Erkinai Shamshidinova

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views28 pages

Version Control

Uploaded by

Erkinai Shamshidinova

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Version Control: What is it, and why should

you care?

Dr. Remudin Reshid Mekuria

March 2, 2022
Local Version Control Systems

I Many people’s version-control method of choice is to copy

files into another directory (perhaps a time-stamped directory,
if they’re clever).
I It is easy to forget which directory you’re in and accidentally
write to the wrong file or copy over files you don’t mean to.
I To deal with this issue, programmers long ago developed local
VCSs that had a simple database that kept all the
changes to files under revision control.
Local Version Control Systems (cont.)

I One of the most popular VCS tools was a system called

Revision Control System (RCS), which is still distributed
with many computers today.
I RCS works by keeping patch sets (that is, the differences
between files) in a special format on disk; it can then
re-create what any file looked like at any point in time
by adding up all the patches.
Local Version Control Systems (cont.)
Centralized Version Control Systems

I What is known as the Centralized Version Control Systems

(CVCSs) were developed to allow people collaborate with
developers on other systems.
I These systems (such as CVS, Subversion, and Perforce) have
a single server that contains all the versioned files, and a
number of clients that check out files from that central place.
I For many years, this has been the standard for version
control.
Advantages of CVCSs

I This setup offers many advantages, especially over local

VCSs.
I For example, everyone knows to a certain degree what
everyone else on the project is doing.
I Administrators have fine-grained control over who can do
what, and it’s far easier to administer a CVCS than it is to
deal with local databases on every client.
Centralized Version Control: Illustration
Centralized Version Control: disadvantages

I However, this setup also has some serious downsides.

I The most obvious is the single point of failure that the
centralized server represents.
I If that server goes down for an hour, then during that hour
nobody can collaborate at all or save versioned changes to
anything they’re working on.
I If the hard disk in the central database becomes
corrupted, and proper backups haven’t been kept, you
lose absolutely everything – the entire history of the project
except whatever single snapshots people happen to have on
their local machines.
Distributed Version Control Systems

I Local VCSs suffer from this same problem – whenever you

have the entire history of the project in a single place, you risk
losing everything.
I This is where Distributed Version Control Systems
(DVCSs) step in.
I In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients
don’t just check out the latest snapshot of the files; rather,
they fully mirror the repository, including its full history.
I Thus, if any server dies, and these systems were collaborating
via that server, any of the client repositories can be
copied back up to the server to restore it.
I Every clone is really a full backup of all the data.
Distributed Version Control: Illustration
Distributed Version Control Systems (cont.)

I Many of these systems deal pretty well with having several

remote repositories they can work with.
I Thus one can collaborate with different groups of people in
different ways simultaneously within the same project.
I This allows you to set up several types of workflows that
aren’t possible in centralized systems, such as
hierarchical models.
A Short History of Git

I As with many great things in life, Git began with a bit of

creative destruction and fiery controversy.
I The Linux kernel is an open source software project of fairly
large scope.
I During the early years of the Linux kernel maintenance
(1991–2002), changes to the software were passed
around as patches and archived files.
I In 2002, the Linux kernel project began using a proprietary
DVCS called BitKeeper.
I In 2005, the relationship between the community that
developed the Linux kernel and the commercial company
that developed BitKeeper broke down, and the tool’s
free-of-charge status was revoked.
A Short History of Git (cont.)

I This prompted the Linux development community (and in

particular Linus Torvalds, the creator of Linux) to develop
their own tool based on some of the lessons they learned while
using BitKeeper.
I Speed
I Simple design
I Strong support for non-linear development (thousands of
parallel branches)
I Fully distributed
I Able to handle large projects like the Linux kernel
efficiently (speed and data size)
I Git has evolved and matured to be easy to use and yet retain
these initial qualities.
I It’s amazingly fast and very efficient with large projects.
So what is Git in a nutshell?

I This is an important section to absorb,because if you

understand what Git is and the fundamentals of how it
works, then using Git effectively will probably be much
easier for you.
I As you learn Git, try to clear your mind of the things you may
know about other VCSs, as this will help you avoid subtle
confusion when using the tool.
I You will see that Git stores and thinks about information
in a very different way, and understanding these differences
will help you avoid becoming confused while using it.
I Conceptually, most other systems store information as a
list of file-based changes.
Storing data as changes to a base version of each file

I These other systems (CVS, Subversion, Perforce, Bazaar, and

so on) think of the information they store as a set of files and
the changes made to each file over time (this is commonly
described as delta-based version control).
Storing data as snapshots of the project over time

I Git doesn’t think of or store its data like in the illustration in

the previous slide.
I Instead, Git thinks of its data more like a series of
snapshots of a miniature filesystem.
Storing data as snapshots of the project over time (cont.)

I With Git, every time you commit, or save the state of your
project, Git basically takes a picture of what all your files look
like at that moment and stores a reference to that snapshot.
I To be efficient, if files have not changed, Git doesn’t store the
file again, just a link to the previous identical file it has
already stored.
I Git thinks about its data more like a stream of
snapshots.
I This is an important distinction between Git and nearly
all other VCSs.
Storing data as snapshots of the project over time (cont.)

I It makes Git reconsider almost every aspect of version control

that most other systems copied from the previous generation.
I This makes Git more like a mini filesystem with some
incredibly powerful tools built on top of it, rather than
simply a VCS.
I We will explore some of the benefits you gain by thinking of
your data this way when we cover Git branching later on.
Nearly Every Operation Is Local

I Most operations in Git need only local files and

resources to operate – generally no information is needed
from another computer on your network.
I If you’re used to a CVCS where most operations have that
network latency overhead, this aspect of Git will make you
think that the gods of speed have blessed Git with unworldly
powers.
I Because you have the entire history of the project right
there on your local disk, most operations seem almost
instantaneous.
Nearly Every Operation Is Local: Example

I To browse the history of the project, Git doesn’t need to go

out to the server to get the history and display it for you – it
simply reads it directly from your local database.
I This means you see the project history almost instantly.
I Let us say one needs to see the changes introduced between
the current version of a file and the file a month ago.
I Then Git can look up the file a month ago and do a local
difference calculation, instead of having to either ask a
remote server to do it or pull an older version of the file
from the remote server to do it locally.
Nearly Every Operation Is Local: Example (cont.)
I This also means that there is very little you can’t do if
you’re offline or off VPN.
I If you get on an airplane or a train and want to do a little
work, you can commit happily (to your local copy,
remember?) until you get to a network connection to
upload.
I If you go home and can’t get your VPN client working
properly, you can still work.
I In many other systems, doing so is either impossible or painful.
I In Perforce, for example, you can’t do much when you aren’t
connected to the server; in Subversion and CVS, you can edit
files, but you can’t commit changes to your database (because
your database is offline).
I This may not seem like a huge deal, but you may be surprised
what a big difference it can make.
Git Has Integrity

I Everything in Git is checksummed before it is stored and

is then referred to by that checksum.
I This means it’s impossible to change the contents of any file
or directory without Git knowing about it.
I This functionality is built into Git at the lowest levels
and is integral to its philosophy.
I You can’t lose information in transit or get file
corruption without Git being able to detect it.
Git Has Integrity (cont.)

I The mechanism that Git uses for this checksumming is

called a SHA-1 hash.
I This is a 40-character string composed of hexadecimal
characters (0–9 and a–f) and calculated based on the
contents of a file or directory structure in Git. A SHA–1 hash
looks something like this:
24b9da6552252987aa493b52f8696cd6d3b00373
I You will see these hash values all over the place in Git because
it uses them so much.
I In fact, Git stores everything in its database not by file
name but by the hash value of its contents.
Git Generally Only Adds Data

I When you do actions in Git, nearly all of them only add

data to the Git database.
I It is hard to get the system to do anything that is not
undoable or to make it erase data in any way.
I As with any VCS, you can lose or mess up changes you
haven’t committed yet, but after you commit a snapshot into
Git,it is very difficult to lose, especially if you regularly
push your database to another repository.
I This makes using Git a joy because we know we can
experiment without the danger of severely screwing things up.
The Three States

I Pay attention now – here is the main thing to remember

about Git if you want the rest of your learning process to go
smoothly. Git has three main states that your files can reside
in: modified, staged, and committed:
I Modified means that you have changed the file but have
not committed it to your database yet.
I Staged means that you have marked a modified file in its
current version to go into your next commit snapshot.
I Committed means that the data is safely stored in your
local database.
I This leads us to the three main sections of a Git project: the
working tree, the staging area, and the Git directory.
Working tree, staging area, and Git directory
Working tree, staging area, and Git directory (cont)

I The working tree is a single checkout of one version of the

project.
I These files are pulled out of the compressed database in the
Git directory and placed on disk for you to use or modify.
I The staging area is a file, generally contained in your Git
directory, that stores information about what will go into your
next commit.
I Its technical name in Git parlance is the “index”, but the
phrase “staging area” works just as well.
I The Git directory is where Git stores the metadata and object
database for your project.
I This is the most important part of Git, and it is what is
copied when you clone a repository from another
computer.
The basic Git workflow goes something like this:

1. You modify files in your working tree.

2. You selectively stage just those changes you want to be
part of your next commit, which adds only those changes to
the staging area.
3. You do a commit, which takes the files as they are in the
staging area and stores that snapshot permanently to your Git
directory.
I If a particular version of a file is in the Git directory, it is
considered committed.
I If it has been modified and was added to the staging area, it
is staged.
I And if it was changed since it was checked out but has not
been staged, it is modified.

GIT Material Titorial Point
No ratings yet
GIT Material Titorial Point
70 pages
Version Control System
No ratings yet
Version Control System
32 pages
Git From Scratch: By: Eng. Mohamed Elemam Email
100% (5)
Git From Scratch: By: Eng. Mohamed Elemam Email
61 pages
Git Workshop - PDF Version 1-Rotated
No ratings yet
Git Workshop - PDF Version 1-Rotated
36 pages
Fast-Version-Control: Search Entire Site..
No ratings yet
Fast-Version-Control: Search Entire Site..
9 pages
IPT Week 1 3
No ratings yet
IPT Week 1 3
9 pages
Intro to Version Control Systems
No ratings yet
Intro to Version Control Systems
45 pages
Git Quick Guide
100% (1)
Git Quick Guide
41 pages
Version Control: 1.what Is "Version Control", and Why Should You Care
No ratings yet
Version Control: 1.what Is "Version Control", and Why Should You Care
7 pages
DevOps Chapter-2
No ratings yet
DevOps Chapter-2
31 pages
Git SCM - Com 13 Getting Started
No ratings yet
Git SCM - Com 13 Getting Started
4 pages
Version Control Guide for Developers
No ratings yet
Version Control Guide for Developers
192 pages
Version Control With Git
No ratings yet
Version Control With Git
67 pages
Git - Learn Version Control With Git - A Step-By-step Ultimate Beginners Guide
100% (3)
Git - Learn Version Control With Git - A Step-By-step Ultimate Beginners Guide
105 pages
GIT Interview Questions
No ratings yet
GIT Interview Questions
47 pages
Lecture 6
No ratings yet
Lecture 6
28 pages
DevOps Chapter 2
No ratings yet
DevOps Chapter 2
31 pages
Introduction To Git: Architecture of CVCS
No ratings yet
Introduction To Git: Architecture of CVCS
34 pages
Source Code Management / Version Control System
No ratings yet
Source Code Management / Version Control System
24 pages
SPE Git
No ratings yet
SPE Git
49 pages
2BA24MC042 (Git&v Control)
No ratings yet
2BA24MC042 (Git&v Control)
19 pages
Git 1
No ratings yet
Git 1
22 pages
Git - Getting Started - Basics
No ratings yet
Git - Getting Started - Basics
5 pages
Learn Version Control With Git
No ratings yet
Learn Version Control With Git
168 pages
Git (Version Control Tool)
No ratings yet
Git (Version Control Tool)
9 pages
Chapter - 4 - Version Control Systems Using GIT
No ratings yet
Chapter - 4 - Version Control Systems Using GIT
84 pages
Linux 2023 Fall 3 Version Control System Git
No ratings yet
Linux 2023 Fall 3 Version Control System Git
33 pages
Version Control System
No ratings yet
Version Control System
27 pages
Git for Developers and Programmers
No ratings yet
Git for Developers and Programmers
13 pages
GIT Documentation
No ratings yet
GIT Documentation
58 pages
Unit 1 - Introduction
No ratings yet
Unit 1 - Introduction
8 pages
Git Basics-Session-1
No ratings yet
Git Basics-Session-1
19 pages
Version Controls
No ratings yet
Version Controls
8 pages
Practical 1 (1) - Merged (1) - Removed
No ratings yet
Practical 1 (1) - Merged (1) - Removed
61 pages
8 Version Control - Notes
No ratings yet
8 Version Control - Notes
8 pages
Slides Git First Steps
No ratings yet
Slides Git First Steps
171 pages
Version Control Systems
No ratings yet
Version Control Systems
9 pages
A Version Control System
No ratings yet
A Version Control System
21 pages
Git Essentials for Developers
No ratings yet
Git Essentials for Developers
35 pages
Ajay Version Control 1
No ratings yet
Ajay Version Control 1
4 pages
About Version Control
No ratings yet
About Version Control
6 pages
Ajay Version Control
No ratings yet
Ajay Version Control
3 pages
Ethans-Prakash 1
No ratings yet
Ethans-Prakash 1
37 pages
Structured Programming Lab Lecture 1
No ratings yet
Structured Programming Lab Lecture 1
6 pages
Git Version Control Tutorial
No ratings yet
Git Version Control Tutorial
181 pages
Git and Github Lyst1736093713658
No ratings yet
Git and Github Lyst1736093713658
6 pages
Version Control
100% (1)
Version Control
3 pages
Unit Ii
No ratings yet
Unit Ii
19 pages
Lecture 10 - Version Control
No ratings yet
Lecture 10 - Version Control
66 pages
Manager: Xyz Abc Xyz
No ratings yet
Manager: Xyz Abc Xyz
9 pages
1 - GIT - Dia1
No ratings yet
1 - GIT - Dia1
32 pages
Lab 2
No ratings yet
Lab 2
24 pages
DevOps Notes-1-2
No ratings yet
DevOps Notes-1-2
2 pages
Intro to Git and Version Control
No ratings yet
Intro to Git and Version Control
24 pages
Requirements - or - You Want Me To Do What?: CSE 442/542 - Software Engineering
No ratings yet
Requirements - or - You Want Me To Do What?: CSE 442/542 - Software Engineering
86 pages
Git and Github Interview Questions
No ratings yet
Git and Github Interview Questions
21 pages
Session 02 - Git Presentation
No ratings yet
Session 02 - Git Presentation
41 pages
Git Basics
No ratings yet
Git Basics
19 pages
Embrace The Git Index
No ratings yet
Embrace The Git Index
8 pages
Git Collaboration Best Practices
No ratings yet
Git Collaboration Best Practices
3 pages
Correct Answer: IBM: Your Answer: The SCO Group
No ratings yet
Correct Answer: IBM: Your Answer: The SCO Group
4 pages
Opengatev9 2
No ratings yet
Opengatev9 2
339 pages
Unit Iii Continuous Integration Using Jenkins Notes
No ratings yet
Unit Iii Continuous Integration Using Jenkins Notes
48 pages
Git Commands
No ratings yet
Git Commands
3 pages
Git Cheat Sheet: Getting Started
No ratings yet
Git Cheat Sheet: Getting Started
3 pages
IANS - Third Party Software Security Checklist
No ratings yet
IANS - Third Party Software Security Checklist
4 pages
CCS342 DevOps Unit 1 Part-A
No ratings yet
CCS342 DevOps Unit 1 Part-A
5 pages
Open EdX On Azure - Ficus Stamp Deployment - V4
No ratings yet
Open EdX On Azure - Ficus Stamp Deployment - V4
19 pages
.NET Deployment with GitLab Runner
No ratings yet
.NET Deployment with GitLab Runner
12 pages
Harshal More Aws&Devops
No ratings yet
Harshal More Aws&Devops
3 pages
Jenkins Setup Guide for Developers
No ratings yet
Jenkins Setup Guide for Developers
12 pages
7COM1079 Coursework
No ratings yet
7COM1079 Coursework
8 pages
Devops
67% (3)
Devops
38 pages
NG 4 CPP Module 01
No ratings yet
NG 4 CPP Module 01
8 pages
Git Basics for Developers
No ratings yet
Git Basics for Developers
6 pages
Command Line For Beginners - How To Use The Terminal Like A Pro (Full Handbook)
No ratings yet
Command Line For Beginners - How To Use The Terminal Like A Pro (Full Handbook)
87 pages
DevOps & Cloud Engineering Expertise
No ratings yet
DevOps & Cloud Engineering Expertise
2 pages
Deploy Your Website On Cloud Run
No ratings yet
Deploy Your Website On Cloud Run
19 pages
OpenDaylight Controller - MD-SAL Developers' Guide (PDFDrive)
No ratings yet
OpenDaylight Controller - MD-SAL Developers' Guide (PDFDrive)
363 pages
Linux Fedora Man - K Files
No ratings yet
Linux Fedora Man - K Files
124 pages
Spring Cloud
No ratings yet
Spring Cloud
195 pages
OrangePi 2G-IOT User Manual - v0.9.6 PDF
No ratings yet
OrangePi 2G-IOT User Manual - v0.9.6 PDF
47 pages
DevOps - Semester 2 - Module 02 - v1.0.0 - PPT
No ratings yet
DevOps - Semester 2 - Module 02 - v1.0.0 - PPT
33 pages
Git & GitHub Cheat Sheet
No ratings yet
Git & GitHub Cheat Sheet
11 pages
Lab Exercise Git: A Distributed Version Control System: 1 Working From The Command Line
No ratings yet
Lab Exercise Git: A Distributed Version Control System: 1 Working From The Command Line
9 pages
Nexus Notes Session-01
No ratings yet
Nexus Notes Session-01
9 pages
Introducting Perforce - Helix
No ratings yet
Introducting Perforce - Helix
30 pages

Version Control

Uploaded by

Version Control

Uploaded by

Version Control: What is it, and why should

Dr. Remudin Reshid Mekuria

I Many people’s version-control method of choice is to copy

I One of the most popular VCS tools was a system called

I What is known as the Centralized Version Control Systems

I This setup offers many advantages, especially over local

I However, this setup also has some serious downsides.

I Local VCSs suffer from this same problem – whenever you

I Many of these systems deal pretty well with having several

I As with many great things in life, Git began with a bit of

I This prompted the Linux development community (and in

I This is an important section to absorb,because if you

I These other systems (CVS, Subversion, Perforce, Bazaar, and

I Git doesn’t think of or store its data like in the illustration in

I It makes Git reconsider almost every aspect of version control

I Most operations in Git need only local files and

I To browse the history of the project, Git doesn’t need to go

I Everything in Git is checksummed before it is stored and

I The mechanism that Git uses for this checksumming is

I When you do actions in Git, nearly all of them only add

I Pay attention now – here is the main thing to remember

I The working tree is a single checkout of one version of the

1. You modify files in your working tree.

You might also like