KEMBAR78
Unit-5 Notes | PDF | Version Control | Amazon Web Services
0% found this document useful (0 votes)
37 views39 pages

Unit-5 Notes

The document provides an introduction to Git, a distributed version control system that enables collaboration among developers by tracking changes in source code. It outlines Git's architecture, core concepts such as repositories, commits, branches, and merging, as well as essential commands for using Git effectively. Additionally, it discusses branching strategies, particularly GitFlow, and the process of merging branches, including conflict resolution and the importance of remote repositories for project management.

Uploaded by

prajval7324
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views39 pages

Unit-5 Notes

The document provides an introduction to Git, a distributed version control system that enables collaboration among developers by tracking changes in source code. It outlines Git's architecture, core concepts such as repositories, commits, branches, and merging, as well as essential commands for using Git effectively. Additionally, it discusses branching strategies, particularly GitFlow, and the process of merging branches, including conflict resolution and the importance of remote repositories for project management.

Uploaded by

prajval7324
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Introduction to GIT and Its Architecture

What is GIT?
GIT is a distributed version control system used to track changes in source code during
software development. It allows multiple developers to work on the same project without
overwriting each other’s work.

Features of GIT:

• Distributed system (every developer has a complete local copy)


• Fast performance
• Data integrity
• Supports non-linear workflows (branches, merges)

GIT Architecture:

• Working Directory: Local workspace where files are modified.


• Staging Area (Index): Prepares changes for commit.
• Repository (.git): Where all commits and branches are stored.
• Remote Repository: Shared server for team collaboration (e.g., GitHub, GitLab).

Important GIT Commands:

• git init, git clone, git add, git commit, git push, git pull, git status, git log


Git is a powerful and widely used version control system that helps developers track changes
in their code, collaborate with others, and manage project history effectively. Whether you are
a professional developer or just starting out, understanding Git is important for modern
software development. This article will introduce you to Git, explain its core concepts, and
provide practical examples to help you get started.

Why Use Git?

1. Collaboration: Git enables multiple developers to work on the same project


simultaneously. Changes can be merged seamlessly, and conflicts can be resolved easily.
2. History Tracking: Every change is recorded, allowing you to revert to previous versions
of your code if something goes wrong.
3. Branching and Merging: Git allows you to create branches for new features or
experiments without affecting the main codebase. Once the feature is ready, it can be
merged back into the main branch.
4. Distributed Development: Each developer has a complete copy of the repository,
including its history. This decentralization enhances collaboration and backup capabilities

Core Concepts of Git

1. Repositories
A repository (or repo) is a storage space where your project files and their history are kept.
There are two types of repositories in Git:
• Local Repository: A copy of the project on your local machine.
• Remote Repository: A version of the project hosted on a server, often on platforms like
GitHub, GitLab, or Bitbucket.

2. Commits
A commit is a snapshot of your project at a specific point in time. Each commit has a unique
identifier (hash) and includes a message describing the changes made. Commits allow you to
track and review the history of your project.

3. Branches
A branch is a separate line of development. The default branch is called main or master. You
can create new branches to work on features or fixes independently. Once the work is
complete, the branch can be merged back into the main branch.

4. Merging
Merging is the process of integrating changes from one branch into another. It allows you to
combine the work done in different branches and resolve any conflicts that arise.

5. Cloning
Cloning a repository means creating a local copy of a remote repository. This copy includes
all files, branches, and commit history.

6. Pull and Push


• Pull: Fetches updates from the remote repository and integrates them into your local
repository.
• Push: Sends your local changes to the remote repository, making them available to
others.

Getting Started with Git

1. Installing Git
Git can be installed on various operating systems.

2. Basic Git Commands

1. Initialize a Repository – git init


To start using Git in a project, you need to initialize a repository:
git init
This command creates a new Git repository in your project’s directory.

2. Clone a Repository – git clone


Creates a copy of an existing repository.
git clone https://github.com/username/repo.git

3. Check Repository Status – git status


To check the status of your repository:
git status
This command shows changes, staged files, and the current branch.

4. Add Changes:
To stage changes for the next commit:
git add <file-name>
Or to add all changes:
git add .

5. Commit Changes – git commit


Records the staged changes with a message.
git commit -m "Initial commit"
Include a descriptive message to explain what changes were made.

6. Create a Branch – git branch <branch-name>


Uploads local repository content to a remote repository.
git branch <branch-name>

7. Switch Branches – git checkout


To switch to a different branch:
git checkout <branch-name>

8. Pull Changes – git pull


Fetches and integrates changes from a remote repository.
git pull origin main

9. Merge Branches – git merge


Merges changes from one branch into another.
git merge <branch-name>

10. Push Changes – git push


To push your changes to the remote repository:
git push origin <branch-name>

Getting Started with Git

1. Install Git: Download and install Git from the official Git website.
2. Configure Git: Set up your username and email.
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
3. Create a Repository: Navigate to your project directory and initialize a Git repository.
git init
4. Make Your First Commit: Add files to the staging area and commit your changes.
git add .
git commit -m "Initial commit"

Branching strategies In Git



Branches are independent lines of work, stemming from the original codebase. Developers
create separate branches for independently working on features so that changes from other
developers don't interfere with an individual's line of work. Developers can easily pull changes
from different branches and also merge their code with the main branch. This allows easier
collaboration for developers working on one codebase.
Git branching strategies are essential for efficient code management and collaboration within
development teams. In this comprehensive guide, we will delve into the various Git branching
strategies, their benefits, implementation steps, and best practices.

Key Terminologies

• Git Branch: A parallel version of the code within a Git repository, allowing for separate
development and experimentation.
• Main Branch (formerly Master Branch): The primary branch of a Git repository where
the production-ready code resides.
• Feature Branch: A branch created to work on a specific feature or task isolated from the
main branch.
• Merge: The process of combining changes from one branch into another.
• Pull Request (PR): A request made by a developer to merge their changes into another
branch, often used for code review.
• CI/CD Pipeline: Continuous Integration and Continuous Deployment pipeline,
automating the process of building, testing, and deploying code changes.

What Is A Branching Strategy?

A branching strategy is a strategy that software development teams adopt for writing,
merging and deploying code with the help of a version control system like Git. It lays down a
set of rules that aid the developers on how to go about the development process and interact
with a shared codebase. Strategies like these are essential as they help in keeping project
repositories organized, error free and avoid the dreaded merge conflicts when multiple
developers simultaneously push and pull code from the same repository.
Encountering merge conflicts can impede the swift delivery of code, thereby obstructing the
establishment and upkeep of an efficient DevOps workflow. DevOps aims to facilitate a rapid
process for releasing incremental code changes. Therefore, implementing a structured
branching strategy can alleviate this challenge, enabling developers to collaborate seamlessly
and minimize conflicts. This approach fosters parallel workstreams within teams, promoting
quicker releases and reduced likelihood of conflicts through a well-defined process for source
control modifications.
The Branching strategies provides following features:
• Parallel development
• Enhanced productivity due to efficient collaboration
• Organized and structured feature releases
• Clear path for software development process
• Bug-free environment without disrupting development workflow

Step By Step Implementation Of Creating A Branch


The following are the steps for creating a branch:

Step 1: Create Branch


• Create a branch with the name you want to specify, here we are naming the branch name
as "new-feature".
git branch new-feature
Step 2: Navigate to Branch
• Now navigate to the new feature branch from the current branch with the following
command:
git checkout new-feature
( or )

Step 3: Creating And Navigating Branch At A Time


• The following one command only helps in creating the branch and navigating to the
branch.
git checkout -b new-feature

Step 4: Check Current Branch


• Execute the following command to check the current branch that you're on.
git branch

Step 5: Delete a Branch


Ensure you are present on the branch you want to delete.
git branch -d <branch-to-delete>

Common Git Branching Strategies


The following are the common git branching strategies:

Gitflow Workflow
GitFlow enables parallel development, where developers can work separately on feature
branches, where a feature branch is created from a master branch. After completion of
changes, the feature branch is merged with the master branch.
The types of branches that can be present in GitFlow are:
• Master - Used for product release
• Develop - Used for ongoing development
• Feature Branching - branches off the develop branch to develop new features.
• Release - Assist in preparing a new production release and bug fixing, typically branched
from the develop branch, and necessitating merges back into both develop and master
branches.
• Hotfix - Hotfix branches aid in addressing discovered bugs swiftly, allowing developers
to continue their work on the develop branch while the issue is resolved. Unlike release
branches, hotfix branches are created from master branch specifically for critical bug
resolution in the production release.
The Master and Develop branches are the main branches, and persist throughout the journey
of the software. The other branches are essentially supporting branches and are short-lived.
Pros Of Gitflow

• Facilitates parallel development, ensuring production code stability while developers


work on separate branches.
• Organizes work effectively with separate branches for specific purposes.
• Ideal for managing multiple versions of production code.
• GitFlow streamlines the release management process, expediting the rollout of new
features and bug fixes.
• By advocating for feature-based development through individual branches, GitFlow
fosters independent feature implementation. This approach allows seamless merging of
features into the main codebase, minimizing conflicts.
• GitFlow offers a well-defined procedure for addressing bugs and deploying hotfixes,
facilitating their rapid integration into production environments.
Cons Of Gitflow

• Complexity increases as more branches are added, potentially leading to difficulties in


management.
• Merging changes from development branches to the main branch requires multiple
steps, increasing the chance of errors and merge conflicts.
• Debugging issues becomes challenging due to the extensive commit history.
• GitFlow's complexity may slow down the development process and release cycle,
making it less suitable for continuous integration and continuous delivery.

Git – Merge

Git is a powerful version control system that helps developers manage code versions
efficiently. One of the most fundamental operations in Git is merging, which allows you to
integrate changes from one branch into another.
What is Git Merge?
Git merge is a command used to combine the changes from two different branches into one.
It helps in integrating work from multiple branches, ensuring that all changes are integrated
and up-to-date in the target branch without losing any progress made on either branch.
• Preserves History: Keeps commit history of both branches.
• Automatic and Manual: Automatically merges unless there are conflicts.
• Fast-Forward Merge: Moves the branch pointer forward if no diverging changes exist.
• Merge Commit: Creates a special commit to combine histories.
• No Deletion: Branches remain intact after merging.
• Used for Integration: Commonly integrates feature branches into main branches.

Syntax
git merge <branch-name>

How Does Git Merge Work?


The concept of git merging is basically to merge multiple sequences of commits, stored in
multiple branches in a unified history, or to be simple you can say in a single branch.

• Common Base: When merging two branches, Git looks for the common base commit.
• Merge Commit: Once it finds the common base, Git creates a new “merge commit” to
combine the changes.
• Conflict Resolution: If there are any conflicts between the branches, Git will prompt you
to resolve them.

Git – Merge

• In our case, we have two branches one is the default branch called “main” and the other
branch named “dev” and this is how our git repo looks before merging.
• Here git finds the common base, creates a new merge commit, and merged them.
Git – Merge

• A git merge operation is performed by running the below command.


• When we perform merging, git always merges with the current branch from where we are
performing the operation(in our case it is “main”). By this, the branch being merged is not
affected.
"git merge <name of the branch to be merged (in our case it is "dev")>".

Types of Merging in Git


Git supports several types of merging
In Git, there are two primary types of merging. There are.

1. Fast-forward merging
• This occurs when the target branch (e.g., main) is directly ahead of the feature branch
(e.g., dev).
• Instead of creating a merge commit, Git simply moves the current branch’s tip to the
target branch’s tip.
• Fast-forward merging is only possible when branches haven’t diverged

Git – Merge

2. Three-way merging
• This type occurs when the base branch has changed since the branch was first created.
• Git generates a new merge commit by comparing changes in both branches with the base
branch.
Note : Git also supports some other types of merging like recursive and octopus margin. With
the help of a single merge commit “octopus merging” can merge multiple branches at once.
“Recursive merging” is similar to three-way merging but it can handle some more complex
merge operations than the three-way merging.
Essential Commands To Perform Git Merging
To perform a git merge, you need a Git repository with at least two branches. Here’s how to
proceed:
• Create a new branch
git branch <name of the branch you wanna create>
• Merge two branches: First, check out the target branch, then run:
git merge <name of the current branch>
• Now we have successfully merged our two branches and as you can see we have the same
changes or you can say commits in both branches.

Steps To Merge a Branch


To ensure smooth merging, follow these steps:

Step 1: Create a New Branch


Create a new branch from the remote repository you want to merge.
git checkout -b <new-branch-name>

Step 2: Pull the Latest Changes


Before merging, ensure that you pull the latest changes from both branches (e.g., main and
the feature branch).
git checkout <target-branch>
git pull origin <target-branch>
git checkout <feature-branch>
git pull origin <feature-branch>

Step 3: Merge the Branch


If any conflicts arise, Git will notify you. Resolve them manually before proceeding.
git checkout <target-branch>
git merge <feature-branch>

Step 4: Test the Merged Code


Make sure the merged code functions correctly by testing it either automatically or manually.
# Run tests or manually test your application
Step 5: Commit the Merged Code
Once satisfied with the merged code, commit the changes:
git commit -m "Merge branch 'dev' into main"

Step 6: Push the Merged Branch


Push the changes to the remote repository to share the new merged branch:
git push origin main

How To Resolve Merge Conflicts?


Git prompts you to resolve conflicts manually when changes are made to the same part of the
code in both branches. Here’s how to resolve conflicts:

Step 1: Identify the conflict files.


Git will display files that have merge conflicts. These files need manual resolution.

Step 2: Open the conflict files.


Use your preferred editor to open the conflicting files. Look for conflict markers
<<<<<<< HEAD
// Code from the current branch
=======
– Code from the merging branch
>>>>>>> branch-name

Step 3: Resolve the conflicts.


Remove the unnecessary changes, keeping the most relevant ones.

Step 4: Moving to the staging.


Once resolved, add the files to the staging area:
git add <file-name>

Step 5: Commit and Push the changes


After resolving conflicts commit the changes by using the below command.Including the
message which gives information about changes made while resolving the conflicts.
git commit -m "message"
Push the changes made to the remote repository by using. Below command.
git push

What Is a Remote Repository?



The repositories of Github act as essential places for storing the files with maintaining the
versions of development. By using GitHub repositories developers can organize, monitor, and
save their changes of code to their projects in remote environments. The files in the GitHub
repository are imported from the repository into the local server of the user for further updations
and modifications in the content of the file. In this article, we will go through a detailed
understanding of the GitHub repository and its workflow.

What Is Git?
Git is a distributed version control system that is used to store the source code in software
development to track its changes. It facilitates the developers to work collaboratively with
teams on repositories to manage the codebase versions i.e., maintaining the history of project
modifications. On using git, developers can seamlessly move through the different project
states, and merge changes efficiently ensuring a streamlined and organized approach to
software development.

Features Of Git
The Efficient local operations, Secured Version control, flexible workflows, and collaborative
tools enhanced the developers for effective usage of git in diverse ways. The following sections
discuss some of the git features.

1. Performance Of Git
• Local Operations: Git performs most of the operations locally boosting the speed and
efficiency.
• Light Weight Branching: It creates and combines the branches quickly enabling the
parallel development of programming.
• Optimized Merging: It comes up with effective methods for simplifying the integration
of modifications from several branches.

2. Git Security
• Data Integrity: Git comes with tampered resistant version control system ensuring via
cryptographic hashing.
• Access Controls: Git facilitates in restricting the user access and prevent from illegal
modifications by defining permissions.
• Secure Protocols: It allows in exchanges of secure data over SSH and HTTPS protocols.

3. Flexibility Over Git


• Decentralized Development: Git facilitates the developers to work independently using
complete local copies promoting flexibility.
• Branching Strategies: It provides flexible branching techniques that are designed to
accommodate a wide range of development process.
• Experimentation And Rollback: On using git, it is simple to conduct experiments in
separating the branches and rollback the changes if needed.

Version Control With Git

A VCS or the Version Control System is used to track versions of the files and store them in a
specific place known as repository. The process of copying the content from an existing Git
Repository with the help of various Git Tools is termed git cloning. Once the cloning process
is done, the user gets the complete repository on his local machine. Git by default assumes the
work to be done on the repository as a user, once the cloning is done. Users can also create a
new repository or delete an existing repository. To delete a repository, the simpler way is to
just delete the folder containing the repository. Repositories can be divided into two types based
on the usage on a server. These are:

• Bare Repositories: These repositories are used to share the changes that are done by
different developers. A user is not allowed to modify this repository or create a new
version for this repository based on the modifications done.
• Non-bare Repositories: Non-bare repositories are user-friendly and hence allow the user
to create new modifications of files and also create new versions for the repositories. The
cloning process by default creates a non-bare repository if any parameter is not specified
during the clone operation.

Understanding The Working Tree In A Git Repository


A working tree in a Git Repository is the collection of files which are originated from a certain
version of the repository. It helps in tracking the changes done by a specific user on one version
of the repository. Whenever an operation is committed by the user, Git will look only for the
files which are present in the working area, and not all the modified files. Only the files which
are present in the working area are considered for commit operation. The user of the working
tree gets to change the files by modifying existing files and removing or creating files. There
are a few stages of a file in the working tree of a repository:

• Untracked: In this stage, the Git repository is unable to track the file, which means that
the file is never staged nor it is committed. The file is present in the working directory but
Git is unaware of its existence.
• Tracked: When the Git repository tracks a file, which means the file is committed but is
not staged in the working directory. In this the file changes have been committed at some
point in the repository’s history.
• Staged: In this stage, the file is ready to be committed and is placed in the staging area
waiting for the next commit. The changes in the file have been marked and to be included
in the next commit.
• Modified/Dirty: When the changes are made to the file i.e. the file is modified but the
change is not yet staged.
After the changes are done in the working area, the user can either update these changes in the
GIT repository or revert the changes.

Overview Of Git Repository Operations


A GIT repository allows performing various operations on it, to create different versions of a
project. These operations include the addition of files, creating new repositories, committing
an action, deleting a repository, etc. These modifications will result in the creation of different
versions of a project.
Git Repository Areas ( Working Area, Stagging Area And Commit Area )

After performing various modifications on a file in the Working Area, GIT needs to follow two
more steps to save these changes in the local repository. These steps are:
1. Adding the changes to the Index(Staging Area)
2. Committing the indexed changes into the repository

Moving From Working Area To Stagging Area Of A Git Repository

Adding changes to the Index This process is done by the use of git add command. When the
changes have been made in the Working Tree/Area. These changes need to be added to the
Staging Area for further modification of the file. git add command adds the file in the local
repository. This stages them for the commit process.

Syntax And Usage Of `git add`


$ git add file_name
The following are the different ways to use add command:
• To add all the working area files in the current repository to the stagging Area following
command is used:
$ git add .
• To add a specific list of files to the staging area.
$ git add --all
• To add all files with extension .txt of the current directory to a staging area.
$ git add *.txt
• To add all text files with .txt extension of the docs directory to staging area.
$ git add docs/*.txt
• To add all text files of a particular directory(docs) to staging area.
$ git add docs/
• To add all files in a particular directory(docs) to staging area.
$ git add “*.txt”

Moving From Staging Area To Commit Area In A Git Repository


To add text files of entire project to staging area. Committing changes from the
Index Committing process is done in the staging area on the files which are added to the
Index after git add command is executed. This committing process is done by the use of git
commit command. This command commits the staged changes to the local repository.
Syntax And Usage Of `git commit`
$ git commit -m "Add existing file"
• This commit command is used to add any of the tracked files to staging area and commit
them by providing a message to rememb

er.
Cloning And Synchronizing With Remote Repositories
Git allows the users to perform operations on the Repositories by cloning them on the local
machine. This will result in the creation of various different copies of the project. These copies
are stored on the local machine and hence, the users will not be able to sync their changes with
other developers. To overcome this problem, Git allows performing syncing of these local
repositories with the remote repositories. This synchronization can be done by the use of two
commands in the Git listed as follows:
• push
• pull

Git Push And Pull Commands

Git Push
This command is used to push all the commits of the current repository to the tracked remote
repository. This command can be used to push your repository to multiple repositories at
once.

Syntax
$ git push -u origin master
To push all the contents of our local repository that belong to the master branch to the server
(Global repository).

Git Pull

Pull command is used to fetch the commits from a remote repository and stores them in the
remote branches. There might be a case when other users perform changes on their copy of
repositories and upload them with other remote repositories. But in that case, your copy of the
repository will become out of date. Hence, to re-synchronize your copy of the repository with
the remote repository, the user has to just use the git pull command to fetch the content of the
remote repository.

Syntax
$ git pull
Additional Git Commands
Git Status
It is used for checking the status of git repository, i.e., if the files are committed or not, files
in staging area or untracked file.
Syntax
$ git status
Git Log
It is used to track all the changes made in the repository, providing the information on
contributors and their contributions.
Syntax
$ git log
.gitignore
You may use .gitignore if you want to hide any file when uploading online. Just simply
create a .gitignore file, and write all the files names you want to ignore.
Git Merge
It is used to merge two repository, without losing the data. It merge the specified repository to
the current repository.
Syntax
$ git merge <repo-name>
Git Checkout
It is used to rollback to previous version of the project which was committed anytime earlier.
You can copy to hash-code from git log and use it to rollback.
Syntax
$ git checkout <hash-code>

How to Install MySQL on Linux?



MySQL is an open-source relational database management system that is based on SQL
queries. Here, “My” represents the name of the co-founder Michael Widenius’s
daughter and “SQL” represents the Structured Query Language. MySQL is used for data
operations like querying, filtering, sorting, grouping, modifying, and joining the tables
present in the database.
In this article, we will are going to download and install MySQL on Linux and will verify
MySQL installation by creating a database. But Before that Let’s see some features of
MySQL.

Feature of MySQL

MySQL is the most popular RDBMS, It offers various features like:


• It is easy to use and free of cost to download.
• It contains a solid data security layer to protect important data.
• It is based on client and server architecture.
• It supports multithreading which makes it more scalable.
• It is highly flexible and supported by multiple applications.
• MySQL is fast, efficient, and reliable.
• It is compatible with many operating systems like Windows, MacOS, Linux, etc.

Steps to Install MySQL on Linux


For almost every Linux system, the following commands are used to install MySQL:

Installing MySQL on Linux using Terminal

Step 1: Open terminal using Ctrl+Alt+T. Now copy and paste the following command in the
terminal to install MySQL in Linux.
sudo apt install mysql-server

Then give your password and hit ENTER.


Step 2: Press “y” to continue.

It will take some time to download and install MySQL in Linux.


Verify MySQL Installation
Step 3: To verify the MySQL installation or to know the version enter the following commands
in your Terminal.
mysql --version

Protecting and Securing MySQL


Step 4: Now we will set the VALIDATE PASSWORD component.
sudo mysql_secure_installation
Step 5: Then press “y” to set the password. Next press “0” for the low-level password or
choose as you want to set the password.

Step 6: Create a password. Then Re-enter the password, then to continue press “y”.
Now the whole setup is done. Hence, MySQL installaion is successfully done!

Start Using MySQL


To get started with MySQL, type the following command to go to the root directory.
sudo mysql -u root
Let’s create a database using the following two commands:
Command 1: create database database_name;
Command 2: show databases;

Hence, we have successfully created a database using create database command. You are
now ready to start using MySQL. MySQL is the best relational database that will keep all
your data secure. Many Companies use MySQL because of its solid data security and
is supported by multiple applications.

How to Install and Configure MongoDB in Ubuntu?


MongoDB is a popular NoSQL database offering flexibility, scalability, and ease of use.
Installing and configuring MongoDB in Ubuntu is a straightforward process, but it requires
careful attention in detail to ensure a smooth setup.
In this article, we’ll learn how to install and configure MongoDB in Ubuntu. We’ll walk us
through each step, from installation to configuration, enabling us to harness the power of
MongoDB on our Ubuntu system. Let’s look at the requirements for installing MongoDB in
Ubuntu.

Requirements to Install and Configure MongoDB in Ubuntu


MongoDB 7.0 Community Edition supports the following 64-bit Ubuntu LTS (long-term
support) releases on x86_64 architecture:

• Ubuntu 22.04 LTS (“Jammy”)


• Ubuntu 20.04 LTS (“Focal”)

Steps to Install and Configure MongoDB in Ubuntu


MongoDB can be installed on Ubuntu using the following commands: These commands are
easy to run on the terminal and make the installation process handy. Follow the steps given
below to install MongoDB:

Step 1: First we need to update and upgrade our system repository to install MongoDB. Type
the following command in our terminal and then press Enter.
sudo apt update && sudo apt upgrade

Step 2: Now, install the MongoDB package using ‘apt‘. Type the following command and
press Enter.
sudo apt install -y mongodb
Step 3: Check the service status for MongoDB with the help of following command:
sudo systemctl status mongodb

systemctl verifies that MongoDB server is up and running.


Step 4: Now check if the installation process is done correctly and everything is working
fine. Go through the following command:
mongo --eval 'db.runCommand({ connectionStatus: 1 })'
the value “1” in ok field indicates that the server is working properly with no errors.
Step 5: MongoDB services can be started and stopped with the use of following commands:
To stop running the MongoDB service, use command :
sudo systemctl stop mongodb
MongoDB service has been stopped and can be checked by using the status command:
sudo systemctl status mongodb

As it can be seen that the service has stopped, to start the service we can use :
sudo systemctl start mongodb
Step 6: Accessing the MongoDB Shell
MongoDB provides a command-line interface called the MongoDB shell, which allows us to
interact with the database.
To access the MongoDB shell, simply type the following command in your terminal:
mongo
We are now connected to the MongoDB server, and you can start executing commands to
create databases, collections, and documents.

Features of MongoDB

1. Document-Oriented: MongoDB stores data in flexible, JSON-like documents called BSON


(Binary JSON), which allows for easy storage and retrieval of complex data structures.
2. Scalability: MongoDB is designed to scale horizontally across multiple servers, making it
suitable for handling large volumes of data and high-traffic applications.
3. High Performance: MongoDB’s architecture is optimized for high performance, with
features such as indexing, sharding, and replica sets to ensure fast data retrieval and processing.
4. Flexible Schema: MongoDB does not enforce a strict schema like traditional relational
databases, allowing for dynamic schema evolution and easy handling of changing data
requirements.
5. Rich Query Language: MongoDB supports a powerful query language that allows for
complex queries, aggregations, and data manipulation operations.
6. High Availability: MongoDB provides features such as replica sets and automatic failover
to ensure high availability and data durability.
7. Horizontal Scaling: MongoDB supports horizontal scaling through sharding, allowing data
to be distributed across multiple servers to handle increased load and storage requirements.

Use Cases of MongoDB

1. Content Management and Delivery: MongoDB is well-suited for content management


systems and delivery platforms that require storing and serving large volumes of unstructured
content such as articles, images, and videos.
2. Real-Time Analytics: MongoDB’s flexible data model and high-performance capabilities
make it an excellent choice for real-time analytics applications, allowing organizations to
analyze large volumes of data and derive insights quickly.
3. Mobile and IoT Applications: MongoDB’s flexible schema and support for geospatial
queries make it ideal for storing and processing data from mobile devices and Internet of Things
(IoT) sensors.
4. E-commerce: MongoDB can be used to build scalable e-commerce platforms that require
handling complex product catalogs, user profiles, and transaction data.
5. Social Networking: MongoDB is well-suited for social networking applications that need
to store and process large amounts of user-generated content, relationships, and activity feeds.
6. Log and Event Data: MongoDB’s high write throughput and flexible data model make it
suitable for storing and analyzing log and event data generated by web servers, applications,
and IoT devices.
7. Catalog and Inventory Management: MongoDB can be used to build catalog and
inventory management systems for e-commerce, retail, and supply chain management
applications.

Introduction to Kali Linux



Kali Linux is a Debian-derived Linux distribution that is maintained by Offensive Security.
It was developed by Mati Aharoni and Devon Kearns. Kali Linux is a specially designed OS
for network analysts, Penetration testers, or in simple words, it is for those who work under
the umbrella of cybersecurity and analysis. The official website of Kali Linux is Kali.org. It
gained its popularity when it was practically used in Mr. Robot Series. It was not designed for
general purposes, it is supposed to be used by professionals or by those who know how to
operate Linux/Kali.

Advantages:
• It has 600+ Penetration testing and network security tools pre-installed.
• It is completely free and open source. So you can use it for free and even contribute for its
development.
• It supports many languages.
• Great for those who are intermediate in linux and have their hands on Linux commands.
• Could be easily used with Raspberry Pi.

Disadvantages:
• It is not recommended for those who are new to linux and want to learn linux.(As it is
Penetration Oriented)
• It is a bit slower.
• Some software may malfunction.

Kali Linux – Information Gathering Tools



Information Gathering means gathering different kinds of information about the target. It is
basically, the first step or the beginning stage of Ethical Hacking, where the penetration
testers or hackers (both black hat or white hat) tries to gather all the information about the
target, in order to use it for Hacking. To obtain more relevant results, we have to gather more
information about the target to increase the probability of a successful attack. 0
Information gathering is an art that every penetration-tester (pen-tester) and hacker should
master for a better experience in penetration testing. It is a method used by analysts to
determine the needs of customers and users. Techniques that provide safety, utility, usability,
learnability, etc. for collaborators result in their collaboration, commitment, and honesty.
Various tools and techniques are available, including public sources such as Whois, nslookup
which can help hackers to gather user information. This step is very important because while
performing attacks on any target information (such as his pet name, best friend’s name, age,
or phone number to perform password guessing attacks(brute force) or other kinds of attacks)
are required.
Information gathering can be classified into the following categories:
• Footprinting
• Scanning
• Enumeration
• Reconnaissance

1. Nmap Tool
Nmap is an open-source network scanner that is used to recon/scan networks. It is used to
discover hosts, ports, and services along with their versions over a network. It sends packets
to the host and then analyzes the responses in order to produce the desired results. It could
even be used for host discovery, operating system detection, or scanning for open ports. It is
one of the most popular reconnaissance tools.
To use nmap:
• Ping the host with the ping command to get the IP address
ping hostname
• Open the terminal and enter the following command there.
nmap -sV ipaddress
Replace the IP address with the IP address of the host you want to scan.
• It will display all the captured details of the host.
Read more about nmap.

2. ZenMAP

It is another useful tool for the scanning phase of Ethical Hacking in Kali Linux. It uses the
Graphical User Interface. It is a great tool for network discovery and security auditing. It does
the same functions as that of the Nmap tool or in other words, it is the graphical Interface
version of the Nmap tool. It uses command line Interface. It is a free utility tool for network
discovery and security auditing. Tasks such as network inventory, managing service upgrade
schedules, and monitoring host or service uptime are considered really useful by systems and
network administrators.
To use Zenmap, enter the target URL in the target field to scan the target.
3. whois lookup

whois is a database record of all the registered domains over the internet. It is used for many
purposes, a few of them are listed below.
• It is used by Network Administrators in order to identify and fix DNS or domain-related
issues.
• It is used to check the availability of domain names.
• It is used to identify trademark infringement.
• It could even be used to track down the registrants of the Fraud domain.
To use whois lookup, enter the following command in the terminal
whois geeksforgeeks.org
Replace geeksforgeeks.org with the name of the website you want to lookup.
4. SPARTA

SPARTA is a python based Graphical User Interface tool which is used in the scanning and
enumeration phase of information gathering. It is a toolkit having a collection of some useful
tools for information gathering. It is used for many purposes, a few of them are listed below.
• It is used to export Nmap output to an XML file.
• It is used to automate the process of Nikto tool to every HTTP service or any other
service.
• It is used to save the scan of the hosts you have scanned earlier in order to save time.
• It is used to reuse the password which is already found and is not present in the wordlist.
To use SPARTA, enter the IP address of the host you want to scan in the host section to start
scanning.
5. nslookup
nslookup stands for nameserver lookup, which is a command used to get the information
from the DNS server. It queries DNS to obtain a domain name, IP address mapping, or any
other DNS record. It even helps in troubleshooting DNS-related problems. It is used for many
purposes, a few of them are listed below.
• To get the IP address of a domain.
• For reverse DNS lookup
• For lookup for any record
• Lookup for an SOA record
• Lookup for an ns record
• Lookup for an MX record
• Lookup for a txt record
6. Osintgram

Osintgram is an OSINT tool to run on reconnaissance Instagram to collect and analyze. It


offers an interactive shell to perform analysis on account of any users by its nickname. One
can get:
• – addrs : It gets all registered addressed by target photos.
• – captions : It gets the user’s photos captions.
• – comments : It gets total comments of the target’s posts.
• – followers : It gets target followers.
• – followings : It gets users followed by the target.
• – fwersemail : It gets emails of target followers.
• – fwingsemail : It gets an email of users followed by the target.
• – fwersnumber : It gets the phone number of target followers.
• – fwingsnumber : It gets the phone number of users followed by the target.
• – hashtags : It gets hashtags used by the target.
Amazon Web Services (AWS)

Amazon Web Services(AWS) is one of the world’s most adopted cloud computing platform
that offers Infrastructure as a Service(IaaS) and Platform as a Service(PaaS). AWS offers on-
demand computing services, such as virtual servers and storage, that can be used to build and
run applications and websites. AWS is known for its security, reliability, and flexibility,
which makes it a popular choice for organizations that need to store and process sensitive
data.

This AWS tutorial is designed for beginners and professionals to learn AWS’s basic and
advanced concepts. Learn about the various topics of AWS, such as introduction, history of
AWS, global infrastructure, features of AWS, IAM, storage services, database services,
application Services, etc., and other AWS products such as S3, EC2, Lambda, and more. By
the end of this tutorial, readers will have a basic understanding of what AWS is and how it can
be used to support their computing needs.

What is Amazon Web Service?


Amazon Web Services (AWS) is a cloud computing platform offered by Amazon. It provides
a wide range of on-demand services like computing power, storage, and databases, allowing
businesses to scale and manage their IT resources efficiently. AWS offers services such as EC2
for virtual servers, S3 for scalable storage, RDS for managed databases, and Lambda for
serverless computing. By using AWS, companies can reduce infrastructure costs, improve
flexibility, and deploy applications globally with ease.
Prerequisites to Learn AWS
Before jumping to the AWS Tutorial, it’s recommended to have a basic foundational
understanding of operating systems, computer networking, basic coding commands in Linux
terminals, and some prior knowledge of cloud computing.
What is Elastic Compute Cloud (EC2)?


EC2 stands for Elastic Compute Cloud a service from Amazon Web Services (AWS). EC2 is
an on-demand computing service on the AWS cloud platform. It lets you rent virtual computers
to run your applications. You pay only for what you use.
Instead of buying and managing your own servers, EC2 gives you a virtual machine, where
you can run websites, apps, or even big data tasks. You can choose how much memory,
storage, and processing power you need- and stop it when you’re done. EC2 offers security,
reliability, high performance, and cost-effective infrastructure to meet demanding business
needs.
You can deploy your applications in EC2 servers without any worrying about the underlying
infrastructure. You configure the EC2-Instance in a very secure manner by using the
VPC, Subnets, and Security groups. You can scale the configuration of the EC2 instance you
have configured based on the demand of the application by attaching the autoscaling group to
the EC2 instance. You can scale up and scale down the instance based on the incoming traffic
of the application.
The following figure shows the EC2-Instance which is deployed in VPC (Virtual Private
Cloud).

Use Cases of Amazon EC2 (Elastic Compute Cloud)


The following are the use cases of Amazon EC2:
1. Deploying Application
In the AWS EC2 instance, you can deploy your application like .jar,.war, or .ear application
without maintaining the underlying infrastructure.
2. Scaling Application
Once you deployed your web application in the EC2 instance know you can scale your
application based upon the demand you are having by scaling the AWS EC2-Instance.
3. Deploying The ML Models
You can train and deploy your ML models in the EC2-instance because it offers up to 400
Gbps), and storage services purpose-built to optimize the price performance for ML projects.
4. Hybrid Cloud Environment
You can deploy your web application in EC2-Instance and you can connect to the database
which is deployed in the on-premises servers.
5. Cost-Effective
Amazon EC2-instance is cost-effective so you can deploy your gaming application in the
Amazon EC2-Instances
AWS EC2 Instance Types
Different Amazon EC2 instance types are designed for certain activities. Consider the unique
requirements of your workloads and applications when choosing an instance type. This might
include needs for computing, memory, or storage.
The AWS EC2 Instance types are as follows:
• General Purpose Instances
• Compute Optimized Instances
• Memory-Optimized Instances
• Storage Optimized Instances
• Accelerated Computing Instances

1. General Purpose Instances


• It provides the balanced resources for a wide range of workloads.
• It is suitable for web servers, development environments, and small databases.
Examples: T3, M5 instances.

2. Compute Optimized Instances


• It provides high-performance processors for compute-intensive applications.
• It will be Ideal for high-performance web servers, scientific modeling, and batch
processing.
Examples: C5, C6g instances.

3. Memory-Optimized Instances
• High memory-to-CPU ratios for large data sets.
• Perfect for in-memory databases, real-time big data analytics, and high-performance
computing (HPC).
Examples: R5, X1e instances.

4. Storage Optimized Instances


• It provides optimized resource of instance for high, sequential read and write access to
large data sets.
• Best for data warehousing, Hadoop, and distributed file systems.
Examples: I3, D2 instances.

5. Accelerated Computing Instances


• It facilitates with providing hardware accelerators or co-processors for graphics
processing and parallel computations.
• It is ideal for machine learning, gaming, and 3D rendering.
Examples: P3, G4 instances.

Features of AWS EC2 (Elastic Compute Cloud)


The following are the features of AWS EC2:

1. AWS EC2 Functionality


EC2 provides its users with a true virtual computing platform, where they can use various
operations and even launch another EC2 instance from this virtually created environment. This
will increase the security of the virtual devices. Not only creating but also EC2 allows us to
customize our environment as per our requirements, at any point of time during the life span
of the virtual machine. Amazon EC2 itself comes with a set of default AMI(Amazon Machine
Image) options supporting various operating systems along with some pre-configured
resources like RAM, ROM, storage, etc. Besides these AMI options, we can also create an
AMI curated with a combination of default and user-defined configurations. And for future
purposes, we can store this user-defined AMI, so that next time, the user won’t have to re-
configure a new AMI(Amazon Machine Image) from scratch. Rather than this whole process,
the user can simply use the older reference while creating a new EC2 machine.

2. AWS EC2 Operating Systems


Amazon EC2 includes a wide range of operating systems to choose from while selecting your
AMI. Not only are these selected options, but users are also even given the privilege to upload
their own operating systems and opt for that while selecting AMI during launching an EC2
instance. Currently, AWS has the following most preferred set of operating systems available
on the EC2 console.

• Amazon Linux
• Windows Server
• Ubuntu Server
• SUSE Linux
• Red Hat Linux

3. AWS EC2 Software


Amazon is single-handedly ruling the cloud computing market, because of the variety of
options available on EC2 for its users. It allows its users to choose from various software
present to run on their EC2 machines. This whole service is allocated to AWS Marketplace on
the AWS platform. Numerous software like SAP, LAMP, Drupal, etc are available on AWS to
use.

4. AWS EC2 Scalability and Reliability


EC2 provides us the facility to scale up or scale down as per the needs. All dynamic scenarios
can be easily tackled by EC2 with the help of this feature. And because of the flexibility of
volumes and snapshots, it is highly reliable for its users. Due to the scalable nature of the
machine, many organizations like Flipkart, and Amazon rely on these days whenever
humongous traffic occurs on their portals.

Pricing of AWS EC2 (Elastic Compute Cloud) Instance


Amazon EC2 offers several ways to pay for the cloud computing power you need, whether
you’re just getting started or running large-scale workloads. Here’s a breakdown of all the
pricing models available:

1. Free Tier
If you’re new to AWS, you can try EC2 for free with the Free Tier. You get up to 750 hours
per month of t2.micro instances for one year, which is perfect for learning, experimenting,
or running lightweight applications. If you exceed the free limits, you’ll only pay for what’s
above the Free Tier.

2. On-Demand Instances
With On-Demand Instances, you pay for the compute power you use, by the second, with
a minimum of 60 seconds. There’s no need to commit to a long-term contract or make any
upfront payments. This is ideal for applications that are unpredictable or for short-term use
cases where you only want to pay for what you need.

3. Savings Plans
If you know you’ll need consistent computing power, Savings Plans let you commit to a
certain level of usage over a 1- or 3-year term. By making this commitment, you can save
significantly compared to On-Demand pricing. The best part is that you can apply the
discount across a wide range of instances, so it’s flexible based on your needs.

4. Reserved Instances
Reserved Instances give you the chance to commit to a specific instance type and region
for 1 or 3 years. This helps reduce costs by reserving capacity in advance. It’s great for
steady applications that need reliable performance over a long period. Reserved Instances can
also give you significant discounts compared to On-Demand pricing.

5. Spot Instances
Spot Instances let you bid for unused EC2 capacity, which can give you up to 90% off the
regular price. They’re perfect for tasks that are flexible and can be interrupted, like batch
processing or big data analysis. However, AWS can terminate these instances with little
notice, so they work best for non-urgent workloads.

6. Dedicated Hosts
Dedicated Hosts provide you with a physical EC2 server fully dedicated to your use. This is
ideal if you have server-bound software licenses or need to meet
specific compliance requirements. You can also use this option as part of a Savings Plan to
save on costs. Dedicated Hosts allow you to have more control over the physical
infrastructure for your applications.

7. On-Demand Capacity Reservations


With On-Demand Capacity Reservations, you can reserve EC2 capacity in a
specific Availability Zone for any period of time you choose. This ensures you have the
compute resources you need during peak times or for workloads that need guaranteed
availability. Unlike Reserved Instances, you don’t have to commit to a long-term contract.

8. Per-Second Billing
EC2’s per-second billing means that you only pay for the exact compute time you use, down
to the second. There’s no need to pay for unused minutes or extra time, making it a more
cost-effective option for short-lived tasks or workloads that are dynamic in nature.

Create AWS Free Tier Account


Amazon Web Service(AWS) is the world’s most comprehensive and broadly adopted cloud
platform, offering over 200 fully featured services from data centers globally. Millions of
customers including the fastest-growing startups, largest enterprises, and leading government
agencies are using AWS to lower costs, become more agile, and innovate faster. AWS offers
new subscribers a 12-month free tier to get hands-on experience with all AWS cloud services.
To know more about how to create an AWS account for free refer to Amazon Web Services
(AWS) – Free Tier Account Set up.

Get Started With Amazon EC2 (Elastic Compute Cloud) Linux Instances
Step 1: First login into your AWS account. Once you are directed to the management console.
From the left click on “Services” and from the listed options click on EC2.

Step 2: Afterward, you will be redirected to the EC2 console. Here is the image attached to
refer to various features in EC2.

Step 3: To know more about creating an EC2-Instance in a Step-by-Step guide refer to


the Amazon EC2 – Creating an Elastic Cloud Compute Instance.
Benefits of Amazon EC2
The following are the benefits of Amazon EC2:
• Scalability: It helps to easily scale the instances up or down based on the demand with
ensuring the optimal performance and cost-efficiency.
• Flexibility: It provides wide variety of instance types and configurations for matching
different workload requirements and operating systems.
• Cost-Effectiveness: It comes with providing Pay-as-you-go model with options like On-
Demand, Reserved, and Spot Instances for managing cost efficiently.
• High Availability and Reliability: It offers multiple geographic regions and availability
zones for strong fault tolerance and disaster recovery.

What is Amazon S3?


Amazon S3 is a Simple Storage Service in AWS that stores files of different types like
Photos, Audio, and Videos as Objects providing more scalability and security to. It allows
the users to store and retrieve any amount of data at any point in time from anywhere on the
web. It facilitates features such as extremely high availability, security, and simple
connection to other AWS Services.

What is Amazon S3 Used for?


Amazon S3 is used for various purposes in the Cloud because of its robust features with
scaling and Securing of data. It helps people with all kinds of use cases from fields such as
Mobile/Web applications, Big data, Machine Learning and many more. The following are a
few Wide Usage of Amazon S3 service.

• Data Storage: Amazon s3 acts as the best option for scaling both small and large
storage applications. It helps in storing and retrieving the data-intensitive applications
as per needs in ideal time.
• Backup and Recovery: Many Organizations are using Amazon S3 to backup their
critical data and maintain the data durability and availability for recovery needs.
• Hosting Static Websites: Amazon S3 facilitates in storing HTML, CSS and other web
content from Users/developers allowing them for hosting Static Websites benefiting
with low-latency access and cost-effectiveness. To know more detailing refer this
Article – How to host static websites using Amazon S3
• Data Archiving: Amazon S3 Glacier service integration helps as a cost-effective
solution for long-term data storing which are less frequently accessed applications.
• Big Data Analytics: Amazon S3 is often considered as data lake because of its capacity
to store large amounts of both structured and unstructured data offering seamless
integration with other AWS Analytics and AWS Machine Learning Services.

Difference between Relational database and NoSQL


1. Relational Database :
RDBMS stands for Relational Database Management Systems. It is most popular database. In
it, data is store in the form of row that is in the form of tuple. It contain numbers of table and
data can be easily accessed because data is store in the table. This Model was proposed by
E.F. Codd.
2. NoSQL :
NoSQL Database stands for a non-SQL database. NoSQL database doesn’t use table to store
the data like relational database. It is used for storing and fetching the data in database and
generally used to store the large amount of data. It supports query language and provides
better performance.
Difference between Relational database and NoSQL :
Relational Database NoSQL

It is used to handle data coming in low It is used to handle data coming in


velocity. high velocity.

It gives both read and write


It gives only read scalability. scalability.

It manages structured data. It manages all type of data.

Data arrives from one or few locations. Data arrives from many locations.

It supports complex transactions. It supports simple transactions.

It has single point of failure. No single point of failure.

It handles data in less volume. It handles data in high volume.

Transactions written in many


Transactions written in one location. locations.

support ACID properties compliance doesn’t support ACID properties

Its difficult to make changes in database once Enables easy and frequent changes to
it is defined database

schema is mandatory to store the data schema design is not required

Deployed in vertical fashion. Deployed in Horizontal fashion.

You might also like