KEMBAR78
Cloud Computing Lab Manual Ccs335 | PDF | Cloud Computing | Apache Hadoop
50% found this document useful (2 votes)
14K views56 pages

Cloud Computing Lab Manual Ccs335

cloud computing lab manual ccs335

Uploaded by

Shekina Satheesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
14K views56 pages

Cloud Computing Lab Manual Ccs335

cloud computing lab manual ccs335

Uploaded by

Shekina Satheesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

GOOD SHEPHERD

COLLEGE OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

LAB MANUAL

CCS335 – CLOUD COMPUTING


LABORATORY

Regulation 2021

Year / Semester : III / V

Jun 2024 – Dec 2024

PREPARED BY

Mrs. K. SHEKINA, M.E.,


Assistant Professor / CSE
LIST OF EXPERIMENTS

1. Install Virtualbox/VMware Workstation with different flavours of linux or


windows OS on top of windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and
execute Simple Programs
3. Install Google App Engine. Create hello world app and other simple web
applications using python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm
that is not present in CloudSim.
6. Find a procedure to transfer the files from one virtual machine to another virtual machine.
7. Install Hadoop single node cluster and run simple applications like wordcount.

8. Creating and Executing your first container using docker


TABLE OF CONTENTS

MARK
S.NO. DATE EXPERIMENT TITLE SIGN.
S/10
Install Virtualbox / VMware
Workstation with different flavours of
1. linux or windows OS on top of
windows7 or 8.

Install a C compiler in the virtual


machine created using virtual box
2.
and execute Simple Programs

Install Google App Engine. Create


hello world app and other simple web
3.
applications using python/java.

Use GAE launcher to launch the web


4. applications.

Simulate a cloud scenario using


CloudSim and run a scheduling
5. algorithm that is not present in
CloudSim.

Find a procedure to transfer the files from one


6. virtual machine to another virtual machine.
Install Hadoop single
node cluster and run simple
7. applications like wordcount.

Creating and Executing your first container using


8. docker
INSTALL VIRTUAL BOX/VMWARE WORKSTATION
EX.NO:1
WITH DIFFERENT FLAVOURS OF LINUX OR

DATE: WINDOWS OS ON TOP OF WINDOWS7 OR 8.

AIM:
To Install Virtual box/VMware Workstation with different flavours of linux or windows OS on top of
windows7 or 8 Procedure

Virtual Box installation

1. First we need to download Virtual Box from https://www.virtualbox.org. I’ve downloaded


Virtual Box 5.1.14
2. Run the executable and follow the prompts to complete the installation. We don’t really need to
change anything for our purposes, and can accept the defaults. Before completing the wizard you will get a
warning that the network connection will temporarily be interrupted, so make sure you’re not doing anything
that would be impacted, like being half-way through downloading a 16GB file that can’t be resumed

2. Create an Ubuntu virtual machine

1. Download the latest Ubuntu release from https://www.ubuntu.com/download/desktop.


I’ve downloaded Ubuntu 16.04.1
2. Open Virtual Box and click New

1
3. Type the Name for the virtual machine, like Ubuntu 16. VirtualBox will try to predict the
Type and Version based on the name you enter. Otherwise, select:

 Type: Linux
 Version: Ubuntu (64-bit)
and click Next.

4. Next we need to specify how much memory to allocate the virtual machine. According to the
Ubuntu system requirements we need 2GB, but I’d recommend more if your host can handle it.
Basically the higher you can set the memory without severly impacting your host machine, the
better the performance of the guest machine. If you’re not sure, stick with 2GB.

2
5. On the Hardware screen select Create a virtual hard disk now and click Create
6. Accept the default option VDI for Hard disk file type (or change it if you wish…) and click
Next
7. Next we are prompted for Storage on physical hard disk. The options are Dynamically
allocated and Fixed size. We’ll use the default of Dynamically allocated. Click Next

8. Choose the hard disk size and storage location. The Ubuntu system requirements recommend
25GB. Remember, we choose Dynamically allocated as our storage option in the last step, so we
won’t consume all this disk space immediately. Rather, VirtualBox will allocate it as required,
up to the maximum 25GB we specified. Click Create
9. The wizard will finish and we are returned to the main VirtualBox window. Click Settings

10. In the left pane select Storage, then in the right select the CD icon with the word Empty
beside it.

3
11. Under Attributes click the CD icon (highlighted in the screenshot above) and select
Choose Virtual Optical Disk File and browse to the downloaded file ubuntu-16.04.1- desktop-
amd64.iso

12. Click OK to close the Settings dialog window. The virtual machine should now be ready
to start.

Install Ubuntu
In VirtualBox your VM should be showing as Powered Off, and the optical drive configured to point to
the Ubuntu ISO file we downloaded previously.
1. In VirtualBox, select the virtual machine Ubuntu 16 and click Start. VirtualBox will launch a
new window with the vm and boot from the iso.

4
2. Click Install Ubuntu

3. Select Download updates while installing Ubuntu and click Continue


4. On the next screen accept the default of Erase disk and install Ubuntu and click Install Now

5. You will be prompted with a warning saying the changes will be written to disk. Click
Continue

5
6. Select your timezone and click Continue
7. Select your keyboard layout. I accepted the default of English (US) and click Continue
8. Enter a username and password, then click Continue

9. The Ubuntu installation may take several minutes to run, so have another coffee.

6
10. When the installation is finished you will be prompted to restart. Save and close
anything else you may have open and click Restart Now

11. Now when the vm reboots you may see this message.

7
From the menu select Machine > Settings.

Navigate back into the Storage settings where we previously selected the iso file. If the Ubuntu
iso file is still there, remove it. Otherwise close the Settings window and in the vm press Enter to
proceed.
12. If all went well the VM should boot to the Ubuntu login screen. Enter your password to
continue.

Ubuntu should run normally in the VirtualBox environment. If everything is far too small, you can adjust
the ‘zoom’ by selecting View > Scale Factor > 200%.
Have fun!

8
Result:
Thus the Virtual box/Oracle Virtual machine has installed Succesfully.

9
EX.NO:2 INSTALL A C COMPILER IN THE
VIRTUAL MACHINE AND EXECUTE A
DATE:
SAMPLE PROGRAM.

Aim :

To install a C compiler in the virtual machine and execute a sample program.

Procedure:
step1:

Install the centos or ubuntu in the VMware or Oracle Virtual Machine as per previous commands.

Step 2:
Login into the VM of installed OS.

Step 3:
If it is ubuntu then, for gcc installation
$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get install gcc-6 gcc-6-base

Step 4:
Write a sample program like
Welcome.cpp
#include<iostream.h>
using namespace std;
int main()
{
cout<<”Hello world”;
return 0;
}

10
Step 5:
First we need to compile and link our program. Assuming the source code is saved in a file
welcome.cpp, we can do that using GNU C++ compiler g++, for example g++ -Wall -o
welcome welcome.cpp and output can be executed by ./welcome

Result:
Thus the GCC compiler has installed and executed in this sample program successfully.

11
EX.NO:3 INSTALL GOOGLE APP ENGINE. CREATE HELLO WORLD
APP AND OTHER SIMPLE WEB APPLICATIONS USING
DATE: PYTHON/JAVA.

AIM:
To install Google App Engine. Create hello world app and other simple web applications
using python/java.

PROCEDURE:

To create a new App Engine standard project in Eclipse:

Steps to creation:
1. Click the Google Cloud Platform toolbar button .
2. Select Create New Project > Google App Engine Standard Java Project.

12
3. To create a Maven-based App Engine project, check Create as Maven Project and
enter a Maven Group ID and Artifact ID of your choosing to set the coordinates for
this project. The Group ID is often the same as the package name, but does not have to
be. The Artifact ID is often the same as or similar to the project name, but does not
have to be.
4. Click Next.
5. Select any libraries you need in the project.
6. Click Finish.

The wizard generates a native Eclipse project, with a simple servlet, that you can run and deploy from
the IDE.

Running the project locally

Steps for Running the Project locally:

1. Select the project in the Project Explorer or Package Explorer.

2. Open the context menu.


3. Select Run As > App Engine.

13
1. Log messages appear in the console as the server starts up.
2. Eclipse opens its internal web browser to your application. You can also open an
external browser and navigate to http://localhost:8080. Either way, you'll see a static
HTML page with a link to the servlet.

Note: You may see a message that says, Port 8080 already in use. If so, you can run
your application on a different host or port.

14
Debugging the project locally

To debug your project locally, complete the running the project locally steps, except select
Debug As > App Engine instead of Run As > App Engine in the context menu.

The server stops at the breakpoints you set and shows the Eclipse debugger view.

Running App Engine Standard Apps on a Different Host or Port

To run your App Engine standard application on a different host or port:

1. Right-click your project.


2. Select Run As > Run on Server.

15
Note: You can also select Debug As > Debug on Server to debug your application on a
different host or port.

1. In the dialog, select Manually define a new server.


2. Select App Engine Standard as the server type.
3. Enter the hostname in the Server's host name field.
4. Enter the port in the Server port field.
5. Click Finish.
Configuring Eclipse

16
To configure Cloud Tools for Eclipse to use Objectify:

1. In Eclipse, select Run > Run Configurations.

2. In the Run Configurations dialog, select an existing App Engine Local Server
launch configuration, or click the New launch configuration button to create one.
3. Select the Cloud Platform tab of your run configuration.
4. Select an account.

5. Select a project to assign a project ID to be used in the local run. It doesn't matter
which project you select because you won't actually connect to it.

6. As an alternative, if you aren't logged in or don't have a Cloud Project, you can
instead set the GOOGLE_CLOUD_PROJECT environment variable to a legal
string, such
as MyProjectId, in the Environment tab of the run configuration.

Result:
Thus the Google App Engine has installed and executed in this sample program effectively.

17
EX.NO:4 USE GAE LAUNCHER TO LAUNCH THE WEB

DATE: APPLICATIONS

AIM:

To use GAE launcher to launch the web applications(Eclipse)

Procedure:

Step1:

Install an Eclipse and create GAE web application as per previous commands.
Step2 :
Deploying App Engine Standard Applications from Eclipse

The steps of creating a new App Engine app in the Google Cloud Console, authenticating with
Google, and deploying your project to App Engine.

Before you begin


You need a Google Cloud project with an App Engine application to deploy to. If you don't
already have one, use the Google Cloud Console to set up your Cloud project:

1. Select or create a new Cloud project.


2. Sign in to a Google account that is used to deploy your project to App Engine.
3. Select File > Sign in to Google.

If you see Manage Google Accounts instead of the Sign in to Google option, that means you
are already signed in, so you can skip these account sign in steps.

4. Your system browser opens outside of Eclipse and asks for the permissions it needs to
manage your App Engine Application.

18
5. Click Allow and close the window. Eclipse is now signed into your account.
6. Ensure that the appengine-web.xml file is in the WEB-INF folder of
your web application.
7. Ensure that the project has the App Engine Project facet. If you created it
using the wizard, it should already have this facet. Otherwise:
8. Right click the project in the Package Explorer to bring up the context menu.
9. Select Configure > Convert to App Engine Project.
Deploy the Project to App Engine

To deploy the project to App Engine standard environment:

1. Right click the project in the Package Explorer to open the context menu.
2. Select Deploy to App Engine Standard.
3. A dialog pops up.
4. Select the account you want to deploy with, or add a new account.
5. The list of projects the account has access to loads. Select the one you want to deploy to.
6. Click OK.

19
A background job launches that deploys the project to App Engine. The output of the job is
visible in the Eclipse Console view.

By default, App Engine stops the previous version of your application and immediately
promotes your new code to receive all traffic. If you'd rather manually promote it later
using gcloud or the Google Cloud Console, uncheck Promote the deployed version to receive
all traffic. If you don't want to stop the previous version, uncheck Stop previous version.

Result:
Thus the Google App Engine has launched in this sample program agreeably.

20
EX.NO:5 SIMULATE A CLOUD SCENARIO USING CLOUDSIM AND
RUN A SCHEDULING ALGORITHM THAT IS NOT IN
DATE: CLOUDSIM

Aim :

To Simulate a cloud scenario using CloudSim and run a scheduling algorithm.

Procedure:

1. Before you start, It is essential that the cloudsim should already installed/setup on your
local computer machine. In case you are yet to install it, you may follow the process of
Cloudsim setup using Eclipse IDE

public static void main(String[] args)


2. The main() method is the pointer from where the execution of this example starts

3. There are eleven steps that are followed in each example with some variation in
them, specified as follows:

Step1 :Set the Number of users for the current simulation. This user count is directly

int num_user = 1; // number of cloud users


Calendar calendar = Calendar.getInstance();
boolean trace_flag = false;
proportional to a number of brokers in the current simulation.

 Step 2: Initialize the simulation, provided with the current time, number of users and
trace flag.

CloudSim.init(num_user, calendar, trace_flag);

Datacenter datacenter0 = createDatacenter("Datacenter_0");


 Step 3: Create a Datacenter.
4. where the createDatacenter() method itself initializes the various datacenter

21
characteristics along with the host list. This is the most important entity without this
there is no way the simulation of hosting the virtual machine is applicable.

private static Datacenter createDatacenter(String name)

List<Host> hostList = new ArrayList<Host>();


List<Pe> peList = new ArrayList<Pe>();

int mips = 1000;

peList.add(new Pe(0, new PeProvisionerSimple(mips))); int


hostId = 0;

int ram = 2048; // host memory (MB) long


storage = 1000000; // host storageint bw
= 10000;

hostList.add(

new Host(

hostId,

new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,

peList,

22
new VmSchedulerTimeShared(peList)

);

String arch =
"x86"; String
os = "Linux";
String vmm =
"Xen"; double
time_zone =
10.0;double
cost = 3.0;

double costPerMem
= 0.05; double
costPerStorage =
0.001;

double costPerBw = 0.0;

LinkedList<Storage> storageList = new


LinkedList<Storage>();DatacenterCharacteristics
characteristics = new DatacenterCharacteristics(arch, os,
vmm, hostList,
Step:4 Create a datacenter broker
time_zone, cost, costPerMem,
DatacenterBroker broker = createBroker();
int brokerId = broker.getId(); costPerStorage,
Where the
costPerBw);Datacenter datacenter =the
createBroker() method initializes entity object from DatacenterBroker class
null;
private static DatacenterBroker createBroker()
{ try {
DatacenterBroker
datacenterbroker
= new =Datacenter(name,
null;
try {
characteristics,
broker = newnew DatacenterBroker("Broker");
VmAllocationPolicySimple(hostList),
} catch (Exception e) {
e.printStackTrace();
storageList,
return null;0);
}} catch (Exception e) {
return broker;
} e.printStackTrace();

}
23
return datacenter;

}
 Step 5: Create a Virtual Machine(s).
vmlist = new
ArrayList<Vm>();int vmid
= 0;

int mips =
1000; long
size = 10000;
9ioint ram =
512; long bw
= 1000;

int pesNumber = 1;

Vm vm = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm,

newCloudletSchedulerTimeShared());

vmlist.add(vm);

 Step 6: Submit Virtual Machine to Datacenter broker.


broker.submitVmList(vmlist);
 Step 7: Create Cloudlet(s) by specifying their characteristics.

cloudletList = new ArrayList<Cloudlet>();

int id = 0;
long length =
400000; long
fileSize = 300;
long
outputSize =
300;
UtilizationModel utilizationModel = new UtilizationModelFull();

Cloudlet cloudlet = new Cloudlet(id, length, pesNumber,


fileSize,outputSize, utilizationModel,
utilizationModel, utilizationModel);

cloudlet.setUserId(brokerId);
cloudlet.setVmId(vmid);
 Step 8: Submit Cloudlets to Datacenter broker.

cloudletList.add(cloudlet);
broker.submitCloudletList(cloudletList); 24
 Step 9: Send call to Start Simulation.
CloudSim.startSimulation();
 Step 10: Once no more event to execute, send the call to Stop Simulation.
CloudSim.stopSimulation();
 Step 11 : Finally, print the final status of the Simulation.
List<Cloudlet> newList = broker.getCloudletReceivedList();
printCloudletList(newList);
Where printCloudletList() method formats the output to correctly display it on the console.
private static void printCloudletList(List<Cloudlet> list)

int size = list.size();


Cloudlet cloudlet;
String indent = " ";
Log.printLine();

Log.printLine("========== OUTPUT ==========");

Log.printLine("Cloudlet ID" + indent + "STATUS" + indent

+ "Data center ID" + indent + "VM ID" +


indent + "Time" + indent

+ "Start Time" + indent + "Finish Time");

25
DecimalFormat dft = new DecimalFormat("###.##"); for
(int i = 0; i < size; i++)

cloudlet = list.get(i);

Log.print(indent + cloudlet.getCloudletId() + indent +


indent);

if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS)

Log.print("SUCCESS");
Log.printLine(indent + indent +

cloudlet.getResourceId()

+ indent + indent + indent +

cloudlet.getVmId()
Once you Run the example the output for cloudsimExample1.java will be displayed like:
+ indent + indent +
dft.format(cloudlet.getActualCPUTime())

+ indent + indent +
dft.format(cloudlet.getExecStartTime())

+ indent + indent +
dft.format(cloudlet.getFinishTime()));

26
Result :
Thus the Simulation of a cloud scenario using CloudSim and run a scheduling algorithm has
implemented successfully

27
EX.NO:6 FIND A PROCEDURE TO TRANSFER THE FILES
FROM ONE VIRTUAL MACHINE TO ANOTHER
DATE: VIRTUAL MACHINE

AIM:

To Find a procedure to transfer the files from one virtual machine to another virtual
machine.

PROCEDURE:

Step 1: Open Opennebula service from root user and view in localhost:9869

root@linux:$ /etc/init.d/opennebula-sunstone restart

Step 2: Create oneimage, onetemplate and one vm as like earlier Creating oneimage
oneadmin@linux:~/datastores$ oneimage create --name "Ubuntu" –path
"/home/linux/Downloads/source/tubuntu1404-5.0.1.qcow2c" --driver qcow2 -- datastore default
Creating One Template:
oneadmin@linux:~/datastores$ onetemplate create --name "ubuntu1" --cpu 1 --vcpu 1 --
memory 1024 --arch x86_64 --disk "Ubuntu" --nic "private" --vnc –ssh
Instantiating OneVm (oneemplate)
oneadmin@linux:~/datastores$ onetemplate instantiate "ubuntu1"

Step 3: To perform a migration. We use onevm command with VM id as VID = 0 to


host02(HID=1)

oneadmin@linux:~/datastores$ onevm migrate --live 0 1

This will move the VM from host01 to host02. The onevm list shows something like the
following
oneadmin@linux:~/datastores$ onevm list

28
ID USER GROUP NAME STAT CPU MEM HOSTNAME TIME
0 oneadmin oneadmin one-0 runn 0 0k host02 00:00:48

Result :
Thus the virtual machine transfer the files from one virtual machine to
another virtual machine from one node to the other has executed successfully.

29
7.Install Hadoop single node cluster and run simple
applications like wordcount.
Aim:
To Install Hadoop single node cluster and run simple
applications like wordcount.

Steps:

Install Hadoop
Step 1: Click here to download the Java 8 Package. Save this file in your
home directory.

Step 2: Extract the Java Tar File.

Command: tar -xvf jdk-8u101-linux-i586.tar.gz

Fig: Hadoop Installation – Extracting Java Files

Step 3: Download the Hadoop 2.7.3 Package.

Command: wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-


2.7.3.tar.gz

Fig: Hadoop Installation – Downloading Hadoop


Step 4: Extract the Hadoop tar File.

Command: tar -xvf hadoop-2.7.3.tar.gz

Fig: Hadoop Installation – Extracting Hadoop Files

Step 5: Add the Hadoop and Java paths in the bash file (.bashrc).

Open. bashrc file. Now, add Hadoop and Java Path as shown below.

Command: vi .bashrc

Fig: Hadoop Installation – Setting Environment Variable

Then, save the bash file and close it.

For applying all these changes to the current Terminal, execute the source command.

Command: source .bashrc

Fig: Hadoop Installation – Refreshing environment variables

To make sure that Java and Hadoop have been properly installed on your system
and can be accessed through the Terminal, execute the java -version and hadoop
version commands.

Command: java -version


Fig: Hadoop Installation – Checking Java Version

Command: hadoop version

Fig: Hadoop Installation – Checking Hadoop Version

Step 6: Edit the Hadoop Configuration files.

Command: cd hadoop-2.7.3/etc/hadoop/

Command: ls

All the Hadoop configuration files are located in hadoop-2.7.3/etc/hadoop directory as


you can see in the snapshot below:
Fig: Hadoop Installation – Hadoop Configuration Files

Step 7: Open core-site.xml and edit the property mentioned below inside
configuration tag:

core-site.xml informs Hadoop daemon where NameNode runs in the cluster. It contains
configuration settings of Hadoop core such as I/O settings that are common to HDFS &
MapReduce.

Command: vi core-site.xml

Fig: Hadoop Installation – Configuring core-site.xml

1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
6 </property>
7 </configuration>

Step 8: Edit hdfs-site.xml and edit the property mentioned below inside
configuration tag:
hdfs-site.xml contains configuration settings of HDFS daemons (i.e. NameNode,
DataNode, Secondary NameNode). It also includes the replication factor and block size
of HDFS.

Command: vi hdfs-site.xml

Fig: Hadoop Installation – Configuring hdfs-site.xml

1
2 <?xml version="1.0" encoding="UTF-8"?>
3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
4 <property>
5 <name>dfs.replication</name>
6 <value>1</value>
7 </property>
<property>
8 <name>dfs.permission</name>
9 <value>false</value>
10 </property>
</configuration>
11
Step 9: Edit the mapred-site.xml file and edit the property mentioned below
inside configuration tag:

mapred-site.xml contains configuration settings of MapReduce application like number


of JVM that can run in parallel, the size of the mapper and the reducer process, CPU
cores available for a process, etc.

In some cases, mapred-site.xml file is not available. So, we have to create the mapred-
site.xml file using mapred-site.xml template.

Command: cp mapred-site.xml.template mapred-site.xml


Command: vi mapred-site.xml.

Fig: Hadoop Installation – Configuring mapred-site.xml

1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>mapreduce.framework.name</name>
<value>yarn</value>
6 </property>
7 </configuration>

Step 10: Edit yarn-site.xml and edit the property mentioned below inside
configuration tag:

yarn-site.xml contains configuration settings of ResourceManager and NodeManager


like application memory management size, the operation needed on program &
algorithm, etc.

Command: vi yarn-site.xml
Fig: Hadoop Installation – Configuring yarn-site.xml

1
2
<?xml version="1.0">
3 <configuration>
4 <property>
5 <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
6 </property>
7 <property>
8 <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</
name>
9
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
1 </property>
0 </configuration>
1
Step 11: Edit hadoop-env.sh and add the Java Path as mentioned below:
hadoop-env.sh contains the environment variables that are used in the script to run
Hadoop like Java home path, etc.

Command: vi hadoop–env.sh

Fig: Hadoop Installation – Configuring hadoop-env.sh

Step 12: Go to Hadoop home directory and format the NameNode.

Command: cd
Command: cd hadoop-2.7.3

Command: bin/hadoop namenode -format

Fig: Hadoop Installation – Formatting NameNode

This formats the HDFS via NameNode. This command is only executed for the first
time. Formatting the file system means initializing the directory specified by the
dfs.name.dir variable.

Never format, up and running Hadoop filesystem. You will lose all your data stored in
the HDFS.

Step 13: Once the NameNode is formatted, go to hadoop-2.7.3/sbin directory and start all the
daemons.

Command: cd hadoop-2.7.3/sbin

Either you can start all daemons with a single command or do it individually.

Command: ./start-all.sh

The above command is a combination of start-dfs.sh, start-yarn.sh & mr-jobhistory-


daemon.sh

Or you can run all the services individually as below:

Start NameNode:

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of
all files stored in the HDFS and tracks all the file stored across the cluster.

Command: ./hadoop-daemon.sh start namenode


Fig: Hadoop Installation – Starting NameNode

Start DataNode:

On startup, a DataNode connects to the Namenode and it responds to the requests


from the Namenode for different operations.

Command: ./hadoop-daemon.sh start datanode

Fig: Hadoop Installation – Starting DataNode

Start ResourceManager:

ResourceManager is the master that arbitrates all the available cluster resources and
thus helps in managing the distributed applications running on the YARN system.
Its work is to manage each NodeManagers and the each application’s
ApplicationMaster.

Command: ./yarn-daemon.sh start resourcemanager


Fig: Hadoop Installation – Starting ResourceManager

Start NodeManager:

The NodeManager in each machine framework is the agent which is responsible for
managing containers, monitoring their resource usage and reporting the same to the
ResourceManager.

Command: ./yarn-daemon.sh start nodemanager

See Batch Details

Fig: Hadoop Installation – Starting NodeManager

Start JobHistoryServer:

JobHistoryServer is responsible for servicing all job history related requests from client.

Command: ./mr-jobhistory-daemon.sh start historyserver

Step 14: To check that all the Hadoop services are up and running, run the below command.
Command: jps

Fig: Hadoop Installation – Checking Daemons

Step 15: Now open the Mozilla browser and go


to localhost:50070/dfshealth.html to check the NameNode interface.

Fig: Hadoop Installation – Starting WebUI

Congratulations, you have successfully installed a single node Hadoop cluster

Result:
Thus the Hadoop one cluster was installed and simple applications executed
successfully.
8. Creating And Executing Your First Container Using Docker.

AIM:

To create and execute a container using docker..

STEP:1 Installing Docker on the System

To begin, you will need to install Docker on your system.


Docker provides installers for Windows, macOS, and various flavors of Linux, making it accessible to a wide range
of users.
Below are the commands you can use to install Docker on Ubuntu:

1.sudo apt update

2.sudo apt install apt-transport-https ca-certificates curl software-properties-common

3.curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

4.sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) st


able"

5.sudo apt update

6.sudo apt install docker-ce

Once the installation is complete, you can verify it by checking the Docker version and making sure the
Docker daemon is running.

STEP:2 Verifying the Installation and Accessing the Docker CLI

For those using the Ubuntu operating system, you can verify the Docker installation by running the following
command:

1.docker --version

2.sudo systemctl status docker

With Docker successfully installed, you can now access the Docker command-line interface (CLI) to start
creating and managing containers.
The CLI provides a set of commands for interacting with Docker, allowing you to build, run, and manage
containers with ease.

STEP:3 Crafting Your First Dockerfile

Some of the key concepts in Docker revolve around creating a Dockerfile, which is a text document that
contains all the commands a user could call on the command line to assemble an image.
The Dockerfile contains all the information Docker needs to build the image. Let’s take a look at how to
define a simple Dockerfile and some best practices for writing it.
STEP:4 Defining a Simple Dockerfile

First, let’s start by creating a basic Dockerfile.


In this example, we’ll create a Dockerfile that simply prints “Hello, World!” when run as a container.

1.FROM alpine

2.CMD echo "Hello, World!"

When defining a simple Dockerfile, it’s important to keep it as minimal as possible.


Only include the necessary dependencies and commands required for your application to run within the
container.
This helps to keep the image size small and reduces the attack surface, making it more secure.

STEP:5 Best Practices for Writing Dockerfiles

Dockerfiles should follow best practices to ensure consistency, maintainability, and reusability.
One of the best practices is to use the official base images from Docker Hub, as they are well-maintained and
regularly updated. It’s also important to use specific versions of the base images to avoid unexpected changes.

1.FROM node:14

2.COPY . /app

3.WORKDIR /app

4.RUN npm install

5.CMD ["npm", "start"]

Best practices for writing Dockerfiles also include using a .dockerignore file to specify files and directories to
exclude from the context when building the image.
This helps to reduce the build context and improve build performance.

Some additional best practices for writing Dockerfiles include avoiding running commands as root, using
multi-stage builds for smaller images, and using environment variables for configuration.

STEP6: Building and Running Your Container

To build and run your Docker container, you will need to follow a few simple steps.
First, you will need to build the Docker image from your Dockerfile.
Once the image is built, you can run your container using the Docker run command. In this section, we will
walk through each step in detail.

Building the Docker Image from Your Dockerfile

To build the Docker image from your Dockerfile, you will need to navigate to the directory where your
Dockerfile is located and run the following command:
docker build -t your-image-name .

This command will build the Docker image using the instructions specified in your Dockerfile.
Once the build process is complete, you will have a new Docker image ready for use.

Running Your Docker Container

To run your Docker container, you will need to use the Docker run command followed by the name of the
image you want to run.
For example:

docker run your-image-name


Running this command will start a new container based on the specified image.
Depending on your application, you may need to specify additional options for the docker run command,
such as port bindings or environment variables.
docker run -p 8080:80 your-image-name

Your Docker container is now up and running, ready to serve your application to the world.

Managing Your Docker Container

Unlike traditional virtual machines, where you need to manually install and configure software, Docker
containers are designed to be easily managed and manipulated.
Let’s take a look at some key ways to manage your Docker containers.

Monitoring Container Performance

With Docker, you can easily monitor the performance of your containers using built-in commands.
By running docker stats , you can view real-time CPU, memory, and network usage for all running
containers.
This can help you identify any resource bottlenecks and optimize your container performance.

Stopping, Starting, and Removing Containers

The Docker CLI provides simple commands for stopping, starting, and removing containers.
The command

docker stop [container_name]

will gracefully stop a running container, while

docker start [container_name]

will restart a stopped container.

To remove a container entirely, use the command


docker rm [container_name]

Additionally, you can use the docker ps command to list all running containers, and docker ps -a to see all
containers, including those that are stopped.
This gives you full visibility and control over your containers.
RESULT:

Thus a container is created and executed in a docker successfully.

VIVA QUESTIONS AND ANSWERS

1. Define Cloud Computing with example.


Cloud computing is a model for enabling convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction.
2. What is the working principle of Cloud Computing?
The cloud is a collection of computers and servers that are publicly accessible via the
Internet. This hardware is typically owned and operated by a third party on a consolidated
basis in one or more data center locations. The machines can run any combination of
operating systems.

3. What are the advantages and disadvantages of Cloud Computing?


Advantages
Lower-Cost Computers for Users
Improved Performance
Lower IT Infrastructure Costs
Fewer Maintenance Issues
Lower Software Costs
Instant Software Updates
Increased Computing Power
Unlimited Storage Capacity
Increased Data Safety
Improved Compatibility Between Operating Systems
Improved Document Format Compatibility
Easier Group Collaboration
Universal Access to Documents
Latest Version Availability
Removes the Tether to Specific Devices
Disadvantages
Requires a Constant Internet Connection
Doesn’t Work Well with Low-Speed Connections
Can Be Slow
Features Might Be Limited
Stored Data Might Not Be Secure
If the Cloud Loses Your Data, You’re Screwed
4. What is distributed system?
A distributed system is a software system in which components located on networked computers
communicate and coordinate their actions by passing messages. The components interact with each other in
order to achieve a common goal.
Three significant characteristics of distributed systems are:
 Concurrency of components
 Lack of a global clock
 Independent failure of components
 What is cluster?
 Acomputingclusterconsistsofinterconnectedstand-
alonecomputerswhichworkcooperativelyasasingleintegratedcomputingresource.Inthepast,clus
teredcomputersystemshavedemonstrated
5. What is grid computing?
Grid Computing enables virtuals organizations to share geographically distributed resources as they
pursue common goals, assuming the absence of central location, central control, omniscience, and an
existing trust relationship.
(or)
 Gridtechnologydemandsnewdistributedcomputingmodels,software/middlewaresupport,networkp
rotocols,andhardwareinfrastructures.
 Nationalgridprojectsarefollowedbyindustrialgridplat-
formdevelopmentbyIBM,Microsoft,Sun,HP,Dell,Cisco,EMC,PlatformComputing,andothers.
Newgridserviceproviders(GSPs)andnewgridapplicationshaveemergedrapidly,similartothegrowtho
fInternetandwebservicesinthepasttwodecades.
 gridsystemsareclassifiedinessentiallytwocategories:computationalordatagridsandP2Pgrids.
6.What are the business areas needs in Grid computing?
 Life Sciences
 Financial services
 Higher Education
 Engineering Services
 Government
 Collaborative games

7. List out the Grid Applications:


 Application partitioning that involves breaking the problem into discrete pieces
 Discovery and scheduling of tasks and workflow
 Data communications distributing the problem data where and when it is required
 Provisioning and distributing application codes to specific system nodes
 Autonomic features such as self-configuration, self-optimization, self-recovery and self-
management
8. List some grid computing toolkits and frameworks?
 Globus Toolkit Globus Resource Allocation Manager(GRAM)
 Grid Security Infrastructure(GSI)
 Information Services
 Legion, Condor and Condor-G
 NIMROD, UNICORE, NMI.
9.What are Desktop Grids?
These are grids that leverage the compute resources of desktop computers.
Because of the true (but unfortunate) ubiquity of Microsoft® Windows® operating
system in corporations, desktop grids are assumed to apply to the Windows environment.
The Mac OS™ environment is supported by a limited number of vendors.
10. What are Server Grids?
 Some corporations, while adopting Grid Computing , keep it limited to server resources that are
within the purview of the IT department.
 Special servers, in some cases, are bought solely for the purpose of creating an internal “utility
grid” with resources made available to various departments.
 No desktops are included in server grids. These usually run some flavor of the Unix/Linux
operating system.
11. Define Opennebula.
OpenNebula is an open source management tool that helps virtualized data centers oversee private clouds,
public clouds and hybrid clouds ..... OpenNebula is vendor neutral, as well as platform- and API-agnostic. It
can use KVM, Xen or VMware hypervisors.
12.Define Eclipse.
Eclipse is an integrated development environment (IDE) used in computer programming, and is the most
widely used Java IDE. It contains a base workspace and an extensible plug-in system for customizing the
environment.
13. Define Netbeans.
NetBeans is an open-source integrated development environment (IDE) for developing with Java, PHP,
C++, and other programming languages. NetBeans is also referred to as a platform of modular components
used for developing Java desktop applications.
14. Define Apache Tomcat.
Apache Tomcat (or Jakarta Tomcat or simply Tomcat) is an open source servlet container developed by
the Apache Software Foundation (ASF). Tomcat implements the Java Servlet and the JavaServer Pages
(JSP) specifications from Sun Microsystems, and provides a "pure Java" HTTP web server environment for
Java code to run."
15. What is private cloud?
The private cloud is built within the domain of an intranet owned by a single organization.
Therefore, they are client owned and managed. Their access is limited to the owning clients and their
partners. Their deployment was not meant to sell capacity over the Internet through publicly accessible
interfaces. Private clouds give local users a flexible and agile private infrastructure to run service
workloads within their administrative domains.

16. What is public cloud?


A public cloud is built over the Internet, which can be accessed by any user who has paid for the
service. Public clouds are owned by service providers. They are accessed by subscription. Many companies
have built public clouds, namely Google App Engine, Amazon AWS, Microsoft Azure, IBM Blue Cloud,
and Salesforce Force.com. These are commercial providers that offer a publicly accessible remote interface
for creating and managing VM instances within their proprietary infrastructure.

17. What is hybrid cloud?


A hybrid cloud is built with both public and private clouds, Private clouds can also support
a hybrid cloud model by supplementing local infrastructure with computing capacity from an external
public cloud. For example, the research compute cloud (RC2) is a private cloud built by IBM.

18. What is a Community Cloud ?


A community cloud in computing is a collaborative effort in which infrastructure is shared between
several organizations from a specific community with common concerns (security, compliance,
jurisdiction, etc.), whether managed internally or by a third-party and hosted internally or externally. This
is controlled and used by a group of organizations that have shared interest. The costs are spread over
fewer users than a public cloud (but more than a private cloud
19. Define IaaS?
The IaaS layer offers storage and infrastructure resources that is needed to deliver the Cloud
services. It only comprises of the infrastructure or physical resource. Top IaaS Cloud Computing
Companies: Amazon (EC2), Rackspace, GoGrid, Microsoft, Terremark and Google.
20. Define PaaS?
PaaS provides the combination of both, infrastructure and application. Hence, organisations
using PaaS don’t have to worry for infrastructure nor for services. Top PaaS Cloud Computing
Companies: Salesforce.com, Google, Concur Technologies, Ariba, Unisys and Cisco..

21. Define SaaS?


In the SaaS layer, the Cloud service provider hosts the software upon their servers. It can be defined
as a in model in which applications and softwares are hosted upon the server and made available to
customers over a network. Top SaaS Cloud Computing Companies: Amazon Web Services,
AppScale, CA Technologies, Engine Yard, Salesforce and Windows Azure.

22. What is meant by virtualization?


Virtualizationisacomputerarchitecturetechnologybywhichmultiplevirtualmachines (VMs)are
multipl exedin the same hardwar emachine.Theideaof VMs canbe dated back to the 1960s. The purpose
of a VM is to enhance resource sharing by many users and improve computer performance interms of
resource utilization and application flexibility.

23. What are the implementation levels of virtualization?


The virtualization types are following
1. OS-level virtualization
2. ISA level virtualization
3. User-ApplicationLevel virtualization
4. hardware level virtualization
5. library level virtualization
24.List the requirements of VMM?
There are three requirements for a VMM.
First, a VMM should provide an environment for programs which is essentially identical to the
original machine.
Second, programs run in this environment should show, at worst, only minor decreases in speed.
Third, a VMM should be in complete control of the system resources.
25. Explain Host OS and Guest OS?
A comparison of the differences between a host system, a guest system, and a virtual machine within
a virtual infrastructure.
A host system (host operating system) would be the primary & first installed operating system. If
you are using a bare metal Virtualization platform like Hyper-V or ESX, there really isn’t a host
operating system besides the Hypervisor. If you are using a Type-2 Hypervisor like VMware Server or
Virtual Server, the host operating system is whatever operating system those applications are installed
into.
A guest system (guest operating system) is a virtual guest or virtual machine (VM) that is installed
under the host operating system. The guests are the VMs that you run in your virtualization platform.

26. Write the steps for live VM migration?


The five steps for live VM migration is
Stage 0: Pre-Migration

Active VM on Host A
Alternate physical host may be preselected for migration
Block devices mirrored and free resources maintained
Stage 1: Reservation
Initialize a container on the target
hostStage 2: Iterative pre-copy

Enable shadow paging


Copy dirty pages in successive rounds.
Stage 3: Stop and copy

Suspend VM on host A
Generate ARP to redirect traffic to Host B
Synchronize all remaining VM state to Host B
Stage 4: Commitment
VM state on Host A is released
Stage 5: Activation

VM starts on Host B
Connects to local devices
Resumes normal operation

27..Define Globus Toolkit: Grid Computing Middleware


 Globus is open source grid software that addresses the most challenging problmes in distributed
resources sharing.
 The Globus Toolkit includes software services and libraries for distributed security, resource
management, monitoring and discovery, and data management.
28. Define Blocks in HDFS
 A disk has a block size, which is the minimum amount of data that it can read or write. Filesystems for
a single disk build on this by dealing with data in blocks, which are an integral multiple of the disk
block size. Filesystem blocks are typically a few kilobytes in size, while disk blocks are normally 512
bytes. This is generally transparent to the filesystem user who is simply reading or writing a file—of
whatever length.
29. Define Namenodes and Datanodes
 An HDFS cluster has two types of node operating in a master-worker pattern:
 a namenode (the master) and
 a number of datanodes(workers).
 The namenode manages the filesystem namespace. It maintains the filesystem tree and the metadata
for all the files and directories in the tree. This information is stored persistently on the local disk in
the form of two files: the namespace image and the edit log.
 The namenode also knows the datanodes on which all the blocks for a given file are located,
however, it does not store block locations persistently, since this information is reconstructed from
datanodes when the system starts.

30. Define HADOOP.


Hadoop is an open source, Java-based programming framework that supports the processing and storage of
extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored
by the Apache Software Foundation.

31. Define HDFS.


Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data
storage that is designed to span large clusters of commodity servers. HDFS, MapReduce, and YARN form
the core of Apache™ Hadoop®.

32. Write about HADOOP.


Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at
the time, named it after his son's toy elephant. It was originally developed to support distribution for the
Nutch search engine project.

33. Definition of Grid Portal:


A Grid Portal provides an efficient infrastructure to put Grid-empowered applications on corporate
Intranet/Internet.

34. Define GAE.


Google App Engine (often referred to as GAE or simply App Engine) is a Platform as a Service and cloud
computing platform for developing and hosting web applications in Google-managed data centers.
Applications are sandboxed and run across multiple servers. App Engine offers automatic scaling for web
applications—as the number of requests increases for an application, App Engine automatically allocates
more resources for the web application to handle the additional demand.
35. What is Cloudsim?
CloudSim is a simulation toolkit that supports the modeling and simulation of the core functionality of
cloud, like job/task queue, processing of events, creation of cloud entities(datacenter, datacenter brokers,
etc), communication between different entities, implementation of broker policies, etc. This toolkit allows
to:

 Test application services in a repeatable and controllable environment.


 Tune the system bottlenecks before deploying apps in an actual cloud.
 Experiment with different workload mix and resource performance scenarios on simulated
infrastructure for developing and testing adaptive application provisioning techniques
36. Core features of CloudSim are:

 The Support of modeling and simulation of large scale computing environment as


federated cloud data centers, virtualized server hosts, with customizable policies for
provisioning host resources to virtual machines and energy-aware computational
resources
 It is a self-contained platform for modeling cloud’s service brokers, provisioning,
and allocationpolicies.
 It supports the simulation of network connections among simulated system elements.
 Support for simulation of federated cloud environment, that inter-networks
resources from bothprivate and public domains.
 Availability of a virtualization engine that aids in the creation and
management of multipleindependent and co-hosted virtual services on a data
center node.
 Flexibility to switch between space shared and time shared allocation of
processing cores tovirtualized services.

37. Uses of Cloudsim.

 Load Balancing of resources and tasks


 Task scheduling and its migrations
 Optimizing the Virtual machine allocation and placement policies
 Energy-aware Consolidations or Migrations of virtual machines
 Optimizing schemes for Network latencies for various cloud scenarios

38. Define OpenStack.


OpenStack is a cloud operating system that controls large pools of compute, storage, and
networking resources throughout a datacenter, all managed and provisioned through APIs
with common authenticationmechanisms.
A dashboard is also available, giving administrators control while empowering their users to
provisionresources through a web interface.

39. Define Trystack.


TryStack is a great way to take OpenStack for a spin without having to
commit to a fulldeployment.
This free service lets you test what the cloud can do for you, offering networking, storage
and computeinstances, without having to go all in with your own hardware.
It’s a labor of love spearheaded by three Red Hat OpenStack experts Will
Foster, KambizAghaiepour and Dan Radez.
TryStack’s set-up must bear the load of anyone who wants to use it, but instead of an
equally boundless budget and paid staff, it was originally powered by donated equipment and
volunteers from Cisco, Dell, Equinix, NetApp, Rackspace and Red Hat who pulled together
for this OpenStack Foundationproject.
40. Define Hadoop.
Hadoop is an open-source software framework for storing data and running
applications on clusters of commodity hardware. It provides massive storage for any kind of
data, enormous processing power and the ability to handle virtually limitless concurrent tasks
or jobs.

You might also like