lOMoARcPSD|30946019
Cloud computing lab manual ccs335
Cloud Computing (Anna University)
Scan to open on Studocu
Studocu is not sponsored or endorsed by any college or university
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
GOOD SHEPHERD
COLLEGE OF ENGINEERING AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
LAB MANUAL
CCS335 – CLOUD COMPUTING
LABORATORY
Regulation 2021
Year / Semester : III / V
Jun 2024 – Dec 2024
PREPARED BY
Mrs. K. SHEKINA, M.E.,
Assistant Professor / CSE
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
LIST OF EXPERIMENTS
1. Install Virtualbox/VMware Workstation with different flavours of linux or
windows OS on top of windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and
execute Simple Programs
3. Install Google App Engine. Create hello world app and other simple web
applications using python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm
that is not present in CloudSim.
6. Find a procedure to transfer the files from one virtual machine to another virtual machine.
7. Install Hadoop single node cluster and run simple applications like wordcount.
8. Creating and Executing your first container using docker
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
TABLE OF CONTENTS
MARK
S.NO. DATE EXPERIMENT TITLE SIGN.
S/10
Install Virtualbox / VMware
Workstation with different flavours of
1. linux or windows OS on top of
windows7 or 8.
Install a C compiler in the virtual
machine created using virtual box
2.
and execute Simple Programs
Install Google App Engine. Create
hello world app and other simple web
3.
applications using python/java.
Use GAE launcher to launch the web
4. applications.
Simulate a cloud scenario using
CloudSim and run a scheduling
5. algorithm that is not present in
CloudSim.
Find a procedure to transfer the files from one
6. virtual machine to another virtual machine.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Install Hadoop single
node cluster and run simple
7. applications like wordcount.
Creating and Executing your first container using
8. docker
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
INSTALL VIRTUAL BOX/VMWARE WORKSTATION
EX.NO:1
WITH DIFFERENT FLAVOURS OF LINUX OR
DATE: WINDOWS OS ON TOP OF WINDOWS7 OR 8.
AIM:
To Install Virtual box/VMware Workstation with different flavours of linux or windows OS on top of
windows7 or 8 Procedure
Virtual Box installation
1. First we need to download Virtual Box from https://www.virtualbox.org. I’ve downloaded
Virtual Box 5.1.14
2. Run the executable and follow the prompts to complete the installation. We don’t really need to
change anything for our purposes, and can accept the defaults. Before completing the wizard you will get a
warning that the network connection will temporarily be interrupted, so make sure you’re not doing anything
that would be impacted, like being half-way through downloading a 16GB file that can’t be resumed
2. Create an Ubuntu virtual machine
1. Download the latest Ubuntu release from https://www.ubuntu.com/download/desktop.
I’ve downloaded Ubuntu 16.04.1
2. Open Virtual Box and click New
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
3. Type the Name for the virtual machine, like Ubuntu 16. VirtualBox will try to predict the
Type and Version based on the name you enter. Otherwise, select:
Type: Linux
Version: Ubuntu (64-bit)
and click Next.
4. Next we need to specify how much memory to allocate the virtual machine. According to the
Ubuntu system requirements we need 2GB, but I’d recommend more if your host can handle it.
Basically the higher you can set the memory without severly impacting your host machine, the
better the performance of the guest machine. If you’re not sure, stick with 2GB.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
5. On the Hardware screen select Create a virtual hard disk now and click Create
6. Accept the default option VDI for Hard disk file type (or change it if you wish…) and click
Next
7. Next we are prompted for Storage on physical hard disk. The options are Dynamically
allocated and Fixed size. We’ll use the default of Dynamically allocated. Click Next
8. Choose the hard disk size and storage location. The Ubuntu system requirements recommend
25GB. Remember, we choose Dynamically allocated as our storage option in the last step, so we
won’t consume all this disk space immediately. Rather, VirtualBox will allocate it as required,
up to the maximum 25GB we specified. Click Create
9. The wizard will finish and we are returned to the main VirtualBox window. Click Settings
10. In the left pane select Storage, then in the right select the CD icon with the word Empty
beside it.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
11. Under Attributes click the CD icon (highlighted in the screenshot above) and select
Choose Virtual Optical Disk File and browse to the downloaded file ubuntu-16.04.1- desktop-
amd64.iso
12. Click OK to close the Settings dialog window. The virtual machine should now be ready
to start.
Install Ubuntu
In VirtualBox your VM should be showing as Powered Off, and the optical drive configured to point to
the Ubuntu ISO file we downloaded previously.
1. In VirtualBox, select the virtual machine Ubuntu 16 and click Start. VirtualBox will launch a
new window with the vm and boot from the iso.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
2. Click Install Ubuntu
3. Select Download updates while installing Ubuntu and click Continue
4. On the next screen accept the default of Erase disk and install Ubuntu and click Install Now
5. You will be prompted with a warning saying the changes will be written to disk. Click
Continue
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
6. Select your timezone and click Continue
7. Select your keyboard layout. I accepted the default of English (US) and click Continue
8. Enter a username and password, then click Continue
9. The Ubuntu installation may take several minutes to run, so have another coffee.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
10. When the installation is finished you will be prompted to restart. Save and close
anything else you may have open and click Restart Now
11. Now when the vm reboots you may see this message.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
From the menu select Machine > Settings.
Navigate back into the Storage settings where we previously selected the iso file. If the Ubuntu
iso file is still there, remove it. Otherwise close the Settings window and in the vm press Enter to
proceed.
12. If all went well the VM should boot to the Ubuntu login screen. Enter your password to
continue.
Ubuntu should run normally in the VirtualBox environment. If everything is far too small, you can adjust
the ‘zoom’ by selecting View > Scale Factor > 200%.
Have fun!
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Result:
Thus the Virtual box/Oracle Virtual machine has installed Succesfully.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
EX.NO:2 INSTALL A C COMPILER IN THE
VIRTUAL MACHINE AND EXECUTE A
DATE:
SAMPLE PROGRAM.
Aim :
To install a C compiler in the virtual machine and execute a sample program.
Procedure:
step1:
Install the centos or ubuntu in the VMware or Oracle Virtual Machine as per previous commands.
Step 2:
Login into the VM of installed OS.
Step 3:
If it is ubuntu then, for gcc installation
$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get install gcc-6 gcc-6-base
Step 4:
Write a sample program like
Welcome.cpp
#include<iostream.h>
using namespace std;
int main()
{
cout<<”Hello world”;
return 0;
}
10
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Step 5:
First we need to compile and link our program. Assuming the source code is saved in a file
welcome.cpp, we can do that using GNU C++ compiler g++, for example g++ -Wall -o
welcome welcome.cpp and output can be executed by ./welcome
Result:
Thus the GCC compiler has installed and executed in this sample program successfully.
11
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
EX.NO:3 INSTALL GOOGLE APP ENGINE. CREATE HELLO WORLD
APP AND OTHER SIMPLE WEB APPLICATIONS USING
DATE: PYTHON/JAVA.
AIM:
To install Google App Engine. Create hello world app and other simple web applications
using python/java.
PROCEDURE:
To create a new App Engine standard project in Eclipse:
Steps to creation:
1. Click the Google Cloud Platform toolbar button .
2. Select Create New Project > Google App Engine Standard Java Project.
12
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
3. To create a Maven-based App Engine project, check Create as Maven Project and
enter a Maven Group ID and Artifact ID of your choosing to set the coordinates for
this project. The Group ID is often the same as the package name, but does not have to
be. The Artifact ID is often the same as or similar to the project name, but does not
have to be.
4. Click Next.
5. Select any libraries you need in the project.
6. Click Finish.
The wizard generates a native Eclipse project, with a simple servlet, that you can run and deploy from
the IDE.
Running the project locally
Steps for Running the Project locally:
1. Select the project in the Project Explorer or Package Explorer.
2. Open the context menu.
3. Select Run As > App Engine.
13
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
1. Log messages appear in the console as the server starts up.
2. Eclipse opens its internal web browser to your application. You can also open an
external browser and navigate to http://localhost:8080. Either way, you'll see a static
HTML page with a link to the servlet.
Note: You may see a message that says, Port 8080 already in use. If so, you can run
your application on a different host or port.
14
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Debugging the project locally
To debug your project locally, complete the running the project locally steps, except select
Debug As > App Engine instead of Run As > App Engine in the context menu.
The server stops at the breakpoints you set and shows the Eclipse debugger view.
Running App Engine Standard Apps on a Different Host or Port
To run your App Engine standard application on a different host or port:
1. Right-click your project.
2. Select Run As > Run on Server.
15
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Note: You can also select Debug As > Debug on Server to debug your application on a
different host or port.
1. In the dialog, select Manually define a new server.
2. Select App Engine Standard as the server type.
3. Enter the hostname in the Server's host name field.
4. Enter the port in the Server port field.
5. Click Finish.
Configuring Eclipse
16
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
To configure Cloud Tools for Eclipse to use Objectify:
1. In Eclipse, select Run > Run Configurations.
2. In the Run Configurations dialog, select an existing App Engine Local Server
launch configuration, or click the New launch configuration button to create one.
3. Select the Cloud Platform tab of your run configuration.
4. Select an account.
5. Select a project to assign a project ID to be used in the local run. It doesn't matter
which project you select because you won't actually connect to it.
6. As an alternative, if you aren't logged in or don't have a Cloud Project, you can
instead set the GOOGLE_CLOUD_PROJECT environment variable to a legal
string, such
as MyProjectId, in the Environment tab of the run configuration.
Result:
Thus the Google App Engine has installed and executed in this sample program effectively.
17
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
EX.NO:4 USE GAE LAUNCHER TO LAUNCH THE WEB
DATE: APPLICATIONS
AIM:
To use GAE launcher to launch the web applications(Eclipse)
Procedure:
Step1:
Install an Eclipse and create GAE web application as per previous commands.
Step2 :
Deploying App Engine Standard Applications from Eclipse
The steps of creating a new App Engine app in the Google Cloud Console, authenticating with
Google, and deploying your project to App Engine.
Before you begin
You need a Google Cloud project with an App Engine application to deploy to. If you don't
already have one, use the Google Cloud Console to set up your Cloud project:
1. Select or create a new Cloud project.
2. Sign in to a Google account that is used to deploy your project to App Engine.
3. Select File > Sign in to Google.
If you see Manage Google Accounts instead of the Sign in to Google option, that means you
are already signed in, so you can skip these account sign in steps.
4. Your system browser opens outside of Eclipse and asks for the permissions it needs to
manage your App Engine Application.
18
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
5. Click Allow and close the window. Eclipse is now signed into your account.
6. Ensure that the appengine-web.xml file is in the WEB-INF folder of
your web application.
7. Ensure that the project has the App Engine Project facet. If you created it
using the wizard, it should already have this facet. Otherwise:
8. Right click the project in the Package Explorer to bring up the context menu.
9. Select Configure > Convert to App Engine Project.
Deploy the Project to App Engine
To deploy the project to App Engine standard environment:
1. Right click the project in the Package Explorer to open the context menu.
2. Select Deploy to App Engine Standard.
3. A dialog pops up.
4. Select the account you want to deploy with, or add a new account.
5. The list of projects the account has access to loads. Select the one you want to deploy to.
6. Click OK.
19
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
A background job launches that deploys the project to App Engine. The output of the job is
visible in the Eclipse Console view.
By default, App Engine stops the previous version of your application and immediately
promotes your new code to receive all traffic. If you'd rather manually promote it later
using gcloud or the Google Cloud Console, uncheck Promote the deployed version to receive
all traffic. If you don't want to stop the previous version, uncheck Stop previous version.
Result:
Thus the Google App Engine has launched in this sample program agreeably.
20
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
EX.NO:5 SIMULATE A CLOUD SCENARIO USING CLOUDSIM AND
RUN A SCHEDULING ALGORITHM THAT IS NOT IN
DATE: CLOUDSIM
Aim :
To Simulate a cloud scenario using CloudSim and run a scheduling algorithm.
Procedure:
1. Before you start, It is essential that the cloudsim should already installed/setup on your
local computer machine. In case you are yet to install it, you may follow the process of
Cloudsim setup using Eclipse IDE
public static void main(String[] args)
2. The main() method is the pointer from where the execution of this example starts
3. There are eleven steps that are followed in each example with some variation in
them, specified as follows:
Step1 :Set the Number of users for the current simulation. This user count is directly
int num_user = 1; // number of cloud users
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false;
proportional to a number of brokers in the current simulation.
Step 2: Initialize the simulation, provided with the current time, number of users and
trace flag.
CloudSim.init(num_user, calendar, trace_flag);
Datacenter datacenter0 = createDatacenter("Datacenter_0");
Step 3: Create a Datacenter.
4. where the createDatacenter() method itself initializes the various datacenter
21
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
characteristics along with the host list. This is the most important entity without this
there is no way the simulation of hosting the virtual machine is applicable.
private static Datacenter createDatacenter(String name)
List<Host> hostList = new ArrayList<Host>();
List<Pe> peList = new ArrayList<Pe>();
int mips = 1000;
peList.add(new Pe(0, new PeProvisionerSimple(mips))); int
hostId = 0;
int ram = 2048; // host memory (MB) long
storage = 1000000; // host storageint bw
= 10000;
hostList.add(
22
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
new VmSchedulerTimeShared(peList)
);
String arch =
"x86"; String
os = "Linux";
String vmm =
"Xen"; double
time_zone =
10.0;double
cost = 3.0;
double costPerMem
= 0.05; double
costPerStorage =
0.001;
double costPerBw = 0.0;
LinkedList<Storage> storageList = new
LinkedList<Storage>();DatacenterCharacteristics
characteristics = new DatacenterCharacteristics(arch, os,
vmm, hostList,
Step:4 Create a datacenter broker
DatacenterBroker broker = createBroker();
int brokerId = broker.getId();
Where the createBroker() method initializes the entity object from DatacenterBroker class
private static DatacenterBroker createBroker()
{
DatacenterBroker broker = null;
try {
broker = new DatacenterBroker("Broker");
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}
23
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Step 5: Create a Virtual Machine(s).
vmlist = new
ArrayList<Vm>();int vmid
= 0;
int mips =
1000; long
size = 10000;
9ioint ram =
512; long bw
= 1000;
int pesNumber = 1;
Vm vm = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm,
newCloudletSchedulerTimeShared());
vmlist.add(vm);
Step 6: Submit Virtual Machine to Datacenter broker.
broker.submitVmList(vmlist);
Step 7: Create Cloudlet(s) by specifying their characteristics.
cloudletList = new ArrayList<Cloudlet>();
int id = 0;
long length =
400000; long
fileSize = 300;
long
outputSize =
300;
UtilizationModel utilizationModel = new UtilizationModelFull();
Cloudlet cloudlet = new Cloudlet(id, length, pesNumber,
fileSize,outputSize, utilizationModel,
utilizationModel, utilizationModel);
cloudlet.setUserId(brokerId);
cloudlet.setVmId(vmid);
Step 8: Submit Cloudlets to Datacenter broker.
broker.submitCloudletList(cloudletList); 24
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Step 9: Send call to Start Simulation.
CloudSim.startSimulation();
Step 10: Once no more event to execute, send the call to Stop Simulation.
CloudSim.stopSimulation();
Step 11 : Finally, print the final status of the Simulation.
List<Cloudlet> newList = broker.getCloudletReceivedList();
printCloudletList(newList);
Where printCloudletList() method formats the output to correctly display it on the console.
private static void printCloudletList(List<Cloudlet> list)
int size = list.size();
Cloudlet cloudlet;
String indent = " ";
Log.printLine();
Log.printLine("========== OUTPUT ==========");
Log.printLine("Cloudlet ID" + indent + "STATUS" + indent
25
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
DecimalFormat dft = new DecimalFormat("###.##"); for
(int i = 0; i < size; i++)
cloudlet = list.get(i);
Log.print(indent + cloudlet.getCloudletId() + indent +
indent);
if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS)
Log.print("SUCCESS");
Log.printLine(indent + indent +
cloudlet.getResourceId()
+ indent + indent + indent +
cloudlet.getVmId()
Once you Run the example the output for cloudsimExample1.java will be displayed like:
26
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Result :
Thus the Simulation of a cloud scenario using CloudSim and run a scheduling algorithm has
implemented successfully
27
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
EX.NO:6 FIND A PROCEDURE TO TRANSFER THE FILES
FROM ONE VIRTUAL MACHINE TO ANOTHER
DATE: VIRTUAL MACHINE
AIM:
To Find a procedure to transfer the files from one virtual machine to another virtual
machine.
PROCEDURE:
Step 1: Open Opennebula service from root user and view in localhost:9869
root@linux:$ /etc/init.d/opennebula-sunstone restart
Step 2: Create oneimage, onetemplate and one vm as like earlier Creating oneimage
oneadmin@linux:~/datastores$ oneimage create --name "Ubuntu" –path
"/home/linux/Downloads/source/tubuntu1404-5.0.1.qcow2c" --driver qcow2 -- datastore default
Creating One Template:
oneadmin@linux:~/datastores$ onetemplate create --name "ubuntu1" --cpu 1 --vcpu 1 --
memory 1024 --arch x86_64 --disk "Ubuntu" --nic "private" --vnc –ssh
Instantiating OneVm (oneemplate)
oneadmin@linux:~/datastores$ onetemplate instantiate "ubuntu1"
Step 3: To perform a migration. We use onevm command with VM id as VID = 0 to
host02(HID=1)
oneadmin@linux:~/datastores$ onevm migrate --live 0 1
This will move the VM from host01 to host02. The onevm list shows something like the
following
oneadmin@linux:~/datastores$ onevm list
28
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
ID USER GROUP NAME STAT CPU MEM HOSTNAME TIME
0 oneadmin oneadmin one-0 runn 0 0k host02 00:00:48
Result :
Thus the virtual machine transfer the files from one virtual machine to
another virtual machine from one node to the other has executed successfully.
29
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
7.Install Hadoop single node cluster and run simple
applications like wordcount.
Aim:
To Install Hadoop single node cluster and run simple
applications like wordcount.
Steps:
Install Hadoop
Step 1: Click here to download the Java 8 Package. Save this file in your
home directory.
Step 2: Extract the Java Tar File.
Command: tar -xvf jdk-8u101-linux-i586.tar.gz
Fig: Hadoop Installation – Extracting Java Files
Step 3: Download the Hadoop 2.7.3 Package.
Command: wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-
2.7.3.tar.gz
Fig: Hadoop Installation – Downloading Hadoop
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Step 4: Extract the Hadoop tar File.
Command: tar -xvf hadoop-2.7.3.tar.gz
Fig: Hadoop Installation – Extracting Hadoop Files
Step 5: Add the Hadoop and Java paths in the bash file (.bashrc).
Open. bashrc file. Now, add Hadoop and Java Path as shown below.
Command: vi .bashrc
Fig: Hadoop Installation – Setting Environment Variable
Then, save the bash file and close it.
For applying all these changes to the current Terminal, execute the source command.
Command: source .bashrc
Fig: Hadoop Installation – Refreshing environment variables
To make sure that Java and Hadoop have been properly installed on your system
and can be accessed through the Terminal, execute the java -version and hadoop
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
version commands.
Command: java -version
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Fig: Hadoop Installation – Checking Java Version
Command: hadoop version
Fig: Hadoop Installation – Checking Hadoop Version
Step 6: Edit the Hadoop Configuration files.
Command: cd hadoop-2.7.3/etc/hadoop/
Command: ls
All the Hadoop configuration files are located in hadoop-2.7.3/etc/hadoop directory as
you can see in the snapshot below:
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Fig: Hadoop Installation – Hadoop Configuration Files
Step 7: Open core-site.xml and edit the property mentioned below inside
configuration tag:
core-site.xml informs Hadoop daemon where NameNode runs in the cluster. It contains
configuration settings of Hadoop core such as I/O settings that are common to HDFS &
MapReduce.
Command: vi core-site.xml
Fig: Hadoop Installation – Configuring core-site.xml
1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
6 </property>
7 </configuration>
Step 8: Edit hdfs-site.xml and edit the property mentioned below inside
configuration tag:
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
hdfs-site.xml contains configuration settings of HDFS daemons (i.e. NameNode,
DataNode, Secondary NameNode). It also includes the replication factor and block size
of HDFS.
Command: vi hdfs-site.xml
Fig: Hadoop Installation – Configuring hdfs-site.xml
1
2 <?xml version="1.0" encoding="UTF-8"?>
3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
4 <property>
5 <name>dfs.replication</name>
6 <value>1</value>
7 </property>
<property>
8 <name>dfs.permission</name>
9 <value>false</value>
10 </property>
</configuration>
11
Step 9: Edit the mapred-site.xml file and edit the property mentioned below
inside configuration tag:
mapred-site.xml contains configuration settings of MapReduce application like number
of JVM that can run in parallel, the size of the mapper and the reducer process, CPU
cores available for a process, etc.
In some cases, mapred-site.xml file is not available. So, we have to create the mapred-
site.xml file using mapred-site.xml template.
Command: cp mapred-site.xml.template mapred-site.xml
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Command: vi mapred-site.xml.
Fig: Hadoop Installation – Configuring mapred-site.xml
1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>mapreduce.framework.name</name>
<value>yarn</value>
6 </property>
7 </configuration>
Step 10: Edit yarn-site.xml and edit the property mentioned below inside
configuration tag:
yarn-site.xml contains configuration settings of ResourceManager and NodeManager
like application memory management size, the operation needed on program &
algorithm, etc.
Command: vi yarn-site.xml
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Fig: Hadoop Installation – Configuring yarn-site.xml
1
2
<?xml version="1.0">
3 <configuration>
4 <property>
5 <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
6 </property>
7 <property>
8 <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</
name>
9
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
1 </property>
0 </configuration>
1
Step 11: Edit hadoop-env.sh and add the Java Path as mentioned below:
hadoop-env.sh contains the environment variables that are used in the script to run
Hadoop like Java home path, etc.
Command: vi hadoop–env.sh
Fig: Hadoop Installation – Configuring hadoop-env.sh
Step 12: Go to Hadoop home directory and format the NameNode.
Command: cd
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Command: cd hadoop-2.7.3
Command: bin/hadoop namenode -format
Fig: Hadoop Installation – Formatting NameNode
This formats the HDFS via NameNode. This command is only executed for the first
time. Formatting the file system means initializing the directory specified by the
dfs.name.dir variable.
Never format, up and running Hadoop filesystem. You will lose all your data stored in
the HDFS.
Step 13: Once the NameNode is formatted, go to hadoop-2.7.3/sbin directory and start all the
daemons.
Command: cd hadoop-2.7.3/sbin
Either you can start all daemons with a single command or do it individually.
Command: ./start-all.sh
The above command is a combination of start-dfs.sh, start-yarn.sh & mr-jobhistory-
daemon.sh
Or you can run all the services individually as below:
Start NameNode:
The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of
all files stored in the HDFS and tracks all the file stored across the cluster.
Command: ./hadoop-daemon.sh start namenode
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Fig: Hadoop Installation – Starting NameNode
Start DataNode:
On startup, a DataNode connects to the Namenode and it responds to the requests
from the Namenode for different operations.
Command: ./hadoop-daemon.sh start datanode
Fig: Hadoop Installation – Starting DataNode
Start ResourceManager:
ResourceManager is the master that arbitrates all the available cluster resources and
thus helps in managing the distributed applications running on the YARN system.
Its work is to manage each NodeManagers and the each application’s
ApplicationMaster.
Command: ./yarn-daemon.sh start resourcemanager
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Fig: Hadoop Installation – Starting ResourceManager
Start NodeManager:
The NodeManager in each machine framework is the agent which is responsible for
managing containers, monitoring their resource usage and reporting the same to the
ResourceManager.
Command: ./yarn-daemon.sh start nodemanager
See Batch Details
Fig: Hadoop Installation – Starting NodeManager
Start JobHistoryServer:
JobHistoryServer is responsible for servicing all job history related requests from client.
Command: ./mr-jobhistory-daemon.sh start historyserver
Step 14: To check that all the Hadoop services are up and running, run the below command.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Command: jps
Fig: Hadoop Installation – Checking Daemons
Step 15: Now open the Mozilla browser and go
to localhost:50070/dfshealth.html to check the NameNode interface.
Fig: Hadoop Installation – Starting WebUI
Congratulations, you have successfully installed a single node Hadoop cluster
Result:
Thus the Hadoop one cluster was installed and simple applications executed
successfully.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
8. Creating And Executing Your First Container Using Docker.
AIM:
To create and execute a container using docker..
STEP:1 Installing Docker on the System
To begin, you will need to install Docker on your system.
Docker provides installers for Windows, macOS, and various flavors of Linux, making it accessible to a wide range
of users.
Below are the commands you can use to install Docker on Ubuntu:
1.sudo apt update
2.sudo apt install apt-transport-https ca-certificates curl software-properties-common
3.curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
4.sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) st
able"
5.sudo apt update
6.sudo apt install docker-ce
Once the installation is complete, you can verify it by checking the Docker version and making sure the
Docker daemon is running.
STEP:2 Verifying the Installation and Accessing the Docker CLI
For those using the Ubuntu operating system, you can verify the Docker installation by running the following
command:
1.docker --version
2.sudo systemctl status docker
With Docker successfully installed, you can now access the Docker command-line interface (CLI) to start
creating and managing containers.
The CLI provides a set of commands for interacting with Docker, allowing you to build, run, and manage
containers with ease.
STEP:3 Crafting Your First Dockerfile
Some of the key concepts in Docker revolve around creating a Dockerfile, which is a text document that
contains all the commands a user could call on the command line to assemble an image.
The Dockerfile contains all the information Docker needs to build the image. Let’s take a look at how to
define a simple Dockerfile and some best practices for writing it.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
STEP:4 Defining a Simple Dockerfile
First, let’s start by creating a basic Dockerfile.
In this example, we’ll create a Dockerfile that simply prints “Hello, World!” when run as a container.
1.FROM alpine
2.CMD echo "Hello, World!"
When defining a simple Dockerfile, it’s important to keep it as minimal as possible.
Only include the necessary dependencies and commands required for your application to run within the
container.
This helps to keep the image size small and reduces the attack surface, making it more secure.
STEP:5 Best Practices for Writing Dockerfiles
Dockerfiles should follow best practices to ensure consistency, maintainability, and reusability.
One of the best practices is to use the official base images from Docker Hub, as they are well-maintained and
regularly updated. It’s also important to use specific versions of the base images to avoid unexpected changes.
1.FROM node:14
2.COPY . /app
3.WORKDIR /app
4.RUN npm install
5.CMD ["npm", "start"]
Best practices for writing Dockerfiles also include using a .dockerignore file to specify files and directories to
exclude from the context when building the image.
This helps to reduce the build context and improve build performance.
Some additional best practices for writing Dockerfiles include avoiding running commands as root, using
multi-stage builds for smaller images, and using environment variables for configuration.
STEP6: Building and Running Your Container
To build and run your Docker container, you will need to follow a few simple steps.
First, you will need to build the Docker image from your Dockerfile.
Once the image is built, you can run your container using the Docker run command. In this section, we will
walk through each step in detail.
Building the Docker Image from Your Dockerfile
To build the Docker image from your Dockerfile, you will need to navigate to the directory where your
Dockerfile is located and run the following command:
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
docker build -t your-image-name .
This command will build the Docker image using the instructions specified in your Dockerfile.
Once the build process is complete, you will have a new Docker image ready for use.
Running Your Docker Container
To run your Docker container, you will need to use the Docker run command followed by the name of the
image you want to run.
For example:
docker run your-image-name
Running this command will start a new container based on the specified image.
Depending on your application, you may need to specify additional options for the docker run command,
such as port bindings or environment variables.
docker run -p 8080:80 your-image-name
Your Docker container is now up and running, ready to serve your application to the world.
Managing Your Docker Container
Unlike traditional virtual machines, where you need to manually install and configure software, Docker
containers are designed to be easily managed and manipulated.
Let’s take a look at some key ways to manage your Docker containers.
Monitoring Container Performance
With Docker, you can easily monitor the performance of your containers using built-in commands.
By running docker stats , you can view real-time CPU, memory, and network usage for all running
containers.
This can help you identify any resource bottlenecks and optimize your container performance.
Stopping, Starting, and Removing Containers
The Docker CLI provides simple commands for stopping, starting, and removing containers.
The command
docker stop [container_name]
will gracefully stop a running container, while
docker start [container_name]
will restart a stopped container.
To remove a container entirely, use the command
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
docker rm [container_name]
Additionally, you can use the docker ps command to list all running containers, and docker ps -a to see all
containers, including those that are stopped.
This gives you full visibility and control over your containers.
RESULT:
Thus a container is created and executed in a docker successfully.
VIVA QUESTIONS AND ANSWERS
1. Define Cloud Computing with example.
Cloud computing is a model for enabling convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction.
2. What is the working principle of Cloud Computing?
The cloud is a collection of computers and servers that are publicly accessible via the
Internet. This hardware is typically owned and operated by a third party on a consolidated
basis in one or more data center locations. The machines can run any combination of
operating systems.
3. What are the advantages and disadvantages of Cloud Computing?
Advantages
Lower-Cost Computers for Users
Improved Performance
Lower IT Infrastructure Costs
Fewer Maintenance Issues
Lower Software Costs
Instant Software Updates
Increased Computing Power
Unlimited Storage Capacity
Increased Data Safety
Improved Compatibility Between Operating Systems
Improved Document Format Compatibility
Easier Group Collaboration
Universal Access to Documents
Latest Version Availability
Removes the Tether to Specific Devices
Disadvantages
Requires a Constant Internet Connection
Doesn’t Work Well with Low-Speed Connections
Can Be Slow
Features Might Be Limited
Stored Data Might Not Be Secure
If the Cloud Loses Your Data, You’re Screwed
4. What is distributed system?
A distributed system is a software system in which components located on networked computers
communicate and coordinate their actions by passing messages. The components interact with each other in
order to achieve a common goal.
Three significant characteristics of distributed systems are:
Concurrency of components
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Lack of a global clock
Independent failure of components
What is cluster?
Acomputingclusterconsistsofinterconnectedstand-
alonecomputerswhichworkcooperativelyasasingleintegratedcomputingresource.Inthepast,clus
teredcomputersystemshavedemonstrated
5. What is grid computing?
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
Grid Computing enables virtuals organizations to share geographically distributed resources as they
pursue common goals, assuming the absence of central location, central control, omniscience, and an
existing trust relationship.
(or)
Gridtechnologydemandsnewdistributedcomputingmodels,software/middlewaresupport,networkp
rotocols,andhardwareinfrastructures.
Nationalgridprojectsarefollowedbyindustrialgridplat-
formdevelopmentbyIBM,Microsoft,Sun,HP,Dell,Cisco,EMC,PlatformComputing,andothers.
Newgridserviceproviders(GSPs)andnewgridapplicationshaveemergedrapidly,similartothegrowtho
fInternetandwebservicesinthepasttwodecades.
gridsystemsareclassifiedinessentiallytwocategories:computationalordatagridsandP2Pgrids.
6.What are the business areas needs in Grid computing?
Life Sciences
Financial services
Higher Education
Engineering Services
Government
Collaborative games
7. List out the Grid Applications:
Application partitioning that involves breaking the problem into discrete pieces
Discovery and scheduling of tasks and workflow
Data communications distributing the problem data where and when it is required
Provisioning and distributing application codes to specific system nodes
Autonomic features such as self-configuration, self-optimization, self-recovery and self-
management
8. List some grid computing toolkits and frameworks?
Globus Toolkit Globus Resource Allocation Manager(GRAM)
Grid Security Infrastructure(GSI)
Information Services
Legion, Condor and Condor-G
NIMROD, UNICORE, NMI.
9.What are Desktop Grids?
These are grids that leverage the compute resources of desktop computers.
Because of the true (but unfortunate) ubiquity of Microsoft® Windows® operating
system in corporations, desktop grids are assumed to apply to the Windows environment.
The Mac OS™ environment is supported by a limited number of vendors.
10. What are Server Grids?
Some corporations, while adopting Grid Computing , keep it limited to server resources that are
within the purview of the IT department.
Special servers, in some cases, are bought solely for the purpose of creating an internal “utility
grid” with resources made available to various departments.
No desktops are included in server grids. These usually run some flavor of the Unix/Linux
operating system.
11. Define Opennebula.
OpenNebula is an open source management tool that helps virtualized data centers oversee private clouds,
public clouds and hybrid clouds ..... OpenNebula is vendor neutral, as well as platform- and API-agnostic. It
can use KVM, Xen or VMware hypervisors.
12.Define Eclipse.
Eclipse is an integrated development environment (IDE) used in computer programming, and is the most
widely used Java IDE. It contains a base workspace and an extensible plug-in system for customizing the
environment.
13. Define Netbeans.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
NetBeans is an open-source integrated development environment (IDE) for developing with Java, PHP,
C++, and other programming languages. NetBeans is also referred to as a platform of modular components
used for developing Java desktop applications.
14. Define Apache Tomcat.
Apache Tomcat (or Jakarta Tomcat or simply Tomcat) is an open source servlet container developed by
the Apache Software Foundation (ASF). Tomcat implements the Java Servlet and the JavaServer Pages
(JSP) specifications from Sun Microsystems, and provides a "pure Java" HTTP web server environment for
Java code to run."
15. What is private cloud?
The private cloud is built within the domain of an intranet owned by a single organization.
Therefore, they are client owned and managed. Their access is limited to the owning clients and their
partners. Their deployment was not meant to sell capacity over the Internet through publicly accessible
interfaces. Private clouds give local users a flexible and agile private infrastructure to run service
workloads within their administrative domains.
16. What is public cloud?
A public cloud is built over the Internet, which can be accessed by any user who has paid for the
service. Public clouds are owned by service providers. They are accessed by subscription. Many companies
have built public clouds, namely Google App Engine, Amazon AWS, Microsoft Azure, IBM Blue Cloud,
and Salesforce Force.com. These are commercial providers that offer a publicly accessible remote interface
for creating and managing VM instances within their proprietary infrastructure.
17. What is hybrid cloud?
A hybrid cloud is built with both public and private clouds, Private clouds can also support
a hybrid cloud model by supplementing local infrastructure with computing capacity from an external
public cloud. For example, the research compute cloud (RC2) is a private cloud built by IBM.
18. What is a Community Cloud ?
A community cloud in computing is a collaborative effort in which infrastructure is shared between
several organizations from a specific community with common concerns (security, compliance,
jurisdiction, etc.), whether managed internally or by a third-party and hosted internally or externally. This
is controlled and used by a group of organizations that have shared interest. The costs are spread over
fewer users than a public cloud (but more than a private cloud
19. Define IaaS?
The IaaS layer offers storage and infrastructure resources that is needed to deliver the Cloud
services. It only comprises of the infrastructure or physical resource. Top IaaS Cloud Computing
Companies: Amazon (EC2), Rackspace, GoGrid, Microsoft, Terremark and Google.
20. Define PaaS?
PaaS provides the combination of both, infrastructure and application. Hence, organisations
using PaaS don’t have to worry for infrastructure nor for services. Top PaaS Cloud Computing
Companies: Salesforce.com, Google, Concur Technologies, Ariba, Unisys and Cisco..
21. Define SaaS?
In the SaaS layer, the Cloud service provider hosts the software upon their servers. It can be defined
as a in model in which applications and softwares are hosted upon the server and made available to
customers over a network. Top SaaS Cloud Computing Companies: Amazon Web Services,
AppScale, CA Technologies, Engine Yard, Salesforce and Windows Azure.
22. What is meant by virtualization?
Virtualizationisacomputerarchitecturetechnologybywhichmultiplevirtualmachines (VMs)are
multipl exedin the same hardwar emachine.Theideaof VMs canbe dated back to the 1960s. The purpose
of a VM is to enhance resource sharing by many users and improve computer performance interms of
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
resource utilization and application flexibility.
23. What are the implementation levels of virtualization?
The virtualization types are following
1. OS-level virtualization
2. ISA level virtualization
3. User-ApplicationLevel virtualization
4. hardware level virtualization
5. library level virtualization
24.List the requirements of VMM?
There are three requirements for a VMM.
First, a VMM should provide an environment for programs which is essentially identical to the
original machine.
Second, programs run in this environment should show, at worst, only minor decreases in speed.
Third, a VMM should be in complete control of the system resources.
25. Explain Host OS and Guest OS?
A comparison of the differences between a host system, a guest system, and a virtual machine within
a virtual infrastructure.
A host system (host operating system) would be the primary & first installed operating system. If
you are using a bare metal Virtualization platform like Hyper-V or ESX, there really isn’t a host
operating system besides the Hypervisor. If you are using a Type-2 Hypervisor like VMware Server or
Virtual Server, the host operating system is whatever operating system those applications are installed
into.
A guest system (guest operating system) is a virtual guest or virtual machine (VM) that is installed
under the host operating system. The guests are the VMs that you run in your virtualization platform.
26. Write the steps for live VM migration?
The five steps for live VM migration is
Stage 0: Pre-Migration
Active VM on Host A
Alternate physical host may be preselected for migration
Block devices mirrored and free resources maintained
Stage 1: Reservation
Initialize a container on the target
hostStage 2: Iterative pre-copy
Enable shadow paging
Copy dirty pages in successive rounds.
Stage 3: Stop and copy
Suspend VM on host A
Generate ARP to redirect traffic to Host B
Synchronize all remaining VM state to Host B
Stage 4: Commitment
VM state on Host A is released
Stage 5: Activation
VM starts on Host B
Connects to local devices
Resumes normal operation
27..Define Globus Toolkit: Grid Computing Middleware
Globus is open source grid software that addresses the most challenging problmes in distributed
resources sharing.
The Globus Toolkit includes software services and libraries for distributed security, resource
management, monitoring and discovery, and data management.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
28. Define Blocks in HDFS
A disk has a block size, which is the minimum amount of data that it can read or write. Filesystems for
a single disk build on this by dealing with data in blocks, which are an integral multiple of the disk
block size. Filesystem blocks are typically a few kilobytes in size, while disk blocks are normally 512
bytes. This is generally transparent to the filesystem user who is simply reading or writing a file—of
whatever length.
29. Define Namenodes and Datanodes
An HDFS cluster has two types of node operating in a master-worker pattern:
a namenode (the master) and
a number of datanodes(workers).
The namenode manages the filesystem namespace. It maintains the filesystem tree and the metadata
for all the files and directories in the tree. This information is stored persistently on the local disk in
the form of two files: the namespace image and the edit log.
The namenode also knows the datanodes on which all the blocks for a given file are located,
however, it does not store block locations persistently, since this information is reconstructed from
datanodes when the system starts.
30. Define HADOOP.
Hadoop is an open source, Java-based programming framework that supports the processing and storage of
extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored
by the Apache Software Foundation.
31. Define HDFS.
Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data
storage that is designed to span large clusters of commodity servers. HDFS, MapReduce, and YARN form
the core of Apache™ Hadoop®.
32. Write about HADOOP.
Hadoop was created by Doug Cutting and Mike Cafarella in 2005. Cutting, who was working at Yahoo! at
the time, named it after his son's toy elephant. It was originally developed to support distribution for the
Nutch search engine project.
33. Definition of Grid Portal:
A Grid Portal provides an efficient infrastructure to put Grid-empowered applications on corporate
Intranet/Internet.
34. Define GAE.
Google App Engine (often referred to as GAE or simply App Engine) is a Platform as a Service and cloud
computing platform for developing and hosting web applications in Google-managed data centers.
Applications are sandboxed and run across multiple servers. App Engine offers automatic scaling for web
applications—as the number of requests increases for an application, App Engine automatically allocates
more resources for the web application to handle the additional demand.
35. What is Cloudsim?
CloudSim is a simulation toolkit that supports the modeling and simulation of the core functionality of
cloud, like job/task queue, processing of events, creation of cloud entities(datacenter, datacenter brokers,
etc), communication between different entities, implementation of broker policies, etc. This toolkit allows
to:
Test application services in a repeatable and controllable environment.
Tune the system bottlenecks before deploying apps in an actual cloud.
Experiment with different workload mix and resource performance scenarios on simulated
infrastructure for developing and testing adaptive application provisioning techniques
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
36. Core features of CloudSim are:
The Support of modeling and simulation of large scale computing environment as
federated cloud data centers, virtualized server hosts, with customizable policies for
provisioning host resources to virtual machines and energy-aware computational
resources
It is a self-contained platform for modeling cloud’s service brokers, provisioning,
and allocationpolicies.
It supports the simulation of network connections among simulated system elements.
Support for simulation of federated cloud environment, that inter-networks
resources from bothprivate and public domains.
Availability of a virtualization engine that aids in the creation and
management of multipleindependent and co-hosted virtual services on a data
center node.
Flexibility to switch between space shared and time shared allocation of
processing cores tovirtualized services.
37. Uses of Cloudsim.
Load Balancing of resources and tasks
Task scheduling and its migrations
Optimizing the Virtual machine allocation and placement policies
Energy-aware Consolidations or Migrations of virtual machines
Optimizing schemes for Network latencies for various cloud scenarios
38. Define OpenStack.
OpenStack is a cloud operating system that controls large pools of compute, storage, and
networking resources throughout a datacenter, all managed and provisioned through APIs
with common authenticationmechanisms.
A dashboard is also available, giving administrators control while empowering their users to
provisionresources through a web interface.
39. Define Trystack.
TryStack is a great way to take OpenStack for a spin without having to
commit to a fulldeployment.
This free service lets you test what the cloud can do for you, offering networking, storage
and computeinstances, without having to go all in with your own hardware.
It’s a labor of love spearheaded by three Red Hat OpenStack experts Will
Foster, KambizAghaiepour and Dan Radez.
TryStack’s set-up must bear the load of anyone who wants to use it, but instead of an
equally boundless budget and paid staff, it was originally powered by donated equipment and
volunteers from Cisco, Dell, Equinix, NetApp, Rackspace and Red Hat who pulled together
for this OpenStack Foundationproject.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)
lOMoARcPSD|30946019
40. Define Hadoop.
Hadoop is an open-source software framework for storing data and running
applications on clusters of commodity hardware. It provides massive storage for any kind of
data, enormous processing power and the ability to handle virtually limitless concurrent tasks
or jobs.
Downloaded by Nandini Pandithurai (nandhunivi@gmail.com)