KEMBAR78
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS) | PDF | Computer File | Apache Hadoop
0% found this document useful (0 votes)
182 views74 pages

Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)

This document provides instructions for setting up a single-node Hadoop installation on Ubuntu Linux. It outlines prerequisites like installing Java JDK 7 and configuring SSH access. It then describes basic configuration steps like editing configuration files and properties for core-site.xml, hdfs-site.xml and mapred-site.xml. Finally, it explains formatting and checking the HDFS filesystem, starting and stopping the single-node cluster, and adding a user.

Uploaded by

Ashita Punjabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
182 views74 pages

Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)

This document provides instructions for setting up a single-node Hadoop installation on Ubuntu Linux. It outlines prerequisites like installing Java JDK 7 and configuring SSH access. It then describes basic configuration steps like editing configuration files and properties for core-site.xml, hdfs-site.xml and mapred-site.xml. Finally, it explains formatting and checking the HDFS filesystem, starting and stopping the single-node cluster, and adding a user.

Uploaded by

Ashita Punjabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Deepshikha Agrawal Pushp

B.Sc.(IT), MBA (IT)


Certification-Hadoop,Spark,Scala,
Python,Tableau,ML
(Assistant Professor JLBS)
HDFS Commands

By Prof. Deepshikha Pushp


Interfaces to HDFS

Java API (DistributedFileSystem)


C wrapper (libhdfs)
HTTP protocol
WebDAV protocol
Shell Commands
However the command line is one of the simplest
and most familiar
A table of all the operations is shown below. The following
conventions are used for parameters:

"<path>" means any file or directory name.


"<path>..." means one or m ore file or directory names.
"<file>" means any filename.
"<src>" and "<dest>" are path names in a directed operation.
"<localSrc>" and "<localDest>" are paths as above, but on the local file system .
HDFS – Shell Commands

mkdir
HDFS Command to create the directory in HDFS.
Usage: hadoop fs –mkdir /directory_name
Command: hadoop fs –mkdir /newfold
Note: Here we are trying to create a directory named “new_fold” in HDFS
ls
HDFS Command to display the list of Files and Directories in HDFS.
Command: hadoop fs –ls /
Note:-Here we are showing list of directories in hdfs
Lsr
HDFS Command to display the list of Files and Directories with subdirectories in HDFS.
Command: hadoop fs –lsr /
Note:-Here we are showing list of directories with their subdirectories in
hdfs
du
HDFS Command to check the file size. 
Usage: hadoop fs –du /directory/filename
Command: 
hadoop fs –du  /new_fold
-count
HDFS Command to count the number of directories, files and bytes under the paths that
match the specified file pattern.
Usage: hadoop fs -count <path>
Command: hadoop fs –count /user/deepshikha1
text
HDFS Command that takes a source file and outputs the file in text format.
Usage: hadoop fs –text /directory/filename
Command: hadoop fs –text  /new4/doc1
put
HDFS Command to copy single source, or multiple sources from local file system to the
destination file system.
Usage: hadoop fs -put <localsrc> <destination>
Command:
 hadoop fs –put /home/deepshikha1/doc2 /new
Note:  The command copyFromLocal is similar to put command, except that the source is restricted to a
local file reference.
get
HDFS Command to copy files from hdfs to the local file system.
Usage: hadoop fs -get <src> <localdst>
Command:
 hadoop fs –get /newfold/doc3 /home/deepshikha1
copyToLocal
HDFS Command to copy the file from HDFS to Local File System.
Usage: hadoop fs -copyToLocal <hdfs source> <localdst>
Command:
 hadoop fs –copyToLocal /newfold/doc2 /home/deepshikha1
Note: Here doc2 is a file present in the newfold directory of HDFS and after the command gets executed
the doc2 file will be copied to local directory /home/deepshikha1
copyFromLocal
HDFS Command to copy the file from Local file system to HDFS.
Usage: hadoop fs -copyFromLocal <localsrc> <hdfs destination> 
Command: 
hadoop fs –copyFromLocal /home/deepshikha1/doc2 /newfold
touchz
HDFS Command to create a file in HDFS with file size 0 bytes.
Usage: hadoop fs –touchz /directory/filename
Command: hadoop fs –touchz /newfold/smple.
moveToLocal
HDFS Command to move the file from HDFS to Local File System.
Usage: hadoop fs -moveToLocal <hdfs source> <localdst>
Command: 
hadoop fs –moveToLocal /newfold/doc3 /home/deepshikha1
mv
HDFS Command to move files from source to destination. This command allows multiple
sources as well, in which case the destination needs to be a directory.
Usage:  hadoop fs -mv <src> <dest>
Command:  
hadoop fs -mv /newfold/doc3 /new
-getmerge
HDFS Command to merge the file from HDFS.
Usage: hadoop fs –getmerge <src>   <dest>  
Command: 
hadoop fs –getmerge /newfold/doc1 /new/doc3
rm
HDFS Command to remove the file from HDFS.
Usage: hadoop fs –rm <path>     
Command:  hadoop fs –rm /newfold/doc2
rmr
HDFS Command to remove the entire directory and all of its content from HDFS.
Usage: hadoop fs -rmr <path>
Command: hadoop fs -rmr  /new
expunge
HDFS Command that makes the trash empty.
Command: hadoop fs -expunge
cat
HDFS Command that copies source paths to stdout.
Usage: hadoopfs –cat /path/to/file_in_hdfs
Command: hadoop fs –cat /new4/doc1
cp
HDFS Command to copy files from source to destination.
This command allows multiple sources as well, in which
case the destination must be a directory.
Usage: hadoop fs -cp <src> <dest>
Command: hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
Command: hadoop fs -cp /newfold/doc/file1 /user/hadoop/file2 /user/hadoop/dir  
Thank You
Hadoop Installation
Purpose

This ppt describes how to set up and configure a single-node Hadoop installation so
that you can quickly perform simple operations using Hadoop MapReduce and the
Hadoop Distributed File System (HDFS).
Prerequisites

Supported Platform
Prerequisites 1. Ubuntu Linux 12.10
2. Installing Java jdk 7 update
3. Adding dedicated Hadoop system user.
4. Configuring SSH access.
Basic Configuration – Java jdk7

Hadoop requires a working Java 1.5+ installation


Update the source list
user@ubuntu:~$ sudo apt-get update
(If you already have Java JDK installed on your system, then you need not run the above command.)
To install it
user@ubuntu:~$ sudo apt-get install openjdk-7-jdk
command to check java version
user@ubuntu:~$Java –version
Hadoop on Ubuntu (Single node cluster setup)
Terminal
Create a folder in home name it work
Download hadoop-1.1.2.tar.gz file save it in work folder
Extract hadoop-1.1.2.tar.gz in work folder
Open hadoop-1.1.2 folder and then open conf folder
Open core-site.xml file as text file
Now set property
Set property of hdfs-site.xml file
Set property of mapred-site.xml file
Open ~/.bashrc
Configuration
Formatting the HDFS filesystem via the NameNode
Formatting the HDFS filesystem via the NameNode
Starting single-node cluster
Run jps command to check daemons
To add user run command in terminal

Sudo adduser username


stopping single-node cluster
Thank You

You might also like