Running Mapreduce Program
MR Practical
Eclipse is Present on Cloudera Desktop
Eclipse
Create a new Java Project
Give your project name
I gave wordcount as project name
Copy all the 3 files in src folder
You can see that your project has lot
of errors. The reason is that libraries
are missing. so , We need to add
relevant jars.
Adding the relevant Jars to resolve errors
Click on Add External Jars under Libraries
Select all the jars from the jars that are given
Once all relevant jars are added then all
errors will go away
Now we need to create a input file with
some content
Open terminal -> navigate to Desktop ->
create a new directory -> inside new
directory create a file using gedit.
Have some content in file
Make sure some words repeat so that
you can visualize the results well.
Get the complete input file path using
pwd command in terminal
Now time to set the arguments and Run the
project
Click on Run -> Run Configurations
Setting up Run Configurations
Give your configuration a Name.
Also set your project name & Main class
Setting up the runtime arguments and run it
Make sure output
folder should not
exist
We need to give 2 parameters separated by space. 1st
one is input folder path & second is output folder path.
Just check that there should not be any
error messages in the logs
These are
the log
messages
If the MR job is successful, a new output
folder should be created
You can see
the new
output
folder is
created.
See the files in output folder
Since we used one
reducer there
should be one part
file.
_SUCCESS
indicated job is
successful
See the content of output file
You can see empty space
is also treated as word.
Because if we give 2
spaces consecutively .
The second space is
treated as word.
The output is in
ascending order of the
words (keys)
You can check the output through
terminal also using linux commands
cat command to see the output content
We have successfully executed a mapreduce program
Happy Learning!!!
5 Star Google Rated
Big Data Course
LEARN FROM THE EXPERT
9108179578
Call for more details
Trainer Mr. Sumit Mittal
Phone 9108179578
Email trendytech.sumit@gmail.com
Website https://trendytech.in/courses/big-data-online-training/
LinkedIn https://www.linkedin.com/in/bigdatabysumit/
Follow US @BigdataBySumit
Twitter
Instagram bigdatabysumit
Facebook https://www.facebook.com/trendytech.in/
Youtube TrendyTech