0% found this document useful (0 votes)

5K views10 pages

Matrix Multiplication Using Hadoop Map-Reduce

This document describes how to perform matrix multiplication using MapReduce on Hadoop. It involves the following steps: 1. Installing Hadoop in standalone mode and configuring Java settings. 2. Developing MapReduce programs including mappers, reducers, and a driver class for matrix multiplication. The mapper outputs key-value pairs of matrix elements and the reducer calculates and sums the products. 3. Compiling the programs, creating a JAR file, and executing it on Hadoop to multiply sample matrices stored in HDFS and output the results.

Uploaded by

Niri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5K views10 pages

Matrix Multiplication Using Hadoop Map-Reduce

Uploaded by

Niri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Matrix Multiplication using Hadoop

Map-Reduce
Step 1: Install Hadoop in Stand-Alone Mode

Step 2: Matrix MultiplicationUsing MapReduce Programming

1.1 Installing Java

Check Existing Java version by running command

java -version

1.2 Create hadoop home directory

We will use hadoop 3.1.2.tar.gz here.

Extract hadoop file using following command

tar -xzvf hadoop-2.7.3.tar.gz

Move hadoop to /usr/local

sudo mv hadoop-3.1.2 /usr/local/hadoop

1.3 Configuring Hadoop's Java_home

Hadoop requires that you set the path to Java, either as an environment variable or
in the Hadoop configuration file.

The path to Java, /usr/bin/java is a symlink to /etc/alternatives/java, which is in

turn a symlink to default Java binary. We will use readlink with the -f flag to
follow every symlink in every part of the path, recursively. Then, we'll use sed to
trim bin/java from the output to give us the correct value for JAVA_HOME

To find the default Java path

readlink -f /usr/bin/java | sed "s:bin/java::"

Output :

/usr/lib/jvm/java-11-openjdk-amd64/

Use Readlink to Set the Value Dynamically

Sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Add this line for

export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")

1.4 Running Hadoop

Now we should be able to run Hadoop:

/usr/local/hadoop/bin/hadoop

Output :

The help means we've successfully configured Hadoop to run in stand-alone mode.
We'll ensure that it is functioning properly by running the example MapReduce
program it ships with. To do so, create a directory called input in our home
directory and copy Hadoop's configuration files into it to use those files as our
data.

mkdir ~/input

cp /usr/local/hadoop/etc/hadoop/*.xml ~/input
Next, we can use the following command to run the MapReduce hadoop-mapreduce-examples
program, a Java archive with several options. We'll invoke its grep program, one of many
examples included in hadoop-mapreduce-examples, followed by the input directory, input and
the output directory grep_example. The MapReduce grep program will count the matches of a
literal word or regular expression. Finally, we'll supply a regular expression to find
occurrences of the word principal within or at the end of a declarative sentence. The
expression is case-sensitive, so we wouldn't find the word if it were capitalized at the
beginning of a sentence:

/usr/local/hadoop/bin/hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar grep
~/input ~/grep_example 'principal[.]*'

When the task completes, it provides a summary of what has been processed and errors it has
encountered, but this doesn't contain the actual results

Results are stored in the output directory and can be checked by running cat on the output
directory:

cat ~/grep_example/*
Step 2: Matrix Multi1plicationUsing MapReduce Programming

2.1. In mathematics, matrix multiplication or the matrix product is a binary operation that
produces a matrix from two matrices. The definition is motivated by linear equations and linear
transformations on vectors, which have numerous applications in applied mathematics, physics, and
engineering. In more detail, if A is an n × m matrix and B is an m × p matrix, their matrix product
AB is an n × p matrix, in which the m entries across a row of A are multiplied with the m entries
down a column of B and summed to produce an entry of AB. When two linear transformations are
represented by matrices, then the matrix product represents the composition of the two
transformations.
Algorithm for Map Function.
a. for each element mij of M do
produce (key,value) pairs as ((i,k), (M,j,mij), for k=1,2,3,.. upto the number of
columns of N
b. for each element njk of N do
produce (key,value) pairs as ((i,k),(N,j,Njk), for i = 1,2,3,.. Upto the number of rows
of M.
c. return Set of (key,value) pairs that each key (i,k), has list with values (M,j,mij)
and (N, j,njk) for all possible values of j.
Algorithm for Reduce Function.
for each key (i,k) do
sort values begin with M by j in listM
sort values begin with N by j in listN
multiply mij and njk for jth value of each list
sum up mij x njk return (i,k), Σj=1 mij x njk

2.2 Download the hadoop jar files with these links.

Download Hadoop Common Jar files :

wget https://goo.gl/G4MyHp -O hadoop-common-3.1.2.jar

Download Hadoop Mapreduce Jar File :

wget https://goo.gl/KT8yfB -O hadoop-mapreduce-client-core-3.1.2.jar
2.3 Creating Mapper file for Matrix Multiplication.
Refer Map.java
2.4 Creating Reducer.java file for Matrix Multiplication
educe.java
Refer R
2.5 Creating MatrixMultiply.java file
Refer MatrixMultiply.java

2.6 Compiling the program in particular folder named as operation/

javac -cp hadoop-common-3.1.2.jar:hadoop-mapreduce-client-core-3.1.2.jar:operation/:. -d operation/
Map.java

javac -cp hadoop-common-3.1.2.jar:hadoop-mapreduce-client-core-3.1.2.jar:operation/:. -d operation/

Reduce.java

javac -cp hadoop-common-3.1.2.jar:hadoop-mapreduce-client-core-3.1.2.jar:operation/:. -d operation/

MatrixMultiply.java

2.7 Let’s retrieve the directory after compilation.

ls -R operation/

reating Jar file for the Matrix Multiplication.

2.8 C

jar -cvf MatrixMultiply.jar -C operation/ .

Output :

2.9 Uploading the M, N file which contains the matrix multiplication data to HDFS.
Refer File ‘M’
Refer File ‘N’

hadoop fs -mkdir Matrix/

hadoop fs -copyFromLocal M Matrix/
hadoop fs -copyFromLocal N Matrix/

2.10 Executing the jar file using hadoop command and thus how fetching record from
HDFS and storing output in HDFS.
hadoop jar MatrixMultiply.jar MatrixMultiply Matrix result

NOTE : Here output of mapper and reducer will be generated

2.11 Getting Output from part-r-00000 that was generated after the execution of the
hadoop command.
hadoop fs -cat result/part-r-00000

Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
BDA Lab Manual AI&DS
No ratings yet
BDA Lab Manual AI&DS
60 pages
CLoud Computing Integrated (BCS601) Lab Manual
No ratings yet
CLoud Computing Integrated (BCS601) Lab Manual
54 pages
V CSE CCS375 WT LabManual
100% (1)
V CSE CCS375 WT LabManual
33 pages
CCS334 Big Data Analytics Important Question
No ratings yet
CCS334 Big Data Analytics Important Question
1 page
Big - Data Lab Manual
No ratings yet
Big - Data Lab Manual
65 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
38 pages
CS6612 Compiler Lab Manual
100% (4)
CS6612 Compiler Lab Manual
60 pages
2024-CS3271 Programming in C Lab Manual
No ratings yet
2024-CS3271 Programming in C Lab Manual
54 pages
CCS334 BDA Lab Manual Final
No ratings yet
CCS334 BDA Lab Manual Final
40 pages
Cloud Service Management Lab Guide
No ratings yet
Cloud Service Management Lab Guide
22 pages
Unit 3-BDA
50% (2)
Unit 3-BDA
26 pages
Java Multithreading Lecture Notes
No ratings yet
Java Multithreading Lecture Notes
42 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
36 pages
R22 - OOPS Using JAVA Lab Manual 2-2
No ratings yet
R22 - OOPS Using JAVA Lab Manual 2-2
58 pages
4 UNIT-4 Introduction To Hadoop
No ratings yet
4 UNIT-4 Introduction To Hadoop
154 pages
Anatomy of Mapreduce Job Run: Some Slides Are Taken From Cmu PPT Presentation
No ratings yet
Anatomy of Mapreduce Job Run: Some Slides Are Taken From Cmu PPT Presentation
73 pages
Write C Programs To Illustrate The Following IPC Mechanisms: A) Pipes
No ratings yet
Write C Programs To Illustrate The Following IPC Mechanisms: A) Pipes
6 pages
Cs3271 Programming in C Lab Manual
No ratings yet
Cs3271 Programming in C Lab Manual
33 pages
cs8251 Programming in C Notes PDF
No ratings yet
cs8251 Programming in C Notes PDF
91 pages
Oops Lab Record
0% (1)
Oops Lab Record
37 pages
Cs3301 Unit Important Q-Data-Structures
No ratings yet
Cs3301 Unit Important Q-Data-Structures
8 pages
Experiment No.1:: Write A LEX Program To Scan Reserved Word & Identifiers of C Language
0% (1)
Experiment No.1:: Write A LEX Program To Scan Reserved Word & Identifiers of C Language
4 pages
Unit-Vi Hive Hadoop & Big Data
100% (1)
Unit-Vi Hive Hadoop & Big Data
24 pages
BDA Model Question Paper
No ratings yet
BDA Model Question Paper
2 pages
IT3401 Web Essential Lab
No ratings yet
IT3401 Web Essential Lab
51 pages
CCS334 - Bda Lab Manual
No ratings yet
CCS334 - Bda Lab Manual
40 pages
MapReduce Applications in Big Data
No ratings yet
MapReduce Applications in Big Data
94 pages
Anatomy of Map-Reduce Jobs PDF
No ratings yet
Anatomy of Map-Reduce Jobs PDF
30 pages
@vtucode - in BIS402 Module 2 PDF 2022 Scheme
No ratings yet
@vtucode - in BIS402 Module 2 PDF 2022 Scheme
47 pages
Cloud Computing Lab Guide
No ratings yet
Cloud Computing Lab Guide
2 pages
Laboratory Manual: Object Oriented Software Engineering
No ratings yet
Laboratory Manual: Object Oriented Software Engineering
58 pages
Lexical Issues
100% (2)
Lexical Issues
2 pages
Big Data Analysis Lab Manual
No ratings yet
Big Data Analysis Lab Manual
39 pages
III-II Big Data Analytics Question Bank
100% (1)
III-II Big Data Analytics Question Bank
3 pages
CS3361 Data Science Lab Manual (II CYS)
100% (1)
CS3361 Data Science Lab Manual (II CYS)
40 pages
Vtu-5 Sem Cse Computer Networks Lab Manual-17csl57-Sijin P
50% (2)
Vtu-5 Sem Cse Computer Networks Lab Manual-17csl57-Sijin P
47 pages
Cloud Computing CC Lab Manual - 240125 - 135558
No ratings yet
Cloud Computing CC Lab Manual - 240125 - 135558
51 pages
Anatomy OF File Write and Read
No ratings yet
Anatomy OF File Write and Read
6 pages
Co Po Mapping Justification DSA
100% (1)
Co Po Mapping Justification DSA
3 pages
BD - Unit - III - MapReduce
100% (1)
BD - Unit - III - MapReduce
31 pages
CS 8091 Big Data Analytics Previous Question Paper
100% (1)
CS 8091 Big Data Analytics Previous Question Paper
3 pages
Social Network Security Record
100% (1)
Social Network Security Record
70 pages
Unit5 BD
100% (2)
Unit5 BD
91 pages
Markov Models & NLP Explained
No ratings yet
Markov Models & NLP Explained
27 pages
Os Lab Manual
No ratings yet
Os Lab Manual
107 pages
CS25C03 Essential Computing LAB Manual CS25C03 Essential Computing LAB Manual
100% (1)
CS25C03 Essential Computing LAB Manual CS25C03 Essential Computing LAB Manual
26 pages
Anatomy of Map Reduce Job Run
100% (3)
Anatomy of Map Reduce Job Run
20 pages
Data Visualization With Python Lab.-17007256038890 PDF
No ratings yet
Data Visualization With Python Lab.-17007256038890 PDF
25 pages
NNDL Lab Manual
No ratings yet
NNDL Lab Manual
41 pages
Cloud Computing Lab Record 2021
No ratings yet
Cloud Computing Lab Record 2021
5 pages
CD3291 Data Structures and Algorithms Lecture Notes 1
No ratings yet
CD3291 Data Structures and Algorithms Lecture Notes 1
162 pages
Discrete Mathematical Structures 15CS36: Course Objectives: This Course Will Enable Students To
No ratings yet
Discrete Mathematical Structures 15CS36: Course Objectives: This Course Will Enable Students To
53 pages
JNTUA Software Project Management Notes - R20
No ratings yet
JNTUA Software Project Management Notes - R20
70 pages
Advanced Python Programming Lab Manual
No ratings yet
Advanced Python Programming Lab Manual
27 pages
NLP Lab Manual (R20)
50% (2)
NLP Lab Manual (R20)
24 pages
Aiml Lab Manaual R23
100% (1)
Aiml Lab Manaual R23
10 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
BIGDATALABCURRENT
No ratings yet
BIGDATALABCURRENT
54 pages
BDA
No ratings yet
BDA
19 pages
Project 2 Building Chatbot For Dianogtistic Center
No ratings yet
Project 2 Building Chatbot For Dianogtistic Center
10 pages
Analog and Digital Signal Processing PDF
100% (1)
Analog and Digital Signal Processing PDF
821 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
5 pages
Hive On Google Cloud
No ratings yet
Hive On Google Cloud
16 pages
Python OOP Guide for Developers
No ratings yet
Python OOP Guide for Developers
12 pages
AISC Experiment List
No ratings yet
AISC Experiment List
3 pages
Final Provisional Merit List 2019-20
No ratings yet
Final Provisional Merit List 2019-20
10 pages
To Convert Continuous Time Signal To Discrete Time Signal Using Sampling
No ratings yet
To Convert Continuous Time Signal To Discrete Time Signal Using Sampling
3 pages
Subject: DSIP Class: BE COMPS A.Y. 2019-20 Experiment 3 Problem Statement: To Perform Convolution of Two Signals Theory
No ratings yet
Subject: DSIP Class: BE COMPS A.Y. 2019-20 Experiment 3 Problem Statement: To Perform Convolution of Two Signals Theory
2 pages
User Interface Design
No ratings yet
User Interface Design
4 pages
Business Analytics & Data Science Course
No ratings yet
Business Analytics & Data Science Course
3 pages
Tech Courses & Internships for Freshers
No ratings yet
Tech Courses & Internships for Freshers
5 pages
Determinants and Matrices Previous Year Questions With Answer
75% (4)
Determinants and Matrices Previous Year Questions With Answer
15 pages
Syntax Analysis for CS Students
No ratings yet
Syntax Analysis for CS Students
6 pages
Cube-Voyager - Technical Brochure
No ratings yet
Cube-Voyager - Technical Brochure
3 pages
Week 24 Lesson Plan
No ratings yet
Week 24 Lesson Plan
2 pages
SEMIKRON DataSheet SKiiP 613 GD123 3DUL V3 20452211
No ratings yet
SEMIKRON DataSheet SKiiP 613 GD123 3DUL V3 20452211
7 pages
Diplexores GSM - DCS - Umts
No ratings yet
Diplexores GSM - DCS - Umts
2 pages
(03-511 To 03-0520) End Mills, HTPM, Speeds and Feeds, Slotting and Side Cutting, Metric
No ratings yet
(03-511 To 03-0520) End Mills, HTPM, Speeds and Feeds, Slotting and Side Cutting, Metric
3 pages
A Review of Engine Downsizing and Its Effects
No ratings yet
A Review of Engine Downsizing and Its Effects
6 pages
TCC Number 119 4 4
No ratings yet
TCC Number 119 4 4
1 page
Quickly Export From Primavera P6 To Excel 1
No ratings yet
Quickly Export From Primavera P6 To Excel 1
7 pages
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
No ratings yet
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
9 pages
Page (1) of
No ratings yet
Page (1) of
4 pages
BPS-300W To 30KVA Solar Power System - BESTSUN Solar 2017
No ratings yet
BPS-300W To 30KVA Solar Power System - BESTSUN Solar 2017
5 pages
Series 5000 Texsteam
100% (4)
Series 5000 Texsteam
24 pages
SEIKO 6M13 Watch User Guide
100% (1)
SEIKO 6M13 Watch User Guide
20 pages
Pilkington Low e Glass How It Works
No ratings yet
Pilkington Low e Glass How It Works
2 pages
Stairs: A Little Bit About Them: Slope
No ratings yet
Stairs: A Little Bit About Them: Slope
2 pages
Yaestj: Antenna Rotator Model G-450XL
No ratings yet
Yaestj: Antenna Rotator Model G-450XL
12 pages
SHA256E s2187283
No ratings yet
SHA256E s2187283
153 pages
Truspec: Micro Elemental Series
No ratings yet
Truspec: Micro Elemental Series
4 pages
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
No ratings yet
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
13 pages
GEA PHE NT en
100% (4)
GEA PHE NT en
2 pages
Assignment 1: Fundamentals: of Financial Management (FIBA 201)
No ratings yet
Assignment 1: Fundamentals: of Financial Management (FIBA 201)
8 pages
Moisture Content Determination
No ratings yet
Moisture Content Determination
5 pages
Mathematics 1
No ratings yet
Mathematics 1
6 pages
Numericals
No ratings yet
Numericals
41 pages
Com 101
No ratings yet
Com 101
76 pages
Presentation1 (Accidental Sampling)
No ratings yet
Presentation1 (Accidental Sampling)
40 pages
Adrf 5141
No ratings yet
Adrf 5141
13 pages
Shake Theory
No ratings yet
Shake Theory
8 pages

Matrix Multiplication Using Hadoop Map-Reduce

Uploaded by

Matrix Multiplication Using Hadoop Map-Reduce

Uploaded by

Matrix Multiplication using Hadoop

Step 2: Matrix MultiplicationUsing MapReduce Programming

1.1 ​Installing Java

Check Existing Java version by running command

1.2 ​Create hadoop home directory

We will use hadoop 3.1.2.tar.gz here.

Extract hadoop file using following command

​ tar -xzvf hadoop-2.7.3.tar.gz

Move hadoop to /usr/local

sudo mv hadoop-3.1.2 /usr/local/hadoop

The path to Java, /usr/bin/java is a symlink to /etc/alternatives/java, which is in

To find the default Java path

readlink -f /usr/bin/java | sed "s:bin/java::"

Use Readlink to Set the Value Dynamically

Sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

Add this line for

export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")

Now we should be able to run Hadoop:

2.2 Download the hadoop jar files with these links.

Download Hadoop Common Jar files :

Download Hadoop Mapreduce Jar File :

2.6 Compiling the program in particular folder named as operation/

javac -cp hadoop-common-3.1.2.jar:hadoop-mapreduce-client-core-3.1.2.jar:operation/:. -d operation/

javac -cp hadoop-common-3.1.2.jar:hadoop-mapreduce-client-core-3.1.2.jar:operation/:. -d operation/

2.7 Let’s retrieve the directory after compilation.

​ reating Jar file for the Matrix Multiplication.

jar -cvf MatrixMultiply.jar -C operation/ .

hadoop fs -mkdir Matrix/

NOTE : Here output of mapper and reducer will be generated

You might also like

1.1 Installing Java

1.2 Create hadoop home directory

tar -xzvf hadoop-2.7.3.tar.gz

reating Jar file for the Matrix Multiplication.