KEMBAR78
Big Data Analytics | PDF | Apache Hadoop | No Sql
0% found this document useful (0 votes)
16 views3 pages

Big Data Analytics

Uploaded by

manisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Big Data Analytics

Uploaded by

manisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Big Data Analytics L P C


3 3

Discipline(s) / EAE / OAE Semester Group Sub-group Paper Code


CSE-DS 7 PC PC DS-429T
EAE 7 DS-EAE DS-EAE-4 DS-429T

Marking Scheme:
1. Teachers Continuous Evaluation: 25 marks
2. Term end Theory Examinations: 75 marks
Instructions for paper setter:
1. There should be 9 questions in the term end examinations question paper.
2. The first (1st) question should be compulsory and cover the entire syllabus. This question should be
objective, single line answers or short answer type question of total 15 marks.
3. Apart from question 1 which is compulsory, rest of the paper shall consist of 4 units as per the syllabus.
Every unit shall have two questions covering the corresponding unit of the syllabus. However, the student
shall be asked to attempt only one of the two questions in the unit. Individual questions may contain upto
5 sub-parts / sub-questions. Each Unit shall have a marks weightage of 15.
4. The questions are to be framed keeping in view the learning outcomes of the course / paper. The standard
/ level of the questions to be asked should be at the level of the prescribed textbook.
5. The requirement of (scientific) calculators / log-tables / data – tables may be specified if required.
Course Objectives :
1. Understand the Big Data Platform and its Use cases
2. Provide HDFS Concepts and Interfacing with HDFS
3. Provide hands on Hodoop Eco System
4. Exposure to Data Analytics with R
Course Outcomes (CO)
CO 1 Identify Big Data and its Business Implications
CO 2 List the components of Hadoop and Hadoop Eco-System
CO 3 Develop Big Data Solutions using Hadoop Eco System
CO 4 Manage Job Execution in Hadoop Environment
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO01 PO02 PO03 PO04 PO05 PO06 PO07 PO08 PO09 PO10 PO11 PO12
CO 1 2 - - 2 3 - - - 1 - - -
CO 2 - 2 - - - 2 3 - 1 2 - -
CO 3 - 1 - - - 2 - - - 2 - -
CO 4 - 3 - - - 2 - - - 2 - -

UNIT-I

Introduction to Big Data: Introduction to Big Data, Big Data characteristics, Challenges of Conventional
System, Types of Big Data, Intelligent data analysis, Traditional vs. Big Data business approach, Case Study of
Big Data Solutions.

UNIT-II

Hadoop: History of Hadoop, Hadoop Distributed File System: Physical organization of Compte Nodes,
Components of Hadoop Analyzing the Data with Hadoop, Scaling Out, Hadoop Streaming, Design of HDFS,Java
interfaces to HDFS Basics, Developing a Map Reduce Application, How Map Reduce Works, Anatomy of a Map
Reduce Job run, Failures, Job Scheduling, Shuffle and Sort, Task execution, Map Reduce Types and Formats,
Map Reduce Features, Hadoop environment. Setting up a Hadoop Cluster, Cluster specification, Cluster Setup
and Installation, Hadoop Configuration, security in Hadoop, Administering Hadoop, Monitoring-Maintenance,
Hadoop benchmarks, Hadoop in the cloud

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 566
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

UNIT-III

NoSQL: What is NoSQL? NoSQL business drivers; NoSQL case studies; NoSQL data architecture patterns: Key-
value stores, Graph stores, Column family (Bigtable) stores, Document stores, Variations of NoSQL
architectural patterns; Using NoSQL to manage big data: What is a big data NoSQL solution? Understanding
the types of big data problems; Analyzing big data with a shared-nothing architecture; Choosing distribution
models: master-slave versus peer-to-peer; Four ways that NoSQL systems handle big data problems

UNIT – IV

Frameworks: Applications on Big Data Using Pig and Hive, Data processing operators in Pig, Hive services,
HiveQL, Querying Data in Hive, fundamentals of HBase and ZooKeeper, IBM InfoSphere BigInsights and
Streams. Machine Learning: Introduction, Supervised Learning, Unsupervised Learning, Collaborative Filtering.
Big Data Analytics with BigR

Textbook(s):
1. Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining : Concepts and Techniques”, 3rd edition, MK Publisher
2. Tom White “Hadoop: The Definitive Guide” Third Editon, O’reily Media, 2012.

References:
1. Seema Acharya, Subhasini Chellappan, "Big Data Analytics" Wiley 2015.
2. Michael Berthold, David J. Hand, "Intelligent Data Analysis”, Springer, 2007

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 567
Handbook of B.Tech. Programmes offered by USICT at Affiliated Institutions of the University.

Big Data Analytics Lab L P C


2 1

Discipline(s) / EAE / OAE Semester Group Sub-group Paper Code


CSE-DS 7 PC PC DS-429P
EAE 7 DS-EAE DS-EAE-4 DS-429P

Marking Scheme:
1. Teachers Continuous Evaluation: 40 marks
2. Term end Theory Examinations: 60 marks
Instructions:
1. The course objectives and course outcomes are identical to that of (Big Data Analytics) as this is the practical
component of the corresponding theory paper.
2. The practical list shall be notified by the teacher in the first week of the class commencement under
intimation to the office of the Head of Department / Institution in which the paper is being offered from the
list of practicals below. Atleast 10 experiments must be performed by the students, they may be asked to
do more. Atleast 5 experiments must be from the given list.

1. Downloading and installing Hadoop; Understanding different Hadoop modes. Startup scripts, Configuration
files
2. Implement the following file management tasks in Hadoop:
i. Adding files and directories
ii. Retrieving files
iii. Deleting files
Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into HDFS
using one of the above command line utilities
3. Implement of Matrix Multiplication with Hadoop Map Reduce
4. Write a Map Reduce program that mines weather data. Hint: Weather sensors collecting data every hour at
many locations across the globe gather a large volume of log data, which is a good candidate for analysis
with Map Reduce, since it is semi structured and record-oriented
5. Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm.
6. Implementation of K-means clustering using Map Reduce.
7. Installation of Hive along with practice examples.
8. Installation of HBase, Installing thrift along with Practice examples
9. Run the Pig Latin Scripts to find Word Count.
10. Run the Pig Latin Scripts to find a max temp for each and every year.

Applicable from Batch Admitted in Academic Session 2021-22 Onwards Page 568

You might also like