KEMBAR78
Data Science and Big Data Masters Course | PDF | Apache Spark | Apache Hadoop
0% found this document useful (0 votes)
20 views16 pages

Data Science and Big Data Masters Course

Intellipaat offers a Master's Course in Big Data and Data Science in collaboration with IBM and Microsoft, designed to equip learners with industry-relevant skills through hands-on projects and mentorship. The program includes 322 hours of instructor-led training, certification, and job assistance, targeting the growing demand for data professionals. With a projected 2.4 million job postings by 2022, this course aims to prepare students for lucrative careers in a rapidly expanding field.

Uploaded by

Shyam Prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views16 pages

Data Science and Big Data Masters Course

Intellipaat offers a Master's Course in Big Data and Data Science in collaboration with IBM and Microsoft, designed to equip learners with industry-relevant skills through hands-on projects and mentorship. The program includes 322 hours of instructor-led training, certification, and job assistance, targeting the growing demand for data professionals. With a projected 2.4 million job postings by 2022, this course aims to prepare students for lucrative careers in a rapidly expanding field.

Uploaded by

Shyam Prakash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Master’s Course IN

Big Data and Data


Science
Master Big Data and Data Science skills and take your

career to the next level!

2 Million 1:1 Personlized 55% Average

Learners Mentorship Salary Hike

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Master’s Course in

Big Data and Data

Science
This master’s course by Intellipaat, in collaboration with IBM and Microsoft, is designed by

industry experts and will help you gain proficiency in Big Data and Data Science. Gain real-

world knowledge of market-relevant concepts and technologies by working on industry-

based projects.

Hottest Job of 21st Century

2 .4 Million Job Postings S kill Development


There is a global estimate of 2.4 million job Big Data and Data Science professionals

postings for Big Data and Data Science roles are e quipped with various relevant skills
by 2022 fetching lucrative job offers

Growing Big Data and Data Future-oriented Career


Science Industry Big Data and Data Science is a budding field ;
48.2 % CAGR in the global Big Data and a head start will prove to be beneficial

Data Science industry

Popular Degree High Demand


48 % of Big Data and Data Science By 2022, India and US will face a demand

supply gap of 610,000 Big Data and Data


professionals have a Master’s degree
Science professionals

Our Credentials

2 Million+ 1,000+ 400+


Aspiring Active Students Industry-expert Instructors Hiring Partners

500+ 55% 155+


Corporates Upskilled Average Salary Hike Countries’ Learners

Page - 1

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


About Program
Intellipaat’s Big Data and Data Science Master’s Course will provide you with in-depth
knowledge of designing, developing, and deploying Data Science and Big Data
applications in the real world. You will also get to work with real-time analytics, statistical
computing, parsing machine-generated data, and creating NoSQL applications.

Learning Format 322 Hours Career Services IBM and


Microsoft
Online Duration by Intellipaat Certifications

Key Highlights
322 Hrs Instructor-led Training 381 Hrs Self-paced Videos

528 Hrs Projects and Exercises 19+ Courses

Certification from IBM and Microsoft Flexible Schedule

Job Assistance Lifetime Free Upgradation

Assured Interviews 1:1 with Industry Mentors

24*7 Support No-cost EMI Option

Pr o gram Pedagogy
ttstr tor
A end
In uc L ve i l lass
C
- ed raiens ing
T Ha
Innokvatho
ativensLMS
c

F
Grom worlds top
et trained bFaculty
y top and
experts
industry G
Feteffective
or a sense of how Learnin
online real prog
jects are uilt
b

Industry experts experience

24*7ar
ec Dn cn
Le nTby h oii alg Support Peer Networking and Group Learning
Speak
H ands toonSuexercises,
ject Matter
b project
xperts
Ework,anytime Improve
For effective
youronline
professional
Learning
network and learn from
and
quiz, clarify
capstoneyourpro
queries
jects instantly. peers
experience
through our innovative eer Chat tool.
P

Se Pl rso
1:1 fe
-p a nedc ideLe
alivzed osarning G amified Learning
Learn
H andsatonyour
exercises,
own pace
projwith
ect work, Learn
G trough in
et involved Hackathons and
to
group activities
world-class
quiz, capstone
content
projects solve
G roupreal-world
Learning pro lems
b

Pro
24*7eSupp
j ort
ts and
c Exe c r ises 1:1 Personalized Learning
H
G ands
et real-world
on exercises,
experience
projectthrough
work, Hands-on exercises, project work,
pro
quizjects
, capstone projects quizzes, and capstone projects

P age - 2

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Learning Path

Live Courses

Live Courses
Self-paced

Start of the Course


1

Big Data Hadoop

and Spark

2 4 4

Pyt o fo
h n r Tableau

Dt i
Apache Spark

and Scala a a Sc ence Desktop 10

7 6 5

M ongoDB A fic I
rti ial ntelligen e c & v
Splunk De eloper

Deep L earning C ourse and Am


d in

wh it Tensor fl w
o

8 9 Big Data and Data

Mci roso t f Az ure AW S


Science Master's Course

Training

Self-paced Courses

Data Science with R Apache HBase Apache Cassandra

Couchbase Machine Learning Solr

Linux Java Apache Kafka

SQL

Page - 3

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Curriculum

Module 1 Module 2
Big Data Hadoop and Spark Apache Spark and Scala
Hadoop Installation and Setup
Scala Course Content
Introduction to Big Data Hadoop and Understanding HDFS and Introduction to Scala

MapReduce
Pattern Matching

Deep Dive in MapReduce


Executing the Scala Code

Introduction to Hive
Classes Concept in Scala

Advanced Hive and Impala


Case Classes and Pattern Matching

Introduction to Pig
Concepts of Traits with Examples

Flume, Sqoop and HBase


Scala–Java Interoperability

Writing Spark Applications Using Scala


Scala Collections

Use Case Bobsrockets Package


Mutable Collections vs Immutable Collections

Introduction to Spark
Use Case Bobsrockets Package
Spark Basics

Spark Course Content


Working with RDDs in Spark

Aggregating Data with Pair RDDs


Introduction to Spark

Writing and Deploying Spark Applications


Spark Basics

Project Solution Discussion and Cloudera Certification Tips and Working with RDDs in Spark

Tricks
Aggregating Data with Pair RDDs

Parallel Processing
Writing and Deploying Spark Applications

Spark RDD Persistence


Parallel Processing

Spark MLlib
Spark RDD Persistence

Integrating Apache Flume and Apache Kafka


Spark MLlib

Spark Streaming
Integrating Apache Flume and Apache Kafka

Improving Spark Performance


Spark Streaming

Spark SQL and Data Frames


Improving Spark Performance

Scheduling/Partitioning Spark SQL and Data Frames

Scheduling/Partitioning
Following topics will be available only in self-paced mode:
Hadoop Administration – Multi-node Cluster Setup using Module 3
Amazon EC2
Python for Data Science
Hadoop Administration – Cluster Configuration
Introduction to Data Science using Python

Hadoop Administration – Maintenance, Monitoring, and Python basic constructs

Troubleshooting
Maths for DS-Statistics & Probability

ETL Connectivity with Hadoop Ecosystem (Self-paced)


OOPs in Python (Self paced)

Hadoop Application Testing


NumPy for mathematical computing

Roles and Responsibilities of Hadoop Testing Professional


SciPy for scientific computing

Framework called MRUnit for Testing of MapReduce Programs


Data manipulation

Unit Testing
Data visualization with Matplotlib

Test Execution
Machine Learning using Python

Test Plan Strategy and Writing Test Cases for Testing Hadoop Supervised learning

Application Unsupervised Learning

Python integration with Spark (Self paced)

Dimensionality Reduction

Time Series Forecasting

Page - 4

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Curriculum

Module 4 Introduction to Splunk App

Tableau Desktop 10 Splunk Indexes and Users

Introduction to Data Visualization and The Power of Splunk Configuration Files

Tableau
Splunk Deployment Management

Architecture of Tableau
Splunk Indexes

Charts and Graphs


User Roles and Authentication

Working with Metadata and Data Blending


Splunk Administration Environment

Advanced Data Manipulations


Basic Production Environment

Working with Filters


Splunk Search Engine

Organizing Data and Visual Analytics


Various Splunk Input Methods

Working with Mapping


Splunk User and Index Management

Working with Calculations and Expressions


Machine Data Parsing

Working with Parameters


Search Scaling and Monitoring

Dashboards and Stories


Splunk Cluster Implementation
Tableau Prep

Integration of Tableau with R Module 6


Artificial Intelligence & Deep Learning Course
Module 5 with Tensorflow
Splunk Developer and Admin Introduction to Deep Learning and Neural Networks

Splunk Development Concepts


Multi-layered Neural Networks

Basic Searching
Artificial Neural Networks and Various Methods

Using Fields in Searches


Deep Learning Libraries
Saving and Scheduling Searches Keras API

Creating Alerts
TFLearn API for TensorFlow

Scheduled Reports
Deep Neural Networks (DNNs)

Tags and Event Types


Convolutional Neural Networks (CNNs)

Creating and Using Macros


Rrecurrent Neural Networks(RNNs)

Workflow
GPU in Deep Learning

Splunk Search Commands


Autoencoders and Restricted Boltzmann Machine (RBM)

Transforming Commands
Deep Learning Applications

Reporting Commands
Chatbots
Mapping and Single Value Commands

Module 7
Splunk Reports and Visualizations

MongoDB
Analyzing, Calculating and Formatting Results

Introduction to NoSQL and MongoDB

Correlating Events

MongoDB Installation

Enriching Data with Lookups

Importance of NoSQL

Creating Reports and Dashboards

CRUD Operations

Getting Started with Parsing

Data Modeling and Schema Design

Using Pivot

Data Management and Administration

Common Information Model (CIM) Add-On


Data Indexing and Aggregation

Splunk Administration Topics MongoDB Security

Overview of Splunk
Working with Unstructured Data
Splunk Installation

Splunk Installation in Linux

Distributed Management Console

Page - 5

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Curriculum

Module 8 Time Series Analysis

Microsoft Azure Training Support Vector Machine (SVM)

Introduction to Microsoft Azure


Naïve Bayes

Introduction to ARM & Azure Storage


Text Mining
Introduction to Azure storage

Azure Virtual Machines


Module 11
Azure App and Container services
Apache HBase
Azure Networking – I
HBase Overview

Azure Networking – II
Architecture of NoSQL

Authentication and Authorization in Azure using RBAC


HBase Data Modeling

Microsoft Azure Active Directory


HBase Cluster Components

Azure Monitoring HBase API and Advanced Operations

Integration of Hive with HBase

Module 9
File Loading with Both Load Utilities
AWS
Introduction to Cloud Computing & AWS
Module 12
Elastic Compute and Storage Volumes
Apache Cassandra
Load Balancing, Autoscaling, and DNS
Advantages and Usage of Cassandra

Virtual Private Cloud


CAP Theorem and No SQL DataBase
Storage – Simple Storage Service (S3)
Cassandra Fundamentals, Data model, Installation, and
Databases and In-memory Datastores
Setup

Management and Application Services Cassandra Configuration

Access Management and Monitoring Services


Summarization, Node Tool Commands, Cluster, Indexes,
Automation and Configuration management
Cassandra & MapReduce, Installing Ops-center

AWS Migration Multi-cluster Setup

Self-paced Thrift/Avro/Json/Hector Client

Datastax Installation Part, Secondary index

Architecting AWS – Whitepaper

Advance Modeling

DevOps on AWS

Deploying IDE for Cassandra applications

Amazon FSx and Global Accelerator

Cassandra Administration

AWS Architect Interview Questions


Cassandra API and Summarization and Thrift
Module 10
Module 13
Data Science with R
Couchbase
Introduction to Data Science with R

Introduction to Couchbase

Data Exploration

Single-node Implementation

Data Manipulation

Couchbase Web Console

Data Visualization

Couchbase Multi-node Cluster

Introduction to Statistics

Couchbase Command-line Interface


Machine Learning

Logistic Regression
Module 14
Decision Trees and Random Forest

Machine Learning
Unsupervised Learning

Introduction to Machine Learning

Association Rule Mining and Recommendation Engines


Supervised Learning and Linear Regression

Self-paced Course Content Classification and Logistic Regression


Introduction to Artificial Intelligence
Page - 6

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Curriculum

Decision Tree and Random Forest


Inheritance in Java

Naïve Bayes and Support Vector Machine (self-paced)


Exception H andling in Detail

Unsupervised Learning
G etting started with Interfaces and Abstract Classes

Natural Language Processing and Text Mining (self-paced)


Overview of Nested Classes

Introduction to Deep Learning


G etting started with Java Threads

Time Series Analysis (self-paced) Overview of Java Collections

Understanding JDBC

Module 15
Java G enerics

Solr
/
Input Output in Java

Fundamentals of Search Engine and Apache Lucene

G etting started with Java Annotations

Analyzers in Lucene

Reflection and its Usage

Exploring Apache Lucene

Apache Lucene Demonstration


Module 18

Apache Lucene advance d


Apache Kafka

Advanced Topics of Apache Lucene (Practical)


What is Kafka – An Introduction

Apache Sol r
Multi-broker Kafka Implementation

Apache Solr Indexing


Multi-node Cluster Setup

Solr Indexing continue d


Integrate Flume with Kafka

Apache Solr Searching


Kafka API

Deep Dive into Apache Solr


Producers & Consumers

Apache Solr continued

Module 19
Extended Features

Multicore
SQL

Administration & SolrCloud


Introduction to SQL

Database Normalization and Entity Relationship Model

Module 16

n x
SQL Operators

Li u
Working with SQL - Join, Tables, and Variables

Introduction to Linux

Deep Dive into SQL Functions

File Management

Working with Subqueries

Files and Processes

SQL Views, Functions, and Stored Procedures

Introduction to Shell Scripting

Deep Dive into User-defined Functions

Conditional, Looping Statements, and Functions

SQL Optimization and Performance

Text Processing

Advanced Topics

Scheduling Tasks

Managing Database Concurrency

Advanced Shell Scripting

Programming Databases using Transact - SQL

Database Connectivity

Microsoft Courses - Study Material

Linux Networking

Module 17

J v
a a

Core Java Concepts

Writing Java Programs using Java Principles

Language Conceptuals

Operating with Java Statements

j
Concept of Ob ects and Classes

Introduction to Core Classes

Page - 7

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Curriculum

Skills to Master
Big Data Hadoop
Apache Spark & Scala
Data Science with R
Python for Data Science

Tableau Desktop 10
Splunk Developer & Admin
Azure Artificial Intelligence and Deep
MongoDB AWS Learning with TensorFlow

Tools to Master

Page - 8

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Course Projects

Projects cover the following industries:

Retail Social Media Supply Chain Entrepreneurship E-commerce Ban king Healthcare In surance

Beginner Beginner

Working with NumPy Visualizing and Analyzing the Customer


Churn Dataset using Python
Get hands-on learning to successfully work with
Analyze data by building aesthetic graphs to make
NumPy to solve various Python problems. The project
better sense of it. Work with bar plots and their
also requires learners to efficiently create 2D arrays
applications, which also includes histogram graphs
and perform simple arithmetic operations on the
for data analysis and box plots and the outliers in
arrays.
them.

Beginner Intermediate

Deal with Financial Data Analyzing the Naming Trends using

Python
Deploy Apache Spark and work with Spark MLlib.

Analyze the naming trends using Python. Use Python


Perform regression, clustering, dimensionality

to extract data and understand the applications of


reduction, and collaborative filtering to build a movie

data manipulation and the concepts of data


recommendation system.

visualization.

Intermediate Intermediate

Performing Analysis on Customer Churn Work with Customer Data of a UK Bank


Dataset
Analyze and create a dashboard to gain insight from

Analyze employment reliability in the telecom


a UK bank’s customer data in this real-life project.
industry. In this project, learners will also get to work
Build an asymmetric drop-down consisting of the
on real-time analysis of data with multiple labels
region and the corresponding customer details.

and data visualization for reliability.

Advance Advance
Market Basket Analysis Credit Card Fraud Detection

'
Identify trends in a company s inventory dataset to Perform data analysis on a banking dataset for

increase sales numbers. The learners will also get to several parameters. Use V4 predictor analysis, V7
implement data extraction, data manipulation, etc., predictor analysis, and data visualization techniques

for market basket analysis. to calculate the probability of fraud activities.

Advance Advance
Prediction of Loan Approval Analyzing the Trends of C OVID-19 with
Python
Use banking datasets to analyze, clean, process, and
Use pandas to collect data from various files. Use
visualize data. Once data analysis is completed, you
Plotly to build interactive visualizations. Use Prophet
will learn to implement Naive Bayes and Principal
from Facebook to efficiently develop time-series
Component Analysis.
models.

Page - 9

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Course Advisors / Faculty

Raghu Raman A V
Big Data Consultant
With over 10+ years of IT experience and corporate training, he is a subject matter expert in Big Data and
Hadoop along with Cloud Computing. He has an exhaustive track record of training for major
organizations including HCL, IBM, Capgemini, Wipro.

Suresh Paritala
Solutions Architect at Microsoft
Suresh is responsible for overseeing and directing Big Data and Analytics projects at Microsoft. He is
experienced in advanced analytics and big data, prescriptive and predictive analysis having expertise in
Hadoop and Spark.

David Callaghan
Big Data Solutions Architect, USA
A Blockchain expert, David has been bringing integrated blockchain, particularly Hyperledger, thereum
E

and Big Data solutions to the cloud. He has previously worked on Hadoop, AWS Cloud, Big Data and
Pentaho Projects.

Page - 10

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Meet the Batch

Industries Our Learners Come From

12% - BFSI
12% - BFSI

5% - Healthcare
50% Information Technology
5% - Healthcare

10% - Consulting

10% - Consulting

6% - Others

6% - Others

17% - BPO/ KPO

17% - BPO/ KPO

Work Experience

13%

13% Some of their current employers include


12+ years

15%

9-12 years

15%

25%

6-9 years

25%

27%

3-6 years

27%

20%

0-3 years

20%

Page - 11

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Glimpse of our
Successful Transitions

4.6 4.5 4.38

Career Transitions
Rajesh Venaganti From
 To

Senior Software Engineer Associate Senior Software
Consultant Engineer

Pratik Kumar From
 To



Product Manager Marketing Manager Product Manager

From
 To

Yogesh Kumar Associate Senior Software
Senior Software Engineer
Consultant Engineer

To

Gaurav Saboo From

Senior Technical
Senior Technical Associate Technical Analyst
Associate

Ankit Kumar From
 To



Data Scientist B.com Graduate Data Scientist

From
 To

Melwin Rodrigues Customer Service
Data Scientist Data Scientist
Agent

Shehzin Mulla From
 To



Research Analyst Data Analyst Research Analyst

From
 To

D Rajesh Sahoo B. Tech — Civil Machine Learning
Machine Learning Engineer Engineering Engineer

Page - 12

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Intellipaat
Career Services
500+ 600+ 400+ 55%
Webinars Job Shares Hiring Partners Avg. Salary Hike*

What Makes Us Tick


Career-oriented Sessions Assured Interviews
Attend 25+ career-oriented sessions Get job Opportunities with Top
by industry mentors and plan your MNC's & Startups upon movement
career trajectory to Placement Pool

Profile Building Dedicated Job Portal Access


Craft a Big Data and Data Science Get exclusive access to 200 job
resume and LinkedIn profile and postings per month on Intellipaat’s
make an impression on top job portal

employers

Mock Interview Preparation Job Fairs


Prepare with mock interviews Job fairs are conducted regularly to
including most asked questions by introduce learners to major
top employers
organizations

1:1 Mentoring Sessions Hackathons


Get 1:1 guidance at every step in Work in terms and get exclusive
your career transition to Big Data access to hackathons
and Data Science

Learner Reviews
DEVESH SINHA Srikanth Penumadula Vikrant Singh
Communication cell at JGEC Senior Manager at Fidelity Senior Big Data Consultant at
Investments Amazon
The course is comprehensive and Best platform to learn. Course It was a wonderful learning experience
has a variety of material like videos, structure is well designed and on to learn from the trainers at Intellipaat.
PPTs and PDFs, that are neatly request I got access to two more They were hands-on and provided
organized. Also, the support I projects that made my foundational real-time scenarios. Intellipaat is the
received from the trainer during my knowledge stronger. Overall, I rate the right place for learning latest
learning was great. course 4 on a scale of 5. technology.

* Past record is no guarantee of future job prospects Page - 13

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


Program Partners

About Intellipaat
Intellipaat is one of the leading online training providers with more than 1.2 million learners in over 155
countries. We are on a mission to democratize education as we believe that everyone has the right to
quality education.

We create courses in collaboration with top universities and MNCs for employability like IIT Madras,
University of Essex, University of Liverpool, IIT Roorkee, IIT Guwahati, SPJIMR, IBM, Microsoft, etc.

Our courses are delivered by SMEs & our pedagogy enables quick learning of difficult topics. 24/7
technical support & career services help learners to jump-start their careers.

Collaboration with IBM and Microsoft


IBM and Microsoft are leading innovators and the biggest players in creating innovative
Big Data and Data Science tools. In this Master’s course, top subject matter experts will
share knowledge in the domain of Big Data and Data Science.

Benefits of this collaboration for learners:


Industry-recognized certification from IBM

Industry recognized certification from Microsoft

Industry-specific case studies and project work

Page - 14

IND: +91 7022374614 US: 1-800-216-8930 sales@intellipaat.com www.intellipaat.com


2 Million Learners & 500+ corporates across 155+ countries

upskilling on Intellipaat platform

Contact Us
INDIA

AMR Tech Park 3, Ground Floor, Tower B, Hongasandra Village,
Bommanahalli, Hosur Road, Bangalore, Karnataka 560068, India

Phone No: +91-7022374614


UK

Flat 16 Bluepoint Court, 203 Station Road, Harrow,

Middlesex HA1 2TS, UK


USA

1219 E. Hillsdale Blvd. Suite 205, Foster City, CA 94404

Phone No: 1-800-216-8930

sales@intellipaat.com

www.intellipaat.com

You might also like