IITM Pravartak certified
Advanced Professional
course in Big Data and
Cloud Analytics
Live Classes with Placement Guidance
About GUVI
GUVI is India’s first Vernacular EdTech platform of its
kind. GUVI stands for ‘Grab Ur Vernacular Imprint’,
dedicated to making technical education accessible
and effective by breaking down language barriers. Our
pioneering EdTech company is incubated by India's
premier institutions, ensuring the highest standards of
quality and innovation. We aim to make a significant
impact in the field of tech upskilling, opening doors for
learners across India to acquire valuable technical skills
in their vernacular languages. By democratizing tech
education online through prominent partnerships with
Google-for-Education, UiPath, NASSCOM, & AICTE, GUVI
has made it possible to impart job-ready tech skills to
the ambitious aspirants.
About IIT-M Pravartak
IITM Pravartak Technologies Foundation is a section 08
company housed within the Indian Institute of Technology
Madras, operates as the Technology Innovation Hub on
Sensors, Networking, Actuators, and Control Systems
(SNACS), funded by the Department of Science and
Technology, Government of India, under its National
Mission on Interdisciplinary Cyber-Physical Systems.
Dedicated to preparing India's youth for the forefront of
technological advancement , our core activities include
technology development , entrepreneurship development ,
human resource development , and international
collaboration. By fostering an ecosystem that integrates
academia, industry, government , and international
organizations, IIT-M Pravartak facilitates the translation of
fundamental research into practical products, driving
innovation and shaping the future of technology.
We are Accredited by
About the Co-Founders
Co - Founder at GUVI
20+ years of Technical Expertise
& more
Tech Women Entrepreneur who was selected
For Google Developers’ Launchpad Program
Sridevi
CEO & Founder at GUVI
20+ years of Technical Expertise
& more
Built 7 Products from Scratch Mentored 1000+
Arun Prakash
students Hosted 200+ sessions & 25+ webinars
Co - Founder at GUVI
17+ years of experience with IT industry
Technologist with 9+ years of Entrepreneurial
experience & Member of the Syllabus Sub-
Bala Murugan
Committee at Anna University
Big Data and Cloud Analytics Program
In a stipulated 5-Months Weekend Live Online Classes, our
Big Data and Cloud Analytics Program covers every
technology in-depth. Hands-on training with Industry
Projects, Mock Interviews, unlimited practice sessions on
Coding Practise Platforms like Codekata & MicroArc, assist
students with absolutely no tech knowledge to skill up
and get ready for Big Data and Cloud Analytics job roles.
Along with an Industry-oriented curriculum and
Industry-recognized certification, we offer Placement
Guidance.
Why GUVI’s
Big Data and Cloud
Analytics Class?
Industry Led
Hands on Training
Curriculum
Placement Guidance
IIT-M Pravartak
Certification
Pre-Bootcamp Phase
Book a Seat with ₹8000
(100% Refundable)
Attend Pre-Bootcamp Session
(2 sessions)
Take Assessment
If Selected/
If Not Selected/
Interested Not Interested
Proceed to Data
Immediate Refund
Engineering Program with of ₹8000
₹1,23,900(Course Fee) -
₹8000(Booking Fee) =
₹1,15,900(Remaining Fee)
Big Data and Cloud
Analytics Program
5-Months Weekend Live Online Class
Hands-on Industry Projects
Technical Mentorship by Industry Experts
Practise on Coding Practise Platforms
CodeKata, MicroArc
Live Cumulative test & Mock Interviews
Proceed to Placement Guidance
Top skills you’ll learn!
Great command in Python.
Solid Foundation in Database.
Cloud Services.
Hands-on in Big Data.
Excellent knowledge of Data Cleaning & Data
Visualization techniques.
Technologies covered
Python Database Shell Script
(Primary)
Orchestrator Cloud Services Big Data
Data Cleaning Data Pipelines
Infrastructure Data security Capstone
as code & Privacy Project
Program Curriculum
Module 1 : Python
We will explore Python, a versatile and beginner-
friendly programming language. Python is known
for its readability and wide range of applications,
from web development and data analysis to
artificial intelligence and automation. It offers a
rich ecosystem of libraries and tools, making it a
popular choice for both novice and
experienced programmers.
Why python ?
Python IDE
Hello World Program
Variables & Names
String Basics
List
Tuple
Dictionaries
Conditional Statements
For and While Loop , TRY AND EXCEPT
Numbers and Math Functions
Common Errors in Python
Module 2 : Python (Advance)
We will dive into some advanced concepts like
comprehension, file handling, regular
expressions, object oriented programming,
pickling and many more essential concepts.
Functions , Lambda, Filters and MapPython
Functions as Arguments
List Comprehension
Debugging in Python
Class and Objects
Inheritance , polymorphism , abstractions
Liner and non-Linear Data structures
Singly , doubly ,ciculer Linked list , Binary tree
Bubble , insertion , merge ,quick , Heap sorting
File Handling (Text , Json , csv )
Iterators
Pickling, Multi Threading
Module 3 : RDBMS & SQL
We will explore RDBMS (Relational Database
Management System) to understand the database
technology that organizes data into structured
tables with defined relationships. And will
explore some SQL concepts.
Database-Introduction and Installation,
Data Definition Language
(Create,Drop,Truncate,Alter)
Data Manipulation Language
(Select ,Delete,Update,Insert)
Data Control Language (Grant ,Revoke)
Transaction Control language (Commit ,Rollback)
SQL Keys and Constraints(Primary key, Foreign
Key,Unique,Not NULL,CHECK,DEFAULT)
Operators (Arithmetic, Logical, Bitwise,
Comparison, Compound)
Clauses in SQL(Where,Having,Group by, Order by)
Module 4 : SQL -continued
We will dive into SQL (Structured Query Language) to
acquire the skills needed for managing and querying
relational databases. SQL enables them to retrieve,
update, and manipulate data, making it a fundamental
tool for working with structured data in various
applications.
Joins
SQL Outer Join
SQL Left Join
SQL Right Join
SQL Full Join
SQL Cross Join
Integrating Python SQL
Window functions(rank, dense rank, row
number, etc)
Data Types, Variables, and
Constants
Conditional Structures(IF,CASE,GOTO, and NULL)
Stored procedures and Function
Sub queries
Triggers
Indexes
Transaction
Views
Module 5 : MongoDB
We delve into MongoDB to understand this popular
NoSQL database, which stores data in flexible, JSON-
like documents. They learn how MongoDB's scalability
and speed make it suitable for handling large volumes
of unstructured data
CAP Theorem
Structured and unstructured data
OLTP vs OLAP
Schema vs Schema less
Dimensional modeling
Cluster set and up Monitoring
Insert First Data
CRUD Operations
Insert Many
Update and Update Many
Delete and Delete Many
Projection
Intro to Embed Documents
Embed Documents in Action
Adding Arrays
Fetching Data From Structured Data
Schema Types
Types of Data in MongoDB
Relationship between data's
Aggregation
One to One using Embed Method
One to One using Reference Many
One to Many Embedded
One to Many Reference Method
Assessment
Module 6 : Shell Script
We explore shell scripting in the Linux
environment , where they learn to write and
execute scripts using the command-line
interface. Shell scripts are text files containing
a series of commands, and We discover how to
automate tasks
Intrduction to Linux
Basic Shell script commands
Creating Frameworks
Cron jobs, Email alerts
Running Batch jobs
Module 7 : Git
We will study Git , a distributed version control
system, to learn how it tracks changes in software
code. Git allows collaborative development , enabling
multiple people to work on the same project
simultaneously while managing different versions of
code. It is essential for software development , as it
tracks revisions, facilitates collaboration, and helps in
code management.
Introduction to Git
Git commands
Cloning repository in vs code
Working on cloning branches, commit ,push,
add , merge from vs code
Module 8 : AWS Cloud
We delve into cloud computing, which involves
delivering various computing services (such
as servers, storage, databases, networking,
software, and analytics) over the internet.
Introduction to Clou d
A WS Services overvie w
Server vs serverless
I AM ,roles , policies
EC 2 VM’
, s
S 3
R D –M Q
S yS L Free tier database ,
Integrating R D S to Local System and
Integrating R D S to P ython Environment
Cloud data warehouse
Cloud D ata Lake
Cloud database ( D ynamo D )
B
Lambd a
Cloud Watch,
Integrating All the A bove components and R DS
M onitoring E TL pipeline with Step function
Glue, D ata crawler, A then a
monitoring E T L pipeline with step function
System Design
Load balancer and H igh availabilit y
H z ori ontal vs V ertical Scalin g
M onolithic vs microservice
D istributed messaging service and A
ws SQS
C DN (content delivery N etwork )
Caching , scalabilit y
A APws I gateway
Module 9 : Snowflake
We study Snowflake to grasp modern
cloud-based data warehousing, focusing on its
architecture, data sharing, scalability, and data
analytics applications.
introduction to snowflake
Diffrence between Datalake, Data Warehouse,
Deltalake, Database
Dimension and Fact Tables
Roles and users
Data modeling , snowpipe
MLOAP and ROLAP
Partitioning and indexing
Data mart and data cubes & caching
Data masking
Handling json files
Data loading from S3 and tranformation
Module 10 : Airflow
We explore Airflow to understand its role in
orchestrating and automating workflows,
scheduling tasks, managing data pipelines, and
monitoring job execution.
Why and what is airflow
airflow UI
Run first dag
grid view
graph view
landing times view
calendar view
gantt view
Code view
Core concepts of airflow
DAGs
Scope
Operators
control flow
Task and task instance
Database and executors
ETL/ ELT process implementation
monitoring ETL pipeline with airflow
Module 11 : BigData
We delve into big data to learn about handling
and analyzing vast datasets, using tools like
Hadoop, Hive , and HDFS , PIG for insights and
decision-making.
Installation and configuration of Apache Hive
and MySQL locally
Running Hive Query to integrate Local and HDFS
file system
Installing Pig
Working with Pig script and integrating with local
and HDFS file system
Installing HBase working with HBase Query
Installing Cassandra and working with Cassandra
Installing Sqoop and flume and perform data
Migration,
Local RDBMS to HDFS ,
Local RDBMS to Hive,
Local RDBMS to HBase,
HDFS to local RDBMS
Hive to RDBMS
Module 12 : Kafka
We learn about Kafka, an open-source stream
processing platform. Kafka is used for ingesting,
storing, processing, and distributing real-time
data streams and explores Kafka's architecture,
topics, producers, consumers, and its role in
handling large volumes of data with low latency.
Introduction to Kafka
Producer, consumer, Consumer Groups
Topics , offset , partitions, brokers
Zookeeper,replication
Batch vs real time streaming
Real Streaming Process
Assignment and Task
Module 13 : Spark
We will explore Spark, which is an open-source,
distributed computing framework that provides
high-speed, in-memory data processing for big
data analytics.
Introduction to Apache Spark
Spark architecture
Hadoop vs Spark
RDDs , Dag , transformation , actions
Data Partitioning and Shuffling
DataFrame & Spark SQL
Streaming data handling in Spark
Spark batch data processing(CSV, JSON,
parquet files)
AWS Data Management Tools [AWS EMR ,
GLUE jobs]
Assessments
Module 14 : Data Cleaning
We will engage in data cleaning to understand
the process of identifying and correcting errors
or inconsistencies in datasets, ensuring data
accuracy and reliability for analysis
and reporting.
Structured vs Unstructured Data using Pandas
Common Data issues and how to clean them
Data cleaning with Pandas and PySpark
Handling Json Data
Meaningful data transformation (Scaling and
Normalization)
Example: Movies Data Set Cleaning
Module 15 : Prometheus
We will study Prometheus to explore its role as
an open-source monitoring and alerting toolkit ,
used for collecting and visualizing metrics from
various systems, aiding in performance
optimization and issue detection.
Server, architecture
Installation
understanding prom UI
node exporters
promql (agg, fun, operators,data types)
integrating python with prom
counter , gauge , summary ,histogram
recording rules
alerting rules
alert manager ,installation of alert manager
grouping, inhibiting , throttling , silencing alerts
slack integration with prom with alert manager
pager duty integration with alert manager
black box exporters,installation
mysql exporter
integrating aws and prom
aWS cloudwatch and prom
implementing grafana dashboard to prom
Module 16 : Data dog
Datadog is a monitoring and analytics platform for
cloud-scale applications. It provides developers,
operations teams, and business users with insights
into their applications, infrastructure, and overall
performance.
Metrics
Dashboards
Alerts
Monitors
Tracing
Logs monitoring
Integrations
Module 17 : Docker
Docker is an open-source platform used to develop,
ship, and run applications in containers. Containers
are lightweight , portable, and self-sufficient units
that package an application along with its
dependencies, libraries, and configuration
files, enabling consistent deployment across
different environments.
What is docker
Installation of docker
Docker images , containers
Docker file
Docker volume
Docker registry
Containerizing applications with docker hands-on
Module 18 : Kubernetes
Kubernetes is an open-source container
orchestration platform that automates the
deployment , scaling, and management of
containerized applications.
Nodes
Pods
ReplicaSets
Deployments
Namespaces
Ingress
Hear it from our learners
I am Vinothkumar Gurusamy. I discovered GUVI I have recently completed my master data
through advertisements and after a thorough engineering program from GUVI GEEK and it has
discussion on various courses, I chose Data been a great learning, hands on project is
Engineering. The course was instrumental in effective and gives real time experience. Both
upskilling my knowledge and has proven to be mentors and coordinators staffs are supportive
incredibly useful in my work in the data and quick in response.
engineering field. Thank you, GUVI, for providing
such a valuable learning experience!
Vinothkumar
Abinesh RD
Gurusamy
The Data Engineering course at GUVI GEEK was “GUVI is one of the best platforms to
an exceptional learning experience. The hands- start a new course and a new career.
on projects provided real-world insights and Advanced Programming and Master Data
valuable practical skills. The mentors and Science is one of the best programs
coordinators were incredibly supportive and which are been trained with industry
responsive, guiding me throughout the course. experts. It has its own software to
The program offered endless help in developing practise and a huge number of exercises
my skills, making it a truly valuable investment to master any topic.”
in my career. Thank you, GUVI, for this
incredible opportunity!
Lohit Shakthi Kosigi Tejas Samanthapudi
“I have attended several classes of Masters in Data
“Guvi helps me to improve my self-confidence science course conducted by Guvi. It is really
in coding skills . The zoom classes are totally helpful to gain knowledge as it is different from
comfortable,friendly and easy to learn .It helps other online courses. Here, we have mentors in live
me to understand the basic and the core sessions, so we will be more concentrated than
concepts and it helped me to. Build logical other online courses where we watch pre
skills.I got great mentor's which helped me to recorded videos. Also we are getting weekly tasks
bridge between the academics. I'm very proud that would make us learn even if there is no class.
Thanks to Guvi.”
I am thankful for all the people in Guvi for building
up such a valuable program for our career.”
Gokila Gokul
“I always liked coding but I didn't really get a
good platform to learn things as per industrial
“Guvi offers a cordial, supportive and friendly requirements. When I was in search I got to know
environment to learners. With excellent support about Guvi, I really felt trustworthy by their
and 24*7 assistance from the mentors guvi response When I joined the Data Science course
does not leave any stone unturned to improvise the weekend live classes and recorded course
your learning. Thanks for being such an videos has made learning easy to me. Eventually
inspiration to us.”
I started spending more time practicing in
Codekata. I loved the way Guvi took care of
clarifying doubts asap. Thank you!.”
Gokak Mohd Ishtiyaque Sonia kola
“Hello folks, if you are thinking of a career transition in
the ‘Data Science’ field then, “GUVI” is the best
platform to get nourished, indulged and protruded in “The datascience course is very good,
this upcoming field and also, it doesn’t matter from the concepts are being explained in a
which engineering background you are or whether crisp manner. The instructors have good
you are a working fellow. The best thing I found here depth in the subject and solve every
is you will always get motivated unknowingly and doubt one might have. Thanks to GUVI for
become curious to learn more & more from the setting a great structured program.”
tutorial videos conducted by the IITM professors.
GUVI helps me to think about the problem in
multidimensional ways. Thanks to the GUVI team”
Shubham Nehete Diliban Sibi
“This course is designed being dynamic, interactive
“The course videos help you to learn the tools by and range of materials to refer. This is very well
yourself and you can track the progress.The structured in such a way that it makes the
mentors are very patient and ensure that participants to perform, discuss, and to participate in
students understand the concept, sometimes assessments that will help the participants to
going the extra mile and explaining. Sometimes maximize the utilization. This program is suitable for all
the mentors try to teach in your native language, if students, freshers and working professionals. This
needed. The practice platforms are easy to learn course is excellent for those who would like to learn
and practice. By completing this data science the basics of program like Python and would like to
course, sure you can become a Data Scientist.”
broaden their knowledge in Data Science. I enjoyed
seeing videos in GUVI website from experts that also
explains the concepts in a detailed manner. ”
Sridharan K Anbazhagan
“They are very approachable and friendly when
we ask any doubt or any clarification. Before
joining guvi I have already done a course of data
science in another institution.When comparing
these two institutions, there is a lot of difference
in teaching.I love that the mentor who is
teaching the course is not only a mentor but a
professional too. This is a very unique thing
about guvi. I will rate 5/5 to Guvi.”
Vishally
Instructors
Learn from India’s top Industry Leaders
Balachandar Kasilingam
Shabarinath
Technical architect
Manager
Incedo solutions and
Resprolabs technology
Nethaji Nirmal Aishwarya S
Senior Skill Development Data Engineer
Engineer - Iamneo Roche
Rajaguru Kanagasabai Vinish Vivek
Data Engineer 3
Consultant - Python
PayPal Freelance
Thillaikkarasan M Rajesh M Venkatesan
Lead Data Scientist Quotient technology
Wells Fargo Manager
Our Placements
"I got a 57% Hike,
Thank you GUVI Team"
Sonia Kola
Data Scientist
Watch Video
“Every topic was
covered from scratch”
Rakesh
Python Developer
Watch Video
Watch Video
Program Details
5-Months Weekend Live Online Classes
Please contact our Big Data and Cloud Analytics
coordinator Deepak: +91-97360 97320
Total Course Fee ₹1,23,900
Pre-BootCamp Booking Fees -₹8000
Remaining Fee ₹1,15,900
Now become a proficient Big Data and Cloud Analytics at
Affordable Installments! Master Data Engineering
at just ₹10893 /Month
Upto 12 Months
Note: Valid documents are required for EMI Process.
Additional processing fee will be applied. EMI Amount
might vary with Vendors
No - Eligibility / Restrictions!!!
Students & Working Professionals, seeking
opportunities to upskill their Data Engineering
proficiency for faster career growth.
Develop your Data
Engineering skills
&
Unlock a challenging &
rewarding Career
Begin your Skill Development Journey Today!
For further information:
IITM Research park - phase 2
Deepak@guvi.in
module #9, 3rd floor, D block,
+91 9736097320 Kanagam Rd, Tharamani, Chennai,
Tamil Nadu, India. 600113