0% found this document useful (0 votes)

19 views7 pages

Syllabus Fundamentals of Data Science

Uploaded by

vargiyaaryan2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views7 pages

Syllabus Fundamentals of Data Science

Uploaded by

vargiyaaryan2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Course Code Course name L T P C

CSDS2001P Fundamentals of Data Science 4 0 0 4

Total Units to be Covered: 06 Total Contact Hours: 60
Prerequisite(s): Database Management Systems -CSEG2046 Syllabus version: 1.0

Course Objectives
1. To understand the concept of data science.

2. To understand techniques and methods related to the area of data science on real
world applications.

Course Outcomes

After the completion of the course the students will be able to

CO1: Understand the fundamentals of data processing.

CO2: Understand and apply mathematical concepts in the field of data science.
CO3: Employ the techniques and methods related to the area of data science in a
variety of applications.
CO4: Apply logical thinking to understand and solve the problem in context.
CO5: Apply the entire concept in data analysis tools.

CO-PO Mapping

Program
Outcome
s PO PO PO PO PO PO PO PO PO PO PO PO PS PS PS
Course 1 2 3 4 5 6 7 8 9 10 11 12 O1 O2 O3
Outcome
s
CO 1 - 3 2 2 - - - - - - - - - 3 3
CO 2 - 3 2 2 - - - - - - - - - 3 3
CO 3 - 3 2 3 - - - - - - - - - 3 3
CO4 - 3 2 3 - - - - - - - - - 3 3
- - - - - - - - - -
CO5 3 2 2 2 3
- - - - - - - - - -
Average 3 2 2.4 2.8 3

1 – Weakly Mapped (Low) 2 – Moderately Mapped (Medium)

3 – Strongly Mapped (High) “_” means there is no correlation

Syllabus

Unit I: Introduction to Data Science 8 Lecture Hours

Fundamentals of Data Science, Real World Applications, Data Science Challenges,
Software Engineering for Data Science (DataOps, MLOps (intro)). Data science process
roles, Stages in data science.

Defining Analytics, Types of data analytics (Descriptive, Diagnostic, Predictive,

Prescriptive)

Data Science Process: CRISP-DM Methodology, SEMMA, BIG DATA LIFE CYCLE,
SMAM.

Unit II: Probability and statistics for Data Science 12 Lecture Hours
Probability: Introduction, finite sample spaces, conditional probability, independence;
Random variables, distribution functions, probability mass and density functions,
standard univariate discrete and continuous distributions; Mathematical expectations,
moments; Random vectors, joint, marginal, and conditional distributions, independence,
covariance, correlation, standard multivariate distributions, functions of random vectors;
central limit theorem.

Statistics: Sampling distributions of the sample mean and the sample variance for a
normal population; Point and interval estimation; Sampling distributions (Chi-square,
t,F,Z), Hypothesis testing; One tailed and two-tailed tests; Analysis of variance, ANOVA,
One way and two way classifications

Unit III: Data, Data Sources and Visualization 15 Lecture Hours

Types of Data and Datasets, Data Quality, and Issues, Data Models, General
Framework of Formal modeling, Association Analyses, Prediction Analyses, Data
Pipelines and patterns, Data from files & working with relational databases, Diverse
data sources, data warehouses, data mining, cloud, and Data lake: Characteristics,
components, Data Streaming Ingestion, Batch Data Ingestion, Data Cataloging, Data
Pipeline Stages (extraction, ingestion, cleaning, exploration, wrangling, versioning, Data
transformation, Feature management). Data Visualization: Overview of visualization
techniques for Data Exploratory analysis

Unit IV: Feature Engineering and Optimization 10 Lecture Hours

Feature Extraction, Feature Construction, Feature Subset selection, Feature Learning,
Feature Reduction (Dimensionality Reduction) Case Study involving FE tasks, and
Feature Engineering techniques for text, images, audio, and video. Necessary and
sufficiency conditions for optima; Gradient descent methods; Constrained optimization;
Introduction to non-gradient techniques; Introduction to least squares optimization;
Optimization view of machine learning.

Unit V: Supervised and unsupervised learning 10 Lecture Hours

Introduction to Machine Learning, types, Supervised Learning: Overview, workflow, data
processing, Linear Regression, Logistic Regression, Decision Trees, Random Forest,
Support Vector Machines (SVM), k-Nearest Neighbors (k-NN).

Unsupervised Learning: Overview, clustering algorithms: K-Means Clustering,

Hierarchical Clustering, DBSCAN, Gaussian Mixture Models (GMM),

Dimensionality Reduction: Principal Component Analysis (PCA), t-Distributed Stochastic

Neighbor Embedding (t-SNE)

Association Rule Mining: Apriori Algorithm, FP-Growth Algorithm, Anomaly Detection,

Model Evaluation (Silhouette Score, Inertia, etc.)
Use Cases and Practical Applications

Unit VI: Data Analysis Tool 5 Lecture Hours

Reading and getting data into R, ordered and unordered factors i.e. arrays and matrices
– lists and data frames, reading data from files, probability distributions statistical
models in R - manipulating objects – data distribution.
1. Avrim Blum, John Hopcroft, and Ravindran Kannan, “Foundations of Data Science”,
2018. Available online at: https://www.cs.cornell.edu/jeh/book.pdf.
Total lecture Hours 60
Textbooks
1. G. Strang, “Introduction to Linear Algebra”, 5 th Edition, Wellesley-Cambridge Press,
USA, 2016.
2. D. C. Montgomery, and G. C. Runger, “Applied Statistics and Probability for
Engineers”, 5th Edition, John Wiley & Sons, Inc., NY, USA, 2011.
3. Nina Zumel, and John Mount, “Practical Data Science with R”, Manning
Publications, 2014.
Reference Books
1. Mark Gardener, “Beginning R - The Statistical Programming Language”, John Wiley
& Sons, Inc., 2012.
2. W. N. Venables, D. M. Smith and the R Core Team, “An Introduction to R”, 2013.
Available online at: https://cran.r-project.org/doc/manuals/R-intro.pdf.
3. S. Abiteboul, R. Hull, V. Vianu, “Foundations of Databases”, Addison Wesley, 1995.
4. J. S. Bendat, and A. G. Piersol, “Random Data: Analysis and Measurement
Procedures”, 4th Edition, John Wiley & Sons, Inc., NY, USA, 2010.
5. D. C. Montgomery, and G. C. Runger, “Applied Statistics and Probability for
Engineers”, 5th Edition, John Wiley & Sons, Inc., NY, USA, 2011.
6. Cathy O’Neil, and Rachel Schutt, “Doing Data Science”, O’Reilly Media, 2013.
Modes of Evaluation: Quiz/Assignment/ presentation/ extempore/ Written
Examination
Examination Scheme
Components IA MID SEM End Sem Total
Weightage (%) 50 20 30 100

Course Code Course name L T P C

CSDS2101P Fundamentals of Data Science Lab 0 0 2 1
Total Units to be Covered: 10 Total Contact Hours: 30
Prerequisite(s): Database Management Systems Lab - CSEG2146 Syllabus version: 1.0

Course Objectives

1. Learn to collect, clean, and preprocess data from diverse sources for analysis.
2. Understand core statistical concepts to extract valuable insights from data.
3. Gain a foundational understanding of machine learning algorithms and their
applications.
4. Develop coding skills to perform data analysis and visualization.

Course Outcomes

CO 1. Know the importance of data analytics in relation to various statistical measures.

CO 2. Employ statistical techniques to extract insights from data.
CO 3. Demonstrate proficiency in using R for data analysis.

CO-PO Mapping

Program
Outcomes
PSO1

PSO2

PSO3
PO10

PO11

PO12
PO1

PO2

PO3

PO4

PO5

PO6

PO7

PO8

PO9

Course
Outcome
s
CO 1 1 - - 1 1 - - - - - - - - - 3
CO 2 1 - - 1 1 - - - - - - - - - 3
CO 3 1 2 - 2 1 - - - - - - - - - 3
Average 1 0.67 - 1.3 1 - - - - - - - - - 3

1 – Weakly Mapped (Low) 2 – Moderately Mapped (Medium)

3 – Strongly Mapped (High) “_” means there is no correlation

List of Experiments
Experiment no 1 Conduct basic data exploration by calculating summary statistics,
creating histograms, and generating scatterplots.

Experiment no 2 Learn data cleaning techniques, including handling missing data,

outliers, and data imputation.

Experiment no 3 Perform hypothesis tests, such as t-tests or chi-squared tests, to

make inferences about data.

Experiment no 4 Implement simple linear regression to analyze relationships

between variables and make predictions.

Experiment no 5 Create a variety of visualizations, including bar charts, line graphs,

heatmaps, and box plots.

Experiment no 6 Use clustering algorithms to group similar data points together.

Experiment no 7 Build a random forest model for more advanced classification and
regression tasks.

Experiment no 8 Discover frequent item sets and association rules in transactional

data.

Experiment no 9 Project 1 (Sentiment analysis)

Experiment no 10 Project 2 (Recommendation systems)

Total Lab hours 30

Textbooks
1. G. Strang, “Introduction to Linear Algebra”, 5th Edition, Wellesley-Cambridge Press,
USA, 2016.
2. D. C. Montgomery, and G. C. Runger, “Applied Statistics and Probability for
Engineers”, 5th Edition, John Wiley & Sons, Inc., NY, USA, 2011.
Reference Books
1. Mark Gardener, “Beginning R - The Statistical Programming Language”, John Wiley
& Sons, Inc., 2013.
2. W. N. Venables, D. M. Smith, and the R Core Team, “An Introduction to R”, 2013.
Available online at: https://cran.r-project.org/doc/manuals/R-intro.pdf.

Modes of Evaluation: Quiz/Assignment/ presentation/ extempore/ Written

Examination

Examination Scheme: Continuous Assessment

Components Quiz & Viva Performance & Lab Report

Weightage (%) 50 % 50 %

CS5103 Lecture Plan - Fundamnetals of Data Science
No ratings yet
CS5103 Lecture Plan - Fundamnetals of Data Science
2 pages
Data Science
No ratings yet
Data Science
9 pages
Cds3005 Foundations-Of-data-science LP 1.0 18 Cds3005 Foundation-Of-data-science LP 1.0 1 Foundations of Data Science
No ratings yet
Cds3005 Foundations-Of-data-science LP 1.0 18 Cds3005 Foundation-Of-data-science LP 1.0 1 Foundations of Data Science
2 pages
Statiscal Method Using R Lab, Syllabus
No ratings yet
Statiscal Method Using R Lab, Syllabus
3 pages
Syllabus FDS
No ratings yet
Syllabus FDS
4 pages
BCA (AIDS) - 3rd Sem - TBD303 - Statistical Methods For Data Science-JBK
No ratings yet
BCA (AIDS) - 3rd Sem - TBD303 - Statistical Methods For Data Science-JBK
2 pages
Ex MTech DSDA Sem 1 Syllabus
No ratings yet
Ex MTech DSDA Sem 1 Syllabus
6 pages
CS3352 FDS
No ratings yet
CS3352 FDS
23 pages
Statiscal Method Using R Lab, Syllabus
No ratings yet
Statiscal Method Using R Lab, Syllabus
5 pages
B.Sc Data Science Curriculum
No ratings yet
B.Sc Data Science Curriculum
19 pages
Foundation of Data Science Syllabus
No ratings yet
Foundation of Data Science Syllabus
4 pages
Syllabus - PGD - DS - Batch-7 PDF
No ratings yet
Syllabus - PGD - DS - Batch-7 PDF
12 pages
Mtech-Syllabus-Data Science - Sem1
No ratings yet
Mtech-Syllabus-Data Science - Sem1
25 pages
B.Tech AI & DS Course Outline
No ratings yet
B.Tech AI & DS Course Outline
38 pages
Data Science & Python Syllabus 2022-24
No ratings yet
Data Science & Python Syllabus 2022-24
9 pages
Edit Ds
No ratings yet
Edit Ds
37 pages
FDS Lesson Plan
No ratings yet
FDS Lesson Plan
8 pages
Data Science Syl Lab Us
No ratings yet
Data Science Syl Lab Us
4 pages
MCS DS
No ratings yet
MCS DS
5 pages
DSP U1
No ratings yet
DSP U1
89 pages
Bapatla Engineering College: Unit - I
No ratings yet
Bapatla Engineering College: Unit - I
2 pages
Semester 1 Data Science Course Overview
No ratings yet
Semester 1 Data Science Course Overview
21 pages
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
No ratings yet
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
1 page
Course Plan - FDS Theory
No ratings yet
Course Plan - FDS Theory
8 pages
Unit 1 Fod
No ratings yet
Unit 1 Fod
43 pages
Syllabus - Principle of Data Science
No ratings yet
Syllabus - Principle of Data Science
4 pages
1152CS239-Intro. To Data Science-Syllabus
No ratings yet
1152CS239-Intro. To Data Science-Syllabus
6 pages
R Syllabus Chandigarh University
No ratings yet
R Syllabus Chandigarh University
3 pages
Ocs353 Data Science Fundamentals
No ratings yet
Ocs353 Data Science Fundamentals
2 pages
FDS Lesson Plan Upload
No ratings yet
FDS Lesson Plan Upload
6 pages
Course Title Course Number
No ratings yet
Course Title Course Number
15 pages
4 III BTech Minor DS Courses Syllabus
No ratings yet
4 III BTech Minor DS Courses Syllabus
5 pages
FDS Course Plan - Update
No ratings yet
FDS Course Plan - Update
7 pages
HSB3119 Theory Summary p1 Stud
No ratings yet
HSB3119 Theory Summary p1 Stud
22 pages
Csbs Syllabus 2021
No ratings yet
Csbs Syllabus 2021
2 pages
CS 3352 Foundations of Data Science Syllabus
No ratings yet
CS 3352 Foundations of Data Science Syllabus
2 pages
FDSA Unit 1
No ratings yet
FDSA Unit 1
34 pages
20ad41e2 - Data Science
No ratings yet
20ad41e2 - Data Science
2 pages
Data Science Course Syllabus 2015
No ratings yet
Data Science Course Syllabus 2015
5 pages
Unit I
No ratings yet
Unit I
24 pages
NPTEL Coursebook
No ratings yet
NPTEL Coursebook
649 pages
New Syllabus
No ratings yet
New Syllabus
4 pages
DS Tansche 03.06.2024
No ratings yet
DS Tansche 03.06.2024
23 pages
DSP U2
No ratings yet
DSP U2
172 pages
Foundations of Data Science
No ratings yet
Foundations of Data Science
4 pages
AI ML Course
No ratings yet
AI ML Course
19 pages
Data Science Master
No ratings yet
Data Science Master
11 pages
Fdsa 1
No ratings yet
Fdsa 1
11 pages
Cab112:Introduction To Data Science: Session 2024-25 Page:1/2
No ratings yet
Cab112:Introduction To Data Science: Session 2024-25 Page:1/2
2 pages
Chapter 1 SAIDS
No ratings yet
Chapter 1 SAIDS
38 pages
Big Data Analysis
No ratings yet
Big Data Analysis
11 pages
Course Plan - FDS Theory
No ratings yet
Course Plan - FDS Theory
7 pages
U23AD492 - Data Science Syllabus
No ratings yet
U23AD492 - Data Science Syllabus
4 pages
B.Tech CSE Data Science Syllabus
No ratings yet
B.Tech CSE Data Science Syllabus
24 pages
Data Science & Machine Learning Guide
No ratings yet
Data Science & Machine Learning Guide
3 pages
Lesson Plan Ids-3-Aiml
No ratings yet
Lesson Plan Ids-3-Aiml
4 pages
Data Analytics Detailed Syllabus
No ratings yet
Data Analytics Detailed Syllabus
26 pages
303 - Data Analysis Using Python
No ratings yet
303 - Data Analysis Using Python
6 pages
NP000194-NP000200-NP000203 - (Ct026-3-2-Hci)
No ratings yet
NP000194-NP000200-NP000203 - (Ct026-3-2-Hci)
62 pages
Data Analysis Using ChatGPT
No ratings yet
Data Analysis Using ChatGPT
10 pages
An Assessment of Cost Performance and Accountability in Privatized Public Enterprises in Nigeria.
No ratings yet
An Assessment of Cost Performance and Accountability in Privatized Public Enterprises in Nigeria.
91 pages
Case - Nirmal
No ratings yet
Case - Nirmal
3 pages
Chapter 3 Sample
No ratings yet
Chapter 3 Sample
12 pages
TEN-T Ports & Shipping Analysis 2018
No ratings yet
TEN-T Ports & Shipping Analysis 2018
99 pages
Particle Size Charaterization
No ratings yet
Particle Size Charaterization
12 pages
1624846412694resume Snigdha
No ratings yet
1624846412694resume Snigdha
3 pages
K-Means Clustering Algorithm With Numerical Example - Coding Infinite
No ratings yet
K-Means Clustering Algorithm With Numerical Example - Coding Infinite
16 pages
Business Information Systems Compress
No ratings yet
Business Information Systems Compress
426 pages
Terrm Project On Hindustan Unilever
No ratings yet
Terrm Project On Hindustan Unilever
28 pages
Digitalization of Operations and Supply Chains
No ratings yet
Digitalization of Operations and Supply Chains
19 pages
Economics Class 11 Notes Chapter Correlation
50% (4)
Economics Class 11 Notes Chapter Correlation
4 pages
A Study of E-Banking in India: Trends of Popularity
No ratings yet
A Study of E-Banking in India: Trends of Popularity
46 pages
HarvardX Data Science Professional Certificate - Edx
No ratings yet
HarvardX Data Science Professional Certificate - Edx
6 pages
CPK Calculation
No ratings yet
CPK Calculation
7 pages
5.research Design
100% (1)
5.research Design
16 pages
Banking Operation Project Work
No ratings yet
Banking Operation Project Work
16 pages
Cd-33-A STUDY ON COST ANALYSIS & CONTROL PDF
100% (2)
Cd-33-A STUDY ON COST ANALYSIS & CONTROL PDF
93 pages
Lecture 3
No ratings yet
Lecture 3
32 pages
Surprise Housing Case Study Coincent
No ratings yet
Surprise Housing Case Study Coincent
4 pages
2022 - Exam 1 Solution Source - 2022
No ratings yet
2022 - Exam 1 Solution Source - 2022
6 pages
IBA ASSIGNMENT p22251
No ratings yet
IBA ASSIGNMENT p22251
6 pages
Excel 2007 Stats Training
100% (1)
Excel 2007 Stats Training
28 pages
Apriori Report
No ratings yet
Apriori Report
16 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
15 pages
An Evaluation of Classification Algorithms Using MC Nemar's Test
No ratings yet
An Evaluation of Classification Algorithms Using MC Nemar's Test
13 pages
Budget of Work 1stsem
No ratings yet
Budget of Work 1stsem
6 pages
Using Multivariate Statistics 7th Edition Barbara G. Tabachnick Download PDF
100% (2)
Using Multivariate Statistics 7th Edition Barbara G. Tabachnick Download PDF
47 pages
Automation of Invoice Process
100% (1)
Automation of Invoice Process
51 pages

Syllabus Fundamentals of Data Science

Uploaded by

Syllabus Fundamentals of Data Science

Uploaded by

Course Code Course name L T P C

CSDS2001P Fundamentals of Data Science 4 0 0 4

After the completion of the course the students will be able to

CO1: Understand the fundamentals of data processing.

1 – Weakly Mapped (Low) 2 – Moderately Mapped (Medium)

3 – Strongly Mapped (High) “_” means there is no correlation

Unit I: Introduction to Data Science 8 Lecture Hours

Defining Analytics, Types of data analytics (Descriptive, Diagnostic, Predictive,

Unit III: Data, Data Sources and Visualization 15 Lecture Hours

Unit IV: Feature Engineering and Optimization 10 Lecture Hours

Unit V: Supervised and unsupervised learning 10 Lecture Hours

Unsupervised Learning: Overview, clustering algorithms: K-Means Clustering,

Dimensionality Reduction: Principal Component Analysis (PCA), t-Distributed Stochastic

Association Rule Mining: Apriori Algorithm, FP-Growth Algorithm, Anomaly Detection,

Unit VI: Data Analysis Tool 5 Lecture Hours

Course Code Course name L T P C

CO 1. Know the importance of data analytics in relation to various statistical measures.

1 – Weakly Mapped (Low) 2 – Moderately Mapped (Medium)

3 – Strongly Mapped (High) “_” means there is no correlation

Experiment no 2 Learn data cleaning techniques, including handling missing data,

Experiment no 3 Perform hypothesis tests, such as t-tests or chi-squared tests, to

Experiment no 4 Implement simple linear regression to analyze relationships

Experiment no 5 Create a variety of visualizations, including bar charts, line graphs,

Experiment no 6 Use clustering algorithms to group similar data points together.

Experiment no 8 Discover frequent item sets and association rules in transactional

Experiment no 9 Project 1 (Sentiment analysis)

Experiment no 10 Project 2 (Recommendation systems)

Total Lab hours 30

Modes of Evaluation: Quiz/Assignment/ presentation/ extempore/ Written

Examination Scheme: Continuous Assessment

Components Quiz & Viva Performance & Lab Report

You might also like