INT375:DATA SCIENCE TOOLBOX: PYTHON PROGRAMMING
L:2 T:0 P:2 Credits:3
Course Outcomes: Through this course students should be able to
CO1 :: understand and apply Python programming fundamentals
CO2 :: utilize NumPy and Pandas for efficient data manipulation, cleaning, and preparation.
CO3 :: apply clear and effective data visualizations using Matplotlib and Seaborn to analyze and
communicate data insights.
CO4 :: execute exploratory data analysis to uncover data insights using Python
CO5 :: perform statistical analysis and hypothesis testing using Python
CO6 :: associate the role of machine learning in data science
Unit I
Introduction to Python for Data Science : Overview of Data Science, Basic Syntax and Data
Types, Control Structures (if statements, loops), Functions and Modules
Unit II
Data Manipulation with NumPy and Pandas : Introduction to NumPy: Arrays, Operations, Data
Manipulation with Pandas: Series and DataFrames, Data Cleaning and Preparation, Handling Missing
Data
Unit III
Data Visualization with Matplotlib and Seaborn : Principles of Data Visualization, Creating Plots
with Matplotlib, Advanced Visualization with Seaborn, Customizing Visualizations
Unit IV
Exploratory Data Analysis (EDA) : Understanding EDA and its Importance, Summary Statistics,
Correlation and Covariance, Outlier Detection
Unit V
Introduction to Statistical Analysis : Descriptive and Inferential Statistics, Hypothesis Testing: Z-
test, t-test, p-test, chi-squared test, variance-inflation factor(VIF), Shapiro- Wilk test, Probability
Distributions: Uniform Distribution Normal Distribution Binomial Distribution Poisson Distribution,
Introduction to A/B Testing
Unit VI
Exploring the role of machine learning in data science : Introduction to Machine Learning
Concepts, Supervised vs. Unsupervised Learning, Understand CRISP-DM framework using Linear
Regression model, Introduction to Classification
Recent Trends : Generative AI and Its Applications: GPT-4 DALL-E, Synthetic Data Generation
List of Practicals / Experiments:
List of Practical's / Experiments:
• Exploring and understanding Basics of Python Language
• Exploring and understanding the basic concepts of Data Science and components of Python
• Exploring different Control Structures and function in Python
• Practical on NumPy Package
• Practical to demonstrate working with Data in Python
• Practical to demonstrate working with NumPy Arrays
• Practical on Pandas Package
• Practical on Visualization with MatPlotLib
• Practical demonstration on EDA, Summary Statistics
• Practical demonstration on Correlation and Covariance, Outlier Detection
• Practical demonstration on Outlier Detection
Session 2024-25 Page:1/2
• Practical Demonstration on Descriptive and Inferential Statistics , Hypothesis testing
• Practical Demonstration on Hypothesis testing, Probability Distributions
• Practical Demonstration on CRISP-DM framework using Linear Regression model
Text Books:
1. PYTHON FOR DATA SCIENCE by MOHD. ABDUL HAMEED, WILEY
2. DATA SCIENCE AND MACHINE LEARNING USING PYTHON by REEMA THAREJA, MC GRAW
HILL
References:
1. FOUNDATIONAL PYTHON FOR DATA SCIENCE, 1ST EDITION by KENNEDY BEHRMAN,
PEARSON
2. DATA SCIENCE FROM SCRATCH by JOEL GRUS, O'REILLY
Session 2024-25 Page:2/2