Data Science & Analysis Training Manual
Module 1: Python for Data Science
Week 1: Introduction to Python
Class 1 (Wednesday) - Understanding Python and Setting Up the Environment
1.1 What is Python?
Python is a high-level, interpreted programming language known for its simplicity and readability. It is
widely used in Data Science, Web Development, Automation, and Artificial Intelligence.
Why is Python popular for Data Science?
• Simple syntax, making it easy to learn.
• Large community support with many libraries.
• Used in Machine Learning, AI, and data analysis.
1.2 Installing Python and Setting Up the Environment
To get started with Python, we will use Anaconda, which includes:
• Python
• Jupyter Notebook
• Essential libraries (NumPy, Pandas, Matplotlib, etc.)
Installation Steps:
1. Download Anaconda from https://www.anaconda.com
2. Install using default settings.
3. Open Jupyter Notebook from the Anaconda Navigator.
1.3 Writing Your First Python Program
Open Jupyter Notebook and create a new Python file. Type the following:
print("Hello, Data Science!")
Task: Modify the program to print your name instead.
Class 2 (Sunday) - Python Basics
2.1 Variables and Data Types
A variable is a container for storing data values. Example:
name = "John"
age = 25
score = 89.5
is_student = True
Data Types in Python:
• int (Integer): Whole numbers
• float (Floating point): Decimal numbers
• str (String): Text
• bool (Boolean): True or False values
Task: Create a variable to store your age, name, and whether you are a student.
2.2 Operators in Python
Operators allow us to perform operations on variables.
# Arithmetic Operators
x = 10
y = 5
print(x + y) # Addition
print(x - y) # Subtraction
Other operator types: Comparison (==, >, <), Logical (and, or, not)
Task: Write a program that takes two numbers as input and prints their sum and product.
2.3 Input and Output Functions
The input() function allows user input:
name = input("Enter your name: ")
print("Hello, " + name)
Task: Ask the user for their favorite programming language and print a message including their
response.
Week 2: Control Flow and Loops
Class 3 (Wednesday) - Conditional Statements
3.1 If, Elif, Else Statements
Conditional statements allow decision-making in Python.
age = int(input("Enter your age: "))
if age >= 18:
print("You are an adult.")
elif age >= 13:
print("You are a teenager.")
else:
print("You are a child.")
Task: Write a program that checks if a number is positive, negative, or zero.
Class 4 (Sunday) - Loops in Python
4.1 For Loop & While Loop
Loops are used to repeat code multiple times.
for i in range(5):
print("Hello!")
Task: Write a loop that prints numbers from 1 to 10.
x = 0
while x < 5:
print(x)
x += 1
Task: Modify the program to print numbers from 10 to 1.
Week 3: Data Structures in Python
Class 5 (Wednesday) - Lists & Tuples
5.1 Lists
Lists are used to store multiple items in a variable.
fruits = ["apple", "banana", "cherry"]
print(fruits[0]) # Accessing elements
fruits.append("mango") # Adding a new element
print(fruits)
Task: Create a list of 5 numbers and print their sum.
Class 6 (Sunday) - Dictionaries & Sets
6.1 Dictionaries
A dictionary stores data in key-value pairs.
student = {"name": "John", "age": 21, "score": 85}
print(student["name"]) # Accessing values
Task: Create a dictionary to store details about a book (title, author, year).
Week 4: Functions and File Handling
Class 7 (Wednesday) - Functions in Python
7.1 Defining Functions
A function is a reusable block of code.
def greet(name):
print("Hello, " + name)
greet("Alice")
Task: Write a function that takes two numbers and returns their sum.
Class 8 (Sunday) - Working with Files
8.1 Reading and Writing Files
with open("file.txt", "w") as file:
file.write("Hello, world!")
Task: Write a program to read and print content from a file.
Week 5: Working with Data in Python
Class 9 (Wednesday) - Introduction to Pandas
9.1 Pandas DataFrames
import pandas as pd
data = {"Name": ["Alice", "Bob"], "Age": [25, 30]}
df = pd.DataFrame(data)
print(df)
Task: Create a Pandas DataFrame with names and test scores.
Class 10 (Sunday) - Data Cleaning & Processing
10.1 Handling Missing Data
df.dropna()
Task: Write a Pandas script to remove missing values from a dataset.