KEMBAR78
Assignment 1 | PDF
0% found this document useful (0 votes)
17 views2 pages

Assignment 1

The document outlines a series of questions related to Exploratory Data Analysis (EDA), covering basic concepts, descriptive statistics, data cleaning, visualization, real-world applications, and advanced thinking. It addresses the importance of EDA, the differences between EDA and data cleaning, and the steps involved in EDA, as well as techniques for handling data and visualizing relationships. Additionally, it emphasizes the role of EDA in making data-driven decisions and highlights potential pitfalls and tools for effective analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Assignment 1

The document outlines a series of questions related to Exploratory Data Analysis (EDA), covering basic concepts, descriptive statistics, data cleaning, visualization, real-world applications, and advanced thinking. It addresses the importance of EDA, the differences between EDA and data cleaning, and the steps involved in EDA, as well as techniques for handling data and visualizing relationships. Additionally, it emphasizes the role of EDA in making data-driven decisions and highlights potential pitfalls and tools for effective analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

A.

Basic Conceptual Questions


1. What is Exploratory Data Analysis (EDA) and why is it important?
2. What is the difference between EDA and data cleaning?
3. What are the key steps involved in performing EDA?
4. Name the types of data (categorical, numerical) and explain how EDA differs for each.
5. What is the difference between univariate, bivariate, and multivariate analysis?

B. Descriptive Statistics Questions


6. How do mean, median, and mode help in understanding data?
7. What is standard deviation and what does it tell you?
8. When would you prefer using median over mean?
9. What does a high variance in a feature suggest?
10. How can skewness in data affect analysis?

C. Data Cleaning & Handling Questions


11. How do you handle missing values in a dataset?
12. What is an outlier, and how can it be detected and treated?
13. What is the purpose of data normalization or standardization in EDA?
14. How do you treat duplicate records?
15. What are common data types in pandas, and why do they matter during EDA?

D. Visualization and Interpretation Questions


16. Which plot is best suited for showing the distribution of a single numerical variable?
17. How would you visualize the relationship between two numerical variables?
18. What is a boxplot and what insights can you derive from it?
19. How do heatmaps help in EDA?
20. When should you use pie charts and why are they generally discouraged?

E. Real-World Application Questions


21. If you were given a customer transaction dataset, what EDA steps would you perform?
22. In what ways can EDA help a business make data-driven decisions?
23. How would you use EDA to detect seasonality or trends in sales data?
24. What insights can you gain by analyzing time-based features like "Day of Week" or
"Hour of Purchase"?
25. What role does correlation analysis play in EDA?

F. Advanced Thinking Questions


26. How do you choose which features to visualize first?
27. Can visualizations be misleading? Give an example.
28. What are some pitfalls to avoid during EDA?
29. How would you perform EDA on a dataset with thousands of features (e.g., genomic
data)?
30. What tools or libraries do you use for EDA in Python and why?

You might also like