EASTERN AFRICA STATISTICAL TRAINING CENTRE
HIGHER DIPLOMA IN DATA SCIENCE
FUNDAMENTALS OF DATA MINING
ASSIGNMENT II (June 2024)
1. Data Preprocessing Importance (Group 1 – 6):
• Question: Explain the significance of data preprocessing in the data mining
process. Discuss the main steps involved in data preprocessing and provide
examples of how poor data preprocessing can affect the outcomes of data mining
tasks.
• Objective: To understand the critical role of data preprocessing and the potential
impact of inadequate preprocessing on data mining results.
2. Comparison of Data Mining Techniques (Group 7 – 12):
• Question: Compare and contrast classification, clustering, and association rule
mining. Discuss the primary objectives, typical algorithms, and types of datasets
suitable for each technique.
• Objective: To highlight the differences and similarities between major data mining
techniques and understand their applications and appropriate contexts.
3. Challenges in Mining Different Types of Data (Group 13 – 18):
• Question: Describe the challenges associated with mining structured, semi-
structured, and unstructured data. Provide examples of each type of data and discuss
the specific techniques used to handle these challenges.
• Objective: To explore the complexities of dealing with various data types and the
methods used to address these challenges in data mining.
4. Role of Evaluation Metrics in Data Mining (Group 19 – 24):
• Question: Discuss the importance of evaluation metrics in data mining. Choose
three common evaluation metrics (e.g., accuracy, precision, recall, F1-score,
support, confidence) and explain how they are calculated and interpreted.
• Objective: To understand the purpose and application of evaluation metrics in
assessing the performance of data mining models and techniques.
5. Ethical Considerations in Data Mining (Group 25 – 30):
• Question: Analyze the ethical implications of data mining. Discuss issues such as
privacy, consent, bias, and data security. Provide examples of ethical dilemmas that
data miners might face and suggest possible solutions.
• Objective: To reflect on the ethical aspects of data mining and the responsibilities
of data miners in handling sensitive information.