School of Professional Advancement Doc.
No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Resource Person: Ansif Arooj
Trimester: Spring 2020
Course Title: Data Mining
Course Code: XC-470
Course Type:
Pre-Requisite:
Counseling Hours: Class Days
Program: MCS
Program Head: Mr. Imran Saleem
Name Signature Date
Checked By
(Program Head) Mr. Imran Saleem
Approved By
(Director SPA) Dr. Naveed Yazdani
1
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Course Description
Data Mining studies algorithms and computational paradigms that allow computers to find
patterns and regularities in databases, perform prediction and forecasting, and generally improve
their performance through interaction with data. It is currently regarded as the key element of a
more general process called Knowledge Discovery that deals with extracting useful knowledge
from raw data. The knowledge discovery process includes data selection, cleaning, coding, using
different statistical and machine learning techniques, and visualization of the generated
structures. The course will cover all these issues and will illustrate the whole process by
examples. Special emphasis will be given to the Machine Learning methods as they provide the
real knowledge discovery tools. Important related technologies, as data warehousing and on-line
analytical processing (OLAP) will be also discussed. The students will use recent Data Mining
software
Format of the Course:
Weekly readings with supplement lecture material.
Instructional Goals
To introduce students to the basic concepts and techniques of Data Mining.
To develop skills of using recent data mining software for solving practical problems.
To gain experience of doing independent study and research.
Course (Student) Objectives
Upon completion of the course, students will be able to:
Understand the roles and types of different data mining techniques;
Extract and analyze different data mining tools
Develop some basic level of classification techniques
Describe different clustering techniques
Brief Course Content
Session 1 Introduction to Data Mining
What is Data mining?
Need of Data Mining
Related Terms: Data Science, Big Data, Knowledge Discovery
Related technologies - Machine Learning, DBMS, OLAP, Statistics
Advance and Conventional Data Base Systems
Data Mining Goals
Data Mining Functions
KDD
2
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Learning Objectives
Participants will be able to understand basic introduction about data mining and OLTP
techniques. Data mining goals and applications.
Session 2 Data Mining Process
Knowledge Representation Methods Knowledge Representation Methods
Stages of the Data Mining Process
Getting to know you Data
Data Types
Learning Objectives
Participants will be able to understand knowledge discovery process model. How these models
are implemented into industry and academia.
Session 3 Getting to know your Data
Basic Statistics for Data Mining (five number summery, box blot, quartile)
Data Similarity & Dissimilarity Matrix
Proximity measure for binary, nominal Data
Learning Objectives
Participants are explore the different data mining techniques. Understand different data
transformation and data reduction methods. Basic understanding of data generating
hierarchies.
Session 4 Data preprocessing
Proximity measure for numeric and ordinal
Proximity measure for Mixed Data
Cosine Similarity
Data cleaning
Learning Objectives
Participants are explore the different data reeducation techniques and generate different
hierarchies of data into industry and academia.
Session 5 Data Preprocessing and Data mining knowledge representation
3
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Data Integration
Data transformation
Data reduction
Discretization and generating concept hierarchies
Learning Objectives
Participants are explore the different task relevant data. Basic understanding of data input
and output in mining.
Session 6 Data mining knowledge representation
Learning Objectives
Task relevant data
Background knowledge
Interestingness measures
Representing input data and output knowledge
Understand different data interestingness measure techniques. Basic understanding of
data input and output in mining.
Session 7 Attribute-oriented analysis
Attribute generalization
Attribute relevance
Learning Objectives
Participants will be able to understand attribute relevance and class comparison. They
will be able understand attribute relevance and common features.
Session 8 Attribute-oriented analysis
Class comparison
Statistical measures
Learning Objectives
Participants will be able incorporate different statistical technique to measure attribute
relevance.
4
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Session 9 Mid Term
Session 10 Data mining algorithms: Association rules
Motivation and terminology
Example: mining weather data
Apriori Algorithm
Learning Objectives
Participants will be able to incorporate all the techniques of association rules and generate
different item sets.
5
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Session 11 Data mining algorithms: Association rules
Basic idea: item sets
Generating item sets and rules efficiently
Correlation analysis
FP-Growth Algorithm
Learning Objectives
Participants will be able to incorporate rules with efficiency. They will be able to
understand correlation analysis.
Session 12 Data mining algorithms: Classification
Basic learning/mining tasks
Inferring rudimentary rules: 1R algorithm
Learning Objectives
This session will introduce about basic concepts of classification and different
classification algorithms such as IR.
Session 13 Data mining algorithms: Classification
Decision trees- DTE
Demonstration and numerical example of DTE
Learning Objectives
This session will introduce the working of decision tree and Naïve Bayes and compare
which one is best for text similarity. How classification learner are implanted in industry.
Session 14 Data mining algorithms: Classification
Naïve Bayes
Demonstration and numerical example of Naïve Bayes Classifier
Session 14 Data mining algorithms: Clustering
Types of Clustering
K-Mean Algorithm
Learning Objectives
This session will introduce about basic concepts of rule based classification. How support
vector machine is used to compare different documents.
Session 15 Data mining algorithms: Clustering
Hybrid Clustering
Agglomerative Clustering Algorithm
Learning Objectives
All participants will learn the basic techniques of predication and clustering algorithm. How
Clustering is implanted into real word datasets.
6
School of Professional Advancement Doc. No : QA-WI-01
Course Outlines Issue No: 01
Rev : 00
Recommended Book (s) & Text:
Data Mining Concepts And Techniques By Jiawei Han, Jian Pei.(Third Edition)
Ian H. Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and
Techniques (Second Edition), Morgan Kaufmann, 2005, ISBN: 0-12-088407-0.
E-Resources:
ASSESSMENT METHODOLOGY
Assignments 20
Quizzes 20
Mid Term 20
Final Term Exam 40
Total 100
CALENDAR OF ACTIVITIES
Session Sub-Topic Readings Activities
1 Introduction to Data Mining Ch-1
2 Introduction to Data Mining process Ch-1
3 Getting to know your Data Ch.-2
4 Data preprocessing Ch-3 Assignment1
5 Data mining knowledge representation Ch-4 Quiz1
6 Data mining knowledge representation Ch-4 Discussion
7 Attribute-oriented analysis Ch-5 Quiz 2
8 Attribute-oriented analysis Ch-5 Assignment 2
9 Mid Term
10 Data mining algorithms: Association rules Ch-6 Discussion
11 Data mining algorithms: Association rules Ch-6 Quiz 3
12 Data mining algorithms: Classification Ch-7 Assignment 3
13 Data mining algorithms: Classification Ch-7
14 Data mining algorithms: Clustering Ch-8 Assignment 4
15 Data mining algorithms: Clustering Ch-8 Presentation