BA 404: DATA MINING
MODULE I:
1. Introduction to Data Mining: data mining, Related technologies - Machine Learning, DBMS,
OLAP, Statistics ,Data Mining Goals ,Stages of the Data Mining Process, Data Mining
Techniques, Knowledge Representation Methods,
Applications [4L]
2. Data Warehouse and OLAP: Data Warehouse and DBMS ,Multidimensional data model,
OLAP operations [2L]
3. Data Preprocessing: Data cleaning, Data transformation, Data reduction, Discretization and
generating concept hierarchies, Installing Weka 3 Data Mining
System [4L]
4. Data Mining Knowledge Representation: Task relevant data, Background knowledge,
Interestingness measures, Representing input data and output knowledge, Visualization
techniques, Experiments with Weka- visualization [6L]
5. Attribute-Oriented Analysis: Attribute generalization, Attribute relevance, Class comparison,
Statistical measures [4L]
MODULE II:
6. Data Mining Algorithms I: Association rules, Motivation and terminology,
Generating item sets and rules efficiently, Correlation analysis [4L]
7. Data Mining Algorithms II: Classification, Basic learning/mining tasks, Inferring rudimentary
rules: 1R algorithm, Decision trees, Covering rules [6L]
8. Data Mining Algorithms III: Prediction, The prediction task, Statistical (Bayesian)
classification, Bayesian networks, Instance-based methods (nearest neighbor), Linear
models [4L]
9. Clustering: Basic issues in clustering , conceptual clustering system, Partitioning methods: k-
means, expectation maximization (EM) ,Hierarchical methods: distancebased agglomerative
and divisible clustering ,Conceptual clustering: Cobweb
[4L]
10. Case Studies
[2L]
Suggested Readings:
1. Cristianini N. and Shawe-Taylor J.: An Introduction to Support Vector Machines and Other
Kernel-based Learning Methods, Cambridge University Press, 2000.
2. Hand D., Mannila H. and Smyth P.: Principles of Data Mining, MIT Press, 2001.
3. Langley P.: Elements of machine learning, Morgan Kaufmann Publishers, 1996.
Larose D.T.: Discovering knowledge in data: an introduction to data mining, WileyInterscience,
2005.