MCPE2008 DATA MINING AND DATA WAREHOUSE (3-0-0)
Course Objectives:
1. To be familiar with mathematical foundations of data mining tools
2. To understand and implement classical models and algorithms in data warehouses and data mining
3. To characterize the kinds of patterns that can be discovered by association rule mining, classification and
clustering.
4. To master data mining techniques in various applications like social, scientific and environmental context.
5. To develop skill in selecting the appropriate data mining algorithm for solving practical problems.
MODULE I
Data Warehousing and Business Analysis: - Data warehousing Components –Building a Data warehouse –Data
Warehouse Architecture – DBMS Schemas for Decision Support – Data Extraction, Cleanup, and Transformation
Tools –Metadata – reporting – Query tools and Applications – Online Analytical Processing (OLAP) – OLAP and
Multidimensional Data Analysis.
MODULE II
Data Mining: - Data Mining Functionalities – Data Preprocessing – Data Cleaning – Data Integration and
Transformation – Data Reduction – Data Discretization and Concept Hierarchy Generation- Architecture of A
Typical Data Mining Systems- Classification of Data Mining Systems.
Association Rule Mining: - Efficient and Scalable Frequent Item set Mining Methods – Mining Various Kinds of
Association Rules – Association Mining to Correlation Analysis – Constraint-Based Association Mining.
MODULE III
Classification and Prediction: - Issues Regarding Classification and Prediction – Classification by Decision Tree
Introduction – Bayesian Classification – Rule Based Classification – Classification by Back propagation –
Support Vector Machines – Associative Classification – Lazy Learners – Other Classification Methods –
Prediction – Accuracy and Error Measures – Evaluating the Accuracy of a Classifier or Predictor – Ensemble
Methods – Model Section.
MODULE IV
Cluster Analysis: - Types of Data in Cluster Analysis – A Categorization of Major Clustering Methods –
Partitioning Methods – Hierarchical methods – Density-Based Methods – Grid-Based Methods – Model-Based
Clustering Methods – Clustering High-Dimensional Data – Constraint-Based Cluster Analysis – Outlier
Analysis.Mining Object, Spatial, Multimedia, Text and Web Data:Multidimensional Analysis and Descriptive
Mining of Complex Data Objects – Spatial Data Mining – Multimedia Data Mining – Text Mining – Mining the
World Wide Web.
Course Outcomes: Upon successful completion of this course, students should be able to:
CO1: Understand the functionality of the various data mining and data warehousing component
CO2: Appreciate the strengths and limitations of various data mining and data warehousing models
CO3: Perform classification and prediction of data for analyzing various data
CO4: Describe different methodologies used in data mining and data ware housing
CO5: Apply technical knowhow of the Data Mining principles and techniques for real time applications
BOOKS:
1. Jiawei Han, Micheline Kamber and Jian Pei“Data Mining Concepts and Techniques”, Third Edition, Elsevier,
2011.
2. Alex Berson and Stephen J. Smith “Data Warehousing, Data Mining & OLAP”, Tata McGraw – Hill Edition,
Tenth Reprint 2007.
3. K.P. Soman, Shyam Diwakar and V. Ajay “Insight into Data mining Theory and Practice”, Easter Economy
Edition, Prentice Hall of India, 2006.
4. G. K. Gupta “Introduction to Data Mining with Case Studies”, Easter Economy Edition, Prentice Hall of India,
2006.
5. Pang-Ning Tan, Michael Steinbach and Vipin Kumar “Introduction to Data Mining”, Pearson Education, 2007.