MCA303 Data Mining Course Code MCC303
Course Title Data Mining Course Type Core
Contact Hours 4 Hours per Week Credit 3
Domain Professional Core
Syllabus
I
Introduction
Data Warehousing, Multidimensional Data Model, OLAP Operations, Introduction to KDD
process, Data mining, Data mining -On What kinds of Data, Data mining Functionalities,
Classification of Data Mining Systems.
Data Preprocessing
Data Cleaning, Data Integration and Transformation, Data Reduction, Data discretization and
concept hierarchy generation
II
Exploring Data and Visualization Techniques
General Concepts, Techniques, Visualizing Higher Dimensional Data, Tools
Association Analysis
Basic Concepts, Efficient and Scalable Frequent Item set Mining Methods:Apriori Algorithm,
generating association Rules from Frequent Item sets, Improving the Efficiency of Apriori.
Mining Frequent item-sets without Candidate Generation, Evaluation of Association Patterns,
Visualization.
A Case Study on Association using Orange Tool
III
Classification
Introduction to Classification and Prediction, Classification by Decision Tree Induction:
Decision Tree induction, Attribute Selection Measures, Tree Pruning, Bayesian Classification:
Bayes’ theorem, Naïve Bayesian Classification,
Rule Based Algorithms: Using If – Then rules of Classification, Rule Extraction from a
Decision Tree, Rule Induction Using a Sequential Covering algorithm, K- Nearest Neighbour
Classifiers, Support Vector Machine. Evaluating the performance of a Classifier, Methods for
comparing classifiers, Visualization.
A Case Study on Classification using Orange Tool
IV
Prediction
Linear Regression, Nonlinear Regression, Other Regression-Based Methods
Cluster Analysis I: Basic Concepts and Algorithms
Cluster Analysis, Requirements of Cluster Analysis’ Types of Data in Cluster Analysis,
Categorization of Major Clustering Methods, Partitioning Methods: k-Means and k- Medoids,
From K-Medoids to CLARANS
A Case Study on Clustering using Orange Tool.
V
Cluster Analysis II: Hierarchical Method: Agglomerative and Divisive Hierarchical
Clustering.
Comparison of data mining methods. Applicability of data mining methods for different
scenarios. Considerations for mining unstructured data.
TEXT/REFERENCE BOOKS:
Pang-Ning Tan, Michael Steinbach, Vipin Kumar, ‘Introduction to Data Mining’
Data Mining Concepts and Techniques – Jiawei Han and MichelineKamber, Second Edition,
Elsevier, 2006
G. K. Gupta, “Introduction to Data Mining with Case Studies”, Easter Economy Edition,
Prentice Hall of India, 2006.
Making sense of Data: A practical guide to exploratory Data Analysis and Data Mining-Glenn
J Myatt
COURSE PRE-REQUISITES:
MCA101, MCA 104
COURSE OBJECTIVES:
1. Acquire knowledge in Data mining and warehousing
2. Learn the different techniques for discovery of patterns hidden in large data sets and their
Visualizations
3. Learn data mining tasks such as classification, estimation, prediction, affinity grouping and
clustering.
COURSE OUTCOMES:
CO. No Course Outcome description
MCA303.1 To introduce the students, the basic concepts and techniques of Data mining
and Warehousing and data pre-processing.
MCA303.2 Understand association mining algorithms for discovery of frequent item
patterns in large data sets and their Visualizations
MCA303.3 Understand classification analysis algorithms for discovery and generation of
rules in large data sets and their Visualizations
MCA303.4 Understand basic and advanced clustering analysis algorithms and
Visualizations in Data Mining.