BCO 029A DATA MINING &WAREHOUSING 3:0:0 [3]
OBJECTIVE:
To compare and contrast different conceptions of data mining.
To explain the role of finding associations in commercial market basket data.
To characterize the kinds of patterns that can be discovered by association rule
mining.
To describe how to extend a relational system to find patterns using association rules.
To evaluate methodological issues underlying the effective application of data mining.
Introduction: Basic concepts of data mining, including motivation
and definition; different types of data repositories; data mining
UNIT 1
functionalities; concept of interesting patterns; data mining tasks;
current trends, major issues and ethics in data mining
Data: Types of data and data quality; Data Preprocessing: data
cleaning, data integration and transformation, data reduction,
UNIT 2
discretization and concept hierarchy generation; Exploring Data:
summary statistics, visualization, multidimensional data analysis
Association and Correlation Analysis: Basic concepts: frequent
patterns, association rules - support and confidence;
UNIT 3 Frequentitemset generation - Apriori algorithm, FP-Growth
algorithm; Rule generation, Applications of Association rules;
Correlation analysis.
Clustering Algorithms and Cluster Analysis: Concept of clustering,
measures of similarity, Clustering algorithms: Partitioning methods
- k-means and k-medoids, CLARANS, Hierarchical methods -
UNIT 4 agglomerative and divisive clustering, BIRCH, Densitybased
methods - Subspace clustering, DBSCAN; Graph-based clustering
- MST clustering; Cluster evaluation; Outlier detection and
analysis.
UNIT 5 Classification: Binary Classification - Basic concepts, Bayes
theorem and Naïve Bayes classifier, Association based
classification, Rule based classifiers, Nearest neighbor classifiers,
Decision Trees, Random Forest; Perceptrons; Multi-category
classification; Model overfitting, Evaluation of classifier
performance - cross validation, ROC curves.
Applications: Text mining, Web data analysis, Recommender
systems.Prerequisites: Familiarity with basic Linear Algebra and
Probability will be assumed.
OUTCOMES: :At the end of the course, the student should be able to:
Compare and contrast different conceptions of data mining.
Explain the role of finding associations in commercial market basket data.
Characterize the kinds of patterns that can be discovered by association rule mining.
Describe how to extend a relational system to find patterns using association rules.
Evaluate methodological issues underlying the effective application of data mining.
MAPPING COURSE OUTCOMES LEADING TO THE ACHIEVEMENT OF PROGRAM
OUTCOMES AND PROGRAM SPECIFIC OUTCOMES:
Course Program OutComes Program Specific
Outcom Outcomes
es
PO PO PO PO PO PO PO PO PO PO1 PO1 PO1 PSO PSO PSO
1 2 3 4 5 6 7 8 9 0 1 2
1 2 3
CO1
CO2
CO3
CO4
CO5
H = Highly Related; M = Medium L = Low
Text Books:
1. Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining.
Pearson (2005), India.ISBN 978-8131714720
2. Jiawei Han and MichelineKamber, Data Mining: Concepts and Techniques, Morgan
Kaufmann, 3rd edition (July 2011) 744 pages. ISBN 978-0123814791
Reference Books:
1. T. Hastie, R. Tibshirani and J. H. Friedman, The Elements of Statistical Learning, Data
Mining, Inference, andPrediction.Springer, 2nd Edition, 2009.768 pages. ISBN 978-
0387848570
2. C. M. Bishop, Pattern Recognition and Machine Learning.Springer, 1st edition, 2006.738
pages. ISBN 978-0387310732
3. Ian H. Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and
Techniques, MorganKaufmann, 3rd edition (January 2011).664 pages. ISBN 978-
0123748560.