KEMBAR78
Data Warehouse and Data Mining Syllabus | PDF | Data Warehouse | Cluster Analysis
0% found this document useful (0 votes)
482 views5 pages

Data Warehouse and Data Mining Syllabus

This document provides information on the IT-32 Subject: Data Warehouse and Data Mining course. The course is worth 4 credits and involves 60 hours of lectures and is evaluated through 30% internal and 70% external assessments. The course aims to teach concepts of data warehousing, OLAP, data mining techniques including classification, clustering, and association rule mining. It is divided into 6 units covering topics such as data preprocessing, data warehousing, OLAP, association rule mining, classification, clustering, and other data mining approaches and applications.

Uploaded by

Altamash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
482 views5 pages

Data Warehouse and Data Mining Syllabus

This document provides information on the IT-32 Subject: Data Warehouse and Data Mining course. The course is worth 4 credits and involves 60 hours of lectures and is evaluated through 30% internal and 70% external assessments. The course aims to teach concepts of data warehousing, OLAP, data mining techniques including classification, clustering, and association rule mining. It is divided into 6 units covering topics such as data preprocessing, data warehousing, OLAP, association rule mining, classification, clustering, and other data mining approaches and applications.

Uploaded by

Altamash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Subject Code: IT-32

Subject: Data Warehouse and Data Mining

Credit Scheme Evaluation Scheme


Lecture Practical Credit Internal External Total
60 - 4 30 70 100

Course Description:
Prerequisite:
 Basic concepts of Database / RDBMS
 Basic knowledge of statistics and data structure.
Course Objectives:
 To Study data warehouse architectures, OLAP and the project planning aspects
in building a data warehouse
 To introduce the concepts, techniques, design and applications of data warehousing
and data mining.
 To enable students to understand and implement classical algorithms in data mining
 To understand the various approaches to data warehousing and data mining
implementations
 To understand how to analyze the data, identify the problems, and choose the relevant
algorithms to apply

Course Outcome with Blooms taxonomy :


Student will be able to
CO1: learn and understand techniques of preprocessing various kinds of data -Understand
CO2: Understand Data warehouse concepts. - Understand
CO3: Apply association Mining Techniques on large Data Sets. - Apply
CO4: Apply classification and clustering Techniques on large Data Sets. - Analyze
CO5: Understand other approaches of Data mining techniques. - Understand

Course Structure:
Unit No. Topics Details Weightage No of
in % Sessions

1 Know your Data & Data Pre-processing: 15 6


Data Objects, attribute types(Nominal, Ordinal,
Interval, Ratio scale), descriptions of data, Measuring
Data similarity and dissimilarity(clustering, outlier
analysis, and nearest-neighbor classification)
Data Pre-processing: Data Quality(Incompleteness,
Accuracy, Inconsistency, Invalidity, Redundancy,
Non-standard), major task in preprocessing, Data
cleaning: Missing values, Noisy Data(Binning methods,
Clustering or outlier analysis, Regression), Data Cleaning
as a process
Data Integration: Entity identification problem,
Redundancy – correlation analysis, Tuple
duplications, Data value conflict detection &
resolution
Data reduction: Data reduction strategies(Dimension
reduction, Numerosity reduction(Histogram,
Clustering, Data cube aggregation, Sampling), Data
compression, Principal Component Analysis ),
wavelet transforms, principle component analysis,
Linear Regression- log-linear Regression models,
discriminant analysis and logistic regression
Data Transformation(Smoothing, Attribute
construction, Aggregation, Normalization,
Generalization) & Data Discretization
2 Data Warehousing & Online Analytical Processing: 15 8
Introduction to data warehousing, Need of Data
warehouse(DW), Operational database versus
DW
Data warehouse life cycle, building a Data Warehouse,
Data Warehousing Components, Data Warehousing
Architecture, DW Models
Extraction, Transformation & Loading, Metadata
Repository, feature selection & creation
Multi-Dimensional data Modeling: Star schema, snowflak
schema & fact constellation schema, On Line Analytical
Processing, Categorization of OLAP Tools, Data cubes &
Operations on cubes
Design and usage of Data Warehouse (at least one system
diagram)
3 Association Mining Rules basic concepts, Algorithms: 20 6
Data mining versus Knowledge Discovery process,
Introduction to machine learning and data mining
techniques, Data Mining issues and challenges.
Why Association Mining is necessary, Pros and Cons of
Association Rules
Frequent Item set Generation, Rule Generation, Compact
Representation of Frequent Item sets - Apriori Algorithm
Alternative methods for generating Frequent Item sets,
FP Growth Algorithm
Extracting best possible rules on real data set and
Evaluation of Association Patterns

4 Classification and Prediction: 20 8


Basics, General approach to solve classification problem,
Classification by Decision Tree Induction
Bayesian Classification, Rule-Based Classification,
k-Nearest-Neighbor Classifiers(Lazy Learners), –
Prediction - Classifier accuracy
Classification by Back propagation-Artificial Neural
Network – Support Vector Machines – Associative
Classification –
Performing classification and evaluating the efficient
model - a case study.
5 Clustering Techniques: 20 6
Overview, Features of cluster analysis, Types of Data and
Computing Distance
Categorization of Major Clustering Methods: Partitioning
Methods, Hierarchical Methods, Density-Based Methods,
K-means algorithm , Quality and Validity of Cluster
Analysis, Outlier Analysis
A case study on finding efficient Clusters on set of
documents data / a case study on real data set.
6 Other Approaches of data mining and Data Mining 10 6
applications:
Discovery of sequential patterns, Discovery of patterns in
time series
Bayesian Network, Genetic Algorithms , Rough set &
Fuzzy Set approach
Text mining-NLP, Web Mining
Temporal and Spatial Data Mining
Data mining Trends and Business Intelligence(BI)
applications
Data-visualization: Dashboard-KPI, BI and Analytics tool

Total: 100 40

Recommended Reference Books:


 Data Warehousing Fundamentals: A Comprehensive Guide for IT
professionals, by Paulraj Poonniah, Latest Edition, WILEY INDIA
 Building the Data Warehouse, 3rd edition by W. H. Inmon WILEY INDIA
 Data Mining concepts and Techniques by Jiawei Han, MichelineKambler –
Elsevier.
 Data Mining practical Machine Learning Tools and Techniques by Ian H.
Witten Eibe Frank Mark Hall - Elsevier publication
 Introduction to Data Mining with Case Studies by G. K. Gupta, Prentice Hall of
India.
 Data Mining: Introductory and Advanced Topics, by Margaret Dunham, Pearson
Education
 Data Mining by Arun K. Pujari – University Press.
 Data Mining for Business Intelligence by GalitShmuel, Nitin
Patel, WILEY INTERSCIENCE.

Recommended Website References:


 www.ibm.com/in/en/
 www.pentaho.com/
 www.jaspersoft.com/
 www.amazon.com/Data-Mining-Business-Intelligence-Applications
 www.ibm.com/insights/in
 www.sas.com
 Weka– Data Mining with Open Source Machine
Learning Software, www.cs.waikato.ac.nz/ml/weka.
 https://cloud.google.com/bigquery/
 https://www.rstudio.com/
 https://aws.amazon.com/redshift/
 www.Kaagal.com

You might also like