Data mining, also known as knowledge discovery, involves extracting useful patterns from large data sources and is essential for gaining competitive advantages in today's data-rich environment. Key tasks include classification, association rule mining, clustering, and data visualization, with applications ranging from marketing to fraud detection. The process encompasses understanding the domain, data preprocessing, and post-processing to incorporate discovered patterns into real-world tasks.
WHAT IS DATAMINING?
Data mining is also called knowledge discovery and data
mining (KDD)
Data mining is
extraction of useful patterns from data sources, e.g.,
databases, texts, web, image.
Patterns must be:
valid, novel, potentially useful, understandable
This PPT presented By - Pawneshwar Datt Rai
3.
EXAMPLE OF DISCOVEREDPATTERNS
Association rules:
“80% of customers who buy cheese and milk also buy
bread, and 5% of customers buy all of them together”
Cheese, Milk Bread [sup =5%, confid=80%]
This PPT presented By - Pawneshwar Datt Rai
4.
MAIN DATA MININGTASKS
Classification:
mining patterns that can classify future data into known
classes.
Association rule mining
mining any rule of the form X Y, where X and Y are
sets of data items.
Clustering
identifying a set of similarity groups in the data
This PPT presented By - Pawneshwar Datt Rai
5.
MAIN DATA MININGTASKS
Sequential pattern mining:
A sequential rule: A B, says that event A will be
immediately followed by event B with a certain confidence
Deviation detection:
discovering the most significant changes in data
Data visualization: using graphical methods to show
patterns in data.
This PPT presented By - Pawneshwar Datt Rai
6.
WHY IS DATAMINING IMPORTANT?
Rapid computerization of businesses produce huge
amount of data
How to make best use of data?
A growing realization: knowledge discovered from
data can be used for competitive advantage.
This PPT presented By - Pawneshwar Datt Rai
7.
WHY IS DATAMINING NECESSARY?
Make use of your data assets
There is a big gap from stored data to knowledge; and
the transition won’t occur automatically.
Many interesting things you want to find cannot be found
using database queries
“find me people likely to buy my products”
“Who are likely to respond to my promotion”
This PPT presented By - Pawneshwar Datt Rai
8.
WHY DATA MININGNOW?
The data is abundant.
The data is being warehoused.
The computing power is affordable.
The competitive pressure is strong.
Data mining tools have become available
This PPT presented By - Pawneshwar Datt Rai
9.
RELATED FIELDS
Datamining is an emerging multi-disciplinary field:
Statistics
Machine learning
Databases
Information retrieval
Visualization
etc.
This PPT presented By - Pawneshwar Datt Rai
10.
DATA MINING (KDD)PROCESS
Understand the application domain
Identify data sources and select target data
Pre-process: cleaning, attribute selection
Data mining to extract patterns or models
Post-process: identifying interesting or useful patterns
Incorporate patterns in real world tasks
This PPT presented By - Pawneshwar Datt Rai
11.
DATA MINING APPLICATIONS
Marketing, customer profiling and retention,
identifying potential customers, market
segmentation.
Fraud detection
identifying credit card fraud, intrusion detection
Scientific data analysis
Text and web mining
Any application that involves a large amount of data.
This PPT presented By - Pawneshwar Datt Rai
OPINION ANALYSIS
Word-of-mouthon the Web
The Web has dramatically changed the way that
consumers express their opinions.
One can post reviews of products at merchant
sites, Web forums, discussion groups, blogs
Techniques are being developed to exploit these
sources.
Benefits of Review Analysis
Potential Customer: No need to read many reviews
Product manufacturer: market intelligence, product
benchmarking
This PPT presented By - Pawneshwar Datt Rai
14.
FEATURE BASED ANALYSIS&
SUMMARIZATION
Extracting product features (called Opinion Features) that
have been commented on by customers.
Identifying opinion sentences in each review and
deciding whether each opinion sentence is positive or
negative.
Summarizing and comparing results.
This PPT presented By - Pawneshwar Datt Rai
15.
A Happy andProsperous day to all friends.
This PPT presented By – Pawneshwar Datt Rai
ThisPPTpresentedBy-PawneshwarDattRai