0% found this document useful (0 votes)

13 views56 pages

Lec.01 Introduction To DM

The document outlines the course structure for Data Mining, including assessment components and topics covered such as the importance of data mining, types of data, functionalities, and applications. It emphasizes the explosive growth of data and the necessity for automated analysis to extract valuable insights. Additionally, it discusses the evolution of database technology and various data mining tasks like classification, clustering, and fraud detection.

Uploaded by

khanhndn2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views56 pages

Lec.01 Introduction To DM

Uploaded by

khanhndn2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Course: 505043

Data Mining

Lecture 1. Introduction to Data Mining

Types of Data

Dr. Anh HOANG

1
Report
n QT1 (10%): attending classes
n QT2 (20%): Homework #1-2-3
n Midterm (20%)
n Group presentation
n Individual performance
n Final report (50%)
n Group presentation
n Individual performance
n Requirement:
n Submit HW, Report, … before deadline
n Presentation:
n 1) Understanding proble clearly

n 2) Solution/ Algorithm

n 3) Demo code
2
Contents
n Why data mining?

n What is data mining?

n What types of data can be mined?

n Data mining functionalities/ Tasks

n Interesting patterns

n Classification of data mining systems

n Major issues in data mining

3
Large-scale Data is Everywhere!
§ There has been enormous data
growth in both commercial and
scientific databases due to
advances in data generation and
collection technologies. Cyber Security E-Commerce

§ New mantra
§ Gather whatever data you can
whenever and wherever possible.

Social Networking: Twitter

§ Expectations Traffic Patterns
§ Gathered data will have value
either for the purpose collected or
for a purpose not envisioned.

Sensor Networks Computational Simulations

Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 4
Q1. Why Data Mining?
n The Explosive Growth of Data: from terabytes to petabytes
n Data collection and data availability
n Automated data collection tools, database systems, Web, computerized
society
n Major sources of abundant data
n Business: Web, e-commerce, transactions, stocks, …
n Science: Remote sensing, bioinformatics, scientific simulation, …
n Society and everyone: news, digital cameras,
n …
n We are drowning in data but starving for knowledge!
n “Necessity is the mother of invention”—Data mining—Automated analysis of
massive data sets

5
Why Data Mining? Commercial Viewpoint

n Lots of data is being collected

and warehoused
n Web data
n Google has Peta Bytes of web data
n Facebook has billions of active users
n Purchases at department/
grocery stores, e-commerce
n Amazon handles millions of visits/day
n Bank/Credit Card transactions

n Computers have become cheaper and more powerful

n Competitive Pressure is Strong
n Provide better, customized services for an edge (e.g. in Customer
Relationship Management)
Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 6
Why Data Mining? Scientific Viewpoint
n Data collected and stored at
enormous speeds
n Remote sensors on a satellite
n NASA EOSDIS archives over
petabytes of earth science data / year
fMRI Data from Brain Sky Survey Data
n Telescopes scanning the skies
n Sky survey data
n High-throughput biological data
n Scientific simulations
n Terabytes of data generated in a few hours
Gene Expression Data
n Data mining helps scientists
n In automated analysis of massive datasets

n In hypothesis formation

Surface Temperature of Earth

Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 7
Great opportunities to improve productivity in all walks of life

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 8
Great Opportunities to Solve Society’s Major Problems

Improving health care and reducing costs Predicting the impact of climate change

Finding alternative/ green energy sources Reducing hunger and poverty by

increasing agriculture production
Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 9
Evolution of Database Technology
n 1960s:
n Data collection, database creation, IMS and network DBMS
n 1970s:
n Relational data model, relational DBMS implementation
n 1980s:
n RDBMS, advanced data models (extended-relational, OO, deductive, etc.)
n Application-oriented DBMS (spatial, scientific, engineering, etc.)
n 1990s:
n Data mining, data warehousing, multimedia databases, and Web databases
n 2000s:
n Stream data management and mining
n Data mining and its applications
n Web technology (XML, data integration) and global information systems

10
Why Data Mining?—Potential Applications

n Data analysis and decision support/making

n Market analysis and management
n Target marketing, customer relationship management
(CRM), market basket analysis, market segmentation
n Risk analysis and management
n Forecasting, customer retention, quality control,
competitive analysis
n Fraud detection and detection of unusual patterns (outliers)

11
Why Data Mining?—Potential Applications

n Other Applications
n Text mining (news group, email, documents) and Web
mining
n Stream data mining
n Bioinformatics and bio-data analysis

12
Market Analysis and Management

n Where does the data come from?

n Credit card transactions, discount coupons, customer
complaint calls

n Target marketing
n Find clusters of “model” customers who share the same
characteristics: interest, income level, spending habits, etc.
n Determine customer purchasing patterns over time

13
Market Analysis and Management

n Cross-market analysis
n Associations/co-relations between product sales, &
prediction based on such association
n Customer profiling
n What types of customers buy what products

n Customer requirement analysis

n Identifying the best products for different customers
n Predict what factors will attract new customers

14
Fraud Detection & Mining Unusual Patterns

n Approaches: Clustering & model construction for frauds, outlier analysis

n Applications: Health care, retail, credit card service, telecom.

n Medical insurance
n Professional patients, and ring of doctors
n Unnecessary or correlated screening tests
n Telecommunications:
n Phone call model: destination of the call, duration, time of day or
week. Analyze patterns that deviate from an expected norm
n Retail industry
n Analysts estimate that 38% of retail shrink is due to dishonest
employees

15
Other Applications

n Internet Web Surf-Aid

n IBM Surf-Aid applies data mining algorithms to Web access
logs for market-related pages to discover customer
preference and behavior pages, analyzing effectiveness of
Web marketing, improving Web site organization, etc.
n …

16
Q2. What Is Data Mining?

n Data mining (knowledge discovery from data)

n Extraction of interesting (non-trivial, implicit, previously
unknown and potentially useful) patterns or knowledge
from huge amount of data
n Alternative name
n Knowledge discovery in databases (KDD)
n Watch out: Is everything “data mining”?
n Query processing
n Expert systems
n Statistical programs
17
Data Mining: KDD Process

n Data mining—core of Pattern Evaluation

knowledge discovery
process
Data Mining

Task-relevant Data

Data Warehouse Selection

Data Cleaning

Data Integration

Databases
18
Steps of a KDD Process

n Learning the application domain

n Relevant prior knowledge and goals of application
n Creating a target data set: data selection
n Data cleaning and preprocessing: (may take 60% - 80% of effort!)
n Data reduction and transformation
n Find useful features, dimensionality/variable reduction.
n Choosing functions of data mining
n Summarization, classification, regression, association, clustering.
n Choosing the mining algorithm(s)
n Data mining: search for patterns of interest
n Pattern evaluation and knowledge presentation
n Visualization, transformation, removing redundant patterns, etc.
n Use of discovered knowledge
n …
19
Architecture: Typical Data Mining System

Graphical user interface

Pattern evaluation

Data mining engine

Knowledge-base
Database or data
warehouse server
Data cleaning & data integration Filtering

Data
Databases Warehouse

20
What is Data Mining?
n Many Definitions
n Non-trivial extraction of implicit, previously unknown and
potentially useful information from data
n Exploration & analysis, by automatic or semi-automatic
means, of large quantities of data in order to discover
meaningful patterns

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 21
Origins of Data Mining

n Draws ideas from machine learning/AI, pattern recognition,

statistics, and database systems

n Traditional techniques may be unsuitable due to data that is

n Large-scale
n High dimensional
n Heterogeneous
n Complex
n Distributed

n A key component of the emerging field of data science and data-driven

discovery

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 22
Q3. What types of data can be mined?

n Database data (RDBMs)

n Data warehouse
n Transactional data
n Other types of data:
n Sequence data, data streams (cont.), spatial data (maps), engineering
design data, hypertext, multimedia, web data, etc.

n Advanced database and information repository

n Spatial and temporal data
n Time-series data
n Stream data
n Multimedia database
n Text databases & WWW
23
Database data (RDBMs): Relational -> tables

n RDBMs
n Set of tables – has rows (tuples) and columns (attributes)
n While mining databases, we can search for trends or data
pattern

n Example:
n Analysing customer data to predict the credit risks of new
customers (based on previous data)
n Analysing sales data - (any deviations)
Data warehouse data
cube

n Collection of data integrated from different sources

with querying and decision making on data
n In data warehouse, data is stored in multidimensional
structure (datacube) where each dimension is each
attribute

Data
Source-1 Client-1
Data Data Querying
Source-2 Warehouse Analysis
Client-2
Data
Source-3
Transactional data
n Each record is called as transaction
n sales,
n flight booking,
n user clicks on web page

n Transaction has transaction ID, list of other items making

transaction

n From transaction database, we can mine frequent patterns

n Other types of data:

n Sequence data, data streams (cont.), spatial data (maps),
engineering design data, hypertext, multimedia, web data, etc.
Q4. Data Mining Functionalities
n Data is always associated with class/concepts Descriptions:
n Data characterisation:
n Refers to the summary of the class/ concept
n Output -> General overview
n Data discrimination:
n Compares the common features of the classes
n Output -> barcharts, curves, etc.

n Mining frequent patterns, Association, and Correlations

n Frequent patterns:
n Things which are found most commonly in data
n Frequent itemsets (data items/ data objects)
n Frequent subsequence
n Frequent substructure
n Association analysis: (relationship)
n It is a way identifying the relation between various items
n Example: used to determine sales of items that are frequently purchased
together
27
Q4. Data Mining Functionalities
n Correlation analysis:
n Mathematical technique
n Shows how strongly pair of attributes are related together
n Example: tall peope tend to have more weight

n Classification and Regression for predictive analysis

n Classsification:
n Process of finding a model that distinguishes data items

n Decision tree is used for classification

n Regression:
n Statistical methodology that is used for numeric prediction (done based on
previous data) of missing data

28
Q4. Data Mining Functionalities
n Cluster analysis (Group)
n Class label is unknown: Group data to form new classes, e.g., cluster

houses to find distribution patterns

n Maximizing intra-class similarity & minimizing interclass similarity

n Outlier analysis
n Outlier: a data object that does not comply with the general behavior of

the data
n Useful in fraud detection, rare events analysis

n Trend and evolution analysis

n Trend and deviation: regression analysis

n Sequential pattern mining, periodicity analysis

29
Data Mining Tasks …

Clu
ste Data
ring g
lin
Tid Refund Marital Taxable
Status Income Cheat

e
od
1 Yes Single 125K No
2 No Married 100K No
M
ve
3 No Single 70K No

ti
dic
4 Yes Married 120K No

e
5 No Divorced 95K Yes

Pr
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
An
De oma
11 No Married 60K No

ation 12 Yes Divorced 220K No

i tec ly
soc
13 No Single 85K Yes

s 14 No Married 75K No
tio
A n
les
15 No Single 90K Yes

u
10

Milk

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 30
Predictive Modeling: Classification

n Find a model for class attribute as a function of the

values of other attributes Model for predicting credit
worthiness

Class Employed
# years at
Level of Credit Yes
Tid Employed present No
Education Worthy
address
1 Yes Graduate 5 Yes
2 Yes High School 2 No No Education
3 No Undergrad 1 No
{ High school,
4 Yes High School 10 Yes Graduate
Undergrad }
… … … … …
10

Number of Number of
years years

> 3 yr < 3 yr > 7 yrs < 7 yrs

Yes No Yes No

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 31
Classification Example
l l ive
ir ca ir ca t # years at
go go tita Tid Employed
Level of
present
Credit
ate ate uan lass Education
address
Worthy
c c q c
1 Yes Undergrad 7 ?
# years at 2 No Graduate 3 ?
Level of Credit
Tid Employed present 3 Yes High School 2 ?
Education Worthy
address
… … … … …
1 Yes Graduate 5 Yes 10

2 Yes High School 2 No

3 No Undergrad 1 No
4 Yes High School 10 Yes
… … … … … Test
10

Set

Training
Learn
Model
Set Classifier

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 32
Examples of Classification Task

! Classifying credit card transactions

as legitimate or fraudulent

! Classifying land covers (water bodies, urban areas,

forests, etc.) using satellite data

! Categorizing news stories as finance,

weather, entertainment, sports, etc

! Identifying intruders in the cyberspace

! Predicting tumor cells as benign or malignant

! Classifying secondary structures of protein

as alpha-helix, beta-sheet, or random coil

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 33
Classification: Application 1
n Fraud Detection
n Goal: Predict fraudulent cases in credit card transactions.
n Approach:
n Use credit card transactions and the information on its

account-holder as attributes.
n When does a customer buy, what does he buy, how
often he pays on time, etc
n Label past transactions as fraud or fair transactions.

This forms the class attribute.

n Learn a model for the class of the transactions.

n Use this model to detect fraud by observing credit card

transactions on an account.

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 34
Classification: Application 2
n Churn prediction for telephone customers
n Goal: To predict whether a customer is likely to be lost to a
competitor.
n Approach:
n Use detailed record of transactions with each of the past
and present customers, to find attributes.
n How often the customer calls, where he calls, what time-
of-the day he calls most, his financial status, marital status,
etc.
n Label the customers as loyal or disloyal.
n Find a model for loyalty.

From [Berry & Linoff] Data Mining Techniques, 1997

Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 35
Classification: Application 3
n Sky Survey Cataloging
– Goal: To predict class (star or galaxy) of sky objects,
especially visually faint ones, based on the telescopic survey
images (from Palomar Observatory).
n 3000 images with 23,040 x 23,040 pixels per image.
– Approach:
n Segment the image.

n Measure image attributes (features) - 40 of them per

object.
n Model the class based on these features.

n Success Story: Could find 16 new high red-shift quasars,

some of the farthest objects that are difficult to find!

From [Fayyad, et.al.] Advances in Knowledge Discovery and Data Mining, 1996

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 36
Classifying Galaxies
Courtesy: http://aps.umn.edu

Early Class: Attributes:

• Stages of Formation • Image features,
• Characteristics of light
waves received, etc.
Intermediate

Late

Data Size:
• 72 million stars, 20 million galaxies
• Object Catalog: 9 GB
• Image Database: 150 GB

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 37
Regression
n Predict a value of a given continuous valued variable based on
the values of other variables, assuming a linear or nonlinear
model of dependency.
n Extensively studied in statistics, neural network fields.
n Examples:
n Predicting sales amounts of new product based on
advertising expenditure.
n Predicting wind velocities as a function of temperature,
humidity, air pressure, etc.
n Time series prediction of stock market indices.

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 38
Clustering
n Finding groups of objects such that the objects in a group
will be similar (or related) to one another and different
from (or unrelated to) the objects in other groups

Inter-cluster
Intra-cluster distances are
distances are maximized
minimized

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 39
Applications of Cluster Analysis
n Understanding
n Custom profiling for targeted
marketing
n Group related documents for
browsing
n Group genes and proteins that have
similar functionality
n Group stocks with similar price
fluctuations
n Summarization
n Reduce the size of large data sets

Courtesy: Michael Eisen

Clusters for Raw SST and Raw NPP

Use of K-means to
partition Sea Surface
60

Land Cluster 2

30 Temperature (SST) and

Land Cluster 1 Net Primary Production
latitude

0
(NPP) into clusters that
reflect the Northern and
Ice or No NPP

-30

Sea Cluster 2 Southern Hemispheres.

-60

Sea Cluster 1

-90
-180 -150 -120 -90 -60 -30 0 30 60 90 120 150 180
Cluster
Introduction to Data Mining, 2nd Edition
longitude
Tan, Steinbach, Karpatne, Kumar 40
Clustering: Application 1
n Market Segmentation:
n Goal: subdivide a market into distinct subsets of customers
where any subset may conceivably be selected as a market
target to be reached with a distinct marketing mix.
n Approach:
n Collect different attributes of customers based on their

geographical and lifestyle related information.

n Find clusters of similar customers.

n Measure the clustering quality by observing buying

patterns of customers in same cluster vs. those from

different clusters.

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 41
Clustering: Application 2
n Document Clustering:
n Goal: To find groups of documents that are similar to each
other based on the important terms appearing in them.
n Approach: To identify frequently occurring terms in each
document. Form a similarity measure based on the
frequencies of different terms. Use it to cluster.

Enron email dataset

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 42
Association Rule Discovery: Definition
n Given a set of records each of which contain some
number of items from a given collection
n Produce dependency rules which will predict occurrence of
an item based on occurrences of other items.

TID Items
1 Bread, Coke, Milk
Rules Discovered:
2 Beer, Bread {Milk} --> {Coke}
3 Beer, Coke, Diaper, Milk {Diaper, Milk} --> {Beer}
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 43
Association Analysis: Applications
n Market-basket analysis
n Rules are used for sales promotion, shelf management, and
inventory management

n Telecommunication alarm diagnosis

n Rules are used to find combination of alarms that occur
together frequently in the same time period

n Medical Informatics
n Rules are used to find combination of patient symptoms and
test results associated with certain diseases

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 44
Association Analysis: Applications

n An Example Subspace Differential Co-expression Pattern from

lung cancer dataset Three lung cancer datasets [Bhattacharjee et al.
2001], [Stearman et al. 2005], [Su et al. 2007]

Enriched with the TNF/NFB signaling pathway

which is well-known to be related to lung cancer
P-value: 1.4*10-5 (6/10 overlap with the pathway)

[Fang et al PSB 2010]

Introduction to Data Mining, 2nd Edition
Tan, Steinbach, Karpatne, Kumar 45
Deviation/Anomaly/Change Detection

n Detect significant deviations from normal

behavior
n Applications:
n Credit Card Fraud Detection
n Network Intrusion
Detection
n Identify anomalous behavior from sensor
networks for monitoring and surveillance.
n Detecting changes in the global forest
cover.

Introduction to Data Mining, 2nd Edition

Tan, Steinbach, Karpatne, Kumar 46
Q5. Are All the “Discovered” Patterns Interesting?

n Data mining may generate thousands of patterns: Not all of them are
interesting
n Suggested approach: Human-centered, query-based, focused mining
n Interestingness measures
n A pattern is interesting if it is easily understood by humans, valid on new or test
data with some degree of certainty, potentially useful, novel, or validates some
hypothesis that a user seeks to confirm
n Objective vs. subjective interestingness measures
n Objective: based on statistics and structures of patterns, e.g., support,
confidence, etc.
n Subjective: based on user’s belief in the data, e.g., unexpectedness, novelty.

47
Q6. Data Mining: Classification Schemes

n Different views, different classifications

n Kinds of data to be mined
n Kinds of knowledge to be discovered
n Kinds of techniques utilized
n Kinds of applications adapted

48
Multi-Dimensional View of Data Mining
n Data to be mined
n Relational, data warehouse, transactional, stream, object-
oriented/relational, active, spatial, time-series, text, multi-
media, heterogeneous, WWW

n Knowledge to be mined
n Characterization, discrimination, association, classification,
clustering, trend/deviation, outlier analysis, etc.
n Multiple/integrated functions and mining at multiple levels

49
Multi-Dimensional View of Data Mining
n Techniques utilized
n Database-oriented, data warehouse (OLAP), machine
learning, statistics, visualization, etc.

n Applications adapted
n Retail, telecommunication, banking, fraud analysis, bio-data
mining, stock market analysis, Web mining, etc.

50
OLAP Mining: Integration of Data Mining and Data Warehousing

n Data mining systems, DBMS, Data warehouse systems

coupling
n On-line analytical mining data
n Integration of mining and OLAP technologies

n Interactive mining multi-level knowledge

n Necessity of mining knowledge and patterns at different levels of
abstraction.

n Integration of multiple mining functions

n Characterized classification, first clustering and then association

51
Data Mining: Confluence of Multiple Disciplines

Database
Statistics
Systems

Machine
Learning
Data Mining Visualization

Algorithm Other
Disciplines

52
Q7. Major Issues in Data Mining
n Mining methodology
n Mining different kinds of knowledge from diverse data types,
e.g., bio, stream, Web
n Performance: efficiency, effectiveness, and scalability
n Pattern evaluation: the interestingness problem
n Incorporation of background knowledge
n Handling noise and incomplete data
n Parallel, distributed and incremental mining methods
n Integration of the discovered knowledge with existing one:
knowledge fusion

53
Q7. Major Issues in Data Mining
n User interaction
n Data mining query languages and ad-hoc mining
n Expression and visualization of data mining results
n Interactive mining of knowledge at multiple levels of
abstraction

n Applications and social impacts

n Domain-specific data mining & invisible data mining
n Protection of data security, integrity, and privacy

54
Summary
n Data mining: discovering interesting patterns from large amounts of data
n A natural evolution of database technology, in great demand, with wide
applications
n A KDD process includes data cleaning, data integration, data selection,
transformation, data mining, pattern evaluation, and knowledge presentation
n Mining can be performed in a variety of information repositories
n Data mining functionalities: characterization, discrimination, association,
classification, clustering, outlier and trend analysis, etc.
n Data mining systems and architectures
n Major issues in data mining

55
Where to Find References?
n More conferences on data mining
n PAKDD (1997), PKDD (1997), SIAM-Data Mining (2001), (IEEE) ICDM (2001), etc.
n Data mining and KDD
n Conferences: ACM-SIGKDD, IEEE-ICDM, SIAM-DM, PKDD, PAKDD, etc.
n Journal: Data Mining and Knowledge Discovery, KDD Explorations
n Database systems
n Conferences: ACM-SIGMOD, ACM-PODS, VLDB, IEEE-ICDE, EDBT, ICDT, DASFAA
n Journals: ACM-TODS, IEEE-TKDE, JIIS, J. ACM, etc.
n AI & Machine Learning
n Conferences: Machine learning (ML), AAAI, IJCAI, COLT (Learning Theory), etc.
n Journals: Machine Learning, Artificial Intelligence, etc.
n Statistics
n Conferences: Joint Stat. Meeting, etc.
n Journals: Annals of statistics, etc.
n Visualization
n Conference proceedings: CHI, ACM-SIGGraph, etc.
n Journals: IEEE Trans. visualization and computer graphics, etc.
56

Lec.01 Introduction To DM
No ratings yet
Lec.01 Introduction To DM
56 pages
Data Mining Concepts
No ratings yet
Data Mining Concepts
35 pages
Chapter 1 - Tagged
No ratings yet
Chapter 1 - Tagged
46 pages
Intro to Data Mining Course
No ratings yet
Intro to Data Mining Course
56 pages
DM-Unit 1
No ratings yet
DM-Unit 1
110 pages
VIPDMTheory Chapter 1
No ratings yet
VIPDMTheory Chapter 1
25 pages
01 Introduction
No ratings yet
01 Introduction
36 pages
02-Introduction To Data Mining
No ratings yet
02-Introduction To Data Mining
40 pages
Intro to Data Mining Concepts
No ratings yet
Intro to Data Mining Concepts
50 pages
Lecture 1. Introduction
No ratings yet
Lecture 1. Introduction
42 pages
1 Intro
No ratings yet
1 Intro
50 pages
Data Mining
No ratings yet
Data Mining
26 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Data Mining for Analysts
No ratings yet
Data Mining for Analysts
17 pages
0 Introduction
No ratings yet
0 Introduction
43 pages
01 Intro
No ratings yet
01 Intro
52 pages
Major Issues in Data Mining
75% (4)
Major Issues in Data Mining
45 pages
Unit 1 A
No ratings yet
Unit 1 A
39 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
43 pages
Day-2 BE-VIII DMDW (Into. Contd..)
No ratings yet
Day-2 BE-VIII DMDW (Into. Contd..)
23 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Unit - 2 - Introduction of Data Mining
No ratings yet
Unit - 2 - Introduction of Data Mining
12 pages
Data Mining Concepts & Techniques Guide
100% (2)
Data Mining Concepts & Techniques Guide
27 pages
01 Intro
No ratings yet
01 Intro
40 pages
Data Mining SSWT ZC 425
No ratings yet
Data Mining SSWT ZC 425
381 pages
Unit 1a
No ratings yet
Unit 1a
39 pages
Data Analysis-2
No ratings yet
Data Analysis-2
41 pages
Introduction
No ratings yet
Introduction
27 pages
Intro of Data Mining
No ratings yet
Intro of Data Mining
27 pages
Data Mining Basics for Beginners
No ratings yet
Data Mining Basics for Beginners
59 pages
01 Intro
No ratings yet
01 Intro
40 pages
Comprehensive Guide to Data Mining
No ratings yet
Comprehensive Guide to Data Mining
32 pages
DWDM LS1 Fall 24 25
No ratings yet
DWDM LS1 Fall 24 25
42 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
Data Mining: Concepts and Techniques: - Chapter 1
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 1
37 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
41 pages
Lec Slides Combined Mid Quiz With Old Quizzes
No ratings yet
Lec Slides Combined Mid Quiz With Old Quizzes
378 pages
01 Intro 1
No ratings yet
01 Intro 1
33 pages
01intro Edited v1
No ratings yet
01intro Edited v1
42 pages
Intro Data Mining
No ratings yet
Intro Data Mining
51 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
41 pages
Introduction To Data Mining
75% (4)
Introduction To Data Mining
45 pages
CSC 452 DM Lecture01 Course Information 13102020 014048pm
No ratings yet
CSC 452 DM Lecture01 Course Information 13102020 014048pm
49 pages
Basic Concepts Data Mining (Lecture 02) - 1
No ratings yet
Basic Concepts Data Mining (Lecture 02) - 1
40 pages
01 Intro
No ratings yet
01 Intro
45 pages
LECTURE 1 Data Mining
No ratings yet
LECTURE 1 Data Mining
41 pages
01 Intro
No ratings yet
01 Intro
41 pages
DWDM
No ratings yet
DWDM
30 pages
Module - 1 - DM
No ratings yet
Module - 1 - DM
52 pages
Data Mining From Scratch
No ratings yet
Data Mining From Scratch
17 pages
Introduction
No ratings yet
Introduction
46 pages
Module1 IntroToDataMining
No ratings yet
Module1 IntroToDataMining
36 pages
Data Mining Introduction
No ratings yet
Data Mining Introduction
32 pages
Lecture 1
No ratings yet
Lecture 1
37 pages
Data Mining
No ratings yet
Data Mining
88 pages
MATLAB Based Graphical User Interface GUI For Data Mining As A Tool For Environment Management Libre
No ratings yet
MATLAB Based Graphical User Interface GUI For Data Mining As A Tool For Environment Management Libre
8 pages
Data Mining - Business Report: Clustering Clean - Ads
100% (4)
Data Mining - Business Report: Clustering Clean - Ads
24 pages
DWDM Viva Question
50% (2)
DWDM Viva Question
31 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
AAAI 4V Evaluating The Contributions of Crowdsourced and Professional Fact Checking
No ratings yet
AAAI 4V Evaluating The Contributions of Crowdsourced and Professional Fact Checking
11 pages
Wallacei Primer 2
No ratings yet
Wallacei Primer 2
39 pages
Data Mining Graded Assignment: Problem 1: Clustering Analysis
100% (3)
Data Mining Graded Assignment: Problem 1: Clustering Analysis
39 pages
Hierarchical
No ratings yet
Hierarchical
9 pages
Python-Based Personalized Recommendation System Development
No ratings yet
Python-Based Personalized Recommendation System Development
37 pages
K-Means Clustering Guide for Students
No ratings yet
K-Means Clustering Guide for Students
3 pages
Machine Learning QB
No ratings yet
Machine Learning QB
3 pages
Clustering
No ratings yet
Clustering
8 pages
ML Module 1
No ratings yet
ML Module 1
52 pages
From Development To Data Science: A Complete Roadmap
No ratings yet
From Development To Data Science: A Complete Roadmap
11 pages
A Markov Chain-Based Availability Model of Virtual Cluster Nodes
No ratings yet
A Markov Chain-Based Availability Model of Virtual Cluster Nodes
5 pages
SaTScan Users Guide
100% (1)
SaTScan Users Guide
116 pages
Image Classification: Unsupervised
No ratings yet
Image Classification: Unsupervised
15 pages
Lecture 6-Data Mining and Warehousing
No ratings yet
Lecture 6-Data Mining and Warehousing
7 pages
PAG POV Retail Clustering Methods
No ratings yet
PAG POV Retail Clustering Methods
16 pages
Data Mining and KDD
No ratings yet
Data Mining and KDD
15 pages
Smart Bridge Maintenance Insights
No ratings yet
Smart Bridge Maintenance Insights
52 pages
Manning Christopher, Prabhakar Raghavan, Hinrich Schu Tze: Introduction To Information Retrieval
No ratings yet
Manning Christopher, Prabhakar Raghavan, Hinrich Schu Tze: Introduction To Information Retrieval
4 pages
A Multi-Label Approach For Diagnosis Problems in Energy Systems Using LAMDA Algorithm
No ratings yet
A Multi-Label Approach For Diagnosis Problems in Energy Systems Using LAMDA Algorithm
6 pages
A Novel Density-Based Clustering Algorithm For Predicting Cardiovascular Disease
No ratings yet
A Novel Density-Based Clustering Algorithm For Predicting Cardiovascular Disease
12 pages
Previous Play Next Rewind 10 Seconds Move Forward 10 Seconds Unmute
No ratings yet
Previous Play Next Rewind 10 Seconds Move Forward 10 Seconds Unmute
14 pages
An Analysis of Requirements For Specifying Manufacturing Engineering and Business Processes
No ratings yet
An Analysis of Requirements For Specifying Manufacturing Engineering and Business Processes
17 pages
A Tree-Based Incremental Overlapping Clustering Method Using The Three-Way Decision Theory
No ratings yet
A Tree-Based Incremental Overlapping Clustering Method Using The Three-Way Decision Theory
15 pages
GIS for Road Accident Hotspots
No ratings yet
GIS for Road Accident Hotspots
13 pages
Iv Semester: Data Mining Question Bank: Unit 2 2 Mark Questions)
No ratings yet
Iv Semester: Data Mining Question Bank: Unit 2 2 Mark Questions)
5 pages
Notes On Pattern Recognition: What Is It? What Is Feature Extraction?
No ratings yet
Notes On Pattern Recognition: What Is It? What Is Feature Extraction?
4 pages