KEMBAR78
Data Mining Summary | PDF | Data Mining | Data
0% found this document useful (0 votes)
12 views2 pages

Data Mining Summary

A brief description of all the terminologies used under data mining.

Uploaded by

natemutua01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

Data Mining Summary

A brief description of all the terminologies used under data mining.

Uploaded by

natemutua01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Mining Explained: A Summary

This document provides a comprehensive overview of data mining, its processes, tasks, and
applications. Here are the key takeaways:

What is Data Mining?

 It's the process of uncovering valuable insights, patterns, and relationships from large
datasets using various techniques and algorithms.
 It helps extract meaningful information often hidden within vast amounts of data.
 It's a crucial tool for decision-making, prediction, and optimization across various fields.

Examples of Data Mining Applications:

 Customer profiling: Identifying profitable customer segments.


 Targeting: Determining characteristics of profitable customers acquired by competitors.
 Market-basket analysis: Discovering product purchase patterns for product positioning
and cross-selling.

Data Mining Process:

 CRISP-DM (Cross-Industry Standard Process for Data Mining): A six-phase


industry standard process for data mining projects.
o Phases:
 Business Understanding: Defining business objectives and project goals.
 Data Understanding: Assessing data requirements and quality.
 Data Preparation: Cleaning, transforming, and preparing data for analysis.
 Modeling: Applying data mining tools and models to identify patterns.
 Evaluation: Evaluating model results in the context of business objectives.
 Deployment: Implementing models for prediction or identification
purposes.
 SEMMA (Sample, Explore, Modify, Model, Assess): An alternative data mining
process focusing on modeling.
o Phases:
 Sample: Extracting a representative portion of the data for analysis.
 Explore: Searching for unexpected trends and anomalies in the data.
 Modify: Creating, selecting, and transforming variables for model
building.
 Model: Building models that explain data patterns.
 Assess: Evaluating the model's usefulness and reliability.

Comparison of CRISP-DM and SEMMA:

 CRISP-DM: More comprehensive, emphasizes business understanding and iterative


processes.
 SEMMA: More focused on the modeling process itself.
 The choice depends on project needs and organizational preferences.

Data Mining Tasks:

 Classification: Assigning data points to predefined categories (e.g., email spam


detection).
 Clustering: Grouping data points with similar characteristics (e.g., customer
segmentation).
 Regression: Predicting a real-valued variable based on other variables (e.g., financial
forecasting).
 Time Series Analysis: Examining the value of an attribute over time (e.g., stock price
prediction).
 Prediction: Forecasting future data states based on past and current data (e.g., flood
prediction).
 Summarization: Creating concise summaries of data subsets (e.g., financial statements).
 Association Rules: Discovering relationships between data items (e.g., market basket
analysis).
 Sequence Discovery: Identifying sequential patterns in data (e.g., web browsing
analysis).

Conclusion:

Data mining empowers businesses and organizations to leverage the power of data for informed
decision-making, improved efficiency, and competitive advantage. Understanding the different
data mining processes, tasks, and applications allows for effective implementation and utilization
of this valuable technique.

You might also like