KEMBAR78
Data mining | PDF
DATA MINING
Presented by- Shweta kumari
M.Sc. Bioinformatics
1st semester
Roll no-21
Central University of Bihar
C0NTENTS
1. Intoduction
2. Condition of Data Mining
3. Properties of Data Mining
4. Objective of Data Mining
5. Technique of Data Mining
6. Application of Data Mining in
Bioinformatics
7. Conclusion & chllenges
INTRODUCTION
 Data mining refers to extracting or mining knowledge
from large amount of data.
 To dig out the hidden characteristic from all data to
predict future trends.
Condition of Data Mining
 Data should be extremely large.
 More the data set, more is the
accuracy of prediction
Properties of data
mining
 Automatic discovery of pattern
 Prediction of likely outcomes
 Creation of actionable information
 Focus on large data sets and data bases
Objective of Data Mining
 To predict future trends
 To find the hidden trends
/characteristics/patterns
Technique of Data Mining
 ASSOCIATIVE LEARNING – Techniques In which we learn how outcome of
one entity is influence by the other.
 ARTIFICIAL NURAL NETWORK- This is computational model inspired by
animal central nervous system which is capable of machine learning as
well as pattern recognition.
 CLUSTERING- It is the task of discovering groups and structure in tha
data that are in some way or another similar without using known
structure in the data.
 GENETIC ALGORITHM- It is optimization technique, it mimics the process
of evolution viz. inheritance, mutation, selection and crossing over.
 HIDDEN MARKOV MODEL- It provides a mathematical framework for
multiple sequence alignment and finding periodic patterns in a single
sequence.
Application of Data
Mining in Bioinformatics
 Gene finding
 Protein function domain
 Function motif detection
 Protein function inference
 Disease diagnosis
 Disease prognosis
 Disease treatment optimization
 Protein sub cellular location prediction
Conclusion & chllenges
Since, bioinformatics is data rich, but lacks a
comprehensive theory of life’s organization
at molecular level. The extensive database of
biological information create both challenges
and opportunities for development of novel
KDD (Knowledge Discovery Database)
method.
References:-
 Database system Concept (Abrham
Silberschatz,Henry F. Korth,S. Sudarshan)
 Wikipedia.org/wiki/Data mining
 http://www.ijcse.com/docs/IJCSE10-01-02-18.pdf
Thank You

Data mining

  • 1.
    DATA MINING Presented by-Shweta kumari M.Sc. Bioinformatics 1st semester Roll no-21 Central University of Bihar
  • 2.
    C0NTENTS 1. Intoduction 2. Conditionof Data Mining 3. Properties of Data Mining 4. Objective of Data Mining 5. Technique of Data Mining 6. Application of Data Mining in Bioinformatics 7. Conclusion & chllenges
  • 3.
    INTRODUCTION  Data miningrefers to extracting or mining knowledge from large amount of data.  To dig out the hidden characteristic from all data to predict future trends.
  • 4.
    Condition of DataMining  Data should be extremely large.  More the data set, more is the accuracy of prediction
  • 5.
    Properties of data mining Automatic discovery of pattern  Prediction of likely outcomes  Creation of actionable information  Focus on large data sets and data bases
  • 6.
    Objective of DataMining  To predict future trends  To find the hidden trends /characteristics/patterns
  • 7.
    Technique of DataMining  ASSOCIATIVE LEARNING – Techniques In which we learn how outcome of one entity is influence by the other.  ARTIFICIAL NURAL NETWORK- This is computational model inspired by animal central nervous system which is capable of machine learning as well as pattern recognition.  CLUSTERING- It is the task of discovering groups and structure in tha data that are in some way or another similar without using known structure in the data.  GENETIC ALGORITHM- It is optimization technique, it mimics the process of evolution viz. inheritance, mutation, selection and crossing over.  HIDDEN MARKOV MODEL- It provides a mathematical framework for multiple sequence alignment and finding periodic patterns in a single sequence.
  • 8.
    Application of Data Miningin Bioinformatics  Gene finding  Protein function domain  Function motif detection  Protein function inference  Disease diagnosis  Disease prognosis  Disease treatment optimization  Protein sub cellular location prediction
  • 9.
    Conclusion & chllenges Since,bioinformatics is data rich, but lacks a comprehensive theory of life’s organization at molecular level. The extensive database of biological information create both challenges and opportunities for development of novel KDD (Knowledge Discovery Database) method.
  • 10.
    References:-  Database systemConcept (Abrham Silberschatz,Henry F. Korth,S. Sudarshan)  Wikipedia.org/wiki/Data mining  http://www.ijcse.com/docs/IJCSE10-01-02-18.pdf
  • 11.