KEMBAR78
Introduction to bioinformatics and databases .pptx
Central dogma of Molecular Biology
Central dogma of Bioinformatics
SEQUENCE STRUCTURE FUNCTION
DNA RNA PROTEIN
WHAT IS
BIOINFORMATICS?????????????
Can Bioinformatics be really
defined!
Bio- Life/ Living Organisms
Informatics- Information system or Information flow
Bioinformatics – Application of information
technology
to the storage, retrieval and analysis of biological information,
Facilitated by the use of computers
WHAT IS BIOINFORMATICS?
What is Bioinformatics?
Mathematics
and Statistics
Biology
Computer
Science
Integrative Science
Co
mp
ute
r
Scie
nce
&
Inf
orm
atio
n
Tec
hno
log
y
Bio
log
y
Mat
he
mat
ics
&
Stat
isti
cs
Ph
ysi
cs
&
Ch
em
istr
y
Bioinfo
rmatics
Information Transduction (Data
converge flow)
✔The development of Bioinformatics
started with the networking of computers
and accumulation of data on genes and
proteins.
Definitions of Bioinformatics.
• Bioinformatics is the field of
science in which biology, computer
science, and information technology
merge into a single discipline.
•Is the Science of managing and
analyzing biological data using
advanced computing Techniques.
•The mathematical, statistical and
computing methods that aim to solve
biological problems using DNA and
amino acids sequences and related
information
•Bioinformatics is a computer-assisted
interface discipline dealing with the
storage, management, access and
processing of molecular biology data.
History of Bioinformatics
• 1933 Process of Electrophoresis for separating proteins in a solution introduced by
Tiselius
• 1951 Structure for the alpha-helix and beta-sheet proposed by Pauling and Corey
• 1953 Double helix model for DNA proposed by Watson and Crick
• 1965 Margaret Dayhoff's Atlas of protein sequences
• 1970 Details of the Needleman-Wunsch algorithm for sequence comparison
published
• 1973 Announcement of the Brookhaven Protein Data Bank
• 1981 Smith-Waterman algorithm developed
Contd..
• 1981 The concept of a sequence motif (Doolittle)
• 1982 GenBank Release 3 made public
• 1985 FASTP/FASTN: fast sequence similarity searching
• 1986 Term "Genomics" appeared for the first time
• 1986 PCR (Polymerase Chain Reaction) described by Kary Mullis and co-workers
• 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM
• 1988 EMBnet network for database distribution
• 1990 BLAST: fast sequence similarity searching by Altshul
• 1991 EST: expressed sequence tag sequencing
• 1993 Sanger Centre, Hinxton, UK
Contd..
• 1994 EMBL European Bioinformatics Institute, Hinxton, UK
• 1996 Yeast genome completely sequenced
• 1997 PSI-BLAST
• 1999 Fly genome completely sequenced
• 2000 The A. thaliana genome (100 Mb)
• 2001 The human genome (3 Giga base pairs) is published
• 2003 Human Genome Project completed
• 2004 The draft genome sequence of the brown Norway laboratory rat, Rattus
norvegicus, was completed by the Rat Genome Sequencing project Consortium
Biological Databases
A biological database is a collection of both experimental and
theoretical data that is organized so that its contents can be easily
1.Accessed
2.Managed
3.Updated
4.Retrieved
Sequence Database
Structure Database
Specialized Database
Some genome sizes
⦿ HIV2 virus 9671 bp
⦿ Mycoplasma genitalis 5.8 · 105
bp
⦿ Haemophilus influenzae 1.83 · 106
bp
⦿ Saccharomyces cerevisiae 1.21 · 107
bp
⦿ Drosophila melanogaster 1.65 · 108
bp
⦿ Homo sapiens 3.14 · 109
bp
⦿ Some amphibians 8 · 1010
bp
⦿ Amoeba dubia 6.7 · 1011
bp
Types Of Databases
Based on the contents it can be divided into
• Nucleotide Sequence databases-GenBank
• Protein Sequence databases-Swissprot
• Macromolecular 3D structures-PDB,NDB
• Gene expression data-EST,STS
• Metabolic pathways-KEGG
• Complete Genomes-TIGR
• Literature databases-PUBmed
Omics Series…..
• Genomics – Gene identification & charaterization
• Transcriptomics – Expression profiles of mRNA
• Proteomics – functions & interactions of proteins
• Structural Genomics – Large scale structure
determination
• Cellinomics - Metabolic Pathways, cell-cell
interactions
• Pharmacogenomics – Genome-based drug design
Major Research Efforts & Applications
Applications of sequence analysis
⦿ assembly of sequence data
⦿ Identification of functional elements in sequences,
⦿ gene prediction
⦿ Sequence comparison
⦿ Classification of proteins
⦿ Comparative genomics
⦿ RNA structure prediction
⦿ Protein structure prediction
⦿ Evolutionary history
Structure Analysis Why?
◆
Structure is believed to be more closely related to function of
proteins
◆
Predicting the function of proteins is a key challenge facing
computational biology
◆
Much of the benefits of molecular biology will depend on predicting
and understanding the functions of proteins
◆
The potential benefits of computationally predicting functions is huge
Faster and cheaper than experimentation
MOLECULAR VISUALIZATION
•The Molecular Visualization tools allows the user to load and
view in Three Dimensional detail, the structure of molecules-both
chemical and Biological.
•Powerful Teaching Tool.
RASMOL
Roger Sayle
Glaxo Wellcome Research and
Development
Stevenage, Hertfordshire, U.K
• RasMol is a molecular graphics program intended for
the visualisation of proteins, nucleic acids and small
molecules.
Thrust areas of Bioinformatics??
❑ Genomics
❑ Proteomics
❑ Pharmacogenomics
❑ Drug Designing
❑ Medical Informatics
❑ Agro Informatics
❑ Phylogeny
❑ DNA Micro arrays
❑ Neural Networks
❑ Large Genome projects
Challenges of working in bioinformatics
⦿Need to feel comfortable in interdisciplinary
area
⦿Depend on others for primary data
⦿Need to address important biological and
computer science problems
• First there was invivo biology, then came invitro and the
discipline now is INSILICO.
Conclusion

Introduction to bioinformatics and databases .pptx

  • 2.
    Central dogma ofMolecular Biology Central dogma of Bioinformatics SEQUENCE STRUCTURE FUNCTION DNA RNA PROTEIN
  • 3.
  • 4.
    Can Bioinformatics bereally defined!
  • 5.
    Bio- Life/ LivingOrganisms Informatics- Information system or Information flow Bioinformatics – Application of information technology to the storage, retrieval and analysis of biological information, Facilitated by the use of computers WHAT IS BIOINFORMATICS?
  • 6.
    What is Bioinformatics? Mathematics andStatistics Biology Computer Science
  • 7.
  • 8.
    ✔The development ofBioinformatics started with the networking of computers and accumulation of data on genes and proteins.
  • 9.
    Definitions of Bioinformatics. •Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. •Is the Science of managing and analyzing biological data using advanced computing Techniques.
  • 10.
    •The mathematical, statisticaland computing methods that aim to solve biological problems using DNA and amino acids sequences and related information •Bioinformatics is a computer-assisted interface discipline dealing with the storage, management, access and processing of molecular biology data.
  • 11.
    History of Bioinformatics •1933 Process of Electrophoresis for separating proteins in a solution introduced by Tiselius • 1951 Structure for the alpha-helix and beta-sheet proposed by Pauling and Corey • 1953 Double helix model for DNA proposed by Watson and Crick • 1965 Margaret Dayhoff's Atlas of protein sequences • 1970 Details of the Needleman-Wunsch algorithm for sequence comparison published • 1973 Announcement of the Brookhaven Protein Data Bank • 1981 Smith-Waterman algorithm developed
  • 12.
    Contd.. • 1981 Theconcept of a sequence motif (Doolittle) • 1982 GenBank Release 3 made public • 1985 FASTP/FASTN: fast sequence similarity searching • 1986 Term "Genomics" appeared for the first time • 1986 PCR (Polymerase Chain Reaction) described by Kary Mullis and co-workers • 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM • 1988 EMBnet network for database distribution • 1990 BLAST: fast sequence similarity searching by Altshul • 1991 EST: expressed sequence tag sequencing • 1993 Sanger Centre, Hinxton, UK
  • 13.
    Contd.. • 1994 EMBLEuropean Bioinformatics Institute, Hinxton, UK • 1996 Yeast genome completely sequenced • 1997 PSI-BLAST • 1999 Fly genome completely sequenced • 2000 The A. thaliana genome (100 Mb) • 2001 The human genome (3 Giga base pairs) is published • 2003 Human Genome Project completed • 2004 The draft genome sequence of the brown Norway laboratory rat, Rattus norvegicus, was completed by the Rat Genome Sequencing project Consortium
  • 14.
    Biological Databases A biologicaldatabase is a collection of both experimental and theoretical data that is organized so that its contents can be easily 1.Accessed 2.Managed 3.Updated 4.Retrieved Sequence Database Structure Database Specialized Database
  • 15.
    Some genome sizes ⦿HIV2 virus 9671 bp ⦿ Mycoplasma genitalis 5.8 · 105 bp ⦿ Haemophilus influenzae 1.83 · 106 bp ⦿ Saccharomyces cerevisiae 1.21 · 107 bp ⦿ Drosophila melanogaster 1.65 · 108 bp ⦿ Homo sapiens 3.14 · 109 bp ⦿ Some amphibians 8 · 1010 bp ⦿ Amoeba dubia 6.7 · 1011 bp
  • 16.
    Types Of Databases Basedon the contents it can be divided into • Nucleotide Sequence databases-GenBank • Protein Sequence databases-Swissprot • Macromolecular 3D structures-PDB,NDB • Gene expression data-EST,STS • Metabolic pathways-KEGG • Complete Genomes-TIGR • Literature databases-PUBmed
  • 17.
    Omics Series….. • Genomics– Gene identification & charaterization • Transcriptomics – Expression profiles of mRNA • Proteomics – functions & interactions of proteins • Structural Genomics – Large scale structure determination • Cellinomics - Metabolic Pathways, cell-cell interactions • Pharmacogenomics – Genome-based drug design
  • 18.
    Major Research Efforts& Applications
  • 19.
    Applications of sequenceanalysis ⦿ assembly of sequence data ⦿ Identification of functional elements in sequences, ⦿ gene prediction ⦿ Sequence comparison ⦿ Classification of proteins ⦿ Comparative genomics ⦿ RNA structure prediction ⦿ Protein structure prediction ⦿ Evolutionary history
  • 20.
    Structure Analysis Why? ◆ Structureis believed to be more closely related to function of proteins ◆ Predicting the function of proteins is a key challenge facing computational biology ◆ Much of the benefits of molecular biology will depend on predicting and understanding the functions of proteins ◆ The potential benefits of computationally predicting functions is huge Faster and cheaper than experimentation
  • 21.
    MOLECULAR VISUALIZATION •The MolecularVisualization tools allows the user to load and view in Three Dimensional detail, the structure of molecules-both chemical and Biological. •Powerful Teaching Tool.
  • 22.
    RASMOL Roger Sayle Glaxo WellcomeResearch and Development Stevenage, Hertfordshire, U.K • RasMol is a molecular graphics program intended for the visualisation of proteins, nucleic acids and small molecules.
  • 23.
    Thrust areas ofBioinformatics?? ❑ Genomics ❑ Proteomics ❑ Pharmacogenomics ❑ Drug Designing ❑ Medical Informatics ❑ Agro Informatics ❑ Phylogeny ❑ DNA Micro arrays ❑ Neural Networks ❑ Large Genome projects
  • 24.
    Challenges of workingin bioinformatics ⦿Need to feel comfortable in interdisciplinary area ⦿Depend on others for primary data ⦿Need to address important biological and computer science problems
  • 25.
    • First therewas invivo biology, then came invitro and the discipline now is INSILICO. Conclusion