Introduction to bioinformatics and databases .pptx

Central dogma of Molecular Biology
Central dogma of Bioinformatics
SEQUENCE STRUCTURE FUNCTION
DNA RNA PROTEIN

WHAT IS
BIOINFORMATICS?????????????

Can Bioinformatics be really
defined!

Bio- Life/ Living Organisms
Informatics- Information system or Information flow
Bioinformatics – Application of information
technology
to the storage, retrieval and analysis of biological information,
Facilitated by the use of computers
WHAT IS BIOINFORMATICS?

What is Bioinformatics?
Mathematics
and Statistics
Biology
Computer
Science

Integrative Science
Co
mp
ute
r
Scie
nce
&
Inf
orm
atio
n
Tec
hno
log
y
Bio
log
y
Mat
he
mat
ics
&
Stat
isti
cs
Ph
ysi
cs
&
Ch
em
istr
y
Bioinfo
rmatics
Information Transduction (Data
converge flow)

✔The development of Bioinformatics
started with the networking of computers
and accumulation of data on genes and
proteins.

Definitions of Bioinformatics.
• Bioinformatics is the field of
science in which biology, computer
science, and information technology
merge into a single discipline.
•Is the Science of managing and
analyzing biological data using
advanced computing Techniques.

•The mathematical, statistical and
computing methods that aim to solve
biological problems using DNA and
amino acids sequences and related
information
•Bioinformatics is a computer-assisted
interface discipline dealing with the
storage, management, access and
processing of molecular biology data.

History of Bioinformatics
• 1933 Process of Electrophoresis for separating proteins in a solution introduced by
Tiselius
• 1951 Structure for the alpha-helix and beta-sheet proposed by Pauling and Corey
• 1953 Double helix model for DNA proposed by Watson and Crick
• 1965 Margaret Dayhoff's Atlas of protein sequences
• 1970 Details of the Needleman-Wunsch algorithm for sequence comparison
published
• 1973 Announcement of the Brookhaven Protein Data Bank
• 1981 Smith-Waterman algorithm developed

Contd..
• 1981 The concept of a sequence motif (Doolittle)
• 1982 GenBank Release 3 made public
• 1985 FASTP/FASTN: fast sequence similarity searching
• 1986 Term "Genomics" appeared for the first time
• 1986 PCR (Polymerase Chain Reaction) described by Kary Mullis and co-workers
• 1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM
• 1988 EMBnet network for database distribution
• 1990 BLAST: fast sequence similarity searching by Altshul
• 1991 EST: expressed sequence tag sequencing
• 1993 Sanger Centre, Hinxton, UK

Contd..
• 1994 EMBL European Bioinformatics Institute, Hinxton, UK
• 1996 Yeast genome completely sequenced
• 1997 PSI-BLAST
• 1999 Fly genome completely sequenced
• 2000 The A. thaliana genome (100 Mb)
• 2001 The human genome (3 Giga base pairs) is published
• 2003 Human Genome Project completed
• 2004 The draft genome sequence of the brown Norway laboratory rat, Rattus
norvegicus, was completed by the Rat Genome Sequencing project Consortium

Biological Databases
A biological database is a collection of both experimental and
theoretical data that is organized so that its contents can be easily
1.Accessed
2.Managed
3.Updated
4.Retrieved
Sequence Database
Structure Database
Specialized Database

Some genome sizes
⦿ HIV2 virus 9671 bp
⦿ Mycoplasma genitalis 5.8 · 105
bp
⦿ Haemophilus influenzae 1.83 · 106
bp
⦿ Saccharomyces cerevisiae 1.21 · 107
bp
⦿ Drosophila melanogaster 1.65 · 108
bp
⦿ Homo sapiens 3.14 · 109
bp
⦿ Some amphibians 8 · 1010
bp
⦿ Amoeba dubia 6.7 · 1011
bp

Types Of Databases
Based on the contents it can be divided into
• Nucleotide Sequence databases-GenBank
• Protein Sequence databases-Swissprot
• Macromolecular 3D structures-PDB,NDB
• Gene expression data-EST,STS
• Metabolic pathways-KEGG
• Complete Genomes-TIGR
• Literature databases-PUBmed

Omics Series…..
• Genomics – Gene identification & charaterization
• Transcriptomics – Expression profiles of mRNA
• Proteomics – functions & interactions of proteins
• Structural Genomics – Large scale structure
determination
• Cellinomics - Metabolic Pathways, cell-cell
interactions
• Pharmacogenomics – Genome-based drug design

Major Research Efforts & Applications

Applications of sequence analysis
⦿ assembly of sequence data
⦿ Identification of functional elements in sequences,
⦿ gene prediction
⦿ Sequence comparison
⦿ Classification of proteins
⦿ Comparative genomics
⦿ RNA structure prediction
⦿ Protein structure prediction
⦿ Evolutionary history

Structure Analysis Why?
◆
Structure is believed to be more closely related to function of
proteins
◆
Predicting the function of proteins is a key challenge facing
computational biology
◆
Much of the benefits of molecular biology will depend on predicting
and understanding the functions of proteins
◆
The potential benefits of computationally predicting functions is huge
Faster and cheaper than experimentation

MOLECULAR VISUALIZATION
•The Molecular Visualization tools allows the user to load and
view in Three Dimensional detail, the structure of molecules-both
chemical and Biological.
•Powerful Teaching Tool.

RASMOL
Roger Sayle
Glaxo Wellcome Research and
Development
Stevenage, Hertfordshire, U.K
• RasMol is a molecular graphics program intended for
the visualisation of proteins, nucleic acids and small
molecules.

Thrust areas of Bioinformatics??
❑ Genomics
❑ Proteomics
❑ Pharmacogenomics
❑ Drug Designing
❑ Medical Informatics
❑ Agro Informatics
❑ Phylogeny
❑ DNA Micro arrays
❑ Neural Networks
❑ Large Genome projects

Challenges of working in bioinformatics
⦿Need to feel comfortable in interdisciplinary
area
⦿Depend on others for primary data
⦿Need to address important biological and
computer science problems

• First there was invivo biology, then came invitro and the
discipline now is INSILICO.
Conclusion

Introduction to bioinformatics and databases .pptx

More Related Content

Similar to Introduction to bioinformatics and databases .pptx

Recently uploaded

Introduction to bioinformatics and databases .pptx