Pcmd bioinformatics-lecture i

1
Molecular Medicine in Collaboration with
Bioinformatics
M. Kamran Azim, Ph.D.
International Center for Chemical and Biological Sciences
H.E.J. Research Institute of Chemistry,
Dr. Panjwani Center for Molecular Medicine and Drug Research
University of Karachi

2
What is Bioinformatics?
Bioinformatics is the science of storing, retrieving and
analyzing large amounts of biological information.
It cuts across many disciplines, including biology,
computer science and mathematics. (as defined by EBI)

3
Application of Bioinformatics in
Molecular Medicine
Molecular basis of pathogenicity;
e.g. Amyloid protein in neurodegenerative
diseases
Novel targets of therapeutic
intervention;
e.g. Caspase inhibitors in diseases
characterized by tissue degradation
Molecular Diagnostics;
e.g. Bird Flu
Host-pathogen interaction;
e.g. Bacterial adherence factors
Novel Research tools;
e.g. GFP-based techniques

4
How Bioinformatics can support
Molecular Medicine?
 Genome-level sequence analysis of medically important
organisms in order to;
gain comprehensive knowledge for their life cycle,
characterization of disease causing factors,
identify new targets for therapeutic intervention
 Development of Bioinformatics such as novel
algorithms, specialized databases and java-based
tools for application in genomics and proteomics.

5
Catalysts for Bioinformatics
 Large-scale DNA/genome sequencing projects have led
to an explosion of information concerning the DNA and
protein sequence data.
 Development in the field of computer technology
including the use of computerized databases for storing,
retrieving and comparing sequences; computer graphics
for displaying and manipulating three-dimensional
structures.

6http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
The explosion in sequence information
billions of bases from over 100,000 species

7
Bioinformatics
and
Molecular basis of life

8
Central paradigms of
Molecular Biology and Bioinformatics
DNA
RNA
Protein
Function
Genetic Information
Protein
Function
Cell
Tissues
Organism
Population

10
DNA for Information
Protein for Execution
Bioinformatics as
the Science of Sequence

11
Molecular Biology in Urdu poetry

21
Frederick Sanger and the Science of Sequence
at MRC, Cambridge University
 First Nobel Prize (1958)
was awarded for developing
methods to determine the
order (sequence) of the
building blocks of the
protein, insulin.
 Second Nobel Prize (1980)
for developing several and
ever-improving methods to
sequence nucleic acids
(DNA and RNA).

22
Prof. Zafar H. Zaidi and Bioinformatics
 Pioneered Protein
Chemistry;
Protein Sequencing;
Sequence analysis
(1975-2001)
 Initiated Bioinformatics;
Protein Structure Prediction,
Homology modeling
(1991-2001)

23
Scope of topics
 Biological databases (utilization, development and
integration etc.)
 Analyses of nucleotide and protein sequence information
 Analyses of 3D structural data of macromolecules.
 Assessment of how small molecules interact with
macromolecules in biological systems.
 Studies on networks of protein-protein interactions
 Simulation of biological processes
 More

24
Scope of topics
integration etc.)
 More

25
Bioinformatics Resources
Sequence Databases
 1960s; The first sequences to be collected
were those of proteins by Margaret Dayhoff
at the NBRF, Washington, USA.
[Protein sequence atlas; PIR]
 1970s; First DNA sequences databases were
(a) the GenBank at Los Alamos National
Labotaroy, New Maxico, USA
(b) EMBL at the European Molecular Biology
Laboratory at Heidelberg, Germany.

26
Primary Bioinformatics Databases
 DNA sequence databases
GenBank, EMBL and DDBJ
 Genome Centers databases
Sanger Center, TIGR
 Protein sequence Databases
SwissProt, PIR, UniProt
 Protein 3D structure databases
PDB, SCOP, CATH
 Specialized databases
MEROPS, Protein Kinase Resource

27
Accessing Bioinformatics Databases
 ENTREZ; a window-based program with
a web-based interface developed at
the NCBI, USA.
 SRS; similar service at the EBI, UK.

29
Specialized databases useful in Molecular Medicine
 OMIM- Online Mendelian Inheritance in Man. This
database is a catalog of human genes and genetic disorders.
 ENSEMBL- is designed to allow free access to all the genetic
information available about the Human Genome.
 Human Gene Mutation DB- contains sequences and
phenotypes of human disease-causing mutations.
 KEGG- to computerize knowledge of molecular interactions
namely metabolic pathways, regulatory pathways and molecular
assemblies.
 dbSNP- Single Nucleotide Polymorphisms DB
 GeneCards- an integrated DB of human genes that includes
automatically-mined genomic, proteomic and transcriptomic
information, as well as orthologies, disease relationships, SNPs,
gene expression, gene function etc.

30
Scope of topics
integration etc.)
 More

31
Sequence Analysis
Sequence Analysis Programs
 As more DNA sequences became available in the late
1970s, interest also increased in developing computer
programs to analyze the sequences.
 In early 1980s, the Genetics Computer Group (GCG)
was started at the University of Wisconsin, USA, offering
a set of programs for sequence analysis.

32
Sequence Analysis
Methods for Comparing Sequences
 The Dot Matrix method (DOTPLOT, COMPARE)
 Dynamic programming matrices
 Word or k-tuple methods (FASTA, BLAST)

33
Sequence analysis by DotPlots
K A M R A N
K *
A * *
M *
R *
A * *
N *
KAMRAN
KAMRAN
Alignment
K A M R A N
K *
E
M *
R *
A * *
N *
KAMRAN
KEMRAN
Substitution
K A M R A A N
K *
E
M *
R *
A * *
N *
KAMRAAN
KEMRA-N
Insertion/deletion

34
DotDlot analysis; repetitive sequences
K A M R A N K A M R A N
K
* *
E
M
* *
R
* *
A
* * * *
N
* *
K
* *
E
M
* *
R
* *
A
* * * *
N
* *

35
Dynamic Programming for sequence alignment
identity and substitution scoring, gap penalty

36
Sequence Analysis
 Sequence comparison and alignment
Pairwise sequence alignment
FASTA; BLAST
Multiple sequence alignment
PILEUP; ClustalW
 Pattern search; PROSITE
 Phylogenetic analysis; Phylip
 Genome-level sequence analysis

37
Pairwise sequence alignment of
(a) human and chicken cathepsin B and
(b) human and hookworm cathepsin B.
Identical residues are indicated as dark blocks.

38

39
of the family of kunitz-type proteinase inhibitors

40
Phylogenetic Analysis
of kunitz-type proteinase inhibitors based on multiple sequence alignment

41
Scope of topics
integration etc.)
three dimensional strutures and Structural Bioinformatics
 More

42
End Note
 Bioinformatics is the body of Knowledge;
A wealth of data on sequences and
structures.
 Key Resource is KNOWLEDGE
 And the key technology is INFORMATION
HANDLING

43
Leading Bioinformatics Institutions
European Bioinformatics Institute, Cambridge, UK
National Center for Biotechnology Information, USA
National Human Genome Research Institute, USA
EMBL, Heidelberg, Germany
J. Craig Ventor Institute, USA
[formerly The Institute of Genome Research (TIGR)]
The Sanger Institute, UK
Bioinformatics Journals and Books
Bioinformatics
Genome Research
Nucleic Acid Research
Bioinformatics by D.W. Mount
Introduction to Bioinformatics by Attwood
Structural Bioinformatics by P.E. Bourne
Bioinformatics; A beginner’s Guide by Claverie
Bioinformatics Computing by B. Bergeron
Bioinformatics Societies
International Society for Computational Biology (ICSB)
Asia Pacific Bioinformatics Network (APBioNet)
European Conference on Computational Biology (ECCB)

Pcmd bioinformatics-lecture i

In this document

More Related Content

What's hot

Viewers also liked

Similar to Pcmd bioinformatics-lecture i

More from Muhammad Younis

Recently uploaded

Pcmd bioinformatics-lecture i