Table of contents
Bioinformatics
The application of advanced computing techniques in analysis and managing biological data.
OR
The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information." -Frank Tekaia
Three important sub-disciplines within bioinformatics:
The development of new algorithms and statistics with which to assess relationships among members of large data sets;
The analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures, et.al.
The development and implementation of tools that enable efficient access and management of different types of information.
Computers emerged as important tools in molecular biology during the early 1960s. Margaret Dayhoff gathered all the available protein sequence data to create the first bioinformatics database in 1965.
Dayhoff and co-workers organized the proteins into families and superfamilies based on degree of sequence similarity
Sequence alignment was introduced as well as special tables that reflected the frequency of changes observed in the sequences of a group of closely related proteins.
1965 Margaret Dayho 's Atlas of Protein Sequences
1970 Needleman-Wunsch algorithm
1977 DNA sequence assembly, editing, and analysis tools(Staden package)
1981 Smith-Waterman algorithm developed
1981 The concept of a sequence motif (RF Doolittle)
1982 GenBank Release 3 made public
1982 Phage lambda genome (a bacterial virus) sequenced
The genome contains 48,490 base pairs of double-stranded, linear DNA
1983 Sequence database searching algorithm (Wilbur-Lipman)
1985 FASTP/FASTN: fast sequence similarity searching
1987 EMBL, Genbank,Swiss-Prot databases
1988 National Center for Biotechnology Information (NCBI) created at NIH/NLM
1988 The European Molecular Biology network (EMBnet) was founded for database distribution
In 2011 EMBnet has 37 nodes spread over 32 countries.
1990 BLAST: fast sequence similarity searching
1991 EST(expressed sequence tag sequencing)
1993 Sanger Centre, Hinxton, UK
1994 EMBL European Bioinformatics Institute, Hinxton, UK
1995 First bacterial genomes completely sequenced
1996 Yeast genome completely sequenced
1997 PSI-BLAST
1998 Worm (multicellular) genome completely sequenced
1999 Fly genome completely sequenced
The Human Genome Project
The publication of the draft sequence of the human genome in 2001 signaled the future of bioinformatics
Diagnosis of disease and disease risks
DNA sequencing can detect the absence of a particular gene, or a mutation.
Identification of specific gene sequences associated with diseases will permit fast and reliable diagnosis of conditions.
Genetics of responses to therapy
Because people di er in their ability to metabolize drugs, different patients with the same condition may require different dosages.
Sequence analysis permits selecting drugs and dosages optimal for individual patients.
Computation Biology
Primarily concerned with evolutionary, population and theoretical biology, rather than the cellular or molecular level.
Cheminformatics
The study and application of computing methods, along with chemical and biological technology, for drug design and development.
Biomedical Informatics
Generally concerned with how the data is manipulated rather than the data itself.
ABOUT THE AUTHOR
JBML JBML is a Bioinformatician with 3+ years of experience in Bioinformatics Research and Machine Learning, with a demonstrated command of Object Oriented programming and data structures. Focused on further developing acquired skills by learning from top teams and completing development projects. He loves to share his knowledge and experience and is open to collaborations.