Introduction to Bioinformatics: Genes and Blue Genes
I400 - Fall 2004
▪ Syllabus
▪ Information regarding class presentations
▪ Grades
▪ Midterm exam: October 25, 2004; 11:15-12:45
▪ Final project presentations: December 15, 2004; 10:15-12:15
▪ Final project report (5-10 pages) is due on December 15, 2004; 10:15-12:15
Class 1: August 30, 2004
Topics
Overview of the course
Introduction to bioinformatics
Reading material
Hunter, L. Molecular Biology for Computer Scientists. Artificial Intelligence for Molecular Biology, Ed. L. Hunter, pp. 1-46, AAAI Press, 1993. (pdf)
(Lecture Notes) (Homework Assignment)
Class 2: September 1, 2004
Topics
The logic of biological phenomena
Class 3: September 6, 2004
Topics
Organization and structure of cells
The central dogma of molecular biology
Class 4: September 8, 2004
Topics
Biological sequences: DNA, RNA, protein
Reading material
Textbook: Molecular biology and biological chemistry (Chapter 1)
(Lecture Notes) (Homework Assignment)
Class 5: September 13, 2004
Topics
Major biological sequence databases: GenBank, Swiss-Prot, PDB, SCOP (presentation by Henry Paik, graduate student at Indiana University School of Informatics)
(Lecture Notes by Henry Paik)
Class 6: September 15, 2004
Topics
Pairwise sequence alignment: Importance of sequence alignment, Needleman-Wunsch algorithm
Reading material
Textbook: Data searches and pairwise alignments (Chapter 2)
Class 7: September 20, 2004
Topics
Pairwise sequence alignment: Smith-Waterman algorithm, FASTA and BLAST
Class 8: September 22, 2004
Topics
Scoring matrices: PAM and BLOSUM series
Database searches
Class 9: September 27, 2004
Topics
Sequence profiles
Multiple sequence alignment: optimal algorithm, ClustalW algorithm
Class 10: September 29, 2004
Topics
Substitution patterns (presentation by Adrian Padilla)
'Oming in on function (presentation by Ashley Kowaleski)
Reading material
Textbook: Substitution patterns (Chapter 3)
Greenbaum, D. et al. Interrelating different types of genomic data, from proteome to secretome: 'oming in on function. Genome Research, pp. 1463-1468, 2001. (pdf)
Class 11: October 4, 2004
Topics
Exploring dead genes (presentation by Adrienne Manuel)
Detecting protein function and protein-protein interactions (presentation by TuyetLinh Nguyen)
Gapped BLAST and PSI-BLAST (presentation by Sean Boyle)
Database
DIP: database of interacting proteins
Reading material
Harrison, PM et al. Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. Nucleic Acids Res. 2001; 29(3): 818-830. (pdf)
Marcotte, EM et al. Detecting protein function and protein-protein interactions from genome sequences. Science. 1999. 285(5428): 751-753. (pdf)
Altschul, SF et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997; 25(17): 3389-3402. (pdf)
Class 12: October 6, 2004
Topics
Distance-based methods of phylogenetics
Reading material
Textbook: Distance-based methods of phylogenetics (Chapter 4)
Class 13: October 11, 2004
Topics
Character-based methods of phylogenetics
Doing sequence alignment and phylogeny over the Internet
Sequence alignment and phylogenetic analysis
Reading material
Textbook: Character-based methods of phylogenetics (Chapter 5)
Class 14: October 13, 2004
Topics
Methods of phylogenetics
Guest presentation
How to use Phylip package (presentation by Kiran Annaiah, graduate student at Indiana University School of Informatics)
(Lecture Notes) (Lecture Notes by David Swofford, FSU - first 15 pages) (Homework Assignment)
Class 15: October 18, 2004
Topics
Prokaryotic gene structure
Reading material
Textbook: Genomics and gene recognition (Chapter 6)
Classes 16-17 October 20-25, 2004
Review for midterm exam (October 20) and midterm exam (October 25)
Class 19: November 1, 2004
Topics
Prokaryotic gene structure
(Lecture Notes) (DNA polymerase)
Class 20: November 3, 2004
Topics
Eukaryotic gene structure
Class 21: November 8, 2004
Topics
Introduction to statistics
Permutation test
Class 22: November 10, 2004
Topics
Introduction to statistical learning
K-nearest neighbor algorithm
(Lecture Notes) (Homework Assignment)
Class 23: November 15, 2004
Topics
Logistic regression method
Class 24: November 17, 2004
Topics
Logistic regression method - summary
Prediction of protein secondary structure - statistical approach
Information
Makeup class (class 18): presentation by Stephen Fodor, CEO, Affymetrix.
"Windows on the genome"
Friday, November 19th. Jordan Hall 102
(Lecture Notes not available)
Class 25: November 22, 2004
Topics
New research directions in predicting protein function
Guest presentation
Sean Mooney, assistant professor at Indiana University School of Medicine
(Lecture Notes not available)
Class 27: November 29, 2004
Topics
Protein folding problem
Chou-Fasman algorithm for prediction of secondary structure
Reading material
Textbook: Protein and RNA structure prediction (Chapter 7) pp. 155-167
(Lecture Notes) (Anfinsen's experiment)
Class 28: December 1, 2004
Topics
Protein sequencing and identification with mass spectrometry
Guest presentation
Haixu Tang, assistant professor at Indiana University School of Informatics
Reading material
Textbook: Proteomics (Chapter 8) pp. 184-187
(Lecture Notes not available)
Class 29: December 6, 2004
Topics
DNA arrays
Reading material
Textbook: Genomics and gene regulation (Chapter 6) pp. 143-147
Useful material
Animation of the DNA array experiment
Class 30: December 8, 2004
Topics
DNA curvature
Guest presentation
Alexander Bolshoy, visiting associate professor, Indiana University School of Informatics
(Lecture Notes not available)
Final Presentations: December 15, 2004
Topics
MES-4 (presentation by Sean Boyle)
Understanding the services from NCBI (presentation by TuyetLinh Nguyen)
Herpesviridae and you (presentation by Adrienne Manuel)
DNA forensic identification (presentation by Ashley Kowaleski)
Out-of-Africa theory (presentation by Adrian Padilla)
Last updated: 12/15/2004 11:31 PM