- EECS 458, Fall 2004: Introduction to Computational Biology/BioInformatics
- Fundamental algorithmic and statistical methods in computational molecular biology and bioinformatics will be discussed.
Topics include DNA sequencing, sequence analysis and gene prediction, pair wise and multiple alignment, motif identification,
polymorphism, gene mapping and haplotyping algorithms, phylogenetic analysis, Microarry data analysis, protein sequencing,
structure and functions. The course will focus on the algorithm part such as dynamic programming, string algorithms, graph
theory, hidden Markov model, etc. Knowledge in molecular biology will be a plus but not assumed.
- Instructor:
- Dr. Jing Li
Office: 509 Olin Bldg
Phone: X0356
Email: jingli@eecs.cwru.edu
Office hours: MW 10:30-11:30am or by appt
- Teaching Assistant:
- Class Meeting:
- MWF 11:30am-12:20pm, SEARS 356
- Reference books
- Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Dan Gus-
field, Cambridge Press, 1997. ISBN: 0-521-58519-8
- Computational Molecular Biology: An Algorithmic Approach, Pavel Pevzner, 2000, the MIT Press. ISBN 0-262-16197-4
- Introduction to Computational Biology: Maps, Sequences and Genomes, Michael S. Waterman, Chapman and Hall, 1995. ISBN: 0-412-99391-0
- Pierre Baldi, Soren Brunak, Bioinformatics: the machine learning approach, MIT press, 1998
- Richard Durbin, A. Krogh, G. Mitchison, and S. Eddy, Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1999. QP620.B576
- Prerequisites
- EECS 340, EECS 233 or equivalent knowledge.
- Course Format
- The course will include lectures by the instructor, guest lectures, direct readings, discussions and student project/presentations.
The actual format of the course will depend on the number of enrollment and students' backgrounds.
- Homework
- There will be four assignments. Most
of the assignments will focus on the design and analysis of algorithms, but some may require
programming.
- Projects
- Students are expected to form teams of size 2-3 and work on research topics provided by the instructor or selected by their own.
The purpose of the project is to familiarize students with methodologies of doing research on important problems in computational biology. Example of possible
projects include proposal of new methods, improvement on previous results, applying methods on new formulations of important problems. Implementation of some
existing algorithms as software tools or survey on a topic not covered in class may also be acceptable. But for implementation, the maximum number of a group is 2 and
for survey, you are expected to work alone. Each team will give a presentation in class and will turn in their term paper by the last class.
- Student Project Homepages
- Grading
- Homework 40% and project/presentation 60%.
- Slides
- Course overview (pdf)
- Introduction to molecular biology (pdf)
- DNA sequencing (pdf)
- Sequence alignment (pdf)
- Guest Leture by Dr. M Adams on Genome annotation (slides and exercise)
- Introduction to Probability, HMM (pdf)
- String algorithms (pdf)
- Pattern discovery (pdf)
- Statistical genetics (pdf)
- Haplotype Inference, SNP and genetic variation in human(ppt)
- Phylogeny reconstruction(pdf), courtesy of Mona Singh
- Gene Expression Analysis(pdf)
- Downloadable papers
- Collins, F. S. et al. A vision for the future of genomics research, Nature 422:835-47(pdf)
- Pattern Discovery Papers: A review,
TEIRESIAS,
WINNOWER,
Projection,
Weeder,
Weeder1,
Gibbs Sampler
Sagot paper,
Footprinter,
A Tutorial of EM algorithm,
MEME,
- Statistical genetics Papers: Human Molecular Genetics, 2nd ed.
Strachan and Read,
GeneHunter,
Allegero,
Merlin,
Superlink (ISMB02) ,
Superlink (RECOMB03) ,
- Haplotyping Papers: Three review papers 1,
2, 3.
- Suffix tree and reserve prefix tree pdf.
- Calendar (preliminary): Aug 23-Dec 15 2004
- Week 1 (Aug 23/25/27): Overview, introduction to molecular biology
- Week 2 (Aug 30/Sep 1/3): Sequence alignment, dynamic programming
- Week 3 (Sep 6(H)/8/10): Multiple sequence alignment, Guest Lecture by Dr. M Adams on Genome annotation (slides and exercise)
- Week 4 (Sep 13/15/17): Intro to probability, hidden Markov models
- Week 5 (Sep 20/22/24): HMM with applications, Eaxct string matching
- Week 6 (Sep 27/29/Oct 1)Eaxct string matching, suffix tree
- Week 7 (Oct 4/6/8) suffix tree, Pattern/Motif identification
- Week 8 (Oct 11/13/15) Pattern/Motif identification, probabilistic models
- Week 9 (Oct 18(H)/20/22) Gene mapping, linkage/association,
- Week 10 (Oct 25/27/29) Haplotype based association mapping
- Week 11 (Nov 1/3/5) SNP variation, haplotyping, Guest lecture by Dr. Joseph Nadeau on systems biology, gnetic perturbations, and network discovery, paper and slides..
- Week 12 (Nov 8/10/12) Haplotyping, Phylogeny reconstruction methods, Microarray data analysis, clustering, classification, Proteomics: sequencing, structure and functions, Protein interaction networks
- Week 13 (Nov 15/17/19) Guest lectures by Dr. Mireya Diaz on coalescent theory slides, Student presentations
- Week 14 (Nov 22/24/26(H)) Student presentations
- Week 15 (Nov 29/Dec 1/3) Student presentations
- Relevant Courses at Case
-
- EECS 433: Database Systems
- EECS 435: Data Mining
- EECS 454: Analysis of Algorithms
- EPBI 452: Statistical Methods in Human Genetics
- EPBI 457: Genetic Linkage Analysis
- EPBI 471: Statistical Aspects of Data Mining
- EPBI 491: Epidemiology: Application of Theory and Methods
- EPBI 492: Epidemiology: Statistical Methods and Modeling
- GENE 508: Bioinformatics and Computational Genomics
- GENE 509: Complex Genetic Traits
- STAT 445/446: Theoretical Statistics I/II
- BIOC 618: Biology and Mathematics of Microarray Studies
- Resources
- more books
- Current Topics in Computational Biology, Tao Jiang, Ying Xu and Michael Zhang (co-ed), the MIT Press Series on Computational Molecular Biology, Feb. 2002. ISBN: 0-262-10092-4.
- Joćo Setubal and Joćo Carlos Meidanis Introduction to Computational Molecular Biology, PWS Publishing Co., 1997
- Jason Wang, Bruce A. Shapiro, and Dennis Shasha, Pattern Discovery in Biomolecular Data Tools, Techniques, and Applications, Oxford University Press, 1999
- David Mount, Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor Laboratory Press, 2002
- Dan E. Krane, Michael L. Raymer, Fundamental Concepts of Bioinformatics, Benjamin Cummings 2002
- Warren J. Ewens, Gregory R. Grant, Statistical Methods in Bioinformatics: An Introduction, Springer, 2001
- Statistical genetics: handbook, ott, lague, QTL, Terry's new book on microarray