Syllabus [PDF]
CIT Lubrano • Tuesday and Thursday, 2:30-3:50pm
Prof. Sorin Istrail
401-863-6196 • sorin@cs.brown.edu
Office Hours: TBA or by appointment
- Introduction. Comparative genomics: genomes (DNA and protein sequence), protein structures (geometry), gene regulation (logic, systems), immunology (systems). The nature and complexity of bio-molecular data. The intertwining of algorithms and statistics in the design of genomics tools. The “Gold-Bug” – a metaphor for Bioinformatics.
-
Genomics
- Alignment of two bio-molecular sequences. Local and global alignment. Dynamic Programming algorithms. Edit graph theory and visualization of alignments. The fundamental Dynamic Programming recurrence. The Smith-Waterman algorithm. Probability and statistical significance. Evolutionary models. Information theory and the genetic code. The PAM matrices of Margaret Dayhoff, the “mother and father” of Bioinformatics. Statistical assumptions for bio-molecular data. Statistics hypothesis testing. How Sir R.A. Fisher caught Mendel “cheating.”
- BLAST. An outline of the BLAST statistical theory. Algorithmic speed up: a linear time approximation of the quadratic Smith-Waterman algorithm.
- Gene prediction. Hidden Markov Model algorithms.
- Genome Assembly. Assembly algorithms. Comparing assemblies: Of Mice and Dogs and Chimps and Men.
- Genomic Regulation. Regulatory motifs. Transcription factors. Position weight matrices algorithms. Sea urchin - the First Genome of genomic regulation. A visit to the Sea Urchin Assembly. Suffix trees data structure and algorithms. Compressing genomic regulatory information. Designing DNA arrays.
- Protein folding. The computational protein folding problem. Secondary structure prediction algorithms. Classification of protein folds. Protein structure alignment algorithms. Protein misfolding and the Mad Cow Disease.
- Genetic variation. Single Nucleotide Polymorphism. Haplotypes. Informative SNPs. The Minimum Informative Subset Problem. Guilt by association. Statistical power and disease associations.
-
Systems Biology.
- Biological complexity. Complex systems and Herbert Simon.s Hora and Tempus problem.
- Human and pathogens. Comparative immuno-peptidomics of human and their pathogens. A tale and a tour of two genomes: the virus genome and the bacteria genome. Do pathogens evolve their proteome to evade the human immune system?
- Cancer genomics. Tumor complexity.
- Gene regulatory networks. Logic functions of genomic cis-regulatory code. Davidson vs. von Neumann: an information processing parallel between the genomic regulatory system and the nervous system.