Syllabus
CSCI2950-L: Algorithmic Foundations of Computational Biology II
CIT 368 • Tuesday and Thursday, 10:30-11:50
Prof. Sorin Istrail
401-863-6196 • sorin@cs.brown.edu
In the post genome-sequence phase of the Human Genome Project, the HapMap Project has been focusing on the study of inherited genetic variation, and its critical but as yet largely uncharacterized role in human disease. Most common diseases, such as diabetes, cancer, and heart disease are affected by many genes and environmental factors. Although any two unrelated people are the same at about 99.9% of their DNA sequences, the remaining 0.1% is important because it contains the genetic variants that influence how people differ in their risk of disease or their response to drugs. This course focuses on genome-wide disease association studies and the computational challenges of revealing the genetic determinants of disease. In this exploration we will use the haplotype map of the human genome, the HapMap, which describes the common patterns of human DNA sequence variation.
- Population Genetics
-
- Introduction
- Population Genetics Models
- Population Genetics Simulations
- Hardy-Weinberg Principle (Equilibrium)
- Haplotypes and SNPs
-
- Maximum Likelihood Estimation
- Functional meaning of SNPs
- SNP Datasets benchmarks
- The HAP MAP Project
- General articles on SNPs and Haplotypes
- Haplotype Maps
- Linkage Disequilibrium
-
- Linkage Disequilibrium measures
- OPEN PROBLEM1: The search for an "optimal" LD measure
- OPEN PROBLEM2: The interpretation of the intermediate values of D'
- Linkage Disequilibrium theory and implication for disease
- associations LD patterns across populations
- Linkage Analysis
- Haplotype Blocks, Block partitioning, and Block-free methods
- Haplotype Phasing Phasing: Expectation Maximization/Maximum Likelihood
-
- OPEN PROBLEM3: A combinatorial algorithms for the Global ML
- Phasing
-
- Parsimony I - Clark methods
- OPEN PROBLEM4: A combinatorial algorithms for Clark parsimony
- Phasing: (Pure) Parsimony (min # of haps)
- OPEN PROBLEM5: A combinatorial algorithms for (pure) parsimony
- Phasing: Bayesian methods
- OPEN PROBLEM6: A combinatorial algorithms for Bayesian methods
- Phasing: Perfect Phylogeny
- OPEN PROBLEM7: A unification theory and algorithm for the Haplotype Phasing
- Problem complexity
-
- Parsimony Phasing is NP-complete
- Maximum Likelihood Phasing is NP-complete
- Clark-type Parsimony (Maximum resolution) is NP-complete
- Haplotype Recostruction methods
-
- SNPs Problems Surveys: Algorithms and Complexity
- Tagging SNPs SNP selection/Tagging SNPs algorithms Maximum Informative set of SNPs
- OPEN PROBLEM8: Better algorithms for tagging SNPs
- Disease Associations Significance Testing - uses and misuses
- Fisher
-
- is Fisher when we need him?
- Fisher and the Likelihood: "Hypotheses (testing)" do not obey the laws of probability!
- Association Studies
-
- Guilt by Association Heart
- Disease association studies
- Diabetes 2 association studies
- Cancer association studies
- Power in Genome-wide association studies
- Uses and misuses in Disease Association studies
- Tough critiques of Association Design studies
- Common Disease Common Variant Hypothesis
- More Rigorous P-values in Disease Association Studies
- Data compression
-
- OPEN PROBLEM9-10: Algorithms for the Minimum Informative SNP/HAP Based on Genetics and Genomics for non-random associations
- Coalescent Model Theory
- Missing Data in haplotype analysis
- Recombination Rates estimation
- SNP typing/calling algorithms
- QTL (quantitative trait loci) analysis
- Pedigree models and methods

