CSCI2950-L: Algorithmic Foundations of Computational Biology II

Class Notes

Note: Most documents are only accessible from within the campus network.


Scribe template and LaTeX tutorials



Lecture Date Topic
Lecture 1 Population Genetics Introduction
Lecture 2 SNPs, Haplotypes, and Disease Association
Lecture 3 Introduction to Chi^2
Lecture 4 Statistical Tests
Lecture 5 The International HAPMAP Project
Lecture 6 Introduction to Linkage Disequilibrium
Lecture 7 Linkage Disequilibrium Measure r^2
Lecture 8 Hardy Weinberg for Two Loci
Lecture 9 LD and Recombination, Tests of Association
Lecture 10 Population Methods and Haplotype Phasing
Lecture 11 Haplotype Phasing and Clark's Algorithm
Lecture 12 Maximum Likelihood and Expectation Maximization
Lecture 13 Complexity of Parsimony Haplotype Phasing
Lecture 14 Complexity of Likelihood Haplotype Phasing
Lecture 15 Presentation of Four Papers on Genetic Variation
Lecture 16 Association Study on Diabetes
Lecture 17 The Four Gamete Condition, Perfect Phylogenies,and Applications in Population Genetics
Lecture 18 Summary of "Mapping Using Inferred ARGs" (Minichiello & Durbin)
Lecture 19 Genomewide Association Studies
Lecture 20 Presentations

Genomewide association
studies
Presentation agenda

Type II diabetes

Multiple Sclerosis

Prostate Cancer
Lecture 21 Presentations

Special Topics
Tagging SNPs

Bayesian methods for phasing




Introduction to Population Genetics
Principles of Population Genetics (Fourth Edition, 2007) Daniel L. Hartl and Andrew G. Clark, Chapters 1 and 2



SNPs, Haplotypes, and Disease Association
SNP Intro Presentation

SNPs Problems, Complexity and Algorithms, G. Lancia, V. Bafna, S. Istrail, R. Lippert, R. Schwartz

Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association Studies Bjarni V. Halldorsson, Vineet Bafna, Ross Lippert, Russell Schwartz, Francisco M. De La Vega,2 Andrew G. Clark, and Sorin Istrail, Genome Research 14:1633-1640, 2004

Variation is the spice of life, Leonid Kruglyak and Deborah A. Nickerson, Nature Genetics, Volume 27 March 2001.

Guilt by association, David Altshuler, Mark Daly & Leonid Kruglyak, Nature Genetics, Volume 26, October 2000

Whole-Genome Patterns of Common DNA Variation in Three Human Populations, David A. Hinds, Laura L. Stuve, Geoffrey B. Nilsen, Eran Halperin, Eleazar Eskin, Dennis G. Ballinger, Kelly A. Frazer, David R. Cox, Science, 18 FEB 2005 Volume 307



Statistical Tests
Efficiency and Power in Genetic Association Studies, de Bakker et al., Nature Genetics, Volume 37, Number 11, November 2005



International HAPMAP project
A haplotype map of the human genome, The International HapMap Consortium, NATURE, Vol 437, 27 October 2005



Introduction to Linkage Disequilibrium
Linkage disequilibrium in the human genome, David E. Reich, Michele Cargill, Stacey Bolk, James Ireland, Pardis C. Sabeti, Daniel J. Richter, Thomas Lavery, Rose Kouyoumjian, Shelli F. Farhadian, Ryk Ward & Eric S. Lander, Nature, VOL 411, 10 MAY 2001

Linkage disequilibrium and the mapping of complex human traits, Kenneth M.Weiss and Andrew G. Clark, TRENDS in Genetics Vol.18 No.1 January 2002

Patterns of Linkage Disequilibrium in the Human Genome, Kristin G. Ardlie, Leonid Kruglyak and Mark Seielstad, Nature Reviews 3, April 2002, 299 The Interaction of Selection and Linkage I. General Considerationsl Heterotic Models, R.C. Lewontin, Genetics 49: 49-67 January, 1964



Linkage Disequilibrium Measure r^2
A first-generation linkage disequilibrium map of human chromosome 22, Elisabeth Dawson et al., Nature Volume 418, 1 August 2002

A Comparison of Linkage Disequilibrium Measures for Fine-Scale Mapping, B. Devlin and N. Risch, Genomics 29, 311-322 (1995)

Linkage Disequilibrium in Humans: Models and Data, Jonathan K. Pritchard and Molly Przeworski, Am. J. Hum. Genet. 69:114, 2001



Hardy Weinberg for Two Loci
Principles of Population Genetics (Fourth Edition, 2007) Daniel L. Hartl and Andrew G. Clark, Chapters 2 and 3



LD and Recombination, Tests of Association
Gametic Disequilibrium Measures: Proceed With Caution Philip W. Hedrick, Genetics 117: 331-341 (October, 1987)

The optimal measure of allelic association, N. E. Morton, W. Zhang, P. Taillon-Miller, S. Ennis, P.-Y. Kwok, and A. Collins, PNAS April 24, 2001, vol. 98 no. 9, 5217-5221

Linkage Disequilibrium and the Search for Complex Disease Genes, L.B. Jorde, Genome Res. 2000 10: 1435-1444



Population Methods and Haplotype Phasing
Principles of Population Genetics (Fourth Edition, 2007) Daniel L. Hartl and Andrew G. Clark, Chapter 6

A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data, Matthew Stephens and Peter Donnelly, Am. J. Hum. Genet. 73:1162-1169, 2003

A Fast and Flexible Statistical Model for Large-Scale Population GenotypeData: Applications to Inferring Missing Genotypes and Haplotypic Phase, Paul Scheet and Matthew Stephens, The American Journal of Human Genetics Volume 78 April 2006

A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data, Matthew Stephens and Peter Donnelly, Am. J. Hum. Genet. 73: 1162-1169, 2003



Haplotype Phasing and Clark's Algorithm
A Comparison of Phasing Algorithms for Trios and Unrelated Individuals, Jonathan Marchini et al., Am. J. Hum. Genet. 2006;78:437-450



Maximum Likelihood and Expectation Maximization
Maximum-Likelihood Estimation of Molecular Haplotype Frequencies in a Diploid Population, Laurent Excoffier and Montgomery Slatkin, Mol. Biol. Evol. 12(5):921-927, 1995



Complexity of Haplotype Phasing
A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase, Paul Scheet and Matthew Stephens, Am. J. Hum. Genet. 2006;78: 629-644



Summary of "Mapping Using Inferred ARGs" (Minichiello & Durbin)
Mapping Trait Loci by Use of Inferred Ancestral Recombination Graphs, Mark J. Minichiello and Richard Durbin, Am. J. Hum. Genet. 2006;79:910922

Coalescent-based association mapping and fine mapping, of complex trait loci, Sebastian Zollner and Jonathan K. Pritchard



Genomewide Association Studies
Old Suspects Found Guilty: The First Genome Profile of Multiple Sclerosis, Leena Peltonen, N. England Journal of Medicine, 357:9, August 30, 2007

Risk Alleles for Multiple Sclerosis Identified by a Genomewide Study The International Multiple Sclerosis Genetics Consortium, N Engl J Med 2007: 357



Special Topics
Coalescent
Mark J. Minichiello and Richard Durbin, Mapping Trait Loci by Use of Inferred Ancestral Recombination Graphs, The American Journal of Human Genetics, volume 79 (2006), pages 910-922

Sebastian Zollner and Jonathan Pritchard, Coalescent-based association mapping and fine mapping of complex trait loci, Genetics. 2005 Feb;169(2):1071-92. Epub 2004 Oct 16

Admixture
Hoggart CJ et al., Design and analysis of admixture mapping studies, Am J Hum Genet. 2004 May;74(5):965-78

David Reich and Nick Patterson, Will admixture mapping work to find disease genes?, Philos Trans R Soc Lond B Biol Sci. 2005 August 29; 360(1460): 1605-1607

Michael W. Smith and Stephen J. O'Brien, Mapping by admixture linkage disequilibrium: advances, limitations and guidelines, Nature Reviews Genetics 6, 623-632 (August 2005)

Patterson N et al., Methods for high-density admixture mapping of disease genes, Am J Hum Genet. 2004 May;74(5):979-1000

SNP selection
and haplotype blocks
V. Bafna, B. V. Halldorsson, R. Schwartz, A. Clark, and S. Istrail, Haplotypes and Informative SNP selection: Don't block out information, RECOMB, 2003 :19-27.

Bjarni V. Halldorsson, Vineet Bafna, Ross Lippert, Russell Schwartz, Frasisco M. De La Vega, Andrew G. Clark, and Sorin Istrail, Optimal haplotype block free selection of tagging SNPs for genome-wide association studies, Genome Research, 2004 14:1633-1640

Eran Halperin, Gad Kimmel and Ron Shamir, Tag SNP selection in genotype data for maximizing SNP prediction accuracy, Bioinformatics, conference version in The Annual Meeting of the International Society for Computational Biology (ISMB), 2005

Bayesian Methods
for Phasing
Matthew Stephens and Peter Donnelly, Am J Hum Genet., A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data, 2003 November; 73(5): 1162-1169

P Scheet and M Stephens, A fast and flexible statistical model for large-scale population structure in genetic association studies, Genome Research 2006 vol:16 pg:290

Olle Haggstrom, Finite Markov Chains and Algorithmic Applications






Association
Study
Papers
Prostate
Cancer
Multiple prostate cancer risk variants on 8q24, Witte, Nature Genetics 39, 579 - 580 (2007)

Genome-wide association study of prostate cancer identifies a second risk locus at 8q24, Yeager et al. Nature Genetics 39, 645 - 649 (2007)

Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men, Freedman et al. PNAS vol. 103, no. 38, pg 14068-14073 (2006)

A common variant associated with prostate cancer in European and African populations, Amundadottir et al. Nature Genetics 38, 652 - 658 (2006)
Heart Disease
Ruth McPherson et al., Science 8 June 2007, Vol. 316 no. 5830, pp. 1488 - 1491, A Common Allele on Chromosome 9 Associated with Coronary Heart Disease

Helgadottir et al., A Common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 2007 Jun 8;316(5830)1491-3. Epub 2007 May 3

Semani et al., Genomewide Association Analysis of Coronary Artery Disease, New England Journal of Medicine, July 2007 (10.1056/NEJMoa072366)

Diabetes
Helgason A et al. Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nature Genetics 39, Feb 2007, 218-225

Valeriya Lyssenko et al, Mechanisms by which common variants in the TCF7L2 gene increase risk of type 2 diabetes, J Clin Invest. 2007 August 1; 117(8): 2155-2163

Multiple Sclerosis
Simon G Gregory et al., Interleukin 7 receptor alpha chain (IL7R) shows allelic and functional association with multiple sclerosis Nature Genetics 39, 1083 - 1091 (2007)

The International Multiple Sclerosis Genetics Consortium, Risk Alleles for Multiple Sclerosis Identified by a Genomewide Study, Volume 357:851-862, August 30, 2007, Number 9

Leena Peltonen, Old Suspects Found Guilty; The First Genome Profile of Multiple Sclerosis, New England Journal of Medicine, Volume 357:927-929, August 30, 2007, Number 9