Coalescent Gene Genealogies

Initializing live version
Download to Desktop

Requires a Wolfram Notebook System

Interact on desktop, mobile and cloud with the free Wolfram Player or other Wolfram Language products.

The coalescent describes the genealogical relations of the lineages ancestral to a sample of genes. The sample size, , may vary from 1 to 50 gene copies; each tree shows a possible genealogy among these copies with nodes representing the common ancestors of the sampled genes. The lengths of branches are randomly based on the probability of coalescence in a population of gene copies.

Contributed by: John Hawks (March 2011)
Open content licensed under CC BY-NC-SA


Snapshots


Details

The coalescent is an algorithmic approach to simulating gene genealogies. It is also a way of looking at the history of genes in a population, which has given rise to considerable theoretical development in population genetics.

The basic idea is that instead of considering the reproduction of genes forward in time, we instead look at the ancestry of a sample of genes looking backward in time. We assume a diploid Wright–Fisher population model. In each generation, two genes have a probability of sharing a single common ancestor, and a probability of having two distinct ancestors. Then the number of generations to have elapsed since their common ancestor is distributed as an exponential decay curve with mean . With sample of size , this Demonstration yields this distribution.

If we extend the sample to gene copies, the expected time in generations until two of these will coalesce is given as

,

again distributed as a negative exponential. At this time, the algorithm chooses two of the lineages randomly and combines them into a single ancestral lineage. This process iterated across epochs yields a bifurcating tree. The times of the nodes and the total depth of the tree are one instantiation consistent with evolution by genetic drift alone in a finite population.

A fundamental review reference on the coalescent, including the algorithm used here and a C implementation, is

R. R. Hudson, "Gene Genealogies and the Coalescent Process," Oxford Surveys in Evolutionary Biology, 7, 1990 pp. 1–44.



Feedback (field required)
Email (field required) Name
Occupation Organization
Note: Your message & contact information may be shared with the author of any specific Demonstration for which you give feedback.
Send