# Coalescent Gene Genealogies

Requires a Wolfram Notebook System

Interact on desktop, mobile and cloud with the free Wolfram CDF Player or other Wolfram Language products.

Requires a Wolfram Notebook System

Edit on desktop, mobile and cloud with any Wolfram Language product.

The coalescent describes the genealogical relations of the lineages ancestral to a sample of genes. The sample size, , may vary from 1 to 50 gene copies; each tree shows a possible genealogy among these copies with nodes representing the common ancestors of the sampled genes. The lengths of branches are randomly based on the probability of coalescence in a population of gene copies.

Contributed by: John Hawks (March 2011)

Open content licensed under CC BY-NC-SA

## Snapshots

## Details

The coalescent is an algorithmic approach to simulating gene genealogies. It is also a way of looking at the history of genes in a population, which has given rise to considerable theoretical development in population genetics.

The basic idea is that instead of considering the reproduction of genes forward in time, we instead look at the ancestry of a sample of genes looking backward in time. We assume a diploid Wright–Fisher population model. In each generation, two genes have a probability * *of sharing a single common ancestor, and a probability of having two distinct ancestors. Then the number of generations to have elapsed since their common ancestor is distributed as an exponential decay curve with mean *.* With sample of size , this Demonstration yields this distribution.

If we extend the sample to gene copies, the expected time in generations until two of these will coalesce is given as

,

again distributed as a negative exponential. At this time, the algorithm chooses two of the lineages randomly and combines them into a single ancestral lineage. This process iterated across epochs yields a bifurcating tree. The times of the nodes and the total depth of the tree are one instantiation consistent with evolution by genetic drift alone in a finite population.

A fundamental review reference on the coalescent, including the algorithm used here and a C implementation, is

R. R. Hudson, "Gene Genealogies and the Coalescent Process," *Oxford Surveys in Evolutionary Biology*, 7, 1990 pp. 1–44.

## Permanent Citation