Preprint · not peer-reviewed Researchers Students

New statistical framework corrects for recurrent mutation in large-scale allele frequency analysis

A bioRxiv preprint from Cold Spring Harbor Laboratory introduces the single mutation frequency spectrum, a revised approach to analysing rare allele data that accounts for identical-by-state variants arising from recurrent mutation events.

Published · AI-drafted summary based on 1 public source
Illustration for generic story
Illustrative image — not from the source article.
Share

As whole-genome and whole-exome sequencing datasets grow to encompass hundreds of thousands of samples, they routinely surface alleles at frequencies too low to have been captured in earlier studies. A preprint posted on bioRxiv proposes that conventional site frequency spectrum (SFS) analysis breaks down at these very low frequencies because some rare alleles that appear identical are not descended from a shared ancestral mutation — they arose independently at the same site, a phenomenon known as recurrent mutation.

To address this, the authors define the single mutation frequency spectrum (SMFS), which restricts analysis to alleles that are identical by descent from a single mutational event rather than identical merely by state. Because the standard SFS is strongly dependent on underlying mutation rates when recurrent mutations are present, the authors argue that using the SFS without correction introduces systematic biases into downstream demographic inference and tests of selection.

The work is a methods contribution aimed at population geneticists who work with large biobank-scale datasets or ultra-rare variant catalogues. The preprint has not yet undergone peer review. If the framework is validated and adopted, it could improve the accuracy of demographic modelling, mutation rate estimation, and tests for natural selection applied to the rare end of the allele frequency spectrum.

Sources

Read the original reporting — these are the public sources this summary draws from.

  1. Primary sourcePreprint bioRxiv (Cold Spring Harbor Laboratory) · 2026-05-31
    Accounting for recurrent mutation in the frequency spectrum of rare alleles

Tags

site-frequency-spectrum recurrent-mutation population-genetics rare-alleles statistical-genetics demographic-inference
Share

About Genetic Current

Educational summaries of public genetics news

Genetic Current is the news section of Evagene, an academic, research, and educational pedigree-modelling platform. Stories are AI-drafted summaries of items from trusted public sources, written for researchers, clinicians, educators, students, genealogists, and patients with an interest in genetics. Summaries are for educational and research purposes only and are not medical advice.

Join the Evagene Alpha Waiting List