Functional consequences of mutation
An educational reference on the functional consequences of mutation. The page covers loss of function, gain of function, dominant-negative effects, haploinsufficiency vs recessive inheritance, splice and regulatory variation, and the role of population allele frequency (gnomAD) in variant interpretation. Examples are drawn from the published literature and from canonical pedigree-textbook conditions.
Short version. A variant's molecular consequence is one question; its functional consequence is another. Loss-of-function (LoF) variants reduce or abolish protein output: their phenotypic impact depends on whether one functional allele is enough (haploinsufficient genes produce dominant phenotypes from a single LoF variant; non-haploinsufficient genes need both alleles knocked out, producing a recessive phenotype). Gain-of-function variants change protein behaviour rather than removing it. Dominant-negative variants poison a multimeric complex with a defective subunit. Splice-site and regulatory variants act on the production rather than the protein product. Population allele frequency — from gnomAD — is a foundational variant-interpretation input.
Loss of function
Loss-of-function (LoF) variants reduce or abolish the protein product of the affected allele. The molecular categories are well defined: nonsense (premature termination codon), frameshift (insertion or deletion not a multiple of three), canonical splice-site disrupting (the GT donor or AG acceptor of an intron), large deletions removing an exon or the whole gene, and in some genes start-loss variants destroying the initiator methionine. Most LoF transcripts are degraded by nonsense-mediated decay (NMD), which surveys the position of the premature stop codon relative to the last exon-junction complex; transcripts whose stop falls in the last exon (or within ~50 nt of the final exon-junction complex) escape NMD and may produce truncated protein.
The systematic survey of LoF in healthy human exomes by MacArthur et al. 2014 (Nature 508:469) showed that even healthy individuals carry roughly 100 LoF variants, of which approximately 20 affect both copies of a gene — a foundational observation that destroyed the assumption that any nonsense or frameshift variant is automatically pathogenic. The follow-up large-scale work, culminating in the gnomAD v2 paper (Karczewski et al. 2020, Nature 581:434), derived gene-level LoF intolerance metrics (LOEUF, the upper bound of the observed-to-expected ratio of LoF variants) that allow a researcher to ask whether a gene is depleted of LoF variation in the healthy population — a strong indicator that LoF is selected against and therefore likely to cause disease when it occurs.
Haploinsufficiency vs recessive
Whether a single LoF allele produces a phenotype is a property of the gene. Genes with very tight dosage requirements (transcription factors at narrow concentration optima, structural proteins making fixed-stoichiometry complexes, secreted proteins where a 50% reduction crosses a clinical threshold) are haploinsufficient: one wild-type allele is not enough, and a single LoF variant produces a dominant phenotype. Most genes are not haploinsufficient: a single LoF reduces output by 50% without phenotypic consequence, and the disease state requires both alleles to be lost (autosomal recessive inheritance).
The conceptual framework for haploinsufficiency was articulated by Reiner Veitia in 2002 (BioEssays 24:175), building on earlier work by Sture Karlsson and others, and tied to the dosage-balance hypothesis: complexes assembled from multiple subunits in fixed ratios are vulnerable to changes in any one subunit's dose. The empirical mapping of haploinsufficient genes is now operationalised through gnomAD's pLI score (probability of LoF intolerance) and LOEUF metric, which rank genes by the strength of selection against LoF in the healthy population.
Gain of function
Gain-of-function (GoF) variants change protein behaviour in a way that produces a phenotype. The mechanisms are heterogeneous: constitutive activation of a signalling component (the canonical example is FGFR3 G380R in achondroplasia — a single recurrent missense substitution that produces ligand-independent receptor activation); novel function, where the mutant protein gains an activity its wild-type counterpart does not have (the polyglutamine-expanded huntingtin protein in Huntington disease is a paradigmatic example); increased expression or stability, where the same protein product accumulates to higher levels with phenotypic consequence; and change in localisation or partner specificity, where the mutant protein operates in the wrong cellular compartment or interacts with the wrong substrates.
Operationally, GoF variants are recognised by their tendency to recur at specific residues across unrelated families (a hotspot pattern that the underlying biology cannot explain by random LoF), by their inheritance (almost always dominant), and by the failure of LoF variants in the same gene to produce the same phenotype. FGFR3 illustrates this clearly: G380R causes achondroplasia, K650E and K650M cause thanatophoric dysplasia type II, and complete loss of FGFR3 causes a different syndrome (CATSHL syndrome, with overgrowth) — a classic GoF/LoF dichotomy at the same locus.
Dominant-negative effects
A dominant-negative (DN) variant is one whose mutant protein interferes with the function of the wild-type protein produced by the other allele — typically by being incorporated into a multimeric complex and destabilising it, or by binding a substrate non-productively and sequestering it. Dominant-negative variants behave like loss of function on the surface (the gene's activity is reduced), but they are more severe than simple haploinsufficiency, because the mutant subunit actively interferes rather than merely failing to contribute.
The canonical examples come from collagens. Type I collagen is a triple helix of two pro-α1 chains (encoded by COL1A1) and one pro-α2 chain (encoded by COL1A2); a glycine-substitution missense variant in either chain is incorporated into the helix and destabilises it, with three-quarters of helices containing at least one mutant chain. This is the molecular basis of osteogenesis imperfecta types II, III, and IV. Type IV collagen, the principal protein of basement membranes, behaves similarly in Alport syndrome: glycine-substitution variants in COL4A3, COL4A4, or COL4A5 destabilise the [α3.α4.α5] heterotrimer.
The TGF-β pathway provides a second class of dominant-negative examples. Marfan syndrome is caused by missense variants in FBN1 (fibrillin-1) that have both a structural-microfibril effect and a TGF-β-pathway dysregulation effect. Loeys-Dietz syndrome (variants in TGFBR1, TGFBR2, SMAD3) shares pathway biology and produces a related connective-tissue phenotype with vascular involvement. Veitia's framework treats DN as the most severe of three dosage-related failure modes (haploinsufficiency, dominant-negative, and gain of function) of the same dosage-sensitive complex.
Splice mutations
Most exons are flanked by canonical splice-site sequences: the GT donor at the 5′ end of the intron, the AG acceptor at the 3′ end, the branch-point adenosine inside the intron, and a polypyrimidine tract just upstream of the acceptor. Variants disrupting any of these tend to abolish or shift splicing of the affected exon. Beyond the canonical sites, splicing depends on a host of exonic splicing enhancers (ESEs), exonic splicing silencers (ESSs), intronic splicing enhancers (ISEs), and intronic splicing silencers (ISSs) recognised by SR proteins and hnRNPs; missense or synonymous variants that disrupt these can have a splicing effect indistinguishable in consequence from a canonical splice-site variant.
Predicting splicing impact has been transformed by deep learning. SpliceAI (Jaganathan et al. 2019, Cell 176:535) is a residual neural network trained on intron/exon boundaries that scores any candidate variant for the probability of altering a splice site within a 10kb window; SpliceAI has been incorporated into ACMG/AMP supplementary frameworks (the ClinGen splicing-impact framework of Walker et al. 2023). The practical effect is that variants previously dismissed as missense or as deep-intronic and ignored are now systematically re-evaluated for splicing consequence.
Regulatory variation
Most of the genome is non-coding and most common variation is regulatory. Variants in promoters, enhancers, silencers, and chromatin-architecture elements can change gene expression without changing protein sequence. The mapping of expression quantitative trait loci (eQTLs) by the GTEx consortium across more than 50 human tissues established that the great majority of common-disease-associated variants identified in genome-wide association studies fall in regulatory rather than coding regions, and that their effect is typically modest (a small fold-change in expression of one or a few nearby genes).
Rare regulatory variants have been harder to demonstrate as Mendelian causes, but a handful of cancer-associated promoter mutations are by now textbook. Horn et al. 2013 (Science 339:959) and Huang et al. 2013 (Science 339:957) independently identified two recurrent somatic substitutions in the TERT promoter (C228T and C250T) in melanoma; each substitution creates a novel ETS-family transcription-factor binding site that drives TERT expression and confers replicative immortality. The same TERT-promoter mutations are now recognised across glioblastoma, hepatocellular carcinoma, and bladder cancer, and are among the most frequent recurrent point mutations in human cancer.
Allele frequency in variant interpretation
A variant present in 2% of the healthy population cannot be highly penetrant for a serious early-onset Mendelian disorder; the maths of selection forbids it. The corollary — that very rare variants in genes with strong selection are more likely to be pathogenic than common ones — underpins the use of population allele-frequency catalogues in variant interpretation.
The reference catalogue is gnomAD (the Genome Aggregation Database), which aggregates exome and genome sequencing data from hundreds of thousands of individuals across populations and reports allele frequencies for hundreds of millions of variants. Karczewski et al. 2020 (Nature 581:434), the gnomAD v2 paper, also publishes gene-level constraint metrics: pLI (probability of LoF intolerance, ranging 0 to 1, with values > 0.9 indicating selection against LoF), LOEUF (upper bound of the observed/expected LoF ratio, with values < 0.35 indicating intolerance), and missense Z-scores. These constraint metrics are used in the ACMG/AMP framework as the basis of the PVS1 (very strong pathogenic) criterion for null variants in haploinsufficient genes, and as supporting evidence for missense pathogenicity in genes where missense variation is depleted overall.
The frequency-based filter rules out variants that are too common to be pathogenic given the prevalence and inheritance of the disorder under consideration. ClinGen variant-curation expert panels publish gene-specific frequency thresholds (typically of the order 0.01% to 0.1% for autosomal recessive carrier-state, lower for autosomal dominant) that calibrate the BS1 / BA1 (benign) ACMG/AMP criteria for each gene. The interpretive framework is covered on the mutation detection and interpretation page.
Where Evagene fits
Evagene is an educational, research, and academic pedigree modelling platform. The platform draws pedigrees and runs implementations of published family-history-based risk-model algorithms; it does not interpret variants, return ACMG/AMP classifications, or filter sequencing output by allele frequency. Where this page touches the platform, it is via the inheritance patterns produced by different functional categories of variant: dominant-negative collagen variants in osteogenesis imperfecta, gain-of-function FGFR3 in achondroplasia, loss-of-function in many of the conditions documented across our disease pages, and de novo mutation patterns visible in the corresponding pedigree examples.
Sources cited on this page
- MacArthur DG, et al. A systematic survey of loss-of-function variants in human protein-coding genes. Nature 2014;508:469 — PMID 24482476.
- Karczewski KJ, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 2020;581:434 — PMID 32461654 (gnomAD v2).
- Veitia RA. Exploring the etiology of haploinsufficiency. BioEssays 2002;24:175 — PMID 11835283.
- Jaganathan K, et al. Predicting splicing from primary sequence with deep learning. Cell 2019;176:535 — PMID 30661751 (SpliceAI).
- Horn S, et al. TERT promoter mutations in familial and sporadic melanoma. Science 2013;339:959 — PMID 23348503.
- Huang FW, et al. Highly recurrent TERT promoter mutations in human melanoma. Science 2013;339:957 — PMID 23348506.
- gnomAD — gnomad.broadinstitute.org (Broad Institute).
- OMIM entries: achondroplasia (100800); osteogenesis imperfecta type II (166210); Alport syndrome X-linked (301050).
Related reading
- Mutation biology and consequences (pillar)
- Types of mutation
- Mutation detection and interpretation
- Achondroplasia pedigree (FGFR3 G380R, gain of function)
- Osteogenesis imperfecta pedigree (collagen, dominant negative)
- Germline mosaicism calculator
- Mendelian inheritance calculator
- Hereditary cancer risk assessment