Heritability and the Liability-Threshold Model

Short version. Heritability h² is V_A / V_P — the proportion of phenotypic variance attributable to additive genetic variance. It is estimated from twin studies (Falconer's formula h² = 2(r_MZ − r_DZ)), from genotyped SNPs in unrelated individuals (GREML, Yang et al. 2010), or from GWAS summary statistics (LD-score regression, Bulik-Sullivan et al. 2015). For binary disease phenotypes, the liability-threshold model of Falconer 1965 maps a population's prevalence and the recurrence-risk pattern in relatives onto an underlying continuous liability; Carter's 1961 multifactorial-threshold framing extends this to sex-differential thresholds (the Carter effect, classically observed in pyloric stenosis). All framing is research and education; outputs of these computations on this site are illustrative.

Defining heritability

Heritability decomposes the variance of an observable phenotype into a genetic and an environmental part. The decomposition is V_P = V_G + V_E, where V_G = V_A + V_D + V_I separates additive, dominance, and epistatic-interaction variance components. Two heritability statistics are routinely used.

Narrow-sense heritability h² = V_A / V_P. The proportion of variance attributable to additive effects only. Determines parent-offspring resemblance and the response of the trait to artificial or natural selection.
Broad-sense heritability H² = V_G / V_P. The proportion of variance attributable to the total genetic contribution including dominance and epistasis. Recovered, in principle, from monozygotic-twin concordance.

By construction h² ≤ H². Both are population-specific quantities — heritability depends on the allele-frequency distribution and the environmental variance in the reference population, and is not a property of the trait or the gene set in isolation. Visscher, Hill & Wray 2008 (Nat Rev Genet 9:255) is the canonical review of heritability misconceptions and the careful interpretation of these statistics.

Twin-study estimation: Falconer's formula

The classical estimator of human heritability is Falconer's formula, derived from the contrast between monozygotic (MZ) and dizygotic (DZ) twin correlations:

h² = 2 · (r_MZ − r_DZ)

The intuition is simple. MZ twins share 100% of their genome and (under the equal-environments assumption) the same shared environment as DZ twins; DZ twins share, on average, 50% of additive genetic variance and the same shared environment. The difference (r_MZ − r_DZ) therefore reflects half of the additive genetic variance, and doubling gives h².

The equal-environments assumption is the standard caveat: MZ and DZ twin pairs are assumed to share environment to the same degree, which can be questioned for traits where MZ co-twins are treated more similarly than DZ co-twins (effects on adolescent behavioural traits are a recurring controversy). Adoption studies, multi-generational pedigree studies, and structural-equation modelling (the ACE / ADE / ACDE model families) extend the basic Falconer-formula machinery. Mayhew & Meyre 2017 review the modern landscape of heritability estimation in human disease genetics.

SNP heritability: GREML

Twin-study heritability is a property of the whole genome. SNP heritability is the heritability captured by a particular set of genotyped variants. Yang et al. 2010 (Nat Genet 42:565; PMID 20562875) introduced GREML — genomic-relatedness-matrix restricted maximum likelihood — which estimates the proportion of phenotypic variance explained by all genotyped SNPs simultaneously, in samples of unrelated individuals. The procedure builds a genomic relationship matrix (GRM) from genotyped SNPs and estimates the proportion of phenotypic variance attributable to the GRM by REML, implemented in the GCTA software package.

Applied to height, GREML recovered SNP heritability of approximately 0.45 from common SNPs in unrelated UK adults — substantial, but well below the twin-study heritability of ~0.8. Yang et al. 2011 (AJHG 88:76; PMID 21167468) extended the same approach to multiple complex diseases, recovering SNP heritabilities consistent with the polygenic synthesis but well short of twin-study estimates. The shortfall is the missing-heritability gap.

LD-score regression

GREML requires individual-level genotype data. Bulik-Sullivan et al. 2015 (Nat Genet 47:291; PMID 25642630) introduced LD-score regression (LDSC), which estimates SNP heritability directly from GWAS summary statistics. The method exploits the observation that, under polygenic inheritance, a SNP's GWAS test statistic is correlated with the sum of LD-scores at that SNP — the number and strength of correlated SNPs in its neighbourhood. The slope of the regression of test statistic on LD-score gives the SNP heritability; the intercept distinguishes polygenic signal from confounding by population stratification.

LDSC is computationally cheap, scales to any GWAS with summary statistics, and underlies a large family of follow-on methods including stratified LDSC for partitioning heritability across functional categories, cross-trait LDSC for genetic correlations, and partitioning-by-cell-type analyses. The LD Hub repository hosts LDSC-derived heritability estimates and genetic correlations across hundreds of traits.

The missing heritability problem

Manolio et al. 2009 (Nature 461:747; PMID 19812666) framed the missing-heritability problem: even after the first decade of GWAS, the genome-wide significant variants for most complex traits accounted for only a small fraction of the trait's twin-study heritability. For height, ~40 GWAS-significant loci circa 2009 explained ~5% of phenotypic variance against a twin-study h² of ~0.8 — a gap of more than an order of magnitude.

The candidate explanations have largely been validated in the years since. Most of the gap reflects a long polygenic tail of variants below GWAS significance: when the burden of all common SNPs is summed (GREML, LDSC), most of the missing heritability is recovered. The remainder is divided across rare variants of moderate effect, structural variation poorly captured by SNP arrays, gene-environment interaction, gene-gene interaction (epistasis), parent-of-origin effects, and miscalibration of the twin-study heritability itself (the equal-environments assumption, assortative mating).

The contemporary consensus is that the gap is not "missing" but distributed across the genome at sub-significance and across genetic architectures the early GWAS designs were not powered to detect. The polygenic synthesis was right; the empirical signal-to-noise ratio per locus was lower than first hoped.

The liability-threshold model

For a binary disease phenotype, heritability is not defined directly on the observable affected/unaffected dichotomy — the variance of a 0/1 indicator is just K(1−K) and is not a useful target for genetic decomposition. Falconer 1965 (Ann Hum Genet 29:51) introduced the device of an underlying continuous liability — an unobserved Gaussian latent trait that determines who is affected and who is not. The population's liability is normally distributed with mean zero and unit variance; the affected fraction K is the area in the right tail above a threshold T such that Φ(T) = 1 − K, where Φ is the standard normal cumulative distribution.

Heritability is defined on the liability scale, h²_L, as the proportion of liability variance attributable to additive genetic variance. The recurrence risk to a relative R is the integral of the conditional liability distribution above T given the relative's expected genetic resemblance to the proband — a one-line normal-distribution computation for any pair of related individuals once h²_L and K are known. Conversion between liability-scale and observed-scale heritability uses the standard Robertson formula, which depends on K and the threshold height φ(T)/Φ(T).

Recurrence risks to relatives: lambda_R

The relative recurrence risk λ_R is defined as P(affected | relative-of-class-R-affected) divided by K. λ₁ for first-degree relatives, λ₂ for second-degree, and so on. Empirically, λ_R declines geometrically with kinship under purely additive polygenic inheritance, with the precise pattern determined by h²_L. Departures from this pattern (e.g. λ₁ much higher than the polygenic prediction implies) suggest dominance variance, epistasis, shared environment, or major-locus contributions on top of the polygenic background — a source of clues to a trait's genetic architecture. The original treatment of λ_R in human disease genetics is in Risch's 1990 paper sequence in AJHG.

Carter 1961 and sex-differential thresholds

Carter 1961 (Br Med Bull 17:251) studied congenital pyloric stenosis, a disorder with a marked sex bias (~5:1 male:female in mid-twentieth-century British studies). Carter observed that recurrence risk in relatives of an affected proband depended on the sex of the proband: relatives of an affected female had higher recurrence risk than relatives of an affected male, even after conditioning on the relative's own sex. The interpretation, set out in the multifactorial-threshold framework of Carter 1961, is that female cases require a higher liability to cross a sex-specific threshold — reflecting either a higher absolute threshold for females or a lower female prevalence at any given liability — so an affected female carries on average more genetic load than an affected male and her relatives inherit a correspondingly larger share of that load.

The Carter effect is now the canonical illustration of sex-differential thresholds in multifactorial inheritance and is observed in cleft lip with or without cleft palate, in autism (with the well-known male:female prevalence ratio and the corresponding "female protective effect"), in some forms of congenital heart defect, and in several other multifactorial disorders. The phenomenon is reproduced in Evagene's complex-disease pedigree software as one of the four classical counselling modifiers applied on top of the empirical recurrence-risk table or the Falconer-1965 fallback.

Putting it together: a worked example

Consider a hypothetical multifactorial condition with population prevalence K = 0.005 and liability-scale heritability h²_L = 0.6. Under the additive liability model:

The threshold T satisfies Φ(T) = 0.995, so T ≈ 2.576 standard deviations above the population mean.
A first-degree relative of an affected proband shares 0.5 of additive genetic variance and inherits, in expectation, half the proband's genetic liability above the population mean. Their conditional liability distribution remains normal but is shifted right by an amount proportional to h²_L and the proband's expected liability.
Integrating the shifted conditional distribution above T gives the recurrence risk to a first-degree relative — for the parameters above, an order of magnitude greater than K.
If the proband is an affected female in a sex-biased disorder, the conditional liability distribution shifts further right (because she crossed a higher threshold), and the implied recurrence risk in her relatives is larger than for an affected male's relatives at the same overall prevalence.

Evagene's complex-disease engine implements the Falconer-1965 fallback for any catalogued condition where prevalence and liability-scale heritability are recorded, with the four classical counselling modifiers (severity grade, sex bias by Carter, multiple affected relatives, parental consanguinity) applied on top. The Smith / Carter / Harper empirical recurrence-risk tables take precedence where they exist; Falconer's calculation is the fallback. As stated throughout this site, all such outputs are research and education only.

Heritability is not destiny — the standard caveats

Several caveats deserve attention because they are routinely misunderstood in the secondary literature:

Heritability is a population statistic, not a personal one. A height heritability of 0.8 does not mean that 80% of an individual's height is "genetic" — it means that 80% of the population's variance in height is attributable to additive genetic variance.
Heritability does not measure malleability. A high heritability is consistent with strong environmental modifiability of the trait; the classical example is phenylketonuria, which has very high heritability and is also fully manageable by dietary intervention.
Heritability is environment-conditional. Reduce environmental variance and heritability rises; introduce a new environmental factor and heritability falls. Twentieth-century changes in childhood nutrition are estimated to have reshaped the heritability of adult height across decades.
Liability-threshold heritability is conventional. Reporting h² on the liability scale rather than the observed scale matters; the conversion depends on prevalence K, and observed-scale heritabilities can be very small for rare diseases even when liability-scale h² is high.

None of this content is clinical recommendation; on this site and on this platform, heritability and liability-threshold computations are research and education materials.

Defining heritability

Twin-study estimation: Falconer's formula

SNP heritability: GREML

LD-score regression

The missing heritability problem

The liability-threshold model

Recurrence risks to relatives: lambda_R

Carter 1961 and sex-differential thresholds

Putting it together: a worked example

Heritability is not destiny — the standard caveats

Canonical references

Related reading

Use Evagene for teaching liability-threshold models

Defining heritability

Twin-study estimation: Falconer's formula

SNP heritability: GREML

LD-score regression

The missing heritability problem

The liability-threshold model

Recurrence risks to relatives: lambdaR

Carter 1961 and sex-differential thresholds

Putting it together: a worked example

Heritability is not destiny — the standard caveats

Canonical references

Related reading

Use Evagene for teaching liability-threshold models

Recurrence risks to relatives: lambda_R