CRBC News
Science

Scientists Find 'Mutation Hotspots' at Gene Start Sites — A Hidden Source of Genetic Change

Scientists Find 'Mutation Hotspots' at Gene Start Sites — A Hidden Source of Genetic Change

Researchers discovered mutation hotspots at transcription start sites (TSSs), where RNA polymerase opens DNA to begin transcription. ERV data across nearly 15,000 genes in over 220,000 people showed a strong hotspot, which was missing from de novo mutation (DNM) studies but reappeared in 11 mosaic datasets. The authors conclude early embryonic (mosaic) mutations cluster at TSSs and recommend re-examining filtered calls and using mosaic-aware analysis to improve genetic-disease studies.

Key finding: Researchers have identified concentrated mutation hotspots at transcription start sites (TSSs) — the locations where RNA polymerase opens DNA to begin copying genes — and traced why these hotspots appear in some datasets but not others.

What the researchers did

Using very large human-genome datasets, the team analyzed extremely rare variants (ERVs) across nearly 15,000 genes from more than 220,000 individuals. They also examined data from 10 trio studies (parent–child trios used to detect de novo mutations, DNMs) and reviewed 11 published mosaic-mutation datasets. The study was led by geneticist Donate Weghorn of the Centre for Genomic Regulation.

What they found

ERV data revealed a strong, consistent mutation hotspot centered on transcription start sites. Surprisingly, this signature was absent from the DNM datasets. The discrepancy was resolved when the team inspected mosaic-mutation data: the same hotspot reappeared, concentrated in early embryonic (mosaic) mutations.

Early embryonic mutations cluster at transcription start sites but are often patchy and can resemble sequencing noise, so standard DNM pipelines frequently filter them out.

Proposed mechanism

The beginning of a gene is a busy and fragile region where RNA polymerase frequently pauses and briefly unwinds DNA. Those pauses increase exposure to damage or allow the transcription machinery to misfire. When damage is imperfectly repaired, it can leave permanent changes — the observed mutation hotspots.

Why this matters

This finding helps explain why some analyses of new mutations miss biologically important hotspots and suggests ways to improve mutation-calling pipelines. Because an estimated 300 million people worldwide live with rare genetic disorders, more accurate detection of these early embryonic and inherited mutations could refine models for genetic disease research and diagnostics.

Recommendations

The authors recommend re-evaluating filtered variant calls near transcription starts, searching for co-occurrence patterns that indicate mosaicism, and incorporating mosaic-aware approaches into DNM analyses to avoid systematic blind spots.

Publication: The work appears in Nature Communications.

Similar Articles