The OpenAI-led report presents case studies showing GPT-5 assisting research in physics, math and biomedical labs: it reproduced known black-hole calculations, sped up fusion modeling, and helped interpret immune-cell data that matched lab-confirmed results. With expert guidance, GPT-5 also aided several verified, modest mathematical discoveries, including progress on a 1992 Erdős problem. The authors stress that while GPT-5 accelerates coding and literature synthesis, human oversight is essential because the model can hallucinate or misattribute references.
How GPT-5 Is Accelerating Breakthroughs in Math, Physics and Cancer Research
The OpenAI-led report presents case studies showing GPT-5 assisting research in physics, math and biomedical labs: it reproduced known black-hole calculations, sped up fusion modeling, and helped interpret immune-cell data that matched lab-confirmed results. With expert guidance, GPT-5 also aided several verified, modest mathematical discoveries, including progress on a 1992 Erdős problem. The authors stress that while GPT-5 accelerates coding and literature synthesis, human oversight is essential because the model can hallucinate or misattribute references.

A new report from OpenAI and a team of independent scientists documents how GPT-5, the latest large language model, is being used as a research assistant across disciplines—from black-hole physics to nuclear fusion to cancer immunology and mathematical puzzles.
The paper is organized as a set of case studies. In each, a researcher stuck on a problem or seeking to verify results asked GPT-5 for help. Sometimes the model made mistakes; sometimes it suggested faster ways to reach known conclusions; and sometimes, under careful human guidance, it helped produce new and verifiable findings.
Physics and fusion
In one experiment about how waves behave around black holes, GPT-5 worked through the required mathematics and independently produced results that aligned with previously established conclusions, demonstrating the model's ability to carry out complex scientific calculations.
In a separate project focused on nuclear fusion, the model generated a computational approach that accelerated parts of the research workflow. As Floor Broekgaarden, an astronomer at the University of California, San Diego (not involved in the study), observes, GPT-5 can dramatically reduce the time required for coding—compressing tasks that traditionally take days into minutes.
Biomedical applications
Researchers studying immune cells used GPT-5 to interpret experimental data, and the model's interpretation matched outcomes already validated by the lab. Derya Unutmaz, the physician leading that project, wrote that "GPT-5 Pro can function as a true mechanistic co-investigator in biomedical research, compressing months of reasoning into minutes, uncovering non-obvious hypotheses, and directly shaping experimentally testable strategies." The authors, however, emphasize that such results require careful experimental follow-up and human oversight.
Mathematics discoveries
Guided by human experts, GPT-5 helped produce several modest but genuine mathematical results. The model contributed to solving a long-standing problem posed in 1992 by Paul Erdős, derived a clearer statement about limits on decision procedures in computation, found a rule for the appearance of certain small patterns in branching diagrams, and identified a method to detect hidden structures as a network grows. Each result was checked and confirmed by human mathematicians before being reported.
Strengths and limits
One of GPT-5's notable strengths is its ability to search and synthesize vast scientific literature. For example, when presented with an online-listed unsolved math problem, the model located a solution in a paper from the 1980s; in another case it found decisive lines in a German paper from the 1960s, bridging language and stylistic gaps.
But the authors are careful not to portray GPT-5 as a replacement for researchers. The model can be confidently wrong, hallucinate nonexistent papers, or misattribute citations. As Broekgaarden puts it, human expertise remains crucial—AI can take on time-consuming tasks like collating data, summarizing articles, and performing complex calculations, but human judgment is essential for validation, interpretation and experimental design.
What comes next
The report reads more like a set of promising case studies than a conventional reproducible research paper: some critics note it lacks full experimental detail and counterfactual comparisons. Nevertheless, the examples indicate a sharp advance in capability over the past year and illustrate how AI can help researchers mix prior results, generate hypotheses and accelerate repeatable parts of the scientific process. As models evolve rapidly, their future contributions to discovery are likely to grow—provided they remain paired with rigorous human oversight.
