CRBC News
Science

18-Year-Old Uncovers 1.5 Million Previously Uncataloged Space Sources in NASA Archive

18-Year-Old Uncovers 1.5 Million Previously Uncataloged Space Sources in NASA Archive
Matteo Paz (center) at Regeneron Science Talent Search holding up prize - Society for Science / YouTube

An 18‑year‑old high school student, Matteo Paz, used a machine‑learning pipeline to mine the NEOWISE archive and identified about 1.5 million previously uncataloged variable infrared sources. The analysis scanned nearly 200 billion detection rows and was developed in six weeks as part of Caltech’s Planet Finder Academy. Paz’s results were peer‑reviewed and published in 2024 in The Astronomical Journal, and he received a $250,000 Regeneron prize. The project shows how modern AI and archival mining can extend the scientific return of past missions and guide future follow‑up observations.

An 18-year-old high school student from California, Matteo Paz, turned a retired NASA mission archive into a major scientific result: his machine‑learning analysis of NEOWISE data flagged roughly 1.5 million previously uncataloged variable infrared sources. His work was peer‑reviewed and published in 2024 in The Astronomical Journal, and Paz was awarded a $250,000 prize in the Regeneron Science Talent Search.

From Classroom Project to Published Science

The discovery began as a project for the Caltech Planet Finder Academy, a program that gives students hands‑on experience with real astronomy problems. Rather than examine a small sample, Paz and his mentors accessed the full NEOWISE detection table — an archive that scientists describe as approaching 200 billion recorded rows of observations collected over more than a decade.

How the Pipeline Worked

Over six weeks, Paz built and trained a machine‑learning pipeline to scan the massive database automatically. The pipeline was designed to detect faint, variable light sources in the infrared — subtle flickers, pulses or fading signals that are easy to miss by eye. These signatures can indicate objects such as binary stars, quasars, or other astrophysical sources that were not previously cataloged in the archive. The team refined the software to classify periodic behaviors, reduce false positives, and scale the method to the entire dataset.

18-Year-Old Uncovers 1.5 Million Previously Uncataloged Space Sources in NASA Archive
Earth telescope looking up at Neowise comet. - Nailzchap/Getty Images

"We were creeping up towards 200 billion rows in the table of every single detection that we had made over the course of over a decade," said Davy Kirkpatrick, an IPAC senior scientist and one of Paz's mentors.

Results and Next Steps

The pipeline produced a catalog of roughly 1.5 million candidate variable sources. These are currently candidate detections that will benefit from follow‑up observations and community verification to determine the nature of each source (for example, whether an entry is a previously unknown minor planet, variable star, quasar, or an unrelated artifact).

The project highlights how modern data science methods can extract new science from archival telescope records. The same approach could be applied to other rich datasets, such as Kepler's exoplanet archives or future surveys from the Nancy Grace Roman Space Telescope. Similar machine‑learning and AI techniques are also being incorporated into operations for large observatories, helping astronomers find faint or complex signals that are otherwise difficult to detect.

Why This Matters

Paz’s work is a clear example of the scientific value locked in archival data and demonstrates that small teams — and even students — with modern algorithms can make significant discoveries. The catalog of candidate sources opens new avenues for follow‑up study and demonstrates the power of combining citizen and student engagement with professional astronomical pipelines.

Help us improve.

Related Articles

Trending