Thoughts on Healthcare Markets & Technology

Thoughts on Healthcare Markets & Technology

A Reasoning Model Reread 376 Unsolved Rare Disease Genomes at Boston Children’s & Surfaced 18 New Diagnoses: What the 4.8% Yield Means, and Why the Real Prize Is Reanalysis as Standing Infrastructure

Jun 22, 2026
∙ Paid

Video Preview

🎧 Podcast episode for paid subscribers only. Also available on Spotify.

Thoughts on Healthcare Markets & Technology
A Reasoning Model Reread 376 Unsolved Rare Disease Genomes at Boston Children’s & Surfaced 18 New Diagnoses: What the 4.8% Yield Means, and Why the Real Prize Is Reanalysis as Standing Infrastructure
Boston Children’s ran a reasoning AI model over 376 unsolved rare disease genomes. Result: 18 new diagnoses. 4.8% yield on cases already declared unsolvable. That number is more interesting than it looks…
Listen now
5 hours ago · Thoughts on Healthcare

To listen to paid episodes in Apple or Spotify, link your Substack subscription via the show settings on those platforms (instructions inside the Substack app under Subscriptions → Podcast).

Thoughts on Healthcare Markets & Technology is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Table of Contents

  1. Why a closed genome is really an open file

  2. What Boston Children’s actually ran

  3. How to read a 4.8 percent yield without kidding yourself

  4. The cases where the model ignored instructions

  5. The vitiligo hypothesis, where research tool and diagnostic part ways

  6. The part nobody is pricing, reanalysis as a standing service

  7. The consumer-AI question the editorial won’t say loudly

  8. Who actually captures this, and the bear case

  9. What to watch next

Abstract

What happened: Boston Children’s Manton Center, Harvard, and OpenAI pointed the o3 Deep Research model at 376 already-unsolved rare disease cases. After expert review and CLIA confirmation, physicians landed 18 new diagnoses. Published June 18, 2026 in NEJM AI.

The headline number: 4.8% additional yield on cases that had already been worked to a dead end.

Cohort spread: neurodevelopmental 100 cases / 10 dx / 10.0%; neuromuscular 61 / 4 / 6.6%; early psychosis 15 / 2 / 13.3% (tiny n); sudden unexpected death in pediatrics 200 / 2 / 1.0%.

The catch: 7 of 18 were rediscoveries, answers that existed elsewhere (some already pathogenic in public databases) but weren’t in the chart the team had.

The setup: per-case packet of HPO terms, metadata, clinician notes, filtered variant table, mostly trios. Model was told to argue, not rank. Humans graded under ACMG/AMP.

The thesis of this piece: the diagnoses are the press release. The durable asset is reanalysis as continuous infrastructure, and the open question is who owns the workflow when the model itself is rentable by everyone.

Why a closed genome is really an open file

Start with the uncomfortable baseline. Even after full sequencing and a specialist workup, roughly half of people with a rare disease never get a clean genetic answer. The test comes back, the answer is some flavor of inconclusive, and the family goes home with a maybe. The instinct is to treat that as a permanent verdict. It is not. The genome is fixed, but everything around it moves. New gene-disease links get published, labs reclassify old variants, case reports pile up, and a locus that meant nothing in 2019 can mean something specific by Tuesday. A negative result, in other words, has a shelf life, and almost nobody is tracking the expiration date.

That reframes the whole problem. Reanalysis is usually described as a scientific challenge, and partly it is, but mostly it is a maintenance problem wearing a lab coat. Each institution that sequences kids inherits a growing pile of genomes that drift out of sync with a knowledge base that updates daily. Keeping that pile current is grunt work at a scale no human department can staff. An Australian audit of every diagnostic lab in the country found about twenty-five thousand new genomic tests between 2018 and 2021 against only nine hundred fifty reanalyses, almost all of them triggered one at a time by a clinician asking nicely. The professionals surveyed named the bottleneck plainly, and it was not science. It was workforce capacity, plus the small detail that there were no national guidelines telling anyone when or how to re-run an old case.

The stakes are not academic. The diagnostic odyssey runs five to eight years on average, with something like seventeen clinical encounters before anyone names the thing. The collective annual burden of rare disease in the United States has been pegged at nearly a trillion dollars, with more than half of that absorbed by families and society rather than the health system. The avoidable share attributable purely to diagnostic delay has been estimated at somewhere between roughly eighty-six thousand and five hundred seventeen thousand dollars per patient depending on the condition. So the backlog of stale, unsolved genomes is not a rounding error or an edge case. It is the steady state of the field, and it has a price tag.

User's avatar

Continue reading this post for free, courtesy of Thoughts on Healthcare.

Or purchase a paid subscription.
© 2026 Healthcare Markets & Technology · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture