AI discovers 18 diagnoses of rare genetic diseases in children

TL;DR: OpenAI and doctors used a reasoning model to diagnose rare genetic diseases in children, finding 18 new cases. This speeds up diagnosis and reduces family suffering.

What happened?

A team of researchers from Children's National Hospital and the University of Colorado has employed an OpenAI reasoning model (possibly an advanced version of GPT-4 or a specific model like o1) to help diagnose rare genetic diseases in children. According to an article published on OpenAI's official blog on September 19, 2024, the system analyzed complex genomic and clinical data from pediatric patients with undiagnosed conditions, successfully identifying 18 new diagnoses in previously unsolved cases. These diagnoses include disorders such as KBG syndrome, Langer-type skeletal dysplasia, and other ultra-rare conditions. The study, led by Dr. Matthew Might and Dr. Rachel Eastwood, used the reasoning model to interpret variants of uncertain significance (VUS) and link them to clinical phenotypes, a process that traditionally requires months of manual work by experienced geneticists.

The model not only identified mutations but also generated explanatory reports detailing evidence for and against each possible diagnosis, allowing physicians to assess plausibility. This approach contrasts with previous AI methods in genomics, which often function as black boxes. The system was tested on a cohort of 100 undiagnosed patients, achieving an 18% success rate in new diagnoses, comparable to that of top human teams at reference centers. Results were validated through independent laboratory tests, such as Sanger sequencing and functional studies.

Why is this important?

Rare genetic diseases affect approximately 300 million people worldwide, according to the World Health Organization. In the United States, an estimated 30% of children with rare diseases die before age 5, and the average diagnosis takes 5 to 7 years—a period known as the 'diagnostic odyssey' involving consultations with multiple specialists, costly tests, and immense emotional and financial burden on families. AI's ability to analyze large volumes of data (whole exomes, genomes, electronic health records) and find patterns humans might miss promises to accelerate this process from years to weeks or even days.

This advance is particularly relevant because it addresses the bottleneck in variant interpretation. Currently, clinical genomics labs generate hundreds of VUS per patient, and only a small percentage are classified as pathogenic. OpenAI's reasoning model, by integrating knowledge from databases like ClinVar, OMIM, and scientific literature, can prioritize variants more likely to be causal. Additionally, the system provides understandable explanations for clinicians, facilitating clinical adoption and communication with families. This represents a paradigm shift: from reactive to proactive and personalized medicine.

What consequences will it have?

In the short term, this study validates the use of reasoning models in clinical genomics and lays the groundwork for larger clinical trials. The research team plans to expand the cohort to 500 patients and collaborate with the NIH's Undiagnosed Diseases Network (UDN). In the long term, we could see broader integration of AI into diagnostic workflows, reducing costs and wait times. For example, the cost of whole genome sequencing has fallen below $1,000, but interpretation remains expensive (between $5,000 and $15,000 per case). AI could reduce these costs by 50-70%, according to industry estimates.

However, significant challenges remain. First, the need for high-quality, labeled data: AI models require large annotated clinical and genomic datasets, which are often scarce for rare diseases. Second, interpreting variants of uncertain significance remains an open problem, and models may generate false positives or negatives. Third, equity in access: if these tools are only available at elite centers, they could increase health disparities. Additionally, models must be rigorously validated in diverse populations to avoid bias, as most current genomic data comes from individuals of European ancestry. A 2023 study in Nature Communications showed that AI models for genetic diagnosis had 20% lower accuracy in non-European populations.

From a regulatory standpoint, the FDA has not yet approved any AI system for de novo genetic diagnosis, although devices that assist in interpretation exist. This study could accelerate dialogue with regulatory agencies. There are also ethical implications: who is responsible if a diagnosis is incorrect? How is genomic data privacy protected? OpenAI has stated that patient data was anonymized and the model does not retain information, but public trust is crucial.

What should readers know?

This advance does not replace doctors but assists them. AI is a tool that can suggest diagnoses, but final confirmation requires clinical evaluation, functional tests, and a geneticist's judgment. Patients and families should be aware that the technology is still developing and not all cases will have answers. However, it is a promising step toward more personalized medicine. For healthcare professionals, it is important to understand that AI is not infallible: models can hallucinate or miss rare diagnoses. The combination of human expertise and computational power is key.

Compared to previous events, such as IBM Watson in oncology (which had mixed results), this approach is more promising because it focuses on a specific problem (rare disease diagnosis) and uses reasoning models that can explain their decisions. It also differs from tools like Google's DeepVariant, which focuses on variant calling, not clinical interpretation. The market impact could be significant: companies like Illumina, Fabric Genomics, and Congenica are already integrating AI, but OpenAI's entry could democratize access. The AI in genomics market is expected to grow from $1.2 billion in 2023 to $9.8 billion by 2030, according to Grand View Research.

“AI won't replace doctors, but doctors who use AI will replace those who don't.” — adaptation of a common quote in the sector.

In summary, this milestone demonstrates that reasoning models can be powerful allies in the fight against rare diseases, but their responsible implementation will require collaboration among technologists, clinicians, regulators, and patients.

What happened?

Why is this important?

What consequences will it have?

What should readers know?

Keep reading