By Tudor Groza (Bioinformatics Institute, A*STAR & Maternal and Child Health Research Institute, KK Women’s and Children’s Hospital, Singapore) and Ian Green (SNOMED International)
Rare diseases affect up to 300 million people worldwide – yet for most patients, the journey to a diagnosis is long, fragmented, and frustrating. It often takes 5–6 years and dozens of healthcare encounters before a rare condition is identified, during which patients navigate uncertainty, delayed care, unnecessary referrals, and emotional burden. For health systems, this delay represents not only clinical risk but also an opportunity cost: patients with rare disorders frequently cycle through services without coordinated recognition or intervention.
As nations build digital health infrastructure at population scale, new tools are needed to proactively identify individuals likely to have a rare disease and connect them to specialist pathways sooner.
In a recent study published in npj Digital Medicine, Tudor Groza et al introduced a novel screening approach grounded in information content (IC) – a measure of how "unusual" or specific a patient's clinical profile appears in electronic health records.
Applied across a nationwide cohort of 1.27 million patients in Singapore, this method demonstrated that rare disease patients exhibit distinct, higher Information Content (IC) patterns from their earliest clinical encounters. Using SNOMED CT – the clinical terminology embedded in Singapore's electronic health records – the authors showed that IC can reliably stratify patients and help surface candidates who may benefit from rare disease assessment or genetic evaluation. At the health-system level, this approach achieved ~95% sensitivity using simple IC thresholds, while keeping follow-up burdens manageable.
Beyond identifying known rare disease patients, their work also revealed 71 potential underdiagnosed rare conditions within the population – the majority likely genetic. This highlights a powerful secondary benefit of this methodology: enabling health systems not just to detect individual patients earlier, but also to uncover hidden clinical patterns and unmet needs across their population.
Importantly, the approach does not require complex machine learning infrastructure. IC can be embedded as a lightweight screening layer on top of existing electronic health records, making it implementable even in early-stage digital health environments.
"Rare disease patients often spend years navigating the healthcare system before receiving a diagnosis, even though many of the clues are already present in their medical records,” says Associate Professor Saumya Jamuar, Senior Consultant in the Genetics Service at KK Women's and Children's Hospital, Director of the SingHealth Duke-NUS Institute of Precision Medicine and Lead Principal Investigator of the Singapore Childhood Undiagnosed Disease Programme. “By harnessing information content from routinely collected clinical data, this work shows how we can surface patients earlier and more systematically – not by adding burden to clinicians, but by letting the health system itself flag individuals who may benefit from specialist review. Our vision is to shorten the diagnostic odyssey and ensure that every patient with a rare condition has a fair chance to be identified, assessed, and supported as early as possible."
A key enabler of this work is the growing alignment between terminologies used in clinical care and rare disease research. SNOMED CT offers significantly richer representation for rare conditions than previous standards like ICD-10, and its official mappings to Orphanet – the global authority for rare disease classification – made it possible to systematically identify rare disease patients and evaluate IC-driven screening performance. This collaboration underscores the value of modern, semantically rich terminologies in accelerating precision medicine at scale.
"SNOMED CT's rich clinical vocabulary and its alignment with Orphanet provide a powerful foundation for understanding rare diseases within health systems,” says Ian Green, Global Clinical Engagement Lead at SNOMED International. “This study demonstrates how standardized terminology and high-quality mappings can unlock new analytical approaches that were previously out of reach. We are proud to support research that turns structured clinical data into meaningful insights – and helps health systems proactively identify and care for patients with rare conditions."
Looking ahead, Groza et al. envision this as a foundation for a broader ecosystem of rare disease early-detection tools. Integrating information-theoretic screening with genomics, machine learning, and clinical workflows could further shorten the diagnostic journey, enabling faster specialist referral and earlier therapeutic intervention. The authors encourage health systems, clinicians, and patient advocacy groups to explore the adoption of proactive rare disease screening – because every year shaved off the diagnostic odyssey has the potential to change a life.
The study is available here: https://doi.org/10.1038/s41746-025-02096-x
