Approximately 350 million people are living with up to 8,000 rare diseases worldwide. For perspective, imagine the entire population of the US and millions more, including children, debilitated by diseases and often unable to receive optimal care due to underdiagnoses or initial misdiagnoses.


Copyright: “Combatting Longstanding Challenges in Rare Disease Detection with Innovative Deep Learning Models”


The average time for accurate diagnosis of a rare disease is four to five years.1,2,3 In some cases, it can take more than a decade.4,5  Unfortunately, this greatly delays patients’ access to effective treatment options and increases the financial burden on them and their families. According to a recent National Institutes of Health study analysis, people with rare diseases can have up to five times the healthcare costs of those without a rare disease.

Though hundreds of millions of people globally are living with rare diseases, prevalence rates are significantly low, varying anywhere from one in 1,000 to one in 20,000 patients, given such high levels of underdiagnoses and misdiagnosis. It is not only challenging for patients and their loved ones, but also for clinical trial sponsors and their clinical research organisation partners trying to plan and execute clinical trials to further examine these diseases and potential treatments. Rare disease clinical trial design poses several unique challenges to sponsors and study teams, including:

  • Inability to define meaningful endpoints, given limited knowledge and history of rare diseases and their progression.
  • Identifying patients to engage for trial participation.
  • Ensuring enrolled patients accurately represent the target patient population.
  • Small sample sizes.

Through advances in machine learning and deep learning methodologies, sponsors can leverage expansive datasets available in today’s broader healthcare system (e.g., genomic sequencing data and electronic health records) to equip themselves with the right expertise to improve rare disease detection and accelerate much-needed trials.

Rare disease patient identification: The traditional AI approach

In recent years, by using tremendous amounts of EHR data, researchers have trained deep learning models to extract disease progression insights, but this predictive modelling has primarily focused on chronic and prevalent diseases, such as Parkinson’s disease or cardiovascular conditions, not rare diseases. It is understandable, because these models only include patients with a confirmed diagnosis, which is limiting and makes extracting disease patterns difficult.

Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!


Though there are many patients who for years have a question mark on their diagnosis due to the time it takes to accurately identify a rare disease, these individuals can be potentially beneficial in improving learning models’ performance, given their similarities to patients with confirmed diagnoses. But it is also challenging to apply these patients to learning models for rare disease detection without generating false positive cases, because there is a broader level of classification. Individuals with uncertain diagnoses may be healthy or have a similar disease, but not the specific rare disease being evaluated, making it difficult for learning models to distinguish between patients with the target rare disease and those with similar conditions.[…]

Read more: