NIH Research Festival
–
Approximately 80% of the more than 7,000 rare diseases have a genetic association but most lack approved treatments. One challenge for researchers, patients, and other community partners is the ability to access and coordinate data from disparate sources where they are siloed, and not organized or harmonized, making it difficult to integrate in a useful manner. To address this need, the Therapeutic Development Branch in the intramural Division of Preclinical Innovation at NCATS conceptualized Rare-SOURCE™ and in collaboration with the Advanced Biomedical and Computational Science developed and launched this user-centric bioinformatic resource platform for rare disease information. The main objective is to facilitate data mining through a searchable interface that integrates bioinformatics databases and enables users to navigate rare disease information quickly and efficiently.
This resource allows for browsing rare diseases with known associations to genes. The disease and gene catalogs include names, symbols, synonyms, and links to rare disease data sources. The resource also incorporates a Literature AI feature, with results from mining rare disease and associated gene mentions in all MEDLINE™ using transformer models. The workflow has an added benefit of combining the names with known disease and gene aliases, thereby easing the search process for finding literature on diseases or genes of interest. A new variant annotation module includes millions of variants from public data sources such as gnomAD and ClinVar, annotated using OpenCRAVAT, for rare disease associated genes. Variant level details are presented in tabular formats as well as 2D and 3D visualizations on protein structure.
Scientific Focus Area: Computational Biology
This page was last updated on Tuesday, August 6, 2024