NIH Research Festival
–
PubTator 3.0 is an advanced biomedical literature resource featuring search capabilities enabled with state-of-the-art AI. It extracts six key types of biomedical entities: genes, diseases, chemicals, genetic variants, species, and cell lines, offering substantially improved performance over its predecessors. Additionally, it identifies 12 common types of relationships between entities. Currently, PubTator 3.0 contains 1.6 billion entity annotations and 33 million relationship annotations, with new articles added weekly.
The online interface of PubTator 3.0 provides a unified search system across both PubMed abstracts and full text articles from the PMC Open Access Subset, facilitating literature exploration through semantic, relation, keyword, and Boolean queries. PubTator 3.0’s API offers the same search options, along with text and annotation exportation in multiple formats. The API also supports more flexible queries; for instance, the information need ‘what chemicals are known to reduce expression of JAK1?’ can be answered in a single API call.
Previous versions of PubTator have fulfilled over one billion API requests, supporting a wide range of research applications, including prioritizing candidate genes, determining gene–phenotype associations, creating gene and genetic variant resources, and enriching disease knowledge graphs. PubTator 3.0’s enhanced accuracy will better support these use cases, and its new relation extraction feature shows substantially higher performance than previous state-of-the-art systems. Moreover, integrating ChatGPT with the PubTator APIs dramatically improves the factuality and verifiability of ChatGPT’s responses. With an expanded set of features and substantially improved performance, PubTator 3.0 will unlock valuable insights for scientific discovery.
Scientific Focus Area: Computational Biology
This page was last updated on Tuesday, August 6, 2024