Skip to main content

Performance comparison of MEDLINE structured abstracts to unstructured abstracts

Monday, September 22, 2014 — Poster Session I

12:00 p.m. – 2:00 p.m.

FAES Academic Center




  • AM Ripple
  • JG Mork
  • HJ Thompson
  • SC Schmidt
  • LS Knecht


BACKGROUND: Structured abstracts are abstracts with distinct, labeled sections (e.g., RESULTS) that pinpoint content. Approximately 30% of all abstracts added to MEDLINE® in PubMed are structured. OBJECTIVE: To compare performance of structured abstracts (SAs) to unstructured abstracts (UAs) with the Medical Text Indexer (MTI) software application that parses journal article titles and abstracts to suggest possible Medical Subject Headings (MeSH®) terms for human indexers to select. METHODS: MTI statistical reports that included main outcome measures of Recall, Precision and F1 measure (which combines Recall and Precision) were run on two sets of data: 1) from the 2009 MEDLINE Indexing year: 577,901 journal citations; 2) a preliminary smaller set from November 2013 – February 2014: 180,387 journal citations. RESULTS: 2013 SAs perform significantly better for processing MeSH term suggestions than 2013 UAs, 2009 SAs, and 2009 UAs. The F1 measures are: 2013 SA F1 = 65.11% 2013 UA F1 = 60.95% 2009 SA F1 = 40.57% 2009 UA F1 = 38.81% CONCLUSIONS: Structured abstracts perform better than unstructured abstracts for the discovery of corresponding MeSH terms. Authors and publishers should continue to adopt the structured abstract format as an assist to data mining applications.

back to top