Surveying the NIH Biomedical Informatics and Computational Biology (BICB) Portfolio through Supervised Machine Learning

Wednesday, September 24, 2014 — Poster Session IV
10:00 a.m. –12:00 p.m.	FAES Academic Center	NIGMS	COMPBIO-10

Authors

P.M. Lyster
K.A. Smith
W.W. Lau
C.A. Johnson

Abstract

We describe an approach for classifying NIH research funding dealing with “Biomedical Informatics and Computational Biology” (BICB). In doing so, we describe a method to obtain an inventory, including dollar amounts, of grants and contracts. The approach we have adopted involves first developing a set of parsimonious categories that describe the types of BICB research projects that are funded by NIH: applications and modeling, informatics, image and signal analysis, high throughput tools, software and productivity, biostatistics, and high end computing. One of us (PML) then acted as a ‘rater’ who identified a gold-standard set of projects out of the broad NIH portfolio of research grants and assessed to be a best fit to the BICB classes; this is called a ‘training set’. In the course of our research we developed a support-vector machine based classifier, here referred to as SAE-SVM, which was developed for retrieving a complete inventory of grants and contracts based on the training set. The SAE-SVM is a flexible and extensible framework for retrieving general research inventories. We show that the knowledge gained from the model-based survey of the BICB portfolio allows for a greater understanding of IC investments in the BICB categories.

2014 program

Surveying the NIH Biomedical Informatics and Computational Biology (BICB) Portfolio through Supervised Machine Learning

Authors

Abstract