Skip to main content

A novel statistical tool for regression analysis of a skewed biomarker subject to pooling.

Tuesday, September 23, 2014 — Poster Session III

12:00 p.m. – 2:00 p.m.

FAES Academic Center



* FARE Award Winner


  • E.M. Mitchell
  • R.H. Lyles
  • M. Danaher
  • A.K. Manatunga
  • E.F. Schisterman


Epidemiological studies involving biomarkers are often hindered by prohibitively expensive laboratory tests. Strategically pooling specimens prior to performing these lab assays can effectively reduce cost with minimal information loss in a logistic regression setting. When the goal is to perform regression with a continuous biomarker as the outcome, analysis of pooled specimens may not be straightforward, particularly if the outcome is right-skewed. In such cases, we demonstrate that a slight modification of standard multiple linear regression for poolwise data can provide valid and precise coefficient estimates when pools are formed by combining biospecimens from subjects with identical covariate values. When these x-homogeneous pools cannot be formed, we propose a Monte Carlo expectation maximization algorithm to compute maximum likelihood estimates. Simulation studies demonstrate that these analytical methods provide essentially unbiased estimates of coefficient parameters and standard errors when appropriate assumptions are met. Furthermore, we show how one can utilize the fully observed covariate data to inform the pooling strategy, yielding a high level of statistical efficiency at a fraction of the total lab cost. Our methodological contribution to the base of available methods to analyze pooled specimens will empower researchers to more confidently consider pooling as a potential study design.

back to top