Skip to main content

Quantitation of Dispersion in Principal Component Analysis (PCA) Plots: Change with Age and Disease

Wednesday, September 16, 2015 — Poster Session I

3:30 p.m. – 5:00 p.m.
FAES Terrace


  • P Bastian
  • Y Zhang
  • K Becker


Principal Components Analysis (PCA) is commonly used to identify and classify subgroups within high-throughput datasets, such as microarray or other data, in a reduced dimensional space that represent the most variant patterns of the data. Typically a subgroup will be represented by a cloud of dots (samples). In aging longitudinal microarray studies, we observe that young subgroups generally have tight grouping in PCA space, representing low sample variability, while older subgroups are typically more dispersed. We devised an approach to quantitate the dispersion in PCA subgroups, giving each subgroup a value which can be compared between classes and between studies. We hope this approach will enable investigators using PCA for grouping and classification, to obtain quantitative and comparative dispersion data regarding subpopulations within large datasets.

Category: Computational Biology