When: Thursdays from 4:00–4:50 p.m.
Where: 1170 TMCB
Dean Billheimer
University of Utah
Huntsman Cancer Institute
2007-10-25
Topic:Compositional Data in Biomedical Research
Abstract:Compositional data are observations composed of vectors of proportions or percentages. That is, they describe the relative amounts attributable to each of a number of key categories. A consequence of this structure is that all categories must have a non-negative value, and the sum across all categories must equal 1. The non-negativity and summation constraints require that special statistical methods be used to analyze data with this structure. Unfortunately, modern methods of compositional data analysis are not well known in biomedical research. Like other areas of science, biomedicine has many problems in which the relevant scientific information is encoded in the relative amounts of key categories.
I present a statistical approach to compositional data analysis, and touch on several recent advances. In addition, I describe two studies in cancer research in which analysis of compositions plays an important role. The studies involve 1) the subcellular localization of the BRCA1 protein, and its role in breast cancer patient prognosis, and 2) the classification of serum proteomic profiles for early detection of lung cancer. Neither of the problems is "solved" in the sense of having a completely satisfactory solution (either statistical or medical). However, both problems demonstrate the utility of compositional analysis methods for biomedical problems. This talk contains a tutorial component, and should be accessible to students with exposure to multivariate statistics.