A DATA ANALYTIC TOOL FOR MEASURING COMPOSITIONAL VARIABILITY
Degree awarded: Ph.D. Mathematics and Statistics. American University
Compositional data are non-negative proportions that sum to one. Under the unit-sum constraint, the standard statistical techniques devised for unconstrained variables can not be applied to analyze compositional data. Aitchison (1986) developed a method based on logratio transformations of compositional data that is widely used. This method is limited by the assumption of strictly positive components or the use of special treatments to accommodate possible zero components. We propose a new data analytic measure of compositional data variability based on the Sum of Coefficients of Variation to address a common objective in compositional data analysis to identify a subset of the variables that retains most of the variability of the full composition. In selecting these subcompositions, this new method resolves the difficulty of zeros in compositional data avoiding any special consideration of zeros. The new technique is investigated analytically and illustrated with real and simulated data sets.