• Title/Summary/Keyword: Compositional analysis

Search Result 358, Processing Time 0.023 seconds

A guideline for the statistical analysis of compositional data in immunology

  • Yoo, Jinkyung;Sun, Zequn;Greenacre, Michael;Ma, Qin;Chung, Dongjun;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.453-469
    • /
    • 2022
  • The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.

Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping

  • Nam, Jin Hyun;Khatiwada, Aastha;Matthews, Lois J.;Schulte, Bradley A.;Dubno, Judy R.;Chung, Dongjun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.225-239
    • /
    • 2020
  • Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.

Comparison of Methods for Reducing the Dimension of Compositional Data with Zero Values

  • Song, Taeg-Youn;Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.559-569
    • /
    • 2012
  • Compositional data consist of compositions that are non-negative vectors of proportions with the unit-sum constraint. In disciplines such as petrology and archaeometry, it is fundamental to statistically analyze this type of data. Aitchison (1983) introduced a log-contrast principal component analysis that involves logratio transformed data, as a dimension-reduction technique to understand and interpret the structure of compositional data. However, the analysis is not usable when zero values are present in the data. In this paper, we introduce 4 possible methods to reduce the dimension of compositional data with zero values. Two real data sets are analyzed using the methods and the obtained results are compared.

Binary classification on compositional data

  • Joo, Jae Yun;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.1
    • /
    • pp.89-97
    • /
    • 2021
  • Due to boundedness and sum constraint, compositional data are often transformed by logratio transformation and their transformed data are put into traditional binary classification or discriminant analysis. However, it may be problematic to directly apply traditional multivariate approaches to the transformed data because class distributions are not Gaussian and Bayes decision boundary are not polynomial on the transformed space. In this study, we propose to use flexible classification approaches to transformed data for compositional data classification. Empirical studies using synthetic and real examples demonstrate that flexible approaches outperform traditional multivariate classification or discriminant analysis.

Statistical analysis of metagenomics data

  • Calle, M. Luz
    • Genomics & Informatics
    • /
    • v.17 no.1
    • /
    • pp.6.1-6.9
    • /
    • 2019
  • Understanding the role of the microbiome in human health and how it can be modulated is becoming increasingly relevant for preventive medicine and for the medical management of chronic diseases. The development of high-throughput sequencing technologies has boosted microbiome research through the study of microbial genomes and allowing a more precise quantification of microbiome abundances and function. Microbiome data analysis is challenging because it involves high-dimensional structured multivariate sparse data and because of its compositional nature. In this review we outline some of the procedures that are most commonly used for microbiome analysis and that are implemented in R packages. We place particular emphasis on the compositional structure of microbiome data. We describe the principles of compositional data analysis and distinguish between standard methods and those that fit into compositional data analysis.

Principal Component Analysis of Compositional Data using Box-Cox Contrast Transformation (Box-Cox 대비변환을 이용한 구성비율자료의 주성분분석)

  • 최병진;김기영
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.137-148
    • /
    • 2001
  • Compositional data found in many practical applications consist of non-negative vectors of proportions with the constraint which the sum of the elements of each vector is unity. It is well-known that the statistical analysis of compositional data suffers from the unit-sum constraint. Moreover, the non-linear pattern frequently displayed by the data does not facilitate the application of the linear multivariate techniques such as principal component analysis. In this paper we develop new type of principal component analysis for compositional data using Box-Cox contrast transformation. Numerical illustrations are provided for comparative purpose.

  • PDF

A study on the shoulder composition methods of power shoulder jackets and corresponding details (파워숄더 재킷의 어깨 구성 방법과 디테일 대응 분석)

  • Park, Jeongah;Lee, Jeongran
    • The Research Journal of the Costume Culture
    • /
    • v.29 no.3
    • /
    • pp.388-405
    • /
    • 2021
  • This study classifies the compositional methods of power shoulder jackets from 1980 to the present. It analyzes the relevance of jacket details according to how the power shoulder changes and its compositional methods by era. The research subdivides shoulder compositional techniques into seven, based on shoulder variation, sleeve variation, and the body and sleeve combination. The researcher investigates the frequency and trends of composing shoulders and analyzes details pertaining to the silhouette, jacket length, collar shape, and front closure. The most common method of shoulder composition is an angular shoulder variation. The others are a rounded shoulder variation, puffed sleeve, sleeve variation using pattern incision, raglan and kimono sleeves, and a shoulder variation that expanded the angle and width. The frequency differs slightly for each era. The relationship between shoulder compositional methods and details of power shoulder jackets is statistically significant, showing period-related differences. The homogeneity analysis results reveal that the shoulder composition of power shoulder jackets, the times, and details fall into distinct groups. This analysis shows that the silhouette, length, collar, and front closure of the power shoulder jacket differ depending on the power shoulder's compositional methods. Moreover, the shape of the power shoulder jacket is distinctly different. One can use this data to help develop the power shoulder jacket design by reflecting the details of shoulder compositional methods and changing trends over time.

Compositional rules of Korean auxiliary predicates for sentiment analysis

  • Lee, Kong Joo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.37 no.3
    • /
    • pp.291-299
    • /
    • 2013
  • Most sentiment analysis systems count the number of occurrences of sentiment expressions in a text, and evaluate the text by summing polarity values of extracted sentiment expressions. However, linguistic contexts of the expressions should be taken into account in order to analyze sentimental orientation of the text meticulously. Korean auxiliary predicates affect meaning of the main verb or adjective in some ways while attached to it in their usage. In this paper, we introduce a new approach that handles Korean auxiliary predicates in the light of sentiment analysis. We classify the auxiliary predicates according to their strength of impact on sentiment polarity values. We also define compositional rules of auxiliary predicates to update polarity values when the predicates appear along with sentiment expressions. This approach is implemented to a sentiment analysis system to extract opinions about a specific individual from review documents which were collected from various web sites. An experimental result shows approximately 72.6% precision and 52.7% recall for correctly detecting sentiment expressions from a text.

Compositional and Contextual Factors Related to Area Differentials in Suicide (지역의 자살률 차이와 관련된 구성적 요인과 상황적 요인)

  • Kang, Eunjeong
    • Korean Journal of Health Education and Promotion
    • /
    • v.30 no.1
    • /
    • pp.41-52
    • /
    • 2013
  • Objectives: Rural-urban differences in suicide have been observed in many settings. However, there has been little research addressing what factors can explain these differences. The purpose of this study was to analyze which compositional factors and contextual factors in local areas might be related to local suicide. Methods: The study design was cross-sectional. The data for 251 primary local governments on their age-standardized suicide mortality and their predefined indicators of compositional factors and contextual factors were obtained from Korean Statistical Information Service as of year 2010. Bivariate analysis including one-way ANOVA and chi-square test were used to identify the differences in local features by area type. Seven poisson regression models for each of total, males, and females were used to analyze which compositional and contextual factors were related to suicide. Results: There were differences in suicide between gu and goon in total, male, and female groups. For total, compositional factors including divorce and smoking rate, and contextual factors including financial independency, water and waterwaste coverage, and number of wastewater discharge factories were found to explain the urban-rural differences. Conclusions: This study provided some evidence that contextual factors at the local level as well as compositional factors are useful for predicting local suicide mortality.

Microbiome Study of Initial Gut Microbiota from Newborn Infants to Children Reveals that Diet Determines Its Compositional Development

  • Ku, Hye-Jin;Kim, You-Tae;Lee, Ju-Hoon
    • Journal of Microbiology and Biotechnology
    • /
    • v.30 no.7
    • /
    • pp.1067-1071
    • /
    • 2020
  • To understand the formation of initial gut microbiota, three initial fecal samples were collected from two groups of two breast milk-fed (BM1) and seven formula milk-fed (FM1) infants, and the compositional changes in gut microbiota were determined using metagenomics. Compositional change analysis during week one showed that Bifidobacterium increased from the first to the third fecal samples in the BM1 group (1.3% to 35.1%), while Klebsiella and Serratia were detected in the third fecal sample of the FM1 group (4.4% and 34.2%, respectively), suggesting the beneficial effect of breast milk intake. To further understand the compositional changes during progression from infancy to childhood (i.e., from three weeks to five years of age), additional fecal samples were collected from four groups of two breast milk-fed infants (BM2), one formula milk-fed toddler (FM2), three weaning food-fed toddlers (WF), and three solid food-fed children (SF). Subsequent compositional change analysis and principal coordinates analysis (PCoA) revealed that the composition of the gut microbiota changed from an infant-like composition to an adult-like one in conjunction with dietary changes. Interestingly, overall gut microbiota composition analyses during the period of progression from infancy to childhood suggested increasing complexity of gut microbiota as well as emergence of a new species of bacteria capable of digesting complex carbohydrates in WF and SF groups, substantiating that diet type is a key factor in determining the composition of gut microbiota. Consequently, this study may be useful as a guide to understanding the development of initial gut microbiota based on diet.