• 제목/요약/키워드: compositional data analysis

검색결과 57건 처리시간 0.019초

A guideline for the statistical analysis of compositional data in immunology

  • Yoo, Jinkyung;Sun, Zequn;Greenacre, Michael;Ma, Qin;Chung, Dongjun;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • 제29권4호
    • /
    • pp.453-469
    • /
    • 2022
  • The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.

Comparison of Methods for Reducing the Dimension of Compositional Data with Zero Values

  • Song, Taeg-Youn;Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제19권4호
    • /
    • pp.559-569
    • /
    • 2012
  • Compositional data consist of compositions that are non-negative vectors of proportions with the unit-sum constraint. In disciplines such as petrology and archaeometry, it is fundamental to statistically analyze this type of data. Aitchison (1983) introduced a log-contrast principal component analysis that involves logratio transformed data, as a dimension-reduction technique to understand and interpret the structure of compositional data. However, the analysis is not usable when zero values are present in the data. In this paper, we introduce 4 possible methods to reduce the dimension of compositional data with zero values. Two real data sets are analyzed using the methods and the obtained results are compared.

Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping

  • Nam, Jin Hyun;Khatiwada, Aastha;Matthews, Lois J.;Schulte, Bradley A.;Dubno, Judy R.;Chung, Dongjun
    • Communications for Statistical Applications and Methods
    • /
    • 제27권2호
    • /
    • pp.225-239
    • /
    • 2020
  • Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.

Binary classification on compositional data

  • Joo, Jae Yun;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • 제28권1호
    • /
    • pp.89-97
    • /
    • 2021
  • Due to boundedness and sum constraint, compositional data are often transformed by logratio transformation and their transformed data are put into traditional binary classification or discriminant analysis. However, it may be problematic to directly apply traditional multivariate approaches to the transformed data because class distributions are not Gaussian and Bayes decision boundary are not polynomial on the transformed space. In this study, we propose to use flexible classification approaches to transformed data for compositional data classification. Empirical studies using synthetic and real examples demonstrate that flexible approaches outperform traditional multivariate classification or discriminant analysis.

Statistical analysis of metagenomics data

  • Calle, M. Luz
    • Genomics & Informatics
    • /
    • 제17권1호
    • /
    • pp.6.1-6.9
    • /
    • 2019
  • Understanding the role of the microbiome in human health and how it can be modulated is becoming increasingly relevant for preventive medicine and for the medical management of chronic diseases. The development of high-throughput sequencing technologies has boosted microbiome research through the study of microbial genomes and allowing a more precise quantification of microbiome abundances and function. Microbiome data analysis is challenging because it involves high-dimensional structured multivariate sparse data and because of its compositional nature. In this review we outline some of the procedures that are most commonly used for microbiome analysis and that are implemented in R packages. We place particular emphasis on the compositional structure of microbiome data. We describe the principles of compositional data analysis and distinguish between standard methods and those that fit into compositional data analysis.

Box-Cox 대비변환을 이용한 구성비율자료의 주성분분석 (Principal Component Analysis of Compositional Data using Box-Cox Contrast Transformation)

  • 최병진;김기영
    • 응용통계연구
    • /
    • 제14권1호
    • /
    • pp.137-148
    • /
    • 2001
  • 비율을 나타내는 요소들로 이루어진 구성비율자료는 각 행들의 합이 1이 되는 제약을 가지고 있어 통계적으로 다루기가 쉽지 않다. 더구나 자료의 구조가 선형적인 형태를 보이지 않는 특성을 가지기 때문에 주성분분석과 같은 선형적인 다변량기법들을 구성비율자료에 적용을 할 때 잘못된 해석과 추론이 이루어질 가능성이 있다. 본 논문에서는 구성비율자료의 주성분분석에서 기존의 방법들이 가지는 문제점을 해결하기 위해 Box-Cox 대비변환(Box-Cox contrast transformation)을 이용한 새로운 형태의 분석방법을 제시한다. 그리고 실제자료의 분석과 모의실험을 통해서 Aitchison(1983)이 제시한 방법과 수행능력을 비교하고자 한다.

  • PDF

Compositional data analysis by the square-root transformation: Application to NBA USG% data

  • Jeseok Lee;Byungwon Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제31권3호
    • /
    • pp.349-363
    • /
    • 2024
  • Compositional data refers to data where the sum of the values of the components is a constant, hence the sample space is defined as a simplex making it impossible to apply statistical methods developed in the usual Euclidean vector space. A natural approach to overcome this restriction is to consider an appropriate transformation which moves the sample space onto the Euclidean space, and log-ratio typed transformations, such as the additive log-ratio (ALR), the centered log-ratio (CLR) and the isometric log-ratio (ILR) transformations, have been mostly conducted. However, in scenarios with sparsity, where certain components take on exact zero values, these log-ratio type transformations may not be effective. In this work, we mainly suggest an alternative transformation, that is the square-root transformation which moves the original sample space onto the directional space. We compare the square-root transformation with the log-ratio typed transformation by the simulation study and the real data example. In the real data example, we applied both types of transformations to the USG% data obtained from NBA, and used a density based clustering method, DBSCAN (density-based spatial clustering of applications with noise), to show the result.

파워숄더 재킷의 어깨 구성 방법과 디테일 대응 분석 (A study on the shoulder composition methods of power shoulder jackets and corresponding details)

  • 박정아;이정란
    • 복식문화연구
    • /
    • 제29권3호
    • /
    • pp.388-405
    • /
    • 2021
  • This study classifies the compositional methods of power shoulder jackets from 1980 to the present. It analyzes the relevance of jacket details according to how the power shoulder changes and its compositional methods by era. The research subdivides shoulder compositional techniques into seven, based on shoulder variation, sleeve variation, and the body and sleeve combination. The researcher investigates the frequency and trends of composing shoulders and analyzes details pertaining to the silhouette, jacket length, collar shape, and front closure. The most common method of shoulder composition is an angular shoulder variation. The others are a rounded shoulder variation, puffed sleeve, sleeve variation using pattern incision, raglan and kimono sleeves, and a shoulder variation that expanded the angle and width. The frequency differs slightly for each era. The relationship between shoulder compositional methods and details of power shoulder jackets is statistically significant, showing period-related differences. The homogeneity analysis results reveal that the shoulder composition of power shoulder jackets, the times, and details fall into distinct groups. This analysis shows that the silhouette, length, collar, and front closure of the power shoulder jacket differ depending on the power shoulder's compositional methods. Moreover, the shape of the power shoulder jacket is distinctly different. One can use this data to help develop the power shoulder jacket design by reflecting the details of shoulder compositional methods and changing trends over time.

지역의 자살률 차이와 관련된 구성적 요인과 상황적 요인 (Compositional and Contextual Factors Related to Area Differentials in Suicide)

  • 강은정
    • 보건교육건강증진학회지
    • /
    • 제30권1호
    • /
    • pp.41-52
    • /
    • 2013
  • Objectives: Rural-urban differences in suicide have been observed in many settings. However, there has been little research addressing what factors can explain these differences. The purpose of this study was to analyze which compositional factors and contextual factors in local areas might be related to local suicide. Methods: The study design was cross-sectional. The data for 251 primary local governments on their age-standardized suicide mortality and their predefined indicators of compositional factors and contextual factors were obtained from Korean Statistical Information Service as of year 2010. Bivariate analysis including one-way ANOVA and chi-square test were used to identify the differences in local features by area type. Seven poisson regression models for each of total, males, and females were used to analyze which compositional and contextual factors were related to suicide. Results: There were differences in suicide between gu and goon in total, male, and female groups. For total, compositional factors including divorce and smoking rate, and contextual factors including financial independency, water and waterwaste coverage, and number of wastewater discharge factories were found to explain the urban-rural differences. Conclusions: This study provided some evidence that contextual factors at the local level as well as compositional factors are useful for predicting local suicide mortality.

건축에서 공간형상의 체계적 구성논리를 적용한 모델화에 관한 연구 (A Study on Model applied to Logic of Systematic Composition on the Space and Shape in Architecture)

  • 이상화
    • 한국주거학회논문집
    • /
    • 제10권2호
    • /
    • pp.175-183
    • /
    • 1999
  • This study aims at modeling the compositional method in the architectural space and shape. The composition of space is composed of the position and the area in space. Therefore these elements are established to the functional program applied at the behavioral data. The element of spatial structure is position and scale in space, and the composition of spatial shape is developed repitionally to the combination of build-up method. The functional program is being expertised and the scale in building being lager, the importance of functional program is increased. Applied at data of the functional program at spatial structure, the process is developed to the method of combination. The purpose of this study is approached at the degree of application in modeling on the compositional method in architectural space and shape, which is a fundamental aspect on the quantitative analysis of architectural space.

  • PDF