• 제목/요약/키워드: compositional regression

검색결과 14건 처리시간 0.024초

A guideline for the statistical analysis of compositional data in immunology

  • Yoo, Jinkyung;Sun, Zequn;Greenacre, Michael;Ma, Qin;Chung, Dongjun;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • 제29권4호
    • /
    • pp.453-469
    • /
    • 2022
  • The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.

Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping

  • Nam, Jin Hyun;Khatiwada, Aastha;Matthews, Lois J.;Schulte, Bradley A.;Dubno, Judy R.;Chung, Dongjun
    • Communications for Statistical Applications and Methods
    • /
    • 제27권2호
    • /
    • pp.225-239
    • /
    • 2020
  • Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.

지역의 자살률 차이와 관련된 구성적 요인과 상황적 요인 (Compositional and Contextual Factors Related to Area Differentials in Suicide)

  • 강은정
    • 보건교육건강증진학회지
    • /
    • 제30권1호
    • /
    • pp.41-52
    • /
    • 2013
  • Objectives: Rural-urban differences in suicide have been observed in many settings. However, there has been little research addressing what factors can explain these differences. The purpose of this study was to analyze which compositional factors and contextual factors in local areas might be related to local suicide. Methods: The study design was cross-sectional. The data for 251 primary local governments on their age-standardized suicide mortality and their predefined indicators of compositional factors and contextual factors were obtained from Korean Statistical Information Service as of year 2010. Bivariate analysis including one-way ANOVA and chi-square test were used to identify the differences in local features by area type. Seven poisson regression models for each of total, males, and females were used to analyze which compositional and contextual factors were related to suicide. Results: There were differences in suicide between gu and goon in total, male, and female groups. For total, compositional factors including divorce and smoking rate, and contextual factors including financial independency, water and waterwaste coverage, and number of wastewater discharge factories were found to explain the urban-rural differences. Conclusions: This study provided some evidence that contextual factors at the local level as well as compositional factors are useful for predicting local suicide mortality.

앙상블 기계학습 모델을 이용한 비정질 소재의 자기냉각 효과 및 전이온도 예측 (Prediction of Transition Temperature and Magnetocaloric Effects in Bulk Metallic Glasses with Ensemble Models)

  • 남충희
    • 한국재료학회지
    • /
    • 제34권7호
    • /
    • pp.363-369
    • /
    • 2024
  • In this study, the magnetocaloric effect and transition temperature of bulk metallic glass, an amorphous material, were predicted through machine learning based on the composition features. From the Python module 'Matminer', 174 compositional features were obtained, and prediction performance was compared while reducing the composition features to prevent overfitting. After optimization using RandomForest, an ensemble model, changes in prediction performance were analyzed according to the number of compositional features. The R2 score was used as a performance metric in the regression prediction, and the best prediction performance was found using only 90 features predicting transition temperature, and 20 features predicting magnetocaloric effects. The most important feature when predicting magnetocaloric effects was the 'Fe' compositional ratio. The feature importance method provided by 'scikit-learn' was applied to sort compositional features. The feature importance method was found to be appropriate by comparing the prediction performance of the Fe-contained dataset with the full dataset.

서울 아파트 가구의 주거만족도의 원천에 관한 연구 -중심지와 외곽지 고충과 저층단지의 비교를 중심으로- (Sources of Residential Satisfaction of the Apartment Households in Seoul : A Contextual Analysis)

  • 김용일;여홍구
    • 한국조경학회지
    • /
    • 제16권3호
    • /
    • pp.47-58
    • /
    • 1989
  • ^x Residential satisfaction by apartment housing type and by location was examined in Seoul Korea for a sample of 303 housewives disaggregated into four housing subgroups. These group differ in their personal characteristic by housing type and location. They showed significant differences in their levels of satisfaction and in their perception and evaluation of several community, neighborhood and housing unit attributes. A regression model of satisfaction for entire sample explain about 45% of the variation, but this conceals the compositional and the contextual differences between groups. Seperate regression for the four groups explain an average of 63% of the variation in residential satisfaction. Residents of high-rise and low-rise apartments both of center and periphery location differ significantly both from each other. Results show that certain dwelling, neighborhood and community context elicit dissatisfaction across the full sample. The objective contextual factor of housing type prove significant in most compositional subsamples, indicating that sources of residential satisfaction are not same in everywhere.

  • PDF

Compositional Analysis of Naphtha by FT-Raman Spectroscopy

  • 구민식;정호일
    • Bulletin of the Korean Chemical Society
    • /
    • 제20권2호
    • /
    • pp.159-162
    • /
    • 1999
  • Three different chemical compositions of total paraffin, total naphthene, total aromatic content in naphtha have been successfully analyzed using FT-Raman spectroscopy. Partial least squares (PLS) regression has been utilized to develop calibration models for each composition from Raman spectral bands. The PLS calibration results showed Blood correlation with those of gas chromatography (GC). Using PLS regression, the spectral information related to each composition has been successfully extracted from highly overlapped Raman spectra of naphtha.

Rapid Compositional Analysis of Naphtha by Near-Infrared Spectroscopy

  • 구민식;정호일;이준식
    • Bulletin of the Korean Chemical Society
    • /
    • 제19권11호
    • /
    • pp.1189-1193
    • /
    • 1998
  • The determination of total paraffin, naphthene, and aromatic (PNA) contents in naphtha samples, which were directly obtained from actual refining process, has been studied using near-infrared (NIR) spectroscopy. Each of the total PNA concentrations in naphtha has been successfully analyzed using NIR spectroscopy. Partial least squares (PLS) regression method has been utilized to quantify the total PNA contents in naphtha from the NIR spectral bands. The NIR calibration results showed an excellent correlation with those of conventional gas chromatography (GC). Due to its rapidity and accuracy, NIR spectroscopy is appeared as a new analytical technique which can be substituted for the conventional GC method for the quantitative analysis of petrochemical products including naphtha.

Compositional differences of Bojungikgi-tang decoctions using pressurized or non-pressurized extraction methods with variable extraction times

  • Kim, Jung-Hoon;Seo, Chang-Seob;Kim, Seong-Sil;Shin, Hyeun-Kyoo
    • 대한본초학회지
    • /
    • 제28권4호
    • /
    • pp.1-6
    • /
    • 2013
  • Objectives : In other to determine the optimal extraction conditions, the various Bojungikgi-tang (BJIGT) decoctions prepared by different pressure levels and different extraction times were compared and evaluated in terms of the extract yield and the total soluble solid content. Methods : Decoctions were prepared by the pressure levels of 0 (non-pressurized) and 1 $kgf/cm^2$ (pressurized) for 60, 120 and 180 min. The extract yield and the total soluble solids content of decoctions were measured, and the amounts of the reference compounds in decoctions were investigated by the analysis using high performance liquid chromatography. Results : The extract yield and the total soluble solid content were higher in decoctions extracted by the pressurized method than those from decoction with non-pressurized method. The patterns of yield and contents showed a proportional increase to the extraction time. In analysis of the linear regression for four reference compounds such as liquiritin, nodakenin, hesperidin, and glycyrrhizin, the good linearity with the correlation coefficient more than 0.9999 was observed. The highest contents for four reference compounds were observed at 180 min of both the pressurized method and the non-pressurized method. Conclusions : This study suggests that the pressure in extraction method and the extraction time affect the compositional constituents in BJIGT decoctions. The extraction time of 180 min could be chosen in both pressurized and non-pressurized method as optimal extraction condition.

$Ba_{1-x}Sr_ xTiO_3$ 단결정의 조성 제어 (Control of the Composition of $Ba_{1-x}Sr_ xTiO_3$ Single Crystals)

  • 노건배;양상돈;유상임
    • 한국결정학회지
    • /
    • 제14권2호
    • /
    • pp.73-78
    • /
    • 2003
  • (Ba/sub 1-x/Sr/sub x/)TiO₃ (BST, 0.4< x <0.65) single crystals were successfully grown by the TSSG (Top-Seeded Solution Growth) method, using a commercial [100] SrTiO₃ or as-grown [100] BST single crystals as seed crystals. To obtain the BST single crystals with various compositions x, the Ba/sr molar ratios in the solutions were systematically controlled while the Ti ion content among all cations was fixed at 67 mol%. A linear regression curve between their x values and the molar ratios of Sr/(Ba + Sr) in the solutions could be obtained, which in turn could used to select the initial composition to produce BST crystal with an aimed x value. In addition, the isothermal growth was found more effective for obtaining a compositional uniformity than a slow cooling process.

회귀 크리깅을 이용한 무인기 영상 기반의 갯벌 표층 퇴적상 분포도 작성 (Unmanned AerialVehicles Images Based Tidal Flat Surface Sedimentary Facies Mapping Using Regression Kriging)

  • 곽근호;김근용;이진교;유주형
    • 대한원격탐사학회지
    • /
    • 제39권5_1호
    • /
    • pp.537-549
    • /
    • 2023
  • 갯벌 퇴적물 성분의 분포 특성은 연안환경 분석, 환경영향평가에서 기초자료로 활용되기 때문에 신뢰성 높은 갯벌 표층 퇴적상 분포도를 제작하는 것은 매우 중요하다. 이 연구에서는 갯벌 퇴적상 분포도를 생성하기 위해 회귀 크리깅(regression kriging)의 적용성을 평가하였다. 이를 목적으로, 갯벌 표층 퇴적상 분류 과정에서 현장조사 자료의 수, 부가자료의 종류, 회귀 크리깅에 적용되는 회귀 모형의 영향과 다른 예측 기법(단변량 크리깅, 회귀 분석)과의 비교와 같은 다양한 요인의 영향을 조사하였다. 회귀 크리깅의 적용성 평가를 위해, 우리나라 태안군 안면도에 위치한 황도 갯벌을 대상으로 무인기 자료를 이용한 사례 연구를 수행하였다. 사례연구 결과, 신뢰성 높은 갯벌 표층 퇴적상 분포도를 제작하기 위해서는 적절한 수의 현장조사 자료 확보와 함께 지형 고도와 조류로 밀도도를 부가자료로 이용하는 것이 가장 중요한 것으로 나타났다. 또한 초고해상도 무인기 자료를 이용하여 퇴적물 분포의 상세한 특성을 고려할 수 있는 회귀 크리깅이 다른 기법과 비교해서 예측 성능이 가장 우수한 것으로 나타났다. 이러한 연구 결과는 갯벌 표층 퇴적상 분포도 제작에 가이드라인으로 활용될 수 있을 것으로 기대된다.