• 제목/요약/키워드: multivariate data analysis

검색결과 1,402건 처리시간 0.037초

Gallbladder Carcinoma: Analysis of Prognostic Factors in 132 Cases

  • Wang, Rui-Tao;Xu, Xin-Sen;Liu, Jun;Liu, Chang
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제13권6호
    • /
    • pp.2511-2514
    • /
    • 2012
  • Objective: To evaluate the prognostic factors of gallbladder carcinoma. Methods: Presentation, operative data, complications, and survival outcome were examined for 132 gallbladder carcinoma patients who underwent gallbladder surgery in our unit during 2002-2007, and follow-up results were obtained from every patient for univariate and multivariate survival analysis. Results: The univariate analysis showed that gallbladder lesion history, tumor cell differentiation, Nevin staging, preoperative lymph node metastasis and the surgical approach significantly correlated with the prognosis of the patients (p<0.05). The results of the multivariate analysis (Cox regression) showed that gallbladder lesion history, Nevin staging and the surgical approach were independent predicators with relative risks of 6.9, 4.4, 2.8, respectively (p=0.002, 0.003, 0.008). Conclusion: Gallbladder lesion history, Nevin staging and the surgical approach are independent prognostic factors for gallbladder carcinoma, a rapidly fatal disease. Therefore, early diagnosis, anti-infective therapy and radical surgery are greatly needed to improve the prognosis of gallbladder carcinoma.

Unsupervised Clustering of Multivariate Time Series Microarray Experiments based on Incremental Non-Gaussian Analysis

  • Ng, Kam Swee;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Sun-Hee;Anh, Nguyen Thi Ngoc
    • International Journal of Contents
    • /
    • 제8권1호
    • /
    • pp.23-29
    • /
    • 2012
  • Multiple expression levels of genes obtained using time series microarray experiments have been exploited effectively to enhance understanding of a wide range of biological phenomena. However, the unique nature of microarray data is usually in the form of large matrices of expression genes with high dimensions. Among the huge number of genes presented in microarrays, only a small number of genes are expected to be effective for performing a certain task. Hence, discounting the majority of unaffected genes is the crucial goal of gene selection to improve accuracy for disease diagnosis. In this paper, a non-Gaussian weight matrix obtained from an incremental model is proposed to extract useful features of multivariate time series microarrays. The proposed method can automatically identify a small number of significant features via discovering hidden variables from a huge number of features. An unsupervised hierarchical clustering representative is then taken to evaluate the effectiveness of the proposed methodology. The proposed method achieves promising results based on predictive accuracy of clustering compared to existing methods of analysis. Furthermore, the proposed method offers a robust approach with low memory and computation costs.

Differentiation of Roots of Glycyrrhiza Species by 1H Nuclear Magnetic Resonance Spectroscopy and Multivariate Statistical Analysis

  • Yang, Seung-Ok;Hyun, Sun-Hee;Kim, So-Hyun;Kim, Hee-Su;Lee, Jae-Hwi;Whang, Wan-Kyun;Lee, Min-Won;Choi, Hyung-Kyoon
    • Bulletin of the Korean Chemical Society
    • /
    • 제31권4호
    • /
    • pp.825-828
    • /
    • 2010
  • To classify Glycyrrhiza species, samples of different species were analyzed by $^1H$ NMR-based metabolomics technique. Partial least squares discriminant analysis (PLS-DA) was used as the multivariate statistical analysis of the 1H NMR data sets. There was a clear separation between various Glycyrrhiza species in the PLS-DA derived score plots. The PLS-DA model was validated, and the key metabolites contributing to the separation in the score plots of various Glycyrrhiza species were lactic acid, alanine, arginine, proline, malic acid, asparagine, choline, glycine, glucose, sucrose, 4-hydroxy-phenylacetic acid, and formic acid. The compounds present at relatively high levels were glucose, and 4-hydroxyphenylacetic acid in G. glabra; lactic acid, alanine, and proline in G. inflata; and arginine, malic acid, and sucrose in G. uralensis. This is the first study to perform the global metabolomic profiling and differentiation of Glycyrrhiza species using $^1H$ NMR and multivariate statistical analysis.

다변량 해석기법을 이용한 인천연안해역의 수질평가 (The Evaluation of Water Quality in Coastal Sea of Incheon Using a Multivariate Analysis)

  • 김종구
    • 한국환경과학회지
    • /
    • 제15권11호
    • /
    • pp.1017-1025
    • /
    • 2006
  • This study was conducted to evaluate characteristic of water duality in coastal sea of Incheon using a multivariate analysis. The analysis data in coastal sea of Incheon was aquired by the NFRDI data which was surveyed from March 1997 to November 2003. Eleven water quality parameters were determined on each survey The results were summarized as follow : Water quality in Incheon coastal sea could be explained up to 64.62% by three factors which were included in loading of fresh water and nutrients by the land(36.98%), seasonal variation(16.19%), and internal metabolism (11.24%). The results of time series analysis by factor score, in case of factor 1, station 1 influenced by Han river was shown to high factor score and station 3 located by outer sea was shown to low factor score. In case of factor 2, station 1 was appeared to high variation and station 3 was appeared to low variation. The result of cluster analysis by station was classified into three group that has different water quality characteristics. Especially, station 1 which affected by Han river and station 4 which affected by sewage treatment plant was appeared to considerable water quality characteristics against other station. In yearly cluster analysis, three group was classified and water quality in 2003 years due to high precipitation was different to another year. It could be suggested from these results that it is important to control discharge of fresh water by Han rivet and sewage treatment plant for water quality management of coastal sea of Incheon.

Evaluation of Water Quality Using Multivariate Statistic Analysis in Busan Coastal Area

  • Kim, Sang-Soo;Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권3호
    • /
    • pp.531-542
    • /
    • 2004
  • Principal component analysis and cluster analysis were conducted to comprehensively evaluate the water quality of Busan coastal area with the data collected seasonally by the analysis of surface water at 10 stations from 1997 to 2003. We noted that the first principal component was regarded as a factor related with the input of nutrient-rich fresh water and the second principal component as meteorological characteristics. Also we obtained that water qualities of station 4 and 9 were different from those of other stations in Busan coastal area.

  • PDF

동적 다변량 그래프의 연속적 분석을 위한 질의 모델 설계 및 구현 (A Query Model for Consecutive Analyses of Dynamic Multivariate Graphs)

  • 배예찬;함도영;김태양;정혜진;김동윤
    • 컴퓨터교육학회논문지
    • /
    • 제17권6호
    • /
    • pp.103-113
    • /
    • 2014
  • 본 연구에서는 동적 다변량 그래프 데이터의 연속적 분석이 가능한 질의 모델을 설계 및 구현하였다. 먼저, 질의 모델을 판별함수 설정과 시간에 따른 통합 방법 선택의 두 단계로 설계하고, 질의 패널, 그래프 시각화 패널, 속성 패널로 구성된 질의 시스템으로 구현하였다. 또한, 그래프 표현에는 노드-링크 다이어그램과 Force-Directed Graph Drawing 알고리즘을 이용하였으며, 질의 결과로 선택된 대상들에 효과를 적용하여 사용자가 시각적으로 구분할 수 있도록 처리하였다. 마지막으로, 세계 소형 무기 거래량 데이터를 이용하여, 본 연구에서 설계한 동적 다변량 그래프 질의 모델을 검증하였다. 본 연구는 동적 그래프의 연속적 분석이 가능한 새로운 질의 모델을 설계하는 것을 통해, 기존 모델이 동적 그래프를 시점별로 이산적으로만 분석할 수 있는 한계를 개선하였다는데 의의가 있다. 본 연구는 추세 분석이나, 복잡계 네트워크 해석 등 동적 그래프를 사용하는 연구에 기여할 수 있을 것으로 기대된다.

  • PDF

불연속지반의 연속체 모델 적용범위에 대한 수치해석적 연구 (A Study on Application Range of Continuum Model to Discontinuous Rock mass with Numerical Analysis)

  • 이경우;노상림;윤지선
    • 한국지반공학회:학술대회논문집
    • /
    • 한국지반공학회 2002년도 봄 학술발표회 논문집
    • /
    • pp.197-204
    • /
    • 2002
  • In this study, multivariate analysis based on domestic data(958 EA) of road tunnel, and suggest the easy prediction equation of Q-system. We generate applicable Q-value to numerical analysis method with using the equation and investigate the behavior as variable Q-value of rock mass induced excavation with discontinuum numerical analysis method, UDEC. In the result of the experiment, we research the application range of Q-value to apply the continuum model to discontinuous rock mass is below 0.7 and we testify the applicability of continuum model as researched Q-value with continuum numerical analysis method, FLAC.

  • PDF

Hadoop기반의 공개의료정보 빅 데이터 분석을 통한 한국여성암 검진 요인분석 서비스 (Analysis of Factors for Korean Women's Cancer Screening through Hadoop-Based Public Medical Information Big Data Analysis)

  • 박민희;조영복;김소영;박종배;박종혁
    • 한국정보통신학회논문지
    • /
    • 제22권10호
    • /
    • pp.1277-1286
    • /
    • 2018
  • 본 논문에서는 공개의료정보 빅데이터 분석을 위해 클라우드 환경에서 아파치 하둡 기반의 클라우드 환경을 도입하여 컴퓨팅 자원의 유연한 확장성을 제공하고 실제로, 로그데이터가 장기간 축적되거나 급격하게 증가하는 상황에서 스토리지, 메모리 등의 자원을 신속성 있고 유연하게 확장을 할 수 있는 기능을 포함했다. 또한, 축적된 비정형 로그데이터의 실시간 분석이 요구되어질 때 기존의 분석도구의 처리한계를 극복하기 위해 본 시스템은 하둡 (Hadoop) 기반의 분석모듈을 도입함으로써 대용량의 로그데이터를 빠르고 신뢰성 있게 병렬 분산 처리할 수 있는 기능을 제공한다. 빅데이터 분석을 위해 빈도분석과 카이제곱검정을 수행하고 유의 수준 0.05를 기준으로 단변량 로지스틱 회귀분석과 모델별 의미 있는 변수들의 다변량 로지스틱 회귀분석을 시행 하였다. (p<0.05) 의미 있는 변수들을 모델별로 나누어 다변량 로지스틱 회귀 분석한 결과 Model 3으로 갈수록 적합도가 높아졌다.

주암호의 조류 발생 특성과 수질요인의 상관성 연구 (Relationships Between the Characteristics of Algae Occurrence and Environmental Factors in Lake Juam, Korea)

  • 서경애;정수정;박종환;황경섭;임병진
    • 한국물환경학회지
    • /
    • 제29권3호
    • /
    • pp.317-328
    • /
    • 2013
  • The purpose of this study was to investigate the change of phytoplankton fluctuation and long term of water quality of Lake Juam and to evaluate the relationship between phytoplankton pattern and environmental factors data. Correlation and factor analyses were employed to identify key environmental factors affecting phytoplankton dynamics. Of 18 parameters, pH, temperature, COD, BOD and T-P were highly correlated with Chl-a. Phytoplankton data showed that cyanobacteria were dominant, and more than 60% of total algae density. Also Lake Juam received a lot of influence of the Asian monsoon climate. This study presents necessity of multivariate statistic techniques for evaluation of Lake Juam complex data set with a view to get better information data and effective management of water source.

Shelf-life prediction of fresh ginseng packaged with plastic films based on a kinetic model and multivariate accelerated shelf-life testing

  • Jong-Jin Park;Jeong-Hee Choi;Kee-Jai Park;Jeong-Seok Cho;Dae-Yong Yun;Jeong-Ho Lim
    • 한국식품저장유통학회지
    • /
    • 제30권4호
    • /
    • pp.573-588
    • /
    • 2023
  • The purpose of this study was to monitor changes in the quality of ginseng and predict its shelf-life. As the storage period of ginseng increased, some quality indicators, such as water-soluble pectin (WSP), CDTA-soluble pectin (CSP), cellulose, weight loss, and microbial growth increased, while others (Na2CO3-soluble pectin/NSP, hemicellulose, starch, and firmness) decreased. Principal component analysis (PCA) was performed using the quality attribute data and the principal component 1 (PC1) scores extracted from the PCA results were applied to the multivariate analysis. The reaction rate at different temperatures and the temperature dependence of the reaction rate were determined using kinetic and Arrhenius models, respectively. Among the kinetic models, zeroth-order models with cellulose and a PC1 score provided an adequate fit for reaction rate estimation. Hence, the prediction model was constructed by applying the cellulose and PC1 scores to the zeroth-order kinetic and Arrhenius models. The prediction model with PC1 score showed higher R2 values (0.877-0.919) than those of cellulose (0.797-0.863), indicating that multivariate analysis using PC1 score is more accurate for the shelf-life prediction of ginseng. The predicted shelf-life using the multivariate accelerated shelf-life test at 5, 20, and 35℃ was 40, 16, and 7 days, respectively.