• Title/Summary/Keyword: Multivariate Statistical Methods

Search Result 463, Processing Time 0.025 seconds

Immunohistochemical Assay for Lymph-Node Micrometastasis in Gastric Cancer and Correlation with Survival Rate (위암에서 림프절 미세전이의 면역조직화학적 방법에 의한 측정 및 생존율과의 상관관계)

  • Moon Chul;Park Kyung-Kyu;Lee Moon Soo;Hur Kyung Yul;Jang Yong Seog;Kim Jae Joon;Lee Min Hyuk;Jin So-Young;Lee Dong Wha
    • Journal of Gastric Cancer
    • /
    • v.2 no.1
    • /
    • pp.5-11
    • /
    • 2002
  • Purpose: The purpose of this study is to identify immunohistochemical evidence of lymph-node micrometastasis in histologic node-negative gastric cancer patients and to evaluate the prognostic significance of lymph-node micrometastasis.Materials and Methods: A retrospective study of 50 gastric cancer patients who underwent curative resections from October 1990 to November 1994 was performed. Two consecutive sections were prepared: one for ordinary hematoxylin and eosin staining, and the other for immunohistochemical staining with Pan cytokeratin antibody (Novocastra, UK). In the univariate analysis, the survival rate was calculated using the Life Table Method, and the multivariate analysis was determined using a Cox Proportional HazardsModel. The statistical analyses of the relationships between the clinicopathologic factors and micrometastases were performed by using a Chi-square test. Results: Of 2522 harvested lymph nodes, 81 ($4.1\%$) nodes and 19 ($38\%$) of 50 patients were identified as having lymphnode micrometastases by using immunohistochemical staining for cytokeratin. The incidence of lymph-node micrometastases was significantly higher in diffuse type carcinomas ($54\%$, P=0.024) and in patients with serosal invasion ($52.2\%$, P=0.05). For patients with lymph-node micrometastases (n=19), the 5-year survival rate was significantly decreased ($73.7\%$, P=0.015). The Lauren's classirication (P=0.021) and the depth of invasion (P=0.035) were shown by multivariate analysis to have a significant relationship with the presence of micrometastases. Multivariate analysis revealed that lymph-node micrometastasis was independently correlated with survival in histologic node-negative gastic cancer patients. Conclusion: The presence of cytokeratin detected lymphnode micrometastases correlates with the worse prognosis for patients with histologic node-negative gastric cancer.

  • PDF

An Alternative Parametric Estimation of Sample Selection Model: An Application to Car Ownership and Car Expense (비정규분포를 이용한 표본선택 모형 추정: 자동차 보유와 유지비용에 관한 실증분석)

  • Choi, Phil-Sun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.345-358
    • /
    • 2012
  • In a parametric sample selection model, the distribution assumption is critical to obtain consistent estimates. Conventionally, the normality assumption has been adopted for both error terms in selection and main equations of the model. The normality assumption, however, may excessively restrict the true underlying distribution of the model. This study introduces the $S_U$-normal distribution into the error distribution of a sample selection model. The $S_U$-normal distribution can accommodate a wide range of skewness and kurtosis compared to the normal distribution. It also includes the normal distribution as a limiting distribution. Moreover, the $S_U$-normal distribution can be easily extended to multivariate dimensions. We provide the log-likelihood function and expected value formula based on a bivariate $S_U$-normal distribution in a sample selection model. The results of simulations indicate the $S_U$-normal model outperforms the normal model for the consistency of estimators. As an empirical application, we provide the sample selection model for car ownership and a car expense relationship.

Chemical Oxygen Demand (COD) Model for the Assessment of Water Quality in the Han River, Korea (한강수질 평가를 위한 COD (화학적 산소 요구량) 모델 평가)

  • Kim, Jae Hyoun;Jo, Jinnam
    • Journal of Environmental Health Sciences
    • /
    • v.42 no.4
    • /
    • pp.280-292
    • /
    • 2016
  • Objectives: The objective of this study was to build COD regression models for the Han River and evaluate water quality. Methods: Water quality data sets for the dry season (as of January) during a four-year period (2012-2015) were collected from the database of the Han River automatic water quality monitoring stations. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR) were used to build five-descriptor COD models. Multivariate statistical techniques such as principal component analysis (PCA) and cluster analysis (CA) are useful tools for extracting meaningful information. Results: The $r^2$ of the best COD models provided significant high values (> 0.8) between 2012 and 2015. Total organic carbon (TOC) was a surrogate indicator for COD (as COD/TOC) with high reliability ($r^2=0.63$ in 2012, $r^2=0.75$ for 2013, $r^2=0.79$ for 2014 and $r^2=0.85$ for 2015). The ratios of COD/TOC were calculated as 2.08 in 2012, 1.79 in 2013, 1.52 and 1.45 in 2015, indicating that biodegradability in the water body of the Han River was being sustained, thereby further improving water quality. The BOD/COD ratio supported these findings. The cluster analysis revealed higher annual levels of microorganisms and phosphorous at stations along the Hangang-Seoul and Hantangang areas. Nevertheless, the overall water quality over the last four years showed an observable trend toward continuous improvement. These findings also suggest that non-point pollution control strategies should consider the influence of upstreams and downstreams to protect water quality in the Han River. Conclusion: This data analysis procedure provided an efficient and comprehensive tool to interpret complex water quality data matrices. Results from a trend analysis provided much important information about sources and parameters for Han River water quality management.

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1245-1245
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145-154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1152-1152
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145 154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

Sample Size Determination of Univariate and Bivariate Ordinal Outcomes by Nonparametric Wilcoxon Tests (단변량 및 이변량 순위변수의 비모수적 윌콕슨 검정법에 의한 표본수 결정방법)

  • Park, Hae-Gang;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.6
    • /
    • pp.1249-1263
    • /
    • 2009
  • The power function in sample size determination has to be characterized by an appropriate statistical test for the hypothesis of interest. Nonparametric tests are suitable in the analysis of ordinal data or frequency data with ordered categories which appear frequently in the biomedical research literature. In this paper, we study sample size calculation methods for the Wilcoxon-Mann-Whitney test for one- and two-dimensional ordinal outcomes. While the sample size formula for the univariate outcome which is based on the variances of the test statistic under both null and alternative hypothesis perform well, this formula requires additional information on probability estimates that appear in the variance of the test statistic under alternative hypothesis, and the values of these probabilities are generally unknown. We study the advantages and disadvantages of different sample size formulas with simulations. Sample sizes are calculated for the two-dimensional ordinal outcomes of efficacy and safety, for which bivariate Wilcoxon-Mann-Whitney test is appropriate than the multivariate parametric test.

Study on Vacuum Pump Monitoring Using MPCA Statistical Method (MPCA 기반의 통계기법을 이용한 진공펌프 상태진단에 관한 연구)

  • Sung D.;Kim J.;Jung W.;Lee S.;Cheung W.;Lim J.;Chung K.
    • Journal of the Korean Vacuum Society
    • /
    • v.15 no.4
    • /
    • pp.338-346
    • /
    • 2006
  • In semiconductor process, it is so hard to predict an exact failure point of the vacuum pump due to its harsh operation conditions and nonlinear properties, which may causes many problems, such as production of inferior goods or waste of unnecessary materials. Therefore it is very urgent and serious problem to develop diagnostic models which can monitor the operation conditions appropriately and recognize the failure point exactly, indicating when to replace the vacuum pump. In this study, many influencing factors are totally considered and eventually the monitoring model using multivariate statistical methods is suggested. The pivotal algorithms are Multiway Principal Component Analysis(MPCA), Dynamic Time Warping Algorithm(DTW Algorithm), etc.

The Effect of Occurrence and Reoccurrence of Catastrophic Health Expenditure on Transition to Poverty and Persistence of Poverty in South Korea (재난적 의료비 발생과 재발생이 빈곤화와 빈곤지속에 미치는 영향)

  • Kim, Eunkyoung;Kwon, Soonman
    • Health Policy and Management
    • /
    • v.26 no.3
    • /
    • pp.172-184
    • /
    • 2016
  • Background: The objective of this study was to examine the effect of occurrence and reoccurrence of catastrophic health expenditure (CHE) on transition to poverty and persistence of poverty in South Korea. Methods: The data of the year 2008-2011 from the Korea Health Panel were used. CHE was defined as the share of total health expenditure in a household out of a household's total income at various threshold levels (more than 5%, 10%, 15%, and 20%). The effect of catastrophic expenditure on transition to poverty and persistence of poverty was analyzed through multivariate logistic regression. Results: The shares of households facing CHE at various threshold levels have increased gradually with 37.7%, 21%, 13.1%, and 9.5% in 2011. Households facing CHE were more likely to experience transition to poverty at thresholds level of more than 5% and 20% in 2010 set. Households facing CHE seemed to experience persistence of poverty, but it was not statistically significant. About 40% of households facing CHE in 2009 encountered another shock of CHE in 2010. Households without CHE seemed to experience more transition to poverty and persistence of poverty, but it was not statistically significant. For household with multiple CHE, those with medical aid were more likely to experience transition to poverty with statistical significance, but the statistical significance disappeared in case of persistence of poverty. Conclusion: The Korean health system needs to be improved to serve as a social security net for addressing transition to poverty and persistence of poverty due to facing CHE.

A Study on Fault Detection of Cycle-based Signals using Wavelet Transform (웨이블릿을 이용한 주기 신호 데이터의 이상 탐지에 관한 연구)

  • Lee, Jae-Hyun;Kim, Ji-Hyun;Hwang, Ji-Bin;Kim, Sung-Shick
    • Journal of the Korea Society for Simulation
    • /
    • v.16 no.4
    • /
    • pp.13-22
    • /
    • 2007
  • Fault detection of cycle-based signals is typically performed using statistical approaches. Univariate SPC using few representative statistics and multivariate analysis methods such as PCA and PLS are the most popular methods for analyzing cycle-based signals. However, such approaches are limited when dealing with information-rich cycle-based signals. In this paper, process fault defection method based on wavelet analysis is proposed. Using Haar wavelet, coefficients that well reflect the process condition are selected. Next, Hotelling's $T^2$ chart using selected coefficients is constructed for assessment of process condition. To enhance the overall efficiency of fault detection, the following two steps are suggested, i.e. denoising method based on wavelet transform and coefficient selection methods using variance difference. For performance evaluation, various types of abnormal process conditions are simulated and the proposed algorithm is compared with other methodologies.

  • PDF

Plant breeding in the 21st century: Molecular breeding and high throughput phenotyping

  • Sorrells, Mark E.
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.14-14
    • /
    • 2017
  • The discipline of plant breeding is experiencing a renaissance impacting crop improvement as a result of new technologies, however fundamental questions remain for predicting the phenotype and how the environment and genetics shape it. Inexpensive DNA sequencing, genotyping, new statistical methods, high throughput phenotyping and gene-editing are revolutionizing breeding methods and strategies for improving both quantitative and qualitative traits. Genomic selection (GS) models use genome-wide markers to predict performance for both phenotyped and non-phenotyped individuals. Aerial and ground imaging systems generate data on correlated traits such as canopy temperature and normalized difference vegetative index that can be combined with genotypes in multivariate models to further increase prediction accuracy and reduce the cost of advanced trials with limited replication in time and space. Design of a GS training population is crucial to the accuracy of prediction models and can be affected by many factors including population structure and composition. Prediction models can incorporate performance over multiple environments and assess GxE effects to identify a highly predictive subset of environments. We have developed a methodology for analyzing unbalanced datasets using genome-wide marker effects to group environments and identify outlier environments. Environmental covariates can be identified using a crop model and used in a GS model to predict GxE in unobserved environments and to predict performance in climate change scenarios. These new tools and knowledge challenge the plant breeder to ask the right questions and choose the tools that are appropriate for their crop and target traits. Contemporary plant breeding requires teams of people with expertise in genetics, phenotyping and statistics to improve efficiency and increase prediction accuracy in terms of genotypes, experimental design and environment sampling.

  • PDF