• Title/Summary/Keyword: Multivariate Statistics

Search Result 656, Processing Time 0.026 seconds

Prediction of Axillary Lymph Node Metastasis in Early Breast Cancer Using Dynamic Contrast-Enhanced Magnetic Resonance Imaging and Diffusion-Weighted Imaging

  • Jeong, Eun Ha;Choi, Eun Jung;Choi, Hyemi;Park, Eun Hae;Song, Ji Soo
    • Investigative Magnetic Resonance Imaging
    • /
    • v.23 no.2
    • /
    • pp.125-135
    • /
    • 2019
  • Purpose: The purpose of this study was to evaluate dynamic contrast-enhanced breast magnetic resonance imaging (DCE-MRI), and diffusion-weighted imaging (DWI) variables, for axillary lymph node (ALN) metastasis in the early stage of breast cancer. Materials and Methods: January 2011-April 2015, 787 patients with early stage of breast cancer were retrospectively reviewed. Only cases of invasive ductal carcinoma, were included in the patient population. Among them, 240 patients who underwent 3.0-T DCE-MRI, including DWI with b value 0 and $800s/mm^2$ were enrolled. MRI variables (adjacent vessel sign, whole-breast vascularity, initial enhancement pattern, quantitative kinetic parameters, signal enhancement ratio (SER), tumor apparent diffusion coefficient (ADC), peritumoral ADC, and peritumor-tumor ADC ratio) clinico-pathologic variables (age, T stage, multifocality, extensive intraductal carcinoma component (EIC), estrogen receptor, progesterone receptor, HER-2 status, Ki-67, molecular subtype, histologic grade, and nuclear grade) were compared between patients with axillary lymph node metastasis and those with no lymph node metastasis. Multivariate regression analysis was performed, to determine independent variables associated with ALN metastasis, and the area under the receiver operating characteristic curve (AUC), for predicting ALN metastasis was analyzed, for those variables. Results: On breast MRI, moderate or prominent ipsilateral whole-breast vascularity (moderate, odds ratio [OR] 3.45, 95% confidence interval [CI] 1.28-9.51 vs. prominent, OR = 15.59, 95% CI 2.52-96.46), SER (OR = 1.68, 95% CI 1.09-2.59), and peritumor-tumor ADC ratio (OR = 6.77, 95% CI 2.41-18.99), were independently associated with ALN metastasis. Among clinico-pathologic variables, HER-2 positivity was independently associated, with ALN metastasis (OR = 23.71, 95% CI 10.50-53.54). The AUC for combining selected MRI variables and clinico-pathologic variables, was higher than that of clinico-pathologic variables (P < 0.05). Conclusion: SER, moderate or prominent increased whole breast vascularity, and peritumor-tumor ADC ratio on breast MRI, are valuable in predicting ALN metastasis, in patients with early stage of breast cancer.

Integrated calibration weighting using complex auxiliary information (통합 칼리브레이션 가중치 산출 비교연구)

  • Park, Inho;Kim, Sujin
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.427-438
    • /
    • 2021
  • Two-stage sampling allows us to estimate population characteristics by both unit and cluster level together. Given a complex auxiliary information, integrated calibration weighting would better reflect the level-wise characteristics as well as multivariate characteristics between levels. This paper explored the integrated calibration weighting methods by Estevao and Särndal (2006) and Kim (2019) through a simulation study, where the efficiency of those weighting methods was compared using an artificial population data. Two weighting methods among others are shown efficient: single step calibration at the unit level with stacked individualized auxiliary information and iterative integrated calibration at each level. Under both methods, cluster calibrated weights are defined as the average of the calibrated weights of the unit(s) within cluster. Both were very good in terms of the goodness-of-fit of estimating the population totals of mutual auxiliary information between clusters and units, and showed small relative bias and relative mean square root errors for estimating the population totals of survey variables that are not included in calibration adjustments.

Multivariate Analysis of Predictive Factors for the Severity in Stable Patients with Severe Injury Mechanism (중증 손상 기전의 안정된 환자에서 중증도 예측 인자들에 대한 다변량 분석)

  • Lee, Jae Young;Lee, Chang Jae;Lee, Hyoung Ju;Chung, Tae Nyoung;Kim, Eui Chung;Choi, Sung Wook;Kim, Ok Jun;Cho, Yun Kyung
    • Journal of Trauma and Injury
    • /
    • v.25 no.2
    • /
    • pp.49-56
    • /
    • 2012
  • Purpose: For determining the prognosis of critically injured patients, transporting patients to medical facilities capable of providing proper assessment and management, running rapid assessment and making rapid decisions, and providing aggressive resuscitation is vital. Considering the high mortality and morbidity rates in critically injured patients, various studies have been conducted in efforts to reduce those rates. However, studies related to diagnostic factors for predicting severity in critically injured patients are still lacking. Furthermore, patients showing stable vital signs and alert mental status, who are injured via a severe trauma mechanism, may be at a risk of not receiving rapid assessment and management. Thus, this study investigates diagnostic factors, including physical examination and laboratory results, that may help predict severity in trauma patients injured via a severe trauma mechanism, but showing stable vital signs. Methods: From March 2010 to December 2011, all trauma patients who fit into a diagnostic category that activated a major trauma team in CHA Bundang Medical Center were analyzed retrospectively. The retrospective analysis was based on prospective medical records completed at the time of arrival in the emergency department and on sequential laboratory test results. PASW statistics 18(SPSS Inc., Chicago, IL, USA) was used for the statistical analysis. Patients with relatively stable vital signs and alert mental status were selected based on a revised trauma score of more than 7 points. The final diagnosis of major trauma was made based on an injury severity score of greater than 16 points. Diagnostic variables include systolic blood pressure and respiratory rate, glasgow coma scale, initial result from focused abdominal sonography for trauma, and laboratory results from blood tests and urine analyses. To confirm the true significance of the measured values, we applied the Kolmogorov-Smirnov one sample test and the Shapiro-Wilk test. When significance was confirmed, the Student's t-test was used for comparison; when significance was not confirmed, the Mann-Whitney u-test was used. The results of focused abdominal sonography for trauma (FAST) and factors of urine analysis were analyzed using the Chi-square test or Fisher's exact test. Variables with statistical significance were selected as prognostics factors, and they were analyzed using a multivariate logistics regression model. Results: A total of 269 patients activated the major trauma team. Excluding 91 patients who scored a revised trauma score of less than 7 points, 178 patients were subdivided by injury severity score to determine the final major trauma patients. Twenty-one(21) patients from 106 major trauma patients and 9 patients from 72 minor trauma patients were also excluded due to missing medical records or untested blood and urine analysis. The investigated variables with p-values less than 0.05 include the glasgow coma scale, respiratory rate, white blood cell count (WBC), serum AST and ALT, serum creatinine, blood in spot urine, and protein in spot urine. These variables could, thus, be prognostic factors in major trauma patients. A multivariate logistics regression analysis on those 8 variables showed the respiratory rate (p=0.034), WBC (p=0.005) and blood in spot urine (p=0.041) to be independent prognostic factors for predicting the clinical course of major trauma patients. Conclusion: In trauma patients injured via a severe trauma mechanism, but showing stable vital signs and alert mental status, the respiratory rate, WBC count and blood in the urine can be used as predictable factors for severity. Using those laboratory results, rapid assessment of major trauma patients may shorten the time to diagnosis and the time for management.

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

A Meta Analysis of Using Structural Equation Model on the Korean MIS Research (국내 MIS 연구에서 구조방정식모형 활용에 관한 메타분석)

  • Kim, Jong-Ki;Jeon, Jin-Hwan
    • Asia pacific journal of information systems
    • /
    • v.19 no.4
    • /
    • pp.47-75
    • /
    • 2009
  • Recently, researches on Management Information Systems (MIS) have laid out theoretical foundation and academic paradigms by introducing diverse theories, themes, and methodologies. Especially, academic paradigms of MIS encourage a user-friendly approach by developing the technologies from the users' perspectives, which reflects the existence of strong causal relationships between information systems and user's behavior. As in other areas in social science the use of structural equation modeling (SEM) has rapidly increased in recent years especially in the MIS area. The SEM technique is important because it provides powerful ways to address key IS research problems. It also has a unique ability to simultaneously examine a series of casual relationships while analyzing multiple independent and dependent variables all at the same time. In spite of providing many benefits to the MIS researchers, there are some potential pitfalls with the analytical technique. The research objective of this study is to provide some guidelines for an appropriate use of SEM based on the assessment of current practice of using SEM in the MIS research. This study focuses on several statistical issues related to the use of SEM in the MIS research. Selected articles are assessed in three parts through the meta analysis. The first part is related to the initial specification of theoretical model of interest. The second is about data screening prior to model estimation and testing. And the last part concerns estimation and testing of theoretical models based on empirical data. This study reviewed the use of SEM in 164 empirical research articles published in four major MIS journals in Korea (APJIS, ISR, JIS and JITAM) from 1991 to 2007. APJIS, ISR, JIS and JITAM accounted for 73, 17, 58, and 16 of the total number of applications, respectively. The number of published applications has been increased over time. LISREL was the most frequently used SEM software among MIS researchers (97 studies (59.15%)), followed by AMOS (45 studies (27.44%)). In the first part, regarding issues related to the initial specification of theoretical model of interest, all of the studies have used cross-sectional data. The studies that use cross-sectional data may be able to better explain their structural model as a set of relationships. Most of SEM studies, meanwhile, have employed. confirmatory-type analysis (146 articles (89%)). For the model specification issue about model formulation, 159 (96.9%) of the studies were the full structural equation model. For only 5 researches, SEM was used for the measurement model with a set of observed variables. The average sample size for all models was 365.41, with some models retaining a sample as small as 50 and as large as 500. The second part of the issue is related to data screening prior to model estimation and testing. Data screening is important for researchers particularly in defining how they deal with missing values. Overall, discussion of data screening was reported in 118 (71.95%) of the studies while there was no study discussing evidence of multivariate normality for the models. On the third part, issues related to the estimation and testing of theoretical models on empirical data, assessing model fit is one of most important issues because it provides adequate statistical power for research models. There were multiple fit indices used in the SEM applications. The test was reported in the most of studies (146 (89%)), whereas normed-test was reported less frequently (65 studies (39.64%)). It is important that normed- of 3 or lower is required for adequate model fit. The most popular model fit indices were GFI (109 (66.46%)), AGFI (84 (51.22%)), NFI (44 (47.56%)), RMR (42 (25.61%)), CFI (59 (35.98%)), RMSEA (62 (37.80)), and NNFI (48 (29.27%)). Regarding the test of construct validity, convergent validity has been examined in 109 studies (66.46%) and discriminant validity in 98 (59.76%). 81 studies (49.39%) have reported the average variance extracted (AVE). However, there was little discussion of direct (47 (28.66%)), indirect, and total effect in the SEM models. Based on these findings, we suggest general guidelines for the use of SEM and propose some recommendations on concerning issues of latent variables models, raw data, sample size, data screening, reporting parameter estimated, model fit statistics, multivariate normality, confirmatory factor analysis, reliabilities and the decomposition of effects.

Survival and Prognostic Factors for Hepatocellular Carcinoma: an Egyptian Multidisciplinary Clinic Experience

  • Abdelaziz, Ashraf Omar;Elbaz, Tamer Mahmoud;Shousha, Hend Ibrahim;Ibrahim, Mostafa Mohamed;El-Shazli, Mostafa Abdel Rahman;Abdelmaksoud, Ahmed Hosni;Aziz, Omar Abdel;Zaki, Hisham Atef;Elattar, Inas Anwar;Nabeel, Mohamed Mahmoud
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.9
    • /
    • pp.3915-3920
    • /
    • 2014
  • Background: Hepatocellular carcinoma (HCC) is a dismal tumor with a high incidence, prevalence and poor prognosis and survival. Management of HCC necessitates multidisciplinary clinics due to the wide heterogeneity in its presentation, different therapeutic options, variable biologic behavior and background presence of chronic liver disease. We studied the different prognostic factors that affected survival of our patients to improve future HCC management and patient survival. Materials and Methods: This study is performed in a specialized multidisciplinary clinic for HCC in Kasr El Eini Hospital, Cairo University, Egypt. We retrospectively analyzed the different patient and tumor characteristics and the primary mode of management applied to our patients. Further analysis was performed using univariate and multivariate statistics. Results: During the period February 2009 till February 2013, 290 HCC patients presented to our multidisciplinary clinic. They were predominantly males and the mean age was $56.5{\pm}7.7years$. All cases developed HCC on top of cirrhosis that was mainly due to HCV (71%). Most of our patients were Child-Pugh A (50%) or B (36.9%) and commonly presented with small single lesions. Transarterial chemoembolization was the most common line of treatment used (32.4%). The overall survival was 79.9% at 6 months, 54.5% at 1 year and 22.4% at 2 years. Serum bilirubin, site of the tumor and type of treatment were the significant independent prognostic factors for survival. Conclusions: Our main prognostic variables are the bilirubin level, the bilobar hepatic affection and the application of specific treatment (either curative or palliative). Multidisciplinary clinics enhance better HCC management.

Predicting Health-Promoting Behaviors in Patients with Stomach Cancer (건강증진행위의 영향요인 분석 -위암환자중심 -)

  • 오복자
    • Journal of Korean Academy of Nursing
    • /
    • v.25 no.4
    • /
    • pp.681-695
    • /
    • 1995
  • It has been noted that a genetic alteration of cells influenced by unhealthy lifestyle In addition to a series of other carcinogens increases various neoplasmic diseases. Therefore the importance of lifestyle that minimizes such impact on health should be emphasized. Since stomach cancer, the most common neoplasmic disease in Korea, is re-lated to the Korean lifestyle and as there's a possibility of its recurrence, people with stomach cancer need to lead a healthy lifestyle. The purpose of this study is to provide a basis for nursing intervention strategies to promote health promoting behaviors that are constructive to a healthy lifestyle. A multivariate model was constructed based on the fender's health promotion model and Booker's health belief model by including influential factors such as hope. The sample was composed of 164 patients with stomach cancer who visited outpatient clinics of a university hospital in Seoul. The following instruments were used in the study after some adaptation : Wallston and others' multidimensional health locus of control scale Laffrey's health conception scale, Lawston and others' health self- rating scale, Walker and others' health promotion lifestyle profile and Rogenberg's self esteem scale. In addition Moon's health belief scale was used with some modification. For self efficacy, the present author constructed a self-efficacy scale based on previous research. The above mentioned instruments were tested in a pilot study with 24 patients with stomach cancer. The reliabilities of instruments were tested with Cronbach's alpha(0.574∼0.949). Data were analyzed using a SAS program (or Pearson correlation coefficients, descriptive correlational statistics and stepwise multiple regression. The results are as follows : 1. The scores on the health promoting behavior scale ranged from 55 to 145 with a mean of 107.91 (S. D : 16.50). The mean scores(range 1-4) on the different dimensions were nutrition 3.14, exercise 2.48, stress management 2.69, health responsibility 2.65, interpersonal relationship 2.878E self actualization 2.85. 2. There were significant correlations among all the predictive variables & the health promoting behavior (r=.20-.55, p〈.01) 3. Stepwise multiple regression analysis showed that : 1) Hope was the main predictor and accounted for 29.8% of the total variance. 2) Self efficacy, perceived barriers & self esteem accounted for an additional 14.6% of the total variance. 3) Hope, self efficacy, perceived barriers & self esteem altogether accounted for 44.3% of the total variance. In conclusion, hope, self efficacy, perceived barriers & self esteem were identified as important variables that contributed to promote health promoting behavior.

  • PDF

Application of Statistical Analysis to Analyze the Spatial Distribution of Earthquake-induced Strain Data (지진유발 변형률 데이터의 분포 특성 분석을 위한 응용통계기법의 적용)

  • Kim, Bo-Ram;Chae, Byung-Gon;Kim, Yongje;Seo, Yong-Seok
    • The Journal of Engineering Geology
    • /
    • v.23 no.4
    • /
    • pp.353-361
    • /
    • 2013
  • To analyze the distribution of earthquake-induced strain data in rock masses, statistical analysis was performed on four-directional strain data obtained from a ground movement monitoring system installed in Korea. Strain data related to the 2011 Tohoku-oki earthquake and two aftershocks of >M7.0 in 2011 were used in x-MR control chart analysis, a type of univariate statistical analysis that can detect an abnormal distribution. The analysis revealed different dispersion times for each measurement orientation. In a more comprehensive analysis, the strain data were re-evaluated using multivariate statistical analysis (MSA) considering correlations among the various data from the different measurement orientations. $T_2$ and Q-statistics, based on principal component analysis, were used to analyze the time-series strain data in real-time. The procedures were performed with 99.9%, 99.0%, and 95.0% control limits. It is possible to use the MSA data to successfully detect an abnormal distribution caused by earthquakes because the dispersion time using the 99.9% control limit is concurrent with or earlier than that from the x-MR analysis. In addition, the dispersion using the 99.0% and 95.0% control limits detected an abnormal distribution in advance. This finding indicates the potential use of MSA for recognizing abnormal distributions of strain data.

Factors to Predict Successful Harvest during Autologous Peripheral Hematopoietic Stem Cell Collection

  • Kim, Mun-Ja;Jin, Soo-He;Lee, Duk-Hee;Park, Dae-Weon;Koh, Sung-Ae;Lee, Kyung-Hee;Hyun, Myung-Soo;Kim, Min-Kyoung
    • Biomedical Science Letters
    • /
    • v.18 no.2
    • /
    • pp.131-138
    • /
    • 2012
  • Autologous peripheral blood stem cell transplantation (PBSCT) has been used as a major treatment strategy for hematological malignancies. The number of CD34 positive cells in the harvested product is a very important factor for achieving successful transplantation. We studied the factors that can predict the number of CD34 positive cells in the harvested product of acute myelocytic leukemia (AML), multiple myeloma (MM) and Non-Hodgkin's lymphoma (NHL) patients after mobilizing them with chemotherapy plus G-CSF. A total of 73 patients (AML 19 patients, MM 28 patients, NHL 26 patients) with hematological malignancies had been mobilized with chemotherapy and granulocyte colony-stimulating growth factor from April, 2000 to February, 2012. Group's characteristics, checkup opinion of pre-peripheral blood on the day of harvest & outcome of PBSC were analyzed and evaluated using SPSS statistics program after grouping patients as below; group 1: CD34 cell counts < $2{\times}10^6/kg$ (n=16); group 2: $2{\times}10^6/kg{\leq}CD34$ cell counts < $6{\times}10^6/kg$ (n=32); group 3: CD34 cell counts ${\geq}6{\times}10^6/kg$ (n=25). We analyzed the clinical characteristics, the peripheral blood (PB) parameters and the number of CD34 positive cells in the PB and their correlation with the yield of CD34 positive cells collected from the mobilized patients. The total number of leukapheresis sessions was 263 (mean: 3.55 session per patient), and the mean number of harvested CD34 positive cells per patient was $7.37{\times}10^6/kg$. The number of CD34 positive cells in product was significantly correlated with the number of platelet and CD34 positive cells in peripheral blood (P<0.05). The number of PB CD34 positive cells was the best significant factor for the quantity of harvested CD34 positive cells on the linear regression analysis (P<0.05). Many factors could influence the mobilization of peripheral blood stem cells. Platelet count and PB CD34 positive cells count were the two variables which remained to be significant in multivariate analysis. Therefore, the number of platelet and CD34 positive cells in peripheral blood on the day of harvest can be used as an accurate predictor for successful peripheral blood stem cell collection.

Color Component Analysis For Image Retrieval (이미지 검색을 위한 색상 성분 분석)

  • Choi, Young-Kwan;Choi, Chul;Park, Jang-Chun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.403-410
    • /
    • 2004
  • Recently, studies of image analysis, as the preprocessing stage for medical image analysis or image retrieval, are actively carried out. This paper intends to propose a way of utilizing color components for image retrieval. For image retrieval, it is based on color components, and for analysis of color, CLCM (Color Level Co-occurrence Matrix) and statistical techniques are used. CLCM proposed in this paper is to project color components on 3D space through geometric rotate transform and then, to interpret distribution that is made from the spatial relationship. CLCM is 2D histogram that is made in color model, which is created through geometric rotate transform of a color model. In order to analyze it, a statistical technique is used. Like CLCM, GLCM (Gray Level Co-occurrence Matrix)[1] and Invariant Moment [2,3] use 2D distribution chart, which use basic statistical techniques in order to interpret 2D data. However, even though GLCM and Invariant Moment are optimized in each domain, it is impossible to perfectly interpret irregular data available on the spatial coordinates. That is, GLCM and Invariant Moment use only the basic statistical techniques so reliability of the extracted features is low. In order to interpret the spatial relationship and weight of data, this study has used Principal Component Analysis [4,5] that is used in multivariate statistics. In order to increase accuracy of data, it has proposed a way to project color components on 3D space, to rotate it and then, to extract features of data from all angles.