• 제목/요약/키워드: Multivariate Statistical Analysis

검색결과 639건 처리시간 0.025초

Multiple Testing in Genomic Sequences Using Hamming Distance

  • Kang, Moonsu
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.899-904
    • /
    • 2012
  • High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.

Quality and Productivity Improvement by Clustering Product Database Information in Semiconductor Testing Floor

  • Lim, Ik-Sung;Koo, Il-Sup;Kim, Tae-Sung
    • 산업경영시스템학회지
    • /
    • 제23권60호
    • /
    • pp.73-81
    • /
    • 2000
  • The testing processes for VLSI finished devices are considerably complex because they require different types of ATE to be linked together. Due to the interaction effect between two or more linked ATEs, it is difficult to trace down the cause of the unexpected longer ATE setup time and random yields, which frequently occur in the VLSI circuit-testing laboratory. The goal of this paper is to develop and demonstrate the methodology designed to eliminate the possible interaction factors that might affect the random yields and/or unexpected longer setup time as well as increase the productivity. The statistical method such as design of experiment or multivariate analysis cannot be applied to the final testing floor here directly due to the environmental constraints. Expanded product data information (PDI) is constructed by combining product data information and ATE control information. An architecture utilizing expanded PDI is designed, which enables the engineer to conduct statistical approach investigation and reduce the setup time, as well as increase yield.

  • PDF

A class of accelerated sequential procedures with applications to estimation problems for some distributions useful in reliability theory

  • Joshi, Neeraj;Bapat, Sudeep R.;Shukla, Ashish Kumar
    • Communications for Statistical Applications and Methods
    • /
    • 제28권5호
    • /
    • pp.563-582
    • /
    • 2021
  • This paper deals with developing a general class of accelerated sequential procedures and obtaining the associated second-order approximations for the expected sample size and 'regret' (difference between the risks of the proposed accelerated sequential procedure and the optimum fixed sample size procedure) function. We establish that the estimation problems based on various lifetime distributions can be tackled with the help of the proposed class of accelerated sequential procedures. Extensive simulation analysis is presented in support of the accuracy of our proposed methodology using the Pareto distribution and a real data set on carbon fibers is also analyzed to demonstrate the practical utility. We also provide the brief details of some other inferential problems which can be seen as the applications of the proposed class of accelerated sequential procedures.

A Kullback-Leibler divergence based comparison of approximate Bayesian estimations of ARMA models

  • Amin, Ayman A
    • Communications for Statistical Applications and Methods
    • /
    • 제29권4호
    • /
    • pp.471-486
    • /
    • 2022
  • Autoregressive moving average (ARMA) models involve nonlinearity in the model coefficients because of unobserved lagged errors, which complicates the likelihood function and makes the posterior density analytically intractable. In order to overcome this problem of posterior analysis, some approximation methods have been proposed in literature. In this paper we first review the main analytic approximations proposed to approximate the posterior density of ARMA models to be analytically tractable, which include Newbold, Zellner-Reynolds, and Broemeling-Shaarawy approximations. We then use the Kullback-Leibler divergence to study the relation between these three analytic approximations and to measure the distance between their derived approximate posteriors for ARMA models. In addition, we evaluate the impact of the approximate posteriors distance in Bayesian estimates of mean and precision of the model coefficients by generating a large number of Monte Carlo simulations from the approximate posteriors. Simulation study results show that the approximate posteriors of Newbold and Zellner-Reynolds are very close to each other, and their estimates have higher precision compared to those of Broemeling-Shaarawy approximation. Same results are obtained from the application to real-world time series datasets.

GC-MS 기반 대사체학 기법을 이용한 산수유의 산지판별모델 (Discrimination model of cultivation area of Corni Fructus using a GC-MS-Based metabolomics approach)

  • 임재윤
    • 분석과학
    • /
    • 제29권1호
    • /
    • pp.1-9
    • /
    • 2016
  • 생약의 원산지를 판별하는 논리적인 일련의 기준을 개발한다면, 현재 유통되는 한약을 좀 더 과학적으로 관리 할 수 있을 것이다. 이러한 노력은 전통적인 한약 산업 발전에 기여할 것이라고 사료된다. 산수유의 원산지 판별법을 개발하기 위해, 본 연구에서는 우선 국산 산수유와 중국산 산수유를 각각 수증기 증류하고 이 때 얻은 휘발성분을 GC/MS를 이용하여 분석하였다. NIST mass spectral library의 데이터베이스로부터 정성분석한 결과를 바탕으로 데이터를 범주화(binning)하여 변수를 얻고, 이에 대하여 PCA, OPLS-DA 등 다변량 통계 분석을 수행함으로써 신속, 정확하게 국산 산수유와 중국산 산수유의 산지를 판별할 수 있는 산지 판별모델을 확립하였다. 산지 판별모델 개발을 위해서 학습집합(n=53)을 분석하여 산지 판별모델을 수립한 후, 검증집합(n=12)을 산지 판별모델에 적용함으로써 그 타당성을 확인하였다. 더불어 1-ethylbutyl-hydroperoxide, nonadecane, butylated hydroxytoluene, 5β,7βH,10α-Eudesm-11-en-1α-ol, 7,9-bis (2-methyl-2-propanyl)-1-oxaspiro[4.5]deca-6,9-diene-2,8-dione, 그리고 2-decyldodecyl-benzene 등 6개의 마커성분을 선정할 수 있었다. 최근에 NMR을 활용한 산수유 원산지 판별에 대한 보고는 있었으나, GC/MS를 기반으로 한 대사체학 연구기법을 이용하여 산지판별 모델을 제시하는 것은 최초의 보고로서 그 의미가 크다. 본 연구결과를 활용하여 한약의 원산지 판별모델 확립과 산수유 원산지의 과학적인 관리에 적용할 수 있으리라 사료된다.

Predicting Unknown Composition of a Mixture Using Independent Component Analysis

  • Lee, Hye-Seon;Park, Hae-Sang;Jun, Chi-Hyuck
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2005년도 춘계학술대회
    • /
    • pp.127-134
    • /
    • 2005
  • A suitable representation for the conceptual simplicity of the data in statistics and signal processing is essential for a subsequent analysis such as prediction, pattern recognition, and spatial analysis. Independent component analysis (ICA) is a statistical method for transforming an observed high-dimensional multivariate data into statistically independent components. ICA has been applied increasingly in wide fields of spectrum application since ICA is able to extract unknown components of a mixture from spectra. We focus on application of ICA for separating independent sources and predicting each composition using extracted components. The theory of ICA is introduced and an application to a metal surface spectra data will be described, where subsequent analysis using non-negative least square method is performed to predict composition ratio of each sample. Furthermore, some simulation experiments are performed to demonstrate the performance of the proposed approach.

  • PDF

The Building Strategies of Natural Park Integration Monitoring System Based on Geographic Information Analysis System

  • Bae, Min-Ki;Lee, Ju-Hee
    • 한국산림과학회지
    • /
    • 제95권5호
    • /
    • pp.605-613
    • /
    • 2006
  • The goal of this study was to propose building strategies of web-based national park monitoring system (WNPMS) using geographic information analysis system. To accomplish this study, at first, this study selected and made integrated management indicators considering physical, ecological, and socio-psychological carrying capacity in national park. Secondly, this study built up an integrated management this system with statistical analysis program for execution of various multivariate analysis and spatial analysis. Finally, WNPMS could identify the relationship among visitors, natural resources, and recreation facilities in national park, and forecast the future management status of each national park in Korea. There results of this study will contribute to prevent the damage of natural resources and facilities, improve visitor's satisfaction, prevent an excess of carrying capacity at national park, and established tailored management strategies of each national park.

Loss of Expression of PTEN is Associated with Worse Prognosis in Patients with Cancer

  • Qiu, Zhi-Xin;Zhao, Shuang;Li, Lei;Li, Wei-Min
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권11호
    • /
    • pp.4691-4698
    • /
    • 2015
  • Background: The tumor suppressor phosphatase and tensin homolog (PTEN) is an important negative regulator of cell-survival signaling. However, available results for the prognostic value of PTEN expression in patients with cancer remain controversial. Therefore, a meta-analysis of published studies investigating this issue was performed. Materials and Methods: A literature search via PubMed and EMBASE databases was conducted. Statistical analysis was performed by using the STATA 12.0 (STATA Corp., College, TX). Data from eligible studies were extracted and included into the meta-analysis using a random effects model. Results: A total of 3,810 patients from 27 studies were included in the meta-analysis, 22 investigating the relationship between PTEN expression and overall survival (OS) using univariate analysis, and nine with multivariate analysis. The pooled hazard ratio (HR) for OS was 1.64 (95% confidence interval (CI): 1.32-2.05) by univariate analysis and 1.56 (95% CI: 1.20-2.03) by multivariate analysis. In addition, eight papers including two disease-free-survival analyses (DFSs), four relapse-free-survival analyses (RFSs), three progression-free-survival analyses (PFSs) and one metastasis-free-survival analysis (MFS) reported the effect of PTEN on survival. The results showed that loss of PTEN expression was significant correlated with poor prognosis, with a combined HR of 1.74 (95% CI: 1.24-2.44). Furthermore, in the stratified analysis by the year of publication, ethnicity, cancer type, method, cut-off value, median follow-up time and neoadjuvant therapy in which the study was conducted, we found that the ethnicity, cancer type, method, median follow-up time and neoadjuvant therapy are associated with prognosis. Conclusions: Our study shows that negative or loss of expression of PTEN is associated with worse prognosis in patients with cancer. However, adequately designed prospective studies need to be performed for confirmation.

관광호텔 식음료상품 서비스품질 평가 (Service Quality assessment for Food & Beverage Product of Hotel)

  • 김승희
    • 한국조리학회지
    • /
    • 제5권2호
    • /
    • pp.447-467
    • /
    • 1999
  • Most published work on product quality focuses on manufactured goods. The subject of service quality has received less attention. This distinction is important because some of the quality-improving strategies avaliable to manufacturers may be inappropriate for service firms. Services are performances, not objects. They are often produced in the presence of the customer, as in the cause of hotel restaurant services, quality occurs during service delivery, usually in an interaction between the customer and contact personnel of service firm. for this reason, service quality is highly dependent on the performance of employees, an organizational resource that cannot be controlled to the degree that components of tangible goods can be engineered. The study has begun as a basic study for customer satisfaction-oriented management in understanding the service quality of food & beverage products and through a systematic analysis of it. The major purpose of the study was to examine the relationship of the customer satisfaction and service quality in consideration of reliability, empathy, responsiveness, tangibility and assurance. An empirical research was conducted based on the previous theoretical studies. 286 customer at first class hotels in Seoul were selected as samples of this study. The time period of research was from February through March 1999, and answers were processed by SAS to yield frequency analysis, multivariate statistical analysis and regression analysis. The finding of the statistical treatment are frequencies, factor analysis, multiple regression analysis, path analysis. SERVQUAL method was used the service quality evaluation methods. After factor analysis, it was resulted to 3 factors. those were factor 1(assurance.empathy.responsiveness), factor 2(reliability), factor 3(tangibility). The findings of the statistical treatment are as follows. First, the attribute measurement of performance service quality was affected by customer satisfaction. Second, the attribute measurement of performance service qualify was affected by repurchase intention. Third, The attribute measurement of performance customer satisfaction was affected by repurchase intention. The result of study model was followed, service quality was affected repurchase intention than customer satisfaction. indirected effect through, service duality and customer satisfaction was affected repurchase intention.

  • PDF

부동산 매매지수와 전세지수 예측: 독립성분분석을 활용한 분석 (Forecasting Korean housing price index: application of the independent component analysis)

  • 박노진
    • 응용통계연구
    • /
    • 제30권2호
    • /
    • pp.271-280
    • /
    • 2017
  • 우리나라 뉴스에서 매일 빠지지 않는 내용은 아마도 부동산 경제에 관한 것이라고 생각된다. 많은 사람들은 부동산 가격의 변동에 관한 전문가들의 예측에 관심을 갖고 있다. 매매가격 혹은 전세가격을 예측하기위해 일반적으로 많이 사용되는 방법은 박스-젠킨스에 기반을 둔 자기회귀이동평균모형이다. 본 논문에서는 자기회귀모형과 다변량 자료분석에서 사용하는 독립성분분석을 결합하여 예측하는 방법을 시도하여 보았다. 매매가격과 전세가격을 두 개의 독립성분으로 재설정하고 독립성분들을 이용하여 예측한 후 역변환을 통해 매매가격과 전세가격을 예측하는 방법을 시도하였다. 그 결과 일반적인 자기회귀이동평균모형을 사용할 때 보다 독립성분을 활용한 예측이 실제 지수에 더 유사한 값들을 얻을 수 있음을 보였다.