• 제목/요약/키워드: LASSO

검색결과 173건 처리시간 0.027초

KNOCKOFF를 이용한 성근 VHAR 모형의 FDR 제어 (Controlling the false discovery rate in sparse VHAR models using knockoffs)

  • 박민수;이재원;백창룡
    • 응용통계연구
    • /
    • 제35권6호
    • /
    • pp.685-701
    • /
    • 2022
  • FDR은 1종 오류를 제어하는 매우 보수적인 FWER과 달리 더 자유로운 변수 판단을 제공하여 고차원 자료의 추론에 있어 널리 쓰이고 있다. 본 논문은 Barber와 Candès (2015)가 제안한 knockoff 방법론을 사용하여 FDR을 일정 수준으로 제어하면서 고차원 장기억 시계열 모형인 성근 VHAR 모형을 추정하는 방법을 제안한다. 또한 기존의 방법론인 AL (adaptive Lasso)와의 모의실험을 통한 비교 연구를 통해서 장단점을 비교하였다. 그 결과 AL이 성근 일치성을 보이는 등 전체적으로 좋은 성질을 가지고 있지만, FDR의 관점에서는 비교적 높은 값을 주는 것을 관찰했다. 즉 AL은 0인 계수를 0이 아닌 계수로 추정하려는 경향이 있었다. 반면, knockoff 방법론은 FDR을 일정 수준으로 유지하였지만 표본의 수가 작을 경우 매우 보수적으로 0이 아닌 계수를 찾아냄을 관찰할 수 있었다. 하지만, 모형이 희박할 수록 knockoff의 성능이 크게 향상됨을 확인할 수 있어 표본의 개수가 크고 성근 모형일 경우 knockoff 방법론이 우수함을 살펴볼 수 있었다.

Low-GloSea6 기상 예측 소프트웨어의 머신러닝 기법 적용 연구 (A Study of the Application of Machine Learning Methods in the Low-GloSea6 Weather Prediction Solution)

  • 박혜성;조예린;신대영;윤은옥;정성욱
    • 한국정보전자통신기술학회논문지
    • /
    • 제16권5호
    • /
    • pp.307-314
    • /
    • 2023
  • 슈퍼컴퓨팅 기술 및 하드웨어 기술이 발전함에 따라 기후 예측 모델도 고도화되고 있다. 한국 기상청 역시 영국 기상청으로부터 GloSea5을 도입하였고 한국 기상 환경에 맞추어 업데이트된 GloSea6를 운용 중이다. 각 대학 및 연구기관에서는 슈퍼컴퓨터보다는 사양이 낮은 중소규모 서버에서 활용하기 위해 저해상도 결합모델인 Low-GloSea6를 구축하여 사용하고 있다. 본 논문에서는 중소규모 서버에서의 기상 연구의 효율성을 위한 Low-GloSea6 소프트웨어를 분석하여 가장 많은 CPU Time을 점유하는 대기 모델의 tri_sor.F90 모듈의 tri_sor_dp_dp 서브루틴을 Hotspot으로 검출하였다. 해당 함수에 머신러닝의 한 종류인 선형 회귀 모델을 적용하여 해당 기법의 가능성을 확인한다. 이상치 데이터를 제거 후 선형 회귀 모델을 학습한 결과 RMSE는 2.7665e-08, MAE는 1.4958e-08으로 Lasso 회귀, ElasticNet 회귀보다 더욱 좋은 성능을 보였다. 이는 Low-GloSea6 수행 과정 중 Hotspot으로 검출된 tri_sor.F90 모듈에 머신러닝 기법 적용 가능성을 확인하였다.

생산 및 제조 단계의 검사 데이터를 이용한 유도탄 탐색기의 고장 분류 연구 (Study on Failure Classification of Missile Seekers Using Inspection Data from Production and Manufacturing Phases)

  • 정예은;김기현;김성목;이연호;김지원;용화영;정재우;박정원;김용수
    • 산업경영시스템학회지
    • /
    • 제47권2호
    • /
    • pp.30-39
    • /
    • 2024
  • This study introduces a novel approach for identifying potential failure risks in missile manufacturing by leveraging Quality Inspection Management (QIM) data to address the challenges presented by a dataset comprising 666 variables and data imbalances. The utilization of the SMOTE for data augmentation and Lasso Regression for dimensionality reduction, followed by the application of a Random Forest model, results in a 99.40% accuracy rate in classifying missiles with a high likelihood of failure. Such measures enable the preemptive identification of missiles at a heightened risk of failure, thereby mitigating the risk of field failures and enhancing missile life. The integration of Lasso Regression and Random Forest is employed to pinpoint critical variables and test items that significantly impact failure, with a particular emphasis on variables related to performance and connection resistance. Moreover, the research highlights the potential for broadening the scope of data-driven decision-making within quality control systems, including the refinement of maintenance strategies and the adjustment of control limits for essential test items.

Comparison of radiomics prediction models for lung metastases according to four semiautomatic segmentation methods in soft-tissue sarcomas of the extremities

  • Heesoon Sheen;Han-Back Shin;Jung Young Kim
    • Journal of the Korean Physical Society
    • /
    • 제80권
    • /
    • pp.247-256
    • /
    • 2022
  • Our objective was to investigate radiomics signatures and prediction models defined by four segmentation methods in using 2-[18F]fluoro-2-deoxy-d-glucose positron emission tomography (18F-FDG PET) imaging of lung metastases of soft-tissue sarcomas (STSs). For this purpose, three fixed threshold methods using the standardized uptake value (SUV) and gradient-based edge detection (ED) were used for tumor delineation on the PET images of STSs. The Dice coefficients (DCs) of the segmentation methods were compared. The least absolute shrinkage and selection operator (LASSO) regression and Spearman's rank, and Friedman's ANOVA test were used for selection and validation of radiomics features. The developed radiomics models were assessed using ROC (receiver operating characteristics) curve and confusion matrices. According to the results, the DC values showed the biggest difference between SUV40% and other segmentation methods (DC: 0.55 and 0.59). Grey-level run-length matrix_run-length nonuniformity (GLRLM_RLNU) was a common radiomics signature extracted by all segmentation methods. The multivariable logistic regression of ED showed the highest area under the ROC (receiver operating characteristic) curve (AUC), sensitivity, specificity, and accuracy (AUC: 0.88, sensitivity: 0.85, specificity: 0.74, accuracy: 0.81). In our research, the ED method was able to derive a significant model of radiomics. GLRLM_RLNU which was selected from all segmented methods as a meaningful feature was considered the obvious radiomics feature associated with the heterogeneity and the aggressiveness. Our results have apparently showed that radiomics signatures have the potential to uncover tumor characteristics.

The Doubly Regularized Quantile Regression

  • Choi, Ho-Sik;Kim, Yong-Dai
    • Communications for Statistical Applications and Methods
    • /
    • 제15권5호
    • /
    • pp.753-764
    • /
    • 2008
  • The $L_1$ regularized estimator in quantile problems conduct parameter estimation and model selection simultaneously and have been shown to enjoy nice performance. However, $L_1$ regularized estimator has a drawback: when there are several highly correlated variables, it tends to pick only a few of them. To make up for it, the proposed method adopts doubly regularized framework with the mixture of $L_1$ and $L_2$ norms. As a result, the proposed method can select significant variables and encourage the highly correlated variables to be selected together. One of the most appealing features of the new algorithm is to construct the entire solution path of doubly regularized quantile estimator. From simulations and real data analysis, we investigate its performance.

Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index

  • Bae, Sunghwan;Choi, Sungkyoung;Kim, Sung Min;Park, Taesung
    • Genomics & Informatics
    • /
    • 제14권4호
    • /
    • pp.149-159
    • /
    • 2016
  • With the success of the genome-wide association studies (GWASs), many candidate loci for complex human diseases have been reported in the GWAS catalog. Recently, many disease prediction models based on penalized regression or statistical learning methods were proposed using candidate causal variants from significant single-nucleotide polymorphisms of GWASs. However, there have been only a few systematic studies comparing existing methods. In this study, we first constructed risk prediction models, such as stepwise linear regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN), using a GWAS chip and GWAS catalog. We then compared the prediction accuracy by calculating the mean square error (MSE) value on data from the Korea Association Resource (KARE) with body mass index. Our results show that SLR provides a smaller MSE value than the other methods, while the numbers of selected variables in each model were similar.

벌점회귀를 통한 상대오차 예측방법 (Relative Error Prediction via Penalized Regression)

  • 정석오;이서은;신기일
    • 응용통계연구
    • /
    • 제28권6호
    • /
    • pp.1103-1111
    • /
    • 2015
  • 본 논문에서는 상대오차의 개념과 벌점회귀를 결합한 새로운 예측방법을 제시하였다. 제안된 방법은 오차항의 분포가 정규성을 크게 벗어나 있어 이상점을 포함하거나 오차항의 분포가 심각하게 비대칭인 경우에도 안정적으로 예측력이 유지할 뿐 아니라 벌점회귀를 통한 변수선택의 성능도 우수하다. 또한 개념적으로 쉽고, 계산 속도가 빠르며, 기존의 알고리즘을 활용해 구현하는 것이 매우 쉽다. 한국교통연구원의 일일 차량통행량 자료 실제 분석 및 모의실험을 통해 제안된 방법의 우수한 성질을 확인하였다.

Two-Stage Penalized Composite Quantile Regression with Grouped Variables

  • Bang, Sungwan;Jhun, Myoungshic
    • Communications for Statistical Applications and Methods
    • /
    • 제20권4호
    • /
    • pp.259-270
    • /
    • 2013
  • This paper considers a penalized composite quantile regression (CQR) that performs a variable selection in the linear model with grouped variables. An adaptive sup-norm penalized CQR (ASCQR) is proposed to select variables in a grouped manner; in addition, the consistency and oracle property of the resulting estimator are also derived under some regularity conditions. To improve the efficiency of estimation and variable selection, this paper suggests the two-stage penalized CQR (TSCQR), which uses the ASCQR to select relevant groups in the first stage and the adaptive lasso penalized CQR to select important variables in the second stage. Simulation studies are conducted to illustrate the finite sample performance of the proposed methods.

A note on standardization in penalized regressions

  • Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권2호
    • /
    • pp.505-516
    • /
    • 2015
  • We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in high-dimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.

형광등을 점타용 압전트랜스포머의 특성에 관한 연구 (A study on the Characteristic of Piezoelectric Transformer for the Fluorescent Lamp ballast)

  • 이용우;윤광희;류주현;서성제
    • 한국전기전자재료학회:학술대회논문집
    • /
    • 한국전기전자재료학회 1999년도 춘계학술대회 논문집
    • /
    • pp.621-625
    • /
    • 1999
  • Rosen type piezoelectric transformer for LCD backlight operated at high voltage and low current, may not be sucessfully used for illuminating general fluorescent lamps because low voltage and high current are required. In this study, the piezoelectric transformer with width vibration mode operated at low voyage and high current was designed for the application of fluorescent lasso ballast. The step-up ratio and efficiency as a function of the load resistance in the piezoelectric transformer indicated that the transformer can be effectively used for the electronic ballast for low profile fluorescent lamp.

  • PDF