Search | Korea Science

Development of Export Volume and Export Amount Prediction Models Based on Supervised Learning (지도학습 기반 수출물량 및 수출금액 예측 모델 개발)

Dong-Gil Na;Yeong-Woong Yu
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.46 no.2
- /
- pp.152-159
- /
- 2023
Due to COVID-19, changes in consumption trends are taking place in the distribution sector, such as an increase in non-face-to-face consumption and a rapid growth in the online shopping market. However, it is difficult for small and medium-sized export sellers to obtain forecast information on the export market by country, compared to large distributors who can easily build a global sales network. This study is about the prediction of export amount and export volume by country and item for market information analysis of small and medium export sellers. A prediction model was developed using Lasso, XGBoost, and MLP models based on supervised learning and deep learning, and export trends for clothing, cosmetics, and household electronic devices were predicted for Korea's major export countries, the United States, China, and Vietnam. As a result of the prediction, the performance of MAE and RMSE for the Lasso model was excellent, and based on the development results, a market analysis system for small and medium sellers was developed.
https://doi.org/10.11627/jksie.2023.46.2.152 인용 PDF

Prediction of the Following BCI Performance by Means of Spectral EEG Characteristics in the Prior Resting State (뇌신호 주파수 특성을 이용한 CNN 기반 BCI 성능 예측)

Kang, Jae-Hwan;Kim, Sung-Hee;Youn, Joosang;Kim, Junsuk
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.11
- /
- pp.265-272
- /
- 2020
In the research of brain computer interface (BCI) technology, one of the big problems encountered is how to deal with some people as called the BCI-illiteracy group who could not control the BCI system. To approach this problem efficiently, we investigated a kind of spectral EEG characteristics in the prior resting state in association with BCI performance in the following BCI tasks. First, spectral powers of EEG signals in the resting state with both eyes-open and eyes-closed conditions were respectively extracted. Second, a convolution neural network (CNN) based binary classifier discriminated the binary motor imagery intention in the BCI task. Both the linear correlation and binary prediction methods confirmed that the spectral EEG characteristics in the prior resting state were highly related to the BCI performance in the following BCI task. Linear regression analysis demonstrated that the relative ratio of the 13 Hz below and above the spectral power in the resting state with only eyes-open, not eyes-closed condition, were significantly correlated with the quantified metrics of the BCI performance (r=0.544). A binary classifier based on the linear regression with L1 regularization method was able to discriminate the high-performance group and low-performance group in the following BCI task by using the spectral-based EEG features in the precedent resting state (AUC=0.817). These results strongly support that the spectral EEG characteristics in the frontal regions during the resting state with eyes-open condition should be used as a good predictor of the following BCI task performance.
https://doi.org/10.3745/KTCCS.2020.9.11.265 인용 PDF KSCI

A study on entertainment TV show ratings and the number of episodes prediction (국내 예능 시청률과 회차 예측 및 영향요인 분석)

Kim, Milim;Lim, Soyeon;Jang, Chohee;Song, Jongwoo
- The Korean Journal of Applied Statistics
- /
- v.30 no.6
- /
- pp.809-825
- /
- 2017
The number of TV entertainment shows is increasing. Competition among programs in the entertainment market is intensifying since cable channels air many entertainment TV shows. There is now a need for research on program ratings and the number of episodes. This study presents predictive models for entertainment TV show ratings and number of episodes. We use various data mining techniques such as linear regression, logistic regression, LASSO, random forests, gradient boosting, and support vector machine. The analysis results show that the average program ratings before the first broadcast is affected by broadcasting company, average ratings of the previous season, starting year and number of articles. The average program ratings after the first broadcast is influenced by the rating of the first broadcast, broadcasting company and program type. We also found that the predicted average ratings, starting year, type and broadcasting company are important variables in predicting of the number of episodes.
https://doi.org/10.5351/KJAS.2017.30.6.809 인용 PDF KSCI

Prediction of Greenhouse Strawberry Production Using Machine Learning Algorithm (머신러닝 알고리즘을 이용한 온실 딸기 생산량 예측)

Kim, Na-eun;Han, Hee-sun;Arulmozhi, Elanchezhian;Moon, Byeong-eun;Choi, Yung-Woo;Kim, Hyeon-tae
- Journal of Bio-Environment Control
- /
- v.31 no.1
- /
- pp.1-7
- /
- 2022
Strawberry is a stand-out cultivating fruit in Korea. The optimum production of strawberry is highly dependent on growing environment. Smart farm technology, and automatic monitoring and control system maintain a favorable environment for strawberry growth in greenhouses, as well as play an important role to improve production. Moreover, physiological parameters of strawberry plant and it is surrounding environment may allow to give an idea on production of strawberry. Therefore, this study intends to build a machine learning model to predict strawberry's yield, cultivated in greenhouse. The environmental parameter like as temperature, humidity and CO₂ and physiological parameters such as length of leaves, number of flowers and fruits and chlorophyll content of 'Seolhyang' (widely growing strawberry cultivar in Korea) were collected from three strawberry greenhouses located in Sacheon of Gyeongsangnam-do during the period of 2019-2020. A predictive model, Lasso regression was designed and validated through 5-fold cross-validation. The current study found that performance of the Lasso regression model is good to predict the number of flowers and fruits, when the MAPE value are 0.511 and 0.488, respectively during the model validation. Overall, the present study demonstrates that using AI based regression model may be convenient for farms and agricultural companies to predict yield of crops with fewer input attributes.
https://doi.org/10.12791/KSBEC.2022.31.1.001 인용 PDF KSCI

Cox Model Improvement Using Residual Blocks in Neural Networks: A Study on the Predictive Model of Cervical Cancer Mortality (신경망 내 잔여 블록을 활용한 콕스 모델 개선: 자궁경부암 사망률 예측모형 연구)

Nang Kyeong Lee;Joo Young Kim;Ji Soo Tak;Hyeong Rok Lee;Hyun Ji Jeon;Jee Myung Yang;Seung Won Lee
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.6
- /
- pp.260-268
- /
- 2024
Cervical cancer is the fourth most common cancer in women worldwide, and more than 604,000 new cases were reported in 2020 alone, resulting in approximately 341,831 deaths. The Cox regression model is a major model widely adopted in cancer research, but considering the existence of nonlinear associations, it faces limitations due to linear assumptions. To address this problem, this paper proposes ResSurvNet, a new model that improves the accuracy of cervical cancer mortality prediction using ResNet's residual learning framework. This model showed accuracy that outperforms the DNN, CPH, CoxLasso, Cox Gradient Boost, and RSF models compared in this study. As this model showed accuracy that outperformed the DNN, CPH, CoxLasso, Cox Gradient Boost, and RSF models compared in this study, this excellent predictive performance demonstrates great value in early diagnosis and treatment strategy establishment in the management of cervical cancer patients and represents significant progress in the field of survival analysis.
https://doi.org/10.3745/TKIPS.2024.13.6.260 인용 PDF

Performance of Prediction Models for Diagnosing Severe Aortic Stenosis Based on Aortic Valve Calcium on Cardiac Computed Tomography: Incorporation of Radiomics and Machine Learning

Nam gyu Kang;Young Joo Suh;Kyunghwa Han;Young Jin Kim;Byoung Wook Choi
- Korean Journal of Radiology
- /
- v.22 no.3
- /
- pp.334-343
- /
- 2021
Objective: We aimed to develop a prediction model for diagnosing severe aortic stenosis (AS) using computed tomography (CT) radiomics features of aortic valve calcium (AVC) and machine learning (ML) algorithms. Materials and Methods: We retrospectively enrolled 408 patients who underwent cardiac CT between March 2010 and August 2017 and had echocardiographic examinations (240 patients with severe AS on echocardiography [the severe AS group] and 168 patients without severe AS [the non-severe AS group]). Data were divided into a training set (312 patients) and a validation set (96 patients). Using non-contrast-enhanced cardiac CT scans, AVC was segmented, and 128 radiomics features for AVC were extracted. After feature selection was performed with three ML algorithms (least absolute shrinkage and selection operator [LASSO], random forests [RFs], and eXtreme Gradient Boosting [XGBoost]), model classifiers for diagnosing severe AS on echocardiography were developed in combination with three different model classifier methods (logistic regression, RF, and XGBoost). The performance (c-index) of each radiomics prediction model was compared with predictions based on AVC volume and score. Results: The radiomics scores derived from LASSO were significantly different between the severe AS and non-severe AS groups in the validation set (median, 1.563 vs. 0.197, respectively, p < 0.001). A radiomics prediction model based on feature selection by LASSO + model classifier by XGBoost showed the highest c-index of 0.921 (95% confidence interval [CI], 0.869-0.973) in the validation set. Compared to prediction models based on AVC volume and score (c-indexes of 0.894 [95% CI, 0.815-0.948] and 0.899 [95% CI, 0.820-0.951], respectively), eight and three of the nine radiomics prediction models showed higher discrimination abilities for severe AS. However, the differences were not statistically significant (p > 0.05 for all). Conclusion: Models based on the radiomics features of AVC and ML algorithms may perform well for diagnosing severe AS, but the added value compared to AVC volume and score should be investigated further.
https://doi.org/10.3348/kjr.2020.0099 인용 PDF

Estimation for misclassified data with ultra-high levels

Kang, Moonsu
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.1
- /
- pp.217-223
- /
- 2016
Outcome misclassification is widespread in classification problems, but methods to account for it are rarely used. In this paper, the problem of inference with misclassified multinomial logit data with a large number of multinomial parameters is addressed. We have had a significant swell of interest in the development of novel methods to infer misclassified data. One simulation study is shown regarding how seriously misclassification issue occurs if the number of categories increase. Then, using the group lasso regression, we will show how the best model should be fitted for that kind of multinomial regression problems comprehensively.
https://doi.org/10.7465/jkdi.2016.27.1.217 인용 PDF KSCI

Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

Park, Chanwoo;Jiang, Nan;Park, Taesung
- Genomics & Informatics
- /
- v.17 no.4
- /
- pp.47.1-47.12
- /
- 2019
The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.
https://doi.org/10.5808/GI.2019.17.4.e47 인용 PDF KSCI

A Study for the Drivers of Movie Box-office Performance (영화흥행 영향요인 선택에 관한 연구)

Kim, Yon Hyong;Hong, Jeong Han
- The Korean Journal of Applied Statistics
- /
- v.26 no.3
- /
- pp.441-452
- /
- 2013
This study analyzed the relationship between key film and a box office record success factors based on movies released in the first quarter of 2013 in Korea. An over-fitting problem can happen if there are too many explanatory variables inserted to regression model; in addition, there is a risk that the estimator is instable when there is multi-collinearity among the explanatory variables. For this reason, optimal variable selection based on high explanatory variables in box-office performance is of importance. Among the numerous ways to select variables, LASSO estimation applied by a generalized linear model has the smallest prediction error that can efficiently and quickly find variables with the highest explanatory power to box-office performance in order.
https://doi.org/10.5351/KJAS.2013.26.3.441 인용 PDF KSCI

On sampling algorithms for imbalanced binary data: performance comparison and some caveats (불균형적인 이항 자료 분석을 위한 샘플링 알고리즘들: 성능비교 및 주의점)

Kim, HanYong;Lee, Woojoo
- The Korean Journal of Applied Statistics
- /
- v.30 no.5
- /
- pp.681-690
- /
- 2017
Various imbalanced binary classification problems exist such as fraud detection in banking operations, detecting spam mail and predicting defective products. Several sampling methods such as over sampling, under sampling, SMOTE have been developed to overcome the poor prediction performance of binary classifiers when the proportion of one group is dominant. In order to overcome this problem, several sampling methods such as over-sampling, under-sampling, SMOTE have been developed. In this study, we investigate prediction performance of logistic regression, Lasso, random forest, boosting and support vector machine in combination with the sampling methods for binary imbalanced data. Four real data sets are analyzed to see if there is a substantial improvement in prediction performance. We also emphasize some precautions when the sampling methods are implemented.
https://doi.org/10.5351/KJAS.2017.30.5.681 인용 PDF KSCI

Search Result 171, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)