• Title/Summary/Keyword: Predictive

Search Result 5,266, Processing Time 0.038 seconds

Evaluation of the Congenital Hypothyroidism for Newborn Screening Program in Korea: A 14-year Retrospective Cohort Study (한국인 선천성 갑상선기능저하증에 대한 신생아선별검사의 14년간의 후향적 연구; 발생빈도와 유효성)

  • Yoon, Hye-Ran;Ahn, Sunhyun;Lee, Hyangja
    • Journal of The Korean Society of Inherited Metabolic disease
    • /
    • v.19 no.1
    • /
    • pp.1-11
    • /
    • 2019
  • Purpose: Congenital hypothyroidism (CH) is the most common congenital endocrine disorder. The purpose of the present study was to determine the incidence of CH in South Korea during the period from January 1991 to March 2004. Methods: Central data from each city branch of SCL (Seoul Clinical Reference Laboratories) in Yongin, South Korea, was gathered and collectively analyzed. Newborn screening (NBS) for CH was based on measuring the levels of neonatal thyroid stimulating hormone (TSH) and free T4 (a cut-off of 20 mIU/L and less than 0.8 ng/dL, respectively). Results: During the study period, 671,805 live births were screened for CH based on TSH and free T4 ELISA assays. A total of 159 newborns were deemed positive for CH out of 671,805, with a corresponding incidence of 1 in 4,225. When a cut-off of 20 mIU/L was used in TSH assays, the associated sensitivity, specificity, and positive predictive values (PPV) were 100.0%, 99.7%, and 10.8%, respectively. When a cut-off of 0.8 ng/dL in free T4 assays was used, the associated sensitivity, specificity, and PPV were 100.0%, 98.5%, and 3.9%, respectively. Conclusion: CH incidence in South Korea as evidenced by the results of NBS was compared with its incidence and comparable to the other countries prior to 2004.

  • PDF

Prediction of a hit drama with a pattern analysis on early viewing ratings (초기 시청시간 패턴 분석을 통한 대흥행 드라마 예측)

  • Nam, Kihwan;Seong, Nohyoon
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.33-49
    • /
    • 2018
  • The impact of TV Drama success on TV Rating and the channel promotion effectiveness is very high. The cultural and business impact has been also demonstrated through the Korean Wave. Therefore, the early prediction of the blockbuster success of TV Drama is very important from the strategic perspective of the media industry. Previous studies have tried to predict the audience ratings and success of drama based on various methods. However, most of the studies have made simple predictions using intuitive methods such as the main actor and time zone. These studies have limitations in predicting. In this study, we propose a model for predicting the popularity of drama by analyzing the customer's viewing pattern based on various theories. This is not only a theoretical contribution but also has a contribution from the practical point of view that can be used in actual broadcasting companies. In this study, we collected data of 280 TV mini-series dramas, broadcasted over the terrestrial channels for 10 years from 2003 to 2012. From the data, we selected the most highly ranked and the least highly ranked 45 TV drama and analyzed the viewing patterns of them by 11-step. The various assumptions and conditions for modeling are based on existing studies, or by the opinions of actual broadcasters and by data mining techniques. Then, we developed a prediction model by measuring the viewing-time distance (difference) using Euclidean and Correlation method, which is termed in our study similarity (the sum of distance). Through the similarity measure, we predicted the success of dramas from the viewer's initial viewing-time pattern distribution using 1~5 episodes. In order to confirm that the model is shaken according to the measurement method, various distance measurement methods were applied and the model was checked for its dryness. And when the model was established, we could make a more predictive model using a grid search. Furthermore, we classified the viewers who had watched TV drama more than 70% of the total airtime as the "passionate viewer" when a new drama is broadcasted. Then we compared the drama's passionate viewer percentage the most highly ranked and the least highly ranked dramas. So that we can determine the possibility of blockbuster TV mini-series. We find that the initial viewing-time pattern is the key factor for the prediction of blockbuster dramas. From our model, block-buster dramas were correctly classified with the 75.47% accuracy with the initial viewing-time pattern analysis. This paper shows high prediction rate while suggesting audience rating method different from existing ones. Currently, broadcasters rely heavily on some famous actors called so-called star systems, so they are in more severe competition than ever due to rising production costs of broadcasting programs, long-term recession, aggressive investment in comprehensive programming channels and large corporations. Everyone is in a financially difficult situation. The basic revenue model of these broadcasters is advertising, and the execution of advertising is based on audience rating as a basic index. In the drama, there is uncertainty in the drama market that it is difficult to forecast the demand due to the nature of the commodity, while the drama market has a high financial contribution in the success of various contents of the broadcasting company. Therefore, to minimize the risk of failure. Thus, by analyzing the distribution of the first-time viewing time, it can be a practical help to establish a response strategy (organization/ marketing/story change, etc.) of the related company. Also, in this paper, we found that the behavior of the audience is crucial to the success of the program. In this paper, we define TV viewing as a measure of how enthusiastically watching TV is watched. We can predict the success of the program successfully by calculating the loyalty of the customer with the hot blood. This way of calculating loyalty can also be used to calculate loyalty to various platforms. It can also be used for marketing programs such as highlights, script previews, making movies, characters, games, and other marketing projects.

Distress and Associated Factors in Patients with Breast Cancer Surgery : A Cross-Sectional Study (유방암 수술환자의 디스트레스 및 연관인자 : 단면연구)

  • Lee, Sang-Shin;Rim, Hyo-Deog;Woo, Jungmin
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.26 no.2
    • /
    • pp.77-85
    • /
    • 2018
  • Objectives : This study aimed to investigate the level of distress using the distress thermometer (DT) and the factors associated with distress in postoperative breast cancer (BC) patients. Methods : DT and WHOQOL-BREF (World Health Organization Quality of Life Scale Abbreviated Version) along with sociodemographic variables were assessed in patients undergoing surgery for their first treatment of BC within one week postoperatively. The distress group consisted of participants with a DT score ${\geq}4$. The prevalence and associative factors of distress were examined by descriptive, univariable, and logistic regression analysis. Results : Three hundred seven women were recruited, and 264 subjects were finally analyzed. A total of 173 (65.5%) were classified into the distress group. The distress group showed significantly younger age (p=0.045), living without a spouse (p=0.032), and worse quality of life (QOL) as measured by overall QOL (p=0.009), general health (p=0.005), physical health domain (p<0.000), and psychological health domain (p=0.002). The logistic regression analysis showed that patients aged 40-49 years were more likely to experience distress than those aged ${\geq}60years$ (Odds ratios [OR]=2.992, 95% confidence interval [CI] 1.241-7.215). Moreover, the WHOQOL-BREF physical health domain was a predictive factor of distress (OR=0.777, 95% CI 0.692-0.873). Conclusions : A substantial proportion of patients are experiencing significant distress after BC surgery. It would be expected that distress management, especially in the middle-aged patients and in the domain of physical QOL (e.g., pain, insomnia, fatigue), from the early BC treatment stage might reduce chronic distress.

Psychological Characteristics of Living Liver Transplantation Donors using MMPI-2 Profiles (MMPI-2를 이용한 생체 간 공여자들의 심리적 특성에 대한 연구)

  • Lee, Jin Hyeok;Choi, Tae Young;Yoon, Seoyoung
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.27 no.1
    • /
    • pp.42-49
    • /
    • 2019
  • Objectives : Living donor liver transplantation (LDLT) is a life-saving therapy for patients with terminal liver disease. Many studies have focused on recipients rather than donors. The aim of this study was to assess the emotional status and personality characteristics of LDLT donors. Methods : We evaluated 218 subjects (126 male, 92 female) who visited Daegu Catholic University Medical Center from August 2012 to July 2018. A retrospective review of their preoperative psychological evaluation was done. We investigated epidemiological data and the Minnesota Multiphasic Personality Inventory-2 questionnaire. Subanalysis was done depending on whether subjects actually underwent surgery, relationship with the recipient, and their gender. Results : Mean age of subjects was $32.19{\pm}10.91years$. 187 subjects received LDLT surgery (actual donors) while 31 subjects didn't (potential donors). Donor-recipient relationship included husband-wife, parent-children, brother-sister etc. Subjects had statistical significance on validity scale L, F, K and all clinical scales compared to the control group. Potential donors had significant difference in F(b), F(p), K, S, Pa, AGGR, PSYC, DISC and NEGE scales compared to actual donors. F, D and NEGE scales were found to be predictive for actual donation. Subanalysis on donor-recipient relationship and gender also showed significant difference in certain scales. Conclusions : Under-reporting of psychological problems should be considered when evaluating living-liver donors. Information about the donor's overall psychosocial background, mental status and donation process should also be acquired.

The Effect of Data Size on the k-NN Predictability: Application to Samsung Electronics Stock Market Prediction (데이터 크기에 따른 k-NN의 예측력 연구: 삼성전자주가를 사례로)

  • Chun, Se-Hak
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.239-251
    • /
    • 2019
  • Statistical methods such as moving averages, Kalman filtering, exponential smoothing, regression analysis, and ARIMA (autoregressive integrated moving average) have been used for stock market predictions. However, these statistical methods have not produced superior performances. In recent years, machine learning techniques have been widely used in stock market predictions, including artificial neural network, SVM, and genetic algorithm. In particular, a case-based reasoning method, known as k-nearest neighbor is also widely used for stock price prediction. Case based reasoning retrieves several similar cases from previous cases when a new problem occurs, and combines the class labels of similar cases to create a classification for the new problem. However, case based reasoning has some problems. First, case based reasoning has a tendency to search for a fixed number of neighbors in the observation space and always selects the same number of neighbors rather than the best similar neighbors for the target case. So, case based reasoning may have to take into account more cases even when there are fewer cases applicable depending on the subject. Second, case based reasoning may select neighbors that are far away from the target case. Thus, case based reasoning does not guarantee an optimal pseudo-neighborhood for various target cases, and the predictability can be degraded due to a deviation from the desired similar neighbor. This paper examines how the size of learning data affects stock price predictability through k-nearest neighbor and compares the predictability of k-nearest neighbor with the random walk model according to the size of the learning data and the number of neighbors. In this study, Samsung electronics stock prices were predicted by dividing the learning dataset into two types. For the prediction of next day's closing price, we used four variables: opening value, daily high, daily low, and daily close. In the first experiment, data from January 1, 2000 to December 31, 2017 were used for the learning process. In the second experiment, data from January 1, 2015 to December 31, 2017 were used for the learning process. The test data is from January 1, 2018 to August 31, 2018 for both experiments. We compared the performance of k-NN with the random walk model using the two learning dataset. The mean absolute percentage error (MAPE) was 1.3497 for the random walk model and 1.3570 for the k-NN for the first experiment when the learning data was small. However, the mean absolute percentage error (MAPE) for the random walk model was 1.3497 and the k-NN was 1.2928 for the second experiment when the learning data was large. These results show that the prediction power when more learning data are used is higher than when less learning data are used. Also, this paper shows that k-NN generally produces a better predictive power than random walk model for larger learning datasets and does not when the learning dataset is relatively small. Future studies need to consider macroeconomic variables related to stock price forecasting including opening price, low price, high price, and closing price. Also, to produce better results, it is recommended that the k-nearest neighbor needs to find nearest neighbors using the second step filtering method considering fundamental economic variables as well as a sufficient amount of learning data.

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

  • Seong, NohYoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.1-17
    • /
    • 2019
  • Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

  • Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.201-220
    • /
    • 2019
  • Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.

Prediction of Air Temperature and Relative Humidity in Greenhouse via a Multilayer Perceptron Using Environmental Factors (환경요인을 이용한 다층 퍼셉트론 기반 온실 내 기온 및 상대습도 예측)

  • Choi, Hayoung;Moon, Taewon;Jung, Dae Ho;Son, Jung Eek
    • Journal of Bio-Environment Control
    • /
    • v.28 no.2
    • /
    • pp.95-103
    • /
    • 2019
  • Temperature and relative humidity are important factors in crop cultivation and should be properly controlled for improving crop yield and quality. In order to control the environment accurately, we need to predict how the environment will change in the future. The objective of this study was to predict air temperature and relative humidity at a future time by using a multilayer perceptron (MLP). The data required to train MLP was collected every 10 min from Oct. 1, 2016 to Feb. 28, 2018 in an eight-span greenhouse ($1,032m^2$) cultivating mango (Mangifera indica cv. Irwin). The inputs for the MLP were greenhouse inside and outside environment data, and set-up and operating values of environment control devices. By using these data, the MLP was trained to predict the air temperature and relative humidity at a future time of 10 to 120 min. Considering typical four seasons in Korea, three-day data of the each season were compared as test data. The MLP was optimized with four hidden layers and 128 nodes for air temperature ($R^2=0.988$) and with four hidden layers and 64 nodes for relative humidity ($R^2=0.990$). Due to the characteristics of MLP, the accuracy decreased as the prediction time became longer. However, air temperature and relative humidity were properly predicted regardless of the environmental changes varied from season to season. For specific data such as spray irrigation, however, the numbers of trained data were too small, resulting in poor predictive accuracy. In this study, air temperature and relative humidity were appropriately predicted through optimization of MLP, but were limited to the experimental greenhouse. Therefore, it is necessary to collect more data from greenhouses at various places and modify the structure of neural network for generalization.

Predicting Healthy Lifestyle Patterns in Older Community Dwelling Adults: A Latent Profile Analysis (잠재프로파일 분석을 활용한 한국 노인 라이프스타일 유형화와 영향요인 분석)

  • Park, Kang-Hyun;Yang, Min Ah;Won, Kyung-A;Park, Ji-Hyuk
    • Therapeutic Science for Rehabilitation
    • /
    • v.10 no.2
    • /
    • pp.75-93
    • /
    • 2021
  • Objective : The aim of this study was to identify subgroups of older adults with respect to their lifestyle patterns and examine the characteristics of each subgroup in order to provide a basic evidence for improving the health and quality of life. Methods : This cross-sectional study was conducted in South Korea. Community-dwelling older adults (n=184) above the age of 65 years were surveyed from April 2019 to May 2019. This study used latent profile analysis to examine the subgroups. Chi-squared (χ2) and multinomial logistic regression measures were then used to analyze individual characteristics and influencing factors. Results : The pattern of physical activity which is one of the lifestyle domains in elderly was categorized into three types: 'passive exercise type (31.1%)', 'low intensity exercise type (54.5%)', and 'balanced exercise type(14.5%)'. Activity participation was divided into three patterns: 'inactive type (12%)', 'self-management type (61%)', and 'balanced activity participation type (27%)'. In terms of nutrition, there were only two groups: 'overall malnutrition type (13.5%)' and 'balanced nutrition type (86.5%)'. Furthermore, as a result of the multinomial logistic regression analysis to understand the effects of lifestyle types on the health and quality of life of the elderly, it was confirmed that the health and quality of life were higher in those following an active and balanced lifestyle. In addition, gender, education level and residential area were analyzed as predictive factors. Conclusion : The health and quality of life of the elderly can be improved when they have balanced lifestyle. Therefore, an empirical and policy intervention strategy should be developed and implemented to enhance the health and quality of life of the elderly.

Abnormal Water Temperature Prediction Model Near the Korean Peninsula Using LSTM (LSTM을 이용한 한반도 근해 이상수온 예측모델)

  • Choi, Hey Min;Kim, Min-Kyu;Yang, Hyun
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.265-282
    • /
    • 2022
  • Sea surface temperature (SST) is a factor that greatly influences ocean circulation and ecosystems in the Earth system. As global warming causes changes in the SST near the Korean Peninsula, abnormal water temperature phenomena (high water temperature, low water temperature) occurs, causing continuous damage to the marine ecosystem and the fishery industry. Therefore, this study proposes a methodology to predict the SST near the Korean Peninsula and prevent damage by predicting abnormal water temperature phenomena. The study area was set near the Korean Peninsula, and ERA5 data from the European Center for Medium-Range Weather Forecasts (ECMWF) was used to utilize SST data at the same time period. As a research method, Long Short-Term Memory (LSTM) algorithm specialized for time series data prediction among deep learning models was used in consideration of the time series characteristics of SST data. The prediction model predicts the SST near the Korean Peninsula after 1- to 7-days and predicts the high water temperature or low water temperature phenomenon. To evaluate the accuracy of SST prediction, Coefficient of determination (R2), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE) indicators were used. The summer (JAS) 1-day prediction result of the prediction model, R2=0.996, RMSE=0.119℃, MAPE=0.352% and the winter (JFM) 1-day prediction result is R2=0.999, RMSE=0.063℃, MAPE=0.646%. Using the predicted SST, the accuracy of abnormal sea surface temperature prediction was evaluated with an F1 Score (F1 Score=0.98 for high water temperature prediction in summer (2021/08/05), F1 Score=1.0 for low water temperature prediction in winter (2021/02/19)). As the prediction period increased, the prediction model showed a tendency to underestimate the SST, which also reduced the accuracy of the abnormal water temperature prediction. Therefore, it is judged that it is necessary to analyze the cause of underestimation of the predictive model in the future and study to improve the prediction accuracy.