• Title/Summary/Keyword: Score Prediction

Search Result 524, Processing Time 0.018 seconds

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Radiomics Analysis of Gray-Scale Ultrasonographic Images of Papillary Thyroid Carcinoma > 1 cm: Potential Biomarker for the Prediction of Lymph Node Metastasis (Radiomics를 이용한 1 cm 이상의 갑상선 유두암의 초음파 영상 분석: 림프절 전이 예측을 위한 잠재적인 바이오마커)

  • Hyun Jung Chung;Kyunghwa Han;Eunjung Lee;Jung Hyun Yoon;Vivian Youngjean Park;Minah Lee;Eun Cho;Jin Young Kwak
    • Journal of the Korean Society of Radiology
    • /
    • v.84 no.1
    • /
    • pp.185-196
    • /
    • 2023
  • Purpose This study aimed to investigate radiomics analysis of ultrasonographic images to develop a potential biomarker for predicting lymph node metastasis in papillary thyroid carcinoma (PTC) patients. Materials and Methods This study included 431 PTC patients from August 2013 to May 2014 and classified them into the training and validation sets. A total of 730 radiomics features, including texture matrices of gray-level co-occurrence matrix and gray-level run-length matrix and single-level discrete two-dimensional wavelet transform and other functions, were obtained. The least absolute shrinkage and selection operator method was used for selecting the most predictive features in the training data set. Results Lymph node metastasis was associated with the radiomics score (p < 0.001). It was also associated with other clinical variables such as young age (p = 0.007) and large tumor size (p = 0.007). The area under the receiver operating characteristic curve was 0.687 (95% confidence interval: 0.616-0.759) for the training set and 0.650 (95% confidence interval: 0.575-0.726) for the validation set. Conclusion This study showed the potential of ultrasonography-based radiomics to predict cervical lymph node metastasis in patients with PTC; thus, ultrasonography-based radiomics can act as a biomarker for PTC.

Prediction of Salvaged Myocardium in Patients with Acute Myocardial Infarction after Primary Percutaneous Coronary Angioplasty using early Thallium-201 Redistribution Myocardial Perfusion Imaging (급성심근경색증의 일차적 관동맥성형술 후 조기 Tl-201 재분포영상을 이용한 구조심근 예측)

  • Choi, Joon-Young;Yang, You-Jung;Choi, Seung-Jin;Yeo, Jeong-Seok;Park, Seong-Wook;Song, Jae-Kwan;Moon, Dae-Hyuk
    • The Korean Journal of Nuclear Medicine
    • /
    • v.37 no.4
    • /
    • pp.219-228
    • /
    • 2003
  • Purpose: The amount of salvaged myocardium is an important prognostic factor in patients with acute myocardial infarction (MI). We investigated if early Tl-201 SPECT imaging could be used to predict the salvaged myocardium and functional recovery in acute MI after primary PTCA. Materials and Methods: In 36 patients with first acute MI treated with primary PTCA, serial echocardiography and Tl-201 SPECT imaging ($5.8{\pm}2.1$ days after PTDA) were performed. Regional wall motion and perfusion were quantified with on 16-segment myocardial model with 5-point and 4-point scaling system, respectively. Results: Wall motion was improved in 78 of the 212 dyssynergic segments on 1 month follow-up echocardiography and 97 on 7 months follow-up echocardiography, which were proved to be salvaged myocardium. The areas under receiver operating characteristic curves of Tl-201 perfusion score for detecting salvaged myocardial segments were 0.79 for 1 month follow-up and 0.83 for 7 months follow-up. The sensitivity and specificity of Tl-201 redistribution images with optimum cutoff of 40% of peak thallium activity for detecting salvaged myocardium were 84.6% and 55.2% for 1 month follow-up, and 87.6% and 64.3% for 7 months follow-up, respectively. There was a linear relationship between the percentage of peak thallium activity on early redistribution imaging and the likelihood of segmental functional improvement 7 months after reperfusion. Conclusion: Tl-201 myocardial perfusion SPECT imaging performed early within 10 days after reperfusion can be used to predict the salvaged myocardium and functional recovery with high sensitivity during the 7 months following primary PTCA in patients with acute MI.

Construction of Consumer Confidence index based on Sentiment analysis using News articles (뉴스기사를 이용한 소비자의 경기심리지수 생성)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.1-27
    • /
    • 2017
  • It is known that the economic sentiment index and macroeconomic indicators are closely related because economic agent's judgment and forecast of the business conditions affect economic fluctuations. For this reason, consumer sentiment or confidence provides steady fodder for business and is treated as an important piece of economic information. In Korea, private consumption accounts and consumer sentiment index highly relevant for both, which is a very important economic indicator for evaluating and forecasting the domestic economic situation. However, despite offering relevant insights into private consumption and GDP, the traditional approach to measuring the consumer confidence based on the survey has several limits. One possible weakness is that it takes considerable time to research, collect, and aggregate the data. If certain urgent issues arise, timely information will not be announced until the end of each month. In addition, the survey only contains information derived from questionnaire items, which means it can be difficult to catch up to the direct effects of newly arising issues. The survey also faces potential declines in response rates and erroneous responses. Therefore, it is necessary to find a way to complement it. For this purpose, we construct and assess an index designed to measure consumer economic sentiment index using sentiment analysis. Unlike the survey-based measures, our index relies on textual analysis to extract sentiment from economic and financial news articles. In particular, text data such as news articles and SNS are timely and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. There exist two main approaches to the automatic extraction of sentiment from a text, we apply the lexicon-based approach, using sentiment lexicon dictionaries of words annotated with the semantic orientations. In creating the sentiment lexicon dictionaries, we enter the semantic orientation of individual words manually, though we do not attempt a full linguistic analysis (one that involves analysis of word senses or argument structure); this is the limitation of our research and further work in that direction remains possible. In this study, we generate a time series index of economic sentiment in the news. The construction of the index consists of three broad steps: (1) Collecting a large corpus of economic news articles on the web, (2) Applying lexicon-based methods for sentiment analysis of each article to score the article in terms of sentiment orientation (positive, negative and neutral), and (3) Constructing an economic sentiment index of consumers by aggregating monthly time series for each sentiment word. In line with existing scholarly assessments of the relationship between the consumer confidence index and macroeconomic indicators, any new index should be assessed for its usefulness. We examine the new index's usefulness by comparing other economic indicators to the CSI. To check the usefulness of the newly index based on sentiment analysis, trend and cross - correlation analysis are carried out to analyze the relations and lagged structure. Finally, we analyze the forecasting power using the one step ahead of out of sample prediction. As a result, the news sentiment index correlates strongly with related contemporaneous key indicators in almost all experiments. We also find that news sentiment shocks predict future economic activity in most cases. In almost all experiments, the news sentiment index strongly correlates with related contemporaneous key indicators. Furthermore, in most cases, news sentiment shocks predict future economic activity; in head-to-head comparisons, the news sentiment measures outperform survey-based sentiment index as CSI. Policy makers want to understand consumer or public opinions about existing or proposed policies. Such opinions enable relevant government decision-makers to respond quickly to monitor various web media, SNS, or news articles. Textual data, such as news articles and social networks (Twitter, Facebook and blogs) are generated at high-speeds and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. Although research using unstructured data in economic analysis is in its early stages, but the utilization of data is expected to greatly increase once its usefulness is confirmed.