• Title/Summary/Keyword: Supervised prediction

Search Result 126, Processing Time 0.02 seconds

Arabic Stock News Sentiments Using the Bidirectional Encoder Representations from Transformers Model

  • Eman Alasmari;Mohamed Hamdy;Khaled H. Alyoubi;Fahd Saleh Alotaibi
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.113-123
    • /
    • 2024
  • Stock market news sentiment analysis (SA) aims to identify the attitudes of the news of the stock on the official platforms toward companies' stocks. It supports making the right decision in investing or analysts' evaluation. However, the research on Arabic SA is limited compared to that on English SA due to the complexity and limited corpora of the Arabic language. This paper develops a model of sentiment classification to predict the polarity of Arabic stock news in microblogs. Also, it aims to extract the reasons which lead to polarity categorization as the main economic causes or aspects based on semantic unity. Therefore, this paper presents an Arabic SA approach based on the logistic regression model and the Bidirectional Encoder Representations from Transformers (BERT) model. The proposed model is used to classify articles as positive, negative, or neutral. It was trained on the basis of data collected from an official Saudi stock market article platform that was later preprocessed and labeled. Moreover, the economic reasons for the articles based on semantic unit, divided into seven economic aspects to highlight the polarity of the articles, were investigated. The supervised BERT model obtained 88% article classification accuracy based on SA, and the unsupervised mean Word2Vec encoder obtained 80% economic-aspect clustering accuracy. Predicting polarity classification on the Arabic stock market news and their economic reasons would provide valuable benefits to the stock SA field.

A Study on Customer Review Rating Recommendation and Prediction through Online Promotional Activity Analysis - Focusing on "S" Company Wearable Products - (온라인 판매촉진활동 분석을 통한 고객 리뷰평점 추천 및 예측에 관한 연구 : S사 Wearable 상품중심으로)

  • Shin, Ho-cheol
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.4
    • /
    • pp.118-129
    • /
    • 2022
  • The purpose of this report is to study a strategic model of promotion activities through various analysis and sales forecasting by selecting wearable products for domestic online companies and collecting sales data. For data analysis, various algorithms are used for analysis and the results are selected as the optimal model. The gradation boosting model, which is selected as the best result, will allow nine independent variables to be entered, including promotion type, price, amount, gender, model, company, grade, sales date, and region, when predicting dependent variables through supervised learning. In this study, the review values set as dependent variables for each type of sales promotion were studied in more detail through the ensemble analysis technique, and the main purpose is to analyze and predict them. The purpose of this study is to study the grades. As a result of the analysis, the evaluation result is 95% of AUC, and F1 is about 93%. In the end, it was confirmed that among the types of sales promotion activities, value-added benefits affected the number of reviews and review grades, and that major variables affected the review and review grades.

3D Quantitative Analysis of Cell Nuclei Based on Digital Image Cytometry (디지털 영상 세포 측정법에 기반한 세포핵의 3차원 정량적 분석)

  • Kim, Tae-Yun;Choi, Hyun-Ju;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.7
    • /
    • pp.846-855
    • /
    • 2007
  • Significant feature extraction in cancer cell image analysis is an important process for grading cell carcinoma. In this study, we propose a method for 3D quantitative analysis of cell nuclei based upon digital image cytometry. First, we acquired volumetric renal cell carcinoma data for each grade using confocal laser scanning microscopy and segmented cell nuclei employing color features based upon a supervised teaming scheme. For 3D visualization, we used a contour-based method for surface rendering and a 3D texture mapping method for volume rendering. We then defined and extracted the 3D morphological features of cell nuclei. To evaluate what quantitative features of 3D analysis could contribute to diagnostic information, we analyzed the statistical significance of the extracted 3D features in each grade using an analysis of variance (ANOVA). Finally, we compared the 2D with the 3D features of cell nuclei and analyzed the correlations between them. We found statistically significant correlations between nuclear grade and 3D morphological features. The proposed method has potential for use as fundamental research in developing a new nuclear grading system for accurate diagnosis and prediction of prognosis.

  • PDF

Predicting Program Code Changes Using a CNN Model (CNN 모델을 이용한 프로그램 코드 변경 예측)

  • Kim, Dong Kwan
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.9
    • /
    • pp.11-19
    • /
    • 2021
  • A software system is required to change during its life cycle due to various requirements such as adding functionalities, fixing bugs, and adjusting to new computing environments. Such program code modification should be considered as carefully as a new system development becase unexpected software errors could be introduced. In addition, when reusing open source programs, we can expect higher quality software if code changes of the open source program are predicted in advance. This paper proposes a Convolutional Neural Network (CNN)-based deep learning model to predict source code changes. In this paper, the prediction of code changes is considered as a kind of a binary classification problem in deep learning and labeled datasets are used for supervised learning. Java projects and code change logs are collected from GitHub for training and testing datasets. Software metrics are computed from the collected Java source code and they are used as input data for the proposed model to detect code changes. The performance of the proposed model has been measured by using evaluation metrics such as precision, recall, F1-score, and accuracy. The experimental results show the proposed CNN model has achieved 95% in terms of F1-Score and outperformed the multilayer percept-based DNN model whose F1-Score is 92%.

Review of Land Cover Classification Potential in River Spaces Using Satellite Imagery and Deep Learning-Based Image Training Method (딥 러닝 기반 이미지 트레이닝을 활용한 하천 공간 내 피복 분류 가능성 검토)

  • Woochul, Kang;Eun-kyung, Jang
    • Ecology and Resilient Infrastructure
    • /
    • v.9 no.4
    • /
    • pp.218-227
    • /
    • 2022
  • This study attempted classification through deep learning-based image training for land cover classification in river spaces which is one of the important data for efficient river management. For this purpose, land cover classification analysis with the RGB image of the target section based on the category classification index of major land cover map was conducted by using the learning outcomes from the result of labeling. In addition, land cover classification of the river spaces was performed by unsupervised and supervised classification from Sentinel-2 satellite images provided in an open format, and this was compared with the results of deep learning-based image classification. As a result of the analysis, it showed more accurate prediction results compared to unsupervised classification results, and it presented significantly improved classification results in the case of high-resolution images. The result of this study showed the possibility of classifying water areas and wetlands in the river spaces, and if additional research is performed in the future, the deep learning based image train method for the land cover classification could be used for river management.

Performance of Investment Strategy using Investor-specific Transaction Information and Machine Learning (투자자별 거래정보와 머신러닝을 활용한 투자전략의 성과)

  • Kim, Kyung Mock;Kim, Sun Woong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.65-82
    • /
    • 2021
  • Stock market investors are generally split into foreign investors, institutional investors, and individual investors. Compared to individual investor groups, professional investor groups such as foreign investors have an advantage in information and financial power and, as a result, foreign investors are known to show good investment performance among market participants. The purpose of this study is to propose an investment strategy that combines investor-specific transaction information and machine learning, and to analyze the portfolio investment performance of the proposed model using actual stock price and investor-specific transaction data. The Korea Exchange offers daily information on the volume of purchase and sale of each investor to securities firms. We developed a data collection program in C# programming language using an API provided by Daishin Securities Cybosplus, and collected 151 out of 200 KOSPI stocks with daily opening price, closing price and investor-specific net purchase data from January 2, 2007 to July 31, 2017. The self-organizing map model is an artificial neural network that performs clustering by unsupervised learning and has been introduced by Teuvo Kohonen since 1984. We implement competition among intra-surface artificial neurons, and all connections are non-recursive artificial neural networks that go from bottom to top. It can also be expanded to multiple layers, although many fault layers are commonly used. Linear functions are used by active functions of artificial nerve cells, and learning rules use Instar rules as well as general competitive learning. The core of the backpropagation model is the model that performs classification by supervised learning as an artificial neural network. We grouped and transformed investor-specific transaction volume data to learn backpropagation models through the self-organizing map model of artificial neural networks. As a result of the estimation of verification data through training, the portfolios were rebalanced monthly. For performance analysis, a passive portfolio was designated and the KOSPI 200 and KOSPI index returns for proxies on market returns were also obtained. Performance analysis was conducted using the equally-weighted portfolio return, compound interest rate, annual return, Maximum Draw Down, standard deviation, and Sharpe Ratio. Buy and hold returns of the top 10 market capitalization stocks are designated as a benchmark. Buy and hold strategy is the best strategy under the efficient market hypothesis. The prediction rate of learning data using backpropagation model was significantly high at 96.61%, while the prediction rate of verification data was also relatively high in the results of the 57.1% verification data. The performance evaluation of self-organizing map grouping can be determined as a result of a backpropagation model. This is because if the grouping results of the self-organizing map model had been poor, the learning results of the backpropagation model would have been poor. In this way, the performance assessment of machine learning is judged to be better learned than previous studies. Our portfolio doubled the return on the benchmark and performed better than the market returns on the KOSPI and KOSPI 200 indexes. In contrast to the benchmark, the MDD and standard deviation for portfolio risk indicators also showed better results. The Sharpe Ratio performed higher than benchmarks and stock market indexes. Through this, we presented the direction of portfolio composition program using machine learning and investor-specific transaction information and showed that it can be used to develop programs for real stock investment. The return is the result of monthly portfolio composition and asset rebalancing to the same proportion. Better outcomes are predicted when forming a monthly portfolio if the system is enforced by rebalancing the suggested stocks continuously without selling and re-buying it. Therefore, real transactions appear to be relevant.