• Title/Summary/Keyword: Recall and Precision

Search Result 724, Processing Time 0.025 seconds

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Development and Validation of MRI-Based Radiomics Models for Diagnosing Juvenile Myoclonic Epilepsy

  • Kyung Min Kim;Heewon Hwang;Beomseok Sohn;Kisung Park;Kyunghwa Han;Sung Soo Ahn;Wonwoo Lee;Min Kyung Chu;Kyoung Heo;Seung-Koo Lee
    • Korean Journal of Radiology
    • /
    • v.23 no.12
    • /
    • pp.1281-1289
    • /
    • 2022
  • Objective: Radiomic modeling using multiple regions of interest in MRI of the brain to diagnose juvenile myoclonic epilepsy (JME) has not yet been investigated. This study aimed to develop and validate radiomics prediction models to distinguish patients with JME from healthy controls (HCs), and to evaluate the feasibility of a radiomics approach using MRI for diagnosing JME. Materials and Methods: A total of 97 JME patients (25.6 ± 8.5 years; female, 45.5%) and 32 HCs (28.9 ± 11.4 years; female, 50.0%) were randomly split (7:3 ratio) into a training (n = 90) and a test set (n = 39) group. Radiomic features were extracted from 22 regions of interest in the brain using the T1-weighted MRI based on clinical evidence. Predictive models were trained using seven modeling methods, including a light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, with radiomics features in the training set. The performance of the models was validated and compared to the test set. The model with the highest area under the receiver operating curve (AUROC) was chosen, and important features in the model were identified. Results: The seven tested radiomics models, including light gradient boosting machine, support vector classifier, random forest, logistic regression, extreme gradient boosting, gradient boosting machine, and decision tree, showed AUROC values of 0.817, 0.807, 0.783, 0.779, 0.767, 0.762, and 0.672, respectively. The light gradient boosting machine with the highest AUROC, albeit without statistically significant differences from the other models in pairwise comparisons, had accuracy, precision, recall, and F1 scores of 0.795, 0.818, 0.931, and 0.871, respectively. Radiomic features, including the putamen and ventral diencephalon, were ranked as the most important for suggesting JME. Conclusion: Radiomic models using MRI were able to differentiate JME from HCs.

Comparison of Effective Soil Depth Classification Methods Using Topographic Information (지형정보를 이용한 유효토심 분류방법비교)

  • Byung-Soo Kim;Ju-Sung Choi;Ja-Kyung Lee;Na-Young Jung;Tae-Hyung Kim
    • Journal of the Korean Geosynthetics Society
    • /
    • v.22 no.2
    • /
    • pp.1-12
    • /
    • 2023
  • Research on the causes of landslides and prediction of vulnerable areas is being conducted globally. This study aims to predict the effective soil depth, a critical element in analyzing and forecasting landslide disasters, using topographic information. Topographic data from various institutions were collected and assigned as attribute information to a 100 m × 100 m grid, which was then reduced through data grading. The study predicted effective soil depth for two cases: three depths (shallow, normal, deep) and five depths (very shallow, shallow, normal, deep, very deep). Three classification models, including K-Nearest Neighbor, Random Forest, and Deep Artificial Neural Network, were used, and their performance was evaluated by calculating accuracy, precision, recall, and F1-score. Results showed that the performance was in the high 50% to early 70% range, with the accuracy of the three classification criteria being about 5% higher than the five criteria. Although the grading criteria and classification model's performance presented in this study are still insufficient, the application of the classification model is possible in predicting the effective soil depth. This study suggests the possibility of predicting more reliable values than the current effective soil depth, which assumes a large area uniformly.

Hate Speech Detection Using Modified Principal Component Analysis and Enhanced Convolution Neural Network on Twitter Dataset

  • Majed, Alowaidi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.112-119
    • /
    • 2023
  • Traditionally used for networking computers and communications, the Internet has been evolving from the beginning. Internet is the backbone for many things on the web including social media. The concept of social networking which started in the early 1990s has also been growing with the internet. Social Networking Sites (SNSs) sprung and stayed back to an important element of internet usage mainly due to the services or provisions they allow on the web. Twitter and Facebook have become the primary means by which most individuals keep in touch with others and carry on substantive conversations. These sites allow the posting of photos, videos and support audio and video storage on the sites which can be shared amongst users. Although an attractive option, these provisions have also culminated in issues for these sites like posting offensive material. Though not always, users of SNSs have their share in promoting hate by their words or speeches which is difficult to be curtailed after being uploaded in the media. Hence, this article outlines a process for extracting user reviews from the Twitter corpus in order to identify instances of hate speech. Through the use of MPCA (Modified Principal Component Analysis) and ECNN, we are able to identify instances of hate speech in the text (Enhanced Convolutional Neural Network). With the use of NLP, a fully autonomous system for assessing syntax and meaning can be established (NLP). There is a strong emphasis on pre-processing, feature extraction, and classification. Cleansing the text by removing extra spaces, punctuation, and stop words is what normalization is all about. In the process of extracting features, these features that have already been processed are used. During the feature extraction process, the MPCA algorithm is used. It takes a set of related features and pulls out the ones that tell us the most about the dataset we give itThe proposed categorization method is then put forth as a means of detecting instances of hate speech or abusive language. It is argued that ECNN is superior to other methods for identifying hateful content online. It can take in massive amounts of data and quickly return accurate results, especially for larger datasets. As a result, the proposed MPCA+ECNN algorithm improves not only the F-measure values, but also the accuracy, precision, and recall.

Application of Deep Learning Method for Real-Time Traffic Analysis using UAV (UAV를 활용한 실시간 교통량 분석을 위한 딥러닝 기법의 적용)

  • Park, Honglyun;Byun, Sunghoon;Lee, Hansung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.4
    • /
    • pp.353-361
    • /
    • 2020
  • Due to the rapid urbanization, various traffic problems such as traffic jams during commute and regular traffic jams are occurring. In order to solve these traffic problems, it is necessary to quickly and accurately estimate and analyze traffic volume. ITS (Intelligent Transportation System) is a system that performs optimal traffic management by utilizing the latest ICT (Information and Communications Technology) technologies, and research has been conducted to analyze fast and accurate traffic volume through various techniques. In this study, we proposed a deep learning-based vehicle detection method using UAV (Unmanned Aerial Vehicle) video for real-time traffic analysis with high accuracy. The UAV was used to photograph orthogonal videos necessary for training and verification at intersections where various vehicles pass and trained vehicles by classifying them into sedan, truck, and bus. The experiment on UAV dataset was carried out using YOLOv3 (You Only Look Once V3), a deep learning-based object detection technique, and the experiments achieved the overall object detection rate of 90.21%, precision of 95.10% and the recall of 85.79%.

Vision-based Low-cost Walking Spatial Recognition Algorithm for the Safety of Blind People (시각장애인 안전을 위한 영상 기반 저비용 보행 공간 인지 알고리즘)

  • Sunghyun Kang;Sehun Lee;Junho Ahn
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.81-89
    • /
    • 2023
  • In modern society, blind people face difficulties in navigating common environments such as sidewalks, elevators, and crosswalks. Research has been conducted to alleviate these inconveniences for the visually impaired through the use of visual and audio aids. However, such research often encounters limitations when it comes to practical implementation due to the high cost of wearable devices, high-performance CCTV systems, and voice sensors. In this paper, we propose an artificial intelligence fusion algorithm that utilizes low-cost video sensors integrated into smartphones to help blind people safely navigate their surroundings during walking. The proposed algorithm combines motion capture and object detection algorithms to detect moving people and various obstacles encountered during walking. We employed the MediaPipe library for motion capture to model and detect surrounding pedestrians during motion. Additionally, we used object detection algorithms to model and detect various obstacles that can occur during walking on sidewalks. Through experimentation, we validated the performance of the artificial intelligence fusion algorithm, achieving accuracy of 0.92, precision of 0.91, recall of 0.99, and an F1 score of 0.95. This research can assist blind people in navigating through obstacles such as bollards, shared scooters, and vehicles encountered during walking, thereby enhancing their mobility and safety.

Performance Analysis of Trading Strategy using Gradient Boosting Machine Learning and Genetic Algorithm

  • Jang, Phil-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.147-155
    • /
    • 2022
  • In this study, we developed a system to dynamically balance a daily stock portfolio and performed trading simulations using gradient boosting and genetic algorithms. We collected various stock market data from stocks listed on the KOSPI and KOSDAQ markets, including investor-specific transaction data. Subsequently, we indexed the data as a preprocessing step, and used feature engineering to modify and generate variables for training. First, we experimentally compared the performance of three popular gradient boosting algorithms in terms of accuracy, precision, recall, and F1-score, including XGBoost, LightGBM, and CatBoost. Based on the results, in a second experiment, we used a LightGBM model trained on the collected data along with genetic algorithms to predict and select stocks with a high daily probability of profit. We also conducted simulations of trading during the period of the testing data to analyze the performance of the proposed approach compared with the KOSPI and KOSDAQ indices in terms of the CAGR (Compound Annual Growth Rate), MDD (Maximum Draw Down), Sharpe ratio, and volatility. The results showed that the proposed strategies outperformed those employed by the Korean stock market in terms of all performance metrics. Moreover, our proposed LightGBM model with a genetic algorithm exhibited competitive performance in predicting stock price movements.

Development of Dolphin Click Signal Classification Algorithm Based on Recurrent Neural Network for Marine Environment Monitoring (해양환경 모니터링을 위한 순환 신경망 기반의 돌고래 클릭 신호 분류 알고리즘 개발)

  • Seoje Jeong;Wookeen Chung;Sungryul Shin;Donghyeon Kim;Jeasoo Kim;Gihoon Byun;Dawoon Lee
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.3
    • /
    • pp.126-137
    • /
    • 2023
  • In this study, a recurrent neural network (RNN) was employed as a methodological approach to classify dolphin click signals derived from ocean monitoring data. To improve the accuracy of click signal classification, the single time series data were transformed into fractional domains using fractional Fourier transform to expand its features. Transformed data were used as input for three RNN models: long short-term memory (LSTM), gated recurrent unit (GRU), and bidirectional LSTM (BiLSTM), which were compared to determine the optimal network for the classification of signals. Because the fractional Fourier transform displayed different characteristics depending on the chosen angle parameter, the optimal angle range for each RNN was first determined. To evaluate network performance, metrics such as accuracy, precision, recall, and F1-score were employed. Numerical experiments demonstrated that all three networks performed well, however, the BiLSTM network outperformed LSTM and GRU in terms of learning results. Furthermore, the BiLSTM network provided lower misclassification than the other networks and was deemed the most practically appliable to field data.

Image Retrieval Using Multiresoluton Color and Texture Features in Wavelet Transform Domain (웨이브릿 변환 영역의 칼라 및 질감 특징을 이용한 영상검색)

  • Chun Young-Deok;Sung Joong-Ki;Kim Nam-Chul
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.1 s.307
    • /
    • pp.55-66
    • /
    • 2006
  • We propose a progressive image retrieval method based on an efficient combination of multiresolution color and torture features in wavelet transform domain. As a color feature, color autocorrelogram of the hue and saturation components is chosen. As texture features, BDIP and BVLC moments of the value component are chosen. For the selected features, we obtain multiresolution feature vectors which are extracted from all decomposition levels in wavelet domain. The multiresolution feature vectors of the color and texture features are efficiently combined by the normalization depending on their dimensions and standard deviation vector, respectively, vector components of the features are efficiently quantized in consideration of their storage space, and computational complexity in similarity computation is reduced by using progressive retrieval strategy. Experimental results show that the proposed method yields average $15\%$ better performance in precision vs. recall and average 0.2 in ANMRR than the methods using color histogram color autocorrelogram SCD, CSD, wavelet moments, EHD, BDIP and BVLC moments, and combination of color histogram and wavelet moments, respectively. Specially, the proposed method shows an excellent performance over the other methods in image DBs contained images of various resolutions.

A Comparative Study of Reservoir Surface Area Detection Algorithm Using SAR Image (SAR 영상을 활용한 저수지 수표면적 탐지 알고리즘 비교 연구)

  • Jeong, Hagyu;Park, Jongsoo;Lee, Dalgeun;Lee, Junwoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_3
    • /
    • pp.1777-1788
    • /
    • 2022
  • The reservoir is a major water supply source in the domestic agricultural environment, and the monitoring of water storage of reservoirs is important for the utilization and management of agricultural water resource. Remote sensing via satellite imagery can be an effective method for regular monitoring of widely distributed objects such as reservoirs, and in this study, image classification and image segmentation algorithms are applied to Sentinel-1 Synthetic Aperture Radar (SAR) imagery for water body detection in 53 reservoirs in South Korea. Six algorithms are used: Neural Network (NN), Support Vector Machine (SVM), Random Forest (RF), Otsu, Watershed (WS), and Chan-Vese (CV), and the results of water body detection are evaluated with in-situ images taken by drones. The correlations between the in-situ water surface area and detected water surface area from each algorithm are NN 0.9941, SVM 0.9942, RF 0.9940, Otsu 0.9922, WS 0.9709, and CV 0.9736, and the larger the scale of reservoir, the higher the linear correlation was. WS showed low recall due to the undetected water bodies, and NN, SVM, and RF showed low precision due to over-detection. For water body detection through SAR imagery, we found that aquatic plants and artificial structures can be the error factors causing undetection of water body.