• Title/Summary/Keyword: Automated Data Analysis

Search Result 586, Processing Time 0.028 seconds

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Transformer-based Language Recognition Technique for Big Data (빅데이터를 위한 트랜스포머 기반의 언어 인식 기법)

  • Hwang, Chi-Gon;Yoon, Chang-Pyo;Lee, Soo-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.267-268
    • /
    • 2022
  • Recently, big data analysis can use various techniques according to the development of machine learning. Big data collected in reality lacks an automated refining technique for the same or similar terms based on semantic analysis of the relationship between words. Big data is usually in the form of sentences, and morphological analysis or understanding of the sentences is required. Accordingly, NLP, a technique for analyzing natural language, can understand the relationship of words and sentences. In this paper, we study the advantages and disadvantages of Transformers and Reformers, which are techniques that complement the disadvantages of RNN, which is a time series approach to big data.

  • PDF

Real time analysis of multichannel EEG signal (다중채널 EEG 신호의 실시간 해석에 관한 연구)

  • 조재희;장태규;양원영
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1992.10a
    • /
    • pp.829-833
    • /
    • 1992
  • This paper presents the design of an automated EEG analyzing system. The design considerations including processing speed, A/D conversion, filtering, and waveforms detection, are overviewed with the description of the associated EEG characteristics. The architecture of the currently implemented system consists of a p-controller based front-end signal processing unit and a host computer system. The data acquisition procedures are described along with a couple of illustrations of the acquired EEG/EOG signal.

  • PDF

Automated Creation of Road Network from Road Edges

  • Wang, P.T.;Doihara, T.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1249-1251
    • /
    • 2003
  • In this paper, a framework for creating road network from road edges is proposed. The present framework mainly includes two modules: road modeler and network generator. Road modeler creates the road polygons from the original road edges, and network generator performs converting road polygons to road network with good connectivity at all intersections. A prototype system is also built, and some experimental results are also presented to demonstrate the effectiveness of the proposed framework.

  • PDF

The Quality Assurance and Accuracy Improvement of National Basemap Digital Mapping Database (국가기본도 수치지도제작 데이터베이스의 품질유지 및 정확도 향상)

  • Lee, Hyun-Jik
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.6 no.2 s.12
    • /
    • pp.91-103
    • /
    • 1998
  • The digital map data of national basemap which will be basic data of the information age is accomplished to produce and are ready to distribute to public. Generally, the quality of digital map is affected to data history, position accuracy, attribute accuracy, logical consistency and completeness. The quality of digital map Is assured to use basic data for the field of various appication. The purpose of this study is to develop automated qualify control program of digital map through analysis type of errors in digital mapping, to determine method of high quality digital mapping. As a results of this study, The automated quality control program of digital map is developed to assure logical consistency and completeness of digital map which is used to principal data in the field of GIS and is determined the process of high quality digital mapping. Also, the results of this study is contributed to quality assrence and accuracy improvement of national basemap digital mapping database.

  • PDF

Identifying a Shape of Input Data Structure for Automated Program Testing (자동화된 프로그램 시험을 위한 입력 자료구조의 모양 식별)

  • Insang, Chung
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1304-1319
    • /
    • 2004
  • We can significantly reduce the cost o# program testing by automating the process of test data generation. Test data generation usually concerns identifying input values on which a selected path is executed. Although lots of research has been done so far, there still remains a lot of issues to be addressed. One of the issues is the shape problem. The shape problem refers to the problem of figuring out a shape of the input data structure required to cause the traversal of a given path. In this paper, we introduce a new method for the shape problem. The method converts the selected path into static single assignment (SSA) form without pointer dereferences. This allows us to consider each statement in the selected path as a constraint involving equality or inequality. We solve the constraints to get a solution which will be represented in terms of the points-to relations for each input variable. Simple, but illustrative examples are given to explain the proposed method.

Measured Data based Inspection for Unintended Deflections in Automotive Outer Panels (측정 데이터 이용한 자동차 외판 미세굴곡 추적 사례 연구)

  • Chung, Yun Chan;Lee, Sang Heon;Chang, Dae Soon;Park, Sang Chul
    • Korean Journal of Computational Design and Engineering
    • /
    • v.18 no.2
    • /
    • pp.113-119
    • /
    • 2013
  • This paper proposes an approach to detect unintended deflections in an automotive outer panel. Conventionally, the detection of unintended deflections has been performed by experienced works, and it requires much amount of time and efforts. The motivation of this work is to reduce such efforts by providing an automated detection methodology. For the detection of unintended deflections, we make use of the measured data from an optical scanner which can be considered as a Z-map data. The proposed approach consists of four major steps; 1) measured data acquisition for an automotive outer panel, 2) identification of shape features, 3) removal of shape features, and 4) detection of unintended deflections via curvature analysis.

Spatial Prediction Based on the Bayesian Kriging with Box-Cox Transformation

  • Choi, Jung-Soon;Park, Man-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.5
    • /
    • pp.851-858
    • /
    • 2009
  • In the last decades, there has been much interest in climate variability because its change has dramatic effects on humanity. Especially, the precipitation data are measured over space and their spatial association is so complicated. So we should take into account such a spatial dependency structure while analyzing the data. However, in linear models for analyzing the data, data sets show severely skewed distribution. In the paper, we consider the Box-Cox transformation to satisfy the normal distribution prior to the analysis, and employ a Bayesian hierarchical framework to investigate the spatial patterns. The data set we considered is monthly average precipitation of the third quarter of 2007 obtained from 347 automated monitoring stations in Contiguous South Korea.

BIM and Thermographic Sensing: Reflecting the As-is Building Condition in Energy Analysis

  • Ham, Youngjib;Golparvar-Fard, Mani
    • Journal of Construction Engineering and Project Management
    • /
    • v.5 no.4
    • /
    • pp.16-22
    • /
    • 2015
  • This paper presents an automated computer vision-based system to update BIM data by leveraging multi-modal visual data collected from existing buildings under inspection. Currently, visual inspections are conducted for building envelopes or mechanical systems, and auditors analyze energy-related contextual information to examine if their performance is maintained as expected by the design. By translating 3D surface thermal profiles into energy performance metrics such as actual R-values at point-level and by mapping such properties to the associated BIM elements using XML Document Object Model (DOM), the proposed method shortens the energy performance modeling gap between the architectural information in the as-designed BIM and the as-is building condition, which improve the reliability of building energy analysis. Several case studies were conducted to experimentally evaluate their impact on BIM-based energy analysis to calculate energy load. The experimental results on existing buildings show that (1) the point-level thermography-based thermal resistance measurement can be automatically matched with the associated BIM elements; and (2) their corresponding thermal properties are automatically updated in gbXML schema. This paper provides practitioners with insight to uncover the fundamentals of how multi-modal visual data can be used to improve the accuracy of building energy modeling for retrofit analysis. Open research challenges and lessons learned from real-world case studies are discussed in detail.

Discovering Meaningful Trends in the Inaugural Addresses of North Korean Leader Via Text Mining (텍스트마이닝을 활용한 북한 지도자의 신년사 및 연설문 트렌드 연구)

  • Park, Chul-Soo
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.43-59
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korean new year addresses, one of most important and authoritative document publicly announced by North Korean government. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. We propose a procedure to find meaningful tendencies based on a combination of text mining, cluster analysis, and co-occurrence networks. To demonstrate applicability and effectiveness of the proposed procedure, we analyzed the inaugural addresses of Kim Jung Un of the North Korea from 2017 to 2019. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. We found that uncovered semantic structures of North Korean new year addresses closely follow major changes in North Korean government's positions toward their own people as well as outside audience such as USA and South Korea.