• Title/Summary/Keyword: graph mining

Search Result 105, Processing Time 0.024 seconds

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Development of Selection Model of Interchange Influence Area in Seoul Belt Expressway Using Chi-square Automatic Interaction Detection (CHAID) (CHAID분석을 이용한 나들목 주변 지가의 공간분포 영향모형 개발 - 서울외곽순환고속도로를 중심으로 -)

  • Kim, Tae Ho;Park, Je Jin;Kim, Young Il;Rho, Jeong Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.6D
    • /
    • pp.711-717
    • /
    • 2009
  • This study develops model for analysis of relationship between major node (Interchange in expressway) and land price formation of apartments along with Seoul Belt Expressway by using CHAID analysis. The results show that first, regions(outer side: Gyeongido, inner side: Seoul) on the line of Seoul Belt Expressway are different and a graph generally show llinear relationships between land price and traffic node but it does not; second, CHAID analysis shows two different spatial distribution at the point of 2.6km in the outer side, but three different spatial distribution at the point of 1.4km and 3.8km in the inner side. In other words, traffic access does not necessarily guarantee high housing price since the graphs shows land price related to composite spatial distribution. This implies that residential environments (highway noise and regional discontinuity) and traffic accessibility cause mutual interaction to generate this phenomenon. Therefore, the highway IC landprice model will be beneficial for calculation of land price in New Town which constantly is being built along the highway.

The Perception Analysis of Autonomous Vehicles using Network Graph (네트워크 그래프를 활용한 자율주행차에 대한 인식 분석)

  • Hyo-gyeong Park;Yeon-hwi You;Sung-jung Yong;Seo-young Lee;Il-young Moon
    • Journal of Practical Engineering Education
    • /
    • v.15 no.1
    • /
    • pp.97-105
    • /
    • 2023
  • Recently, with the development of artificial intelligence technology, many technologies for user convenience are being developed. Among them, interest in autonomous vehicles is increasing day by day. Currently, many automobile companies are aiming to commercialize autonomous vehicles. In order to lay the foundation for the government's new and reasonable policy establishment to support commercialization, we tried to analyze changes and perceptions of public opinion through news article data. Therefore, in this paper, 35,891 news article data mentioning terms similar to 'autonomous vehicles' over the past three years were collected and network analyzed. As a result of the analysis, major keywords such as 'autonomous driving', 'AI', 'future', 'Hyundai Motor', 'autonomous driving vehicle', 'automobile', 'industrial', and 'electric vehicle' were derived. In addition, the autonomous vehicle industry is developing into a faster and more diverse platform and service industry by converging with various industries such as semiconductor companies and big tech companies as well as automobile companies and is paying attention to the convergence of industries. To continuously confirm changes and perceptions in public opinion, it is necessary to analyze perceptions through continuous analysis of SNS data or technology trends.

A study on detective story authors' style differentiation and style structure based on Text Mining (텍스트 마이닝 기법을 활용한 고전 추리 소설 작가 간 문체적 차이와 문체 구조에 대한 연구)

  • Moon, Seok Hyung;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.89-115
    • /
    • 2019
  • This study was conducted to present the stylistic differences between Arthur Conan Doyle and Agatha Christie, famous as writers of classical mystery novels, through data analysis, and further to present the analytical methodology of the study of style based on text mining. The reason why we chose mystery novels for our research is because the unique devices that exist in classical mystery novels have strong stylistic characteristics, and furthermore, by choosing Arthur Conan Doyle and Agatha Christie, who are also famous to the general reader, as subjects of analysis, so that people who are unfamiliar with the research can be familiar with them. The primary objective of this study is to identify how the differences exist within the text and to interpret the effects of these differences on the reader. Accordingly, in addition to events and characters, which are key elements of mystery novels, the writer's grammatical style of writing was defined in style and attempted to analyze it. Two series and four books were selected by each writer, and the text was divided into sentences to secure data. After measuring and granting the emotional score according to each sentence, the emotions of the page progress were visualized as a graph, and the trend of the event progress in the novel was identified under eight themes by applying Topic modeling according to the page. By organizing co-occurrence matrices and performing network analysis, we were able to visually see changes in relationships between people as events progressed. In addition, the entire sentence was divided into a grammatical system based on a total of six types of writing style to identify differences between writers and between works. This enabled us to identify not only the general grammatical writing style of the author, but also the inherent stylistic characteristics in their unconsciousness, and to interpret the effects of these characteristics on the reader. This series of research processes can help to understand the context of the entire text based on a defined understanding of the style, and furthermore, by integrating previously individually conducted stylistic studies. This prior understanding can also contribute to discovering and clarifying the existence of text in unstructured data, including online text. This could help enable more accurate recognition of emotions and delivery of commands on an interactive artificial intelligence platform that currently converts voice into natural language. In the face of increasing attempts to analyze online texts, including New Media, in many ways and discover social phenomena and managerial values, it is expected to contribute to more meaningful online text analysis and semantic interpretation through the links to these studies. However, the fact that the analysis data used in this study are two or four books by author can be considered as a limitation in that the data analysis was not attempted in sufficient quantities. The application of the writing characteristics applied to the Korean text even though it was an English text also could be limitation. The more diverse stylistic characteristics were limited to six, and the less likely interpretation was also considered as a limitation. In addition, it is also regrettable that the research was conducted by analyzing classical mystery novels rather than text that is commonly used today, and that various classical mystery novel writers were not compared. Subsequent research will attempt to increase the diversity of interpretations by taking into account a wider variety of grammatical systems and stylistic structures and will also be applied to the current frequently used online text analysis to assess the potential for interpretation. It is expected that this will enable the interpretation and definition of the specific structure of the style and that various usability can be considered.

The Changing Patterns of Demand-Supply and Role of Mineral Resources in Economic Growth during Industrialization of the Republic of Korea (한국공업화과정(韓國工業化過程)에서의 광물자원(鑛物資源)의 수급구조변화(需給構造變化)와 경제성장(經濟成長)에 있어서의 역할(役割))

  • Yun, Suckew
    • Economic and Environmental Geology
    • /
    • v.18 no.1
    • /
    • pp.65-92
    • /
    • 1985
  • A total of 12 mineral commodities significant in domestic output, economy and/or strategy of the Republic of Korea are chosen to examine the structural changes in production and demand-supply of these minerals during the last two decades of her industrialization. These include iron and manganese ores as the raw materials for iron and steel making, copper, zinc and tungsten ores among other non-ferrous metallic minerals, limestone (cement), kaolin, talc, pyrophyllite and graphite among other non-metallic minerals, and anthracite coal as the only domestic source of fossil energy. These are reviewed historically in time-series based on the statistical data which are tabulated and graphed in terms of domestic output, export, import, apparent demand-supply, its increasing rate, and self-sufficiency rate of each commodity. The increasing rates of demand-supply (IRDS) of some more important commodities are compared with those of Gross Domestic Production (GDP) and Economic Growth Rate (EGR) to evaluate how the IRDS contributed to the GDP and EGR. The major results revealed are as follows: Among the 12 commodities, the domestic output of 8 commodities appeared to have grown with steady upward trends: they are ores of lead, zinc and tungsten, limestone (cement), kaolin, talc, pyrophyllite and anthracite coal. Two commodities, ores of iron and copper, continued with unchanging or slightly declining trends and varied fluctuations, in spite of their cardinal importance to the heavy industry and strategy of Korea. The remaining two, graphite and manganese ore, have gradualy declined in domestic output in which the former has still enough resource potential but the latter has not and virtually ceased its domestic output. Trade patterns for mineral commodities in the Republic of Korea during the last two decades have changed greatly, being marked by a shift from mineral-exporting to mineral importing, mainly because of increasing consumption of mineral raw materials for industrialization rather than beceuse of decreasing output of domestic mineral commodities in quantity. In terms of trade patterns, the 12 commodities concerned in this study can be classified into the following four groups. The 1st group - ores of lead and tungsten have only been exported without imports. The 2nd group - amorphous graphite, and pyrophyllite have mainly been exported but partly been imported. The 3rd group - kaolin, talc and crystalline graphite have equally been exported and imported, but quantity of imports have rapidly been increased with time. The 4th group - ores of iron, manganese and zinc have shifted from exports to imports during the industrialization, particularly owing to the initiation of iron and steel making by the Pohang Iron and Steel Company in the middle 1970' s and the new establishment of the Onsan Zinc Refinery in the late 1970' s. All of the 12 commodities under considerations were far above 100% in self-sufficiency rate before or in the early 1960' s. Recently, however, most of them have been declined to below 100% except for those of limestone (cement) and pyrophyllite. It is particularly serious to identify that the self-sufficiency rates of the three important metallic minerals, iron, copper and manganese ores in 1982 appeared to be 5.1%, 0.5%, and 0.01%, respectively. The average self-sufficiency rate of the total domestic minerals produced in 1982 was 14.4% (in value) for that year. Mining industry appeared to be extremely high in its intermediate demand rate whereas its intermediate input rate to be quite low indicating that mineral raw materials have been exerted strong forward linkage effects upon the other industries rather than backward linkage effects. In comparing the curves of increasing rates of demand-supply of several major minerals - iron ore, manganese ore, copper ore, limestone (cement), kaolin, and anthracite coal - with those of Gross Domestic Production and Economic Growth Rate drawn on every graph, it is clearly shown that the curves of increasing rates of demand-supply comprise around 6 to 7 periods of cycles which roughly harmonious with those of the curves of GDP and EGR, except for the curve of anthracite coal of which the configuration seems to have resulted from the (artificial) government's mineral policy rather than from economic free market mechanism. The harmonic feature of these curves well suggests that the increasing rates of demand-supply of major minerals have been significantly contributed to the GDP and EGR. In addition, the wider amplitudes of the iron, manganese and copper curves than those of the limestone (cement) and kaolin curves indicate that the contribution of the former, metallic commodities, has been greater than that of the latter, non-metallic commodities.

  • PDF