• Title/Summary/Keyword: text analytics

Search Result 109, Processing Time 0.032 seconds

A Study on Job Satisfaction/Retention Factors and Job Unsatisfaction/Turnover Factors by Industries using Job Reviews (직무 리뷰 분석을 통한 산업군별 직무만족/존속 요인 및 직무불만족/이직 요인에 관한 연구)

  • Lee, Jongseo;Kim, Sunggeun;Kang, Juyoung
    • Journal of Information Technology Services
    • /
    • v.16 no.1
    • /
    • pp.1-26
    • /
    • 2017
  • Keeping good, talented people is one of the most significant factors in a company's success. HR analytics is an important area for applying big data analysis techniques to human resources. It provides organizational insight that enables effective management of employees, allowing management to reach their business goals quickly and efficiently. Job satisfaction and employee turnover analysis are the keys to HR analytics. Job review web services have been becoming popular. Because people exchange information about job satisfaction and turnover through these web services, useful information about HR Analytics is accumulated on the job review web sites. In this paper, we identified factors of employee retention by analyzing a Job Satisfaction/Retention group, and the factors of employee turnover by analyzing a Job Unsatisfaction/Turnover group. In order to do this, we first classified employees according to whether their self-reported job satisfaction or turnover was true. We collected and analyzed data from Jobplanet, a popular job review site. Through dominance analysis and LDA topic modeling, we found major factors, topics, and keywords of the classified groups by IT, service, and manufacturing domains. Our approach is a novel model to apply the analysis of reviews and text mining to the HR domain, and it will be practically helpful for setting new strategies that improve job satisfaction.

Analysis on the English Translation of The First Chosen Educational Ordinance, Manual of Education of Koreans (1913), and Manual of Education in Chosen 1920 (1920) Using Text Mining Analytics (텍스트 마이닝(Text mining) 기법을 활용한 『제1차조선교육령』과 『조선교육요람』(1913, 1920)의영어번역본 분석)

  • Jinyoung Tak;Eunjoo Kwak;Silo Chin;Minjoo Shon;Dongmie Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.6
    • /
    • pp.309-317
    • /
    • 2023
  • The purpose of this paper is to investigate how Japan tried to dominate Chosen through educational policies by analyzing three official English texts published by the Japanese Government-General of Korea: the First Chosen Educational Ordinance declared in 1911, the Manual of Education of Koreans(1913), and the Manual of Education in Chosen 1920(1920). In order to pursue this purpose, the present study carried a corpus-based diachronic analysis, rather then a qualitative analysis. Facilitating text analytics such as Word Cloud and CONCOR, this paper derived the following results: First, the first Chosen Educational Ordinance(1911) includes overall educational regulations, curriculum, and operations of schools. Second, the Manual of Education of Koreans(1913) contains the educational medium and contents on how to educate. Finally, it can be proposed that the Manual of Education in Chosen 1920(1920) contains specific implementation of education and the subject of education.

On the Development of Risk Factor Map for Accident Analysis using Textmining and Self-Organizing Map(SOM) Algorithms (재해분석을 위한 텍스트마이닝과 SOM 기반 위험요인지도 개발)

  • Kang, Sungsik;Suh, Yongyoon
    • Journal of the Korean Society of Safety
    • /
    • v.33 no.6
    • /
    • pp.77-84
    • /
    • 2018
  • Report documents of industrial and occupational accidents have continuously been accumulated in private and public institutes. Amongst others, information on narrative-texts of accidents such as accident processes and risk factors contained in disaster report documents is gaining the useful value for accident analysis. Despite this increasingly potential value of analysis of text information, scientific and algorithmic text analytics for safety management has not been carried out yet. Thus, this study aims to develop data processing and visualization techniques that provide a systematic and structural view of text information contained in a disaster report document so that safety managers can effectively analyze accident risk factors. To this end, the risk factor map using text mining and self-organizing map is developed. Text mining is firstly used to extract risk keywords from disaster report documents and then, the Self-Organizing Map (SOM) algorithm is conducted to visualize the risk factor map based on the similarity of disaster report documents. As a result, it is expected that fruitful text information buried in a myriad of disaster report documents is analyzed, providing risk factors to safety managers.

Exploring the Sentiment Analysis of Electric Vehicles Social Media Data by Using Feature Selection Methods (속성선택방법을 이용한 전기자동차 소셜미디어 데이터의 감성분석 연구)

  • Costello, Francis Joseph;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.18 no.2
    • /
    • pp.249-259
    • /
    • 2020
  • This study presents a recently obtained social media data set based upon the case study of Electric Vehicles (EV) and looks to implement a sentiment analysis (SA) in order to gain insights. This study uses two methods in order to fully analyze the public's sentiment on EVs. First, we implement a SA tool in which we used to extract the sentiment of comments. Next we labeled the data with these sentiments obtained and classified them. While performing classification we found the problem of dimensionality and also explored the use of feature selection (FS) models in order to reduce the data set's dimensionality. We found that the use of three FS models (Chi Squared, Information Gain and ReliefF) showed the most promising results when used alongside a logistic and support vector machines classification algorithm. the contributions of this paper are in providing an real-world example of social media text analytics which can be adopted in many other areas of research and business. Moving forward researchers can use the methodological approach in this paper to further refine and improve their own case uses in text analytics.

Text Mining and Sentiment Analysis for Predicting Box Office Success

  • Kim, Yoosin;Kang, Mingon;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4090-4102
    • /
    • 2018
  • After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers' sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the user-generated content can be a more accurate clue to predict business success.

Big Data Analytics of Construction Safety Incidents Using Text Mining (텍스트 마이닝을 활용한 건설안전사고 빅데이터 분석)

  • Jeong Uk Seo;Chie Hoon Song
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.3
    • /
    • pp.581-590
    • /
    • 2024
  • This study aims to extract key topics through text mining of incident records (incident history, post-incident measures, preventive measures) from construction safety accident case data available on the public data portal. It also seeks to provide fundamental insights contributing to the establishment of manuals for disaster prevention by identifying correlations between these topics. After pre-processing the input data, we used the LDA-based topic modeling technique to derive the main topics. Consequently, we obtained five topics related to incident history, and four topics each related to post-incident measures and preventive measures. Although no dominant patterns emerged from the topic pattern analysis, the study holds significance as it provides quantitative information on the follow-up actions related to the incident history, thereby suggesting practical implications for the establishment of a preventive decision-making system through the linkage between accident history and subsequent measures for reccurrence prevention.

Practical Text Mining for Trend Analysis: Ontology to visualization in Aerospace Technology

  • Kim, Yoosin;Ju, Yeonjin;Hong, SeongGwan;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4133-4145
    • /
    • 2017
  • Advances in science and technology are driving us to the better life but also forcing us to make more investment at the same time. Therefore, the government has provided the investment to carry on the promising futuristic technology successfully. Indeed, a lot of resources from the government have supported into the science and technology R&D projects for several decades. However, the performance of the public investments remains unclear in many ways, so thus it is required that planning and evaluation about the new investment should be on data driven decision with fact based evidence. In this regard, the government wanted to know the trend and issue of the science and technology with evidences, and has accumulated an amount of database about the science and technology such as research papers, patents, project reports, and R&D information. Nowadays, the database is supporting to various activities such as planning policy, budget allocation, and investment evaluation for the science and technology but the information quality is not reached to the expectation because of limitations of text mining to drill out the information from the unstructured data like the reports and papers. To solve the problem, this study proposes a practical text mining methodology for the science and technology trend analysis, in case of aerospace technology, and conduct text mining methods such as ontology development, topic analysis, network analysis and their visualization.

Development of Hybrid Recommender System Using Review Data Mining: Kindle Store Data Analysis Case (리뷰 데이터 마이닝을 이용한 하이브리드 추천시스템 개발: Amazon Kindle Store 데이터 분석사례)

  • Yihua Zhang;Qinglong Li;Ilyoung Choi;Jaekyeong Kim
    • Information Systems Review
    • /
    • v.23 no.1
    • /
    • pp.155-172
    • /
    • 2021
  • With the recent increase in online product purchases, a recommender system that recommends products considering users' preferences has still been studied. The recommender system provides personalized product recommendation services to users. Collaborative Filtering (CF) using user ratings on products is one of the most widely used recommendation algorithms. During CF, the item-based method identifies the user's product by using ratings left on the product purchased by the user and obtains the similarity between the purchased product and the unpurchased product. CF takes a lot of time to calculate the similarity between products. In particular, it takes more time when using text-based big data such as review data of Amazon store. This paper suggests a hybrid recommendation system using a 2-phase methodology and text data mining to calculate the similarity between products easily and quickly. To this end, we collected about 980,000 online consumer ratings and review data from the online commerce store, Amazon Kinder Store. As a result of several experiments, it was confirmed that the suggested hybrid recommendation system reflecting the user's rating and review data has resulted in similar recommendation time, but higher accuracy compared to the CF-based benchmark recommender systems. Therefore, the suggested system is expected to increase the user's satisfaction and increase its sales.

The Effect of Text Consistency between the Review Title and Content on Review Helpfulness (온라인 리뷰의 제목과 내용의 일치성이 리뷰 유용성에 미치는 영향)

  • Li, Qinglong;Kim, Jaekyeong
    • Knowledge Management Research
    • /
    • v.23 no.3
    • /
    • pp.193-212
    • /
    • 2022
  • Many studies have proposed several factors that affect review helpfulness. Previous studies have investigated the effect of quantitative factors (e.g., star ratings) and affective factors (e.g., sentiment scores) on review helpfulness. Online reviews contain titles and contents, but existing studies focus on the review content. However, there is a limitation to investigating the factors that affect review helpfulness based on the review content without considering the review title. However, previous studies independently investigated the effect of review content and title on review helpfulness. However, it may ignore the potential impact of similarity between review titles and content on review helpfulness. This study used text consistency between review titles and content affect review helpfulness based on the mere exposure effect theory. We also considered the role of information clearness, review length, and source reliability. The results show that text consistency between the review title and the content negatively affects the review helpfulness. Furthermore, we found that information clearness and source reliability weaken the negative effects of text consistency on review helpfulness.

Analysis of the Bible Data using Big Data Analytics Tools R (빅데이터 분석도구 R을 활용한 성경 데이터의 분석)

  • Kim, YongSu;Ban, ChaeHoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.349-352
    • /
    • 2015
  • 빅 데이터가 정보통신기술 분야의 핵심 이슈로 부각되면서 관련 기술에 대한 관심이 증가하고 있다. 빅 데이터 분석 도구인 R은 통계 기반의 정보 분석을 가능하게 하는 언어와 환경이다. 본 논문에서는 이를 이용하여 성경데이터를 분석한다. 분석을 통해 신구약, 모세오경, 사복음서별로 어떠한 텍스트가 분포되어 있는지를 빈도 조사를 수행한다.

  • PDF