• Title/Summary/Keyword: 자연어분석

Search Result 562, Processing Time 0.031 seconds

Leveraging LLMs for Corporate Data Analysis: Employee Turnover Prediction with ChatGPT (대형 언어 모델을 활용한 기업데이터 분석: ChatGPT를 활용한 직원 이직 예측)

  • Sungmin Kim;Jee Yong Chung
    • Knowledge Management Research
    • /
    • v.25 no.2
    • /
    • pp.19-47
    • /
    • 2024
  • Organizational ability to analyze and utilize data plays an important role in knowledge management and decision-making. This study aims to investigate the potential application of large language models in corporate data analysis. Focusing on the field of human resources, the research examines the data analysis capabilities of these models. Using the widely studied IBM HR dataset, the study reproduces machine learning-based employee turnover prediction analyses from previous research through ChatGPT and compares its predictive performance. Unlike past research methods that required advanced programming skills, ChatGPT-based machine learning data analysis, conducted through the analyst's natural language requests, offers the advantages of being much easier and faster. Moreover, its prediction accuracy was found to be competitive compared to previous studies. This suggests that large language models could serve as effective and practical alternatives in the field of corporate data analysis, which has traditionally demanded advanced programming capabilities. Furthermore, this approach is expected to contribute to the popularization of data analysis and the spread of data-driven decision-making (DDDM). The prompts used during the data analysis process and the program code generated by ChatGPT are also included in the appendix for verification, providing a foundation for future data analysis research using large language models.

Development and Validation of the Letter-unit based Korean Sentimental Analysis Model Using Convolution Neural Network (회선 신경망을 활용한 자모 단위 한국형 감성 분석 모델 개발 및 검증)

  • Sung, Wonkyung;An, Jaeyoung;Lee, Choong C.
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.1
    • /
    • pp.13-33
    • /
    • 2020
  • This study proposes a Korean sentimental analysis algorithm that utilizes a letter-unit embedding and convolutional neural networks. Sentimental analysis is a natural language processing technique for subjective data analysis, such as a person's attitude, opinion, and propensity, as shown in the text. Recently, Korean sentimental analysis research has been steadily increased. However, it has failed to use a general-purpose sentimental dictionary and has built-up and used its own sentimental dictionary in each field. The problem with this phenomenon is that it does not conform to the characteristics of Korean. In this study, we have developed a model for analyzing emotions by producing syllable vectors based on the onset, peak, and coda, excluding morphology analysis during the emotional analysis procedure. As a result, we were able to minimize the problem of word learning and the problem of unregistered words, and the accuracy of the model was 88%. The model is less influenced by the unstructured nature of the input data and allows for polarized classification according to the context of the text. We hope that through this developed model will be easier for non-experts who wish to perform Korean sentimental analysis.

A Comparative Analysis of the Prediction Models for the Direction of Stock Price Using the Online Company Reviews (기업 리뷰 정보를 활용한 주가 방향 예측 모델 비교 분석)

  • Lim, Yongtaek;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.8
    • /
    • pp.165-171
    • /
    • 2020
  • Most of the stock price prediction research using text mining uses news and SNS data. However, there is a weakness that it is difficult to get honest and vivid information about companies from them. This paper deals with the problem of the prediction for the direction of stock price by doing text mining the online company reviews of internal staff indicating employee satisfaction. The comparative analysis of the prediction models for the direction of stock price showed the prediction model, which adds internal employee reviews, has better performance than those that did not. This paper presents the convergence study using natural language processing in financial engineering. In the field of stock price prediction, This paper pursued a new methodology that used employee satisfaction. In practice, it is expected to provide useful information in the field of forecasting stock price direction.

TRED : Twitter based Realtime Event-location Detector (트위터 기반의 실시간 이벤트 지역 탐지 시스템)

  • Yim, Junyeob;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.8
    • /
    • pp.301-308
    • /
    • 2015
  • SNS is a web-based online platform service supporting the formation of relations between users. SNS users have usually used a desktop or laptop for this purpose so far. However, the number of SNS users is greatly increasing and their access to the web is improving with the spread of smart phones. They share their daily lives with other users through SNSs. We can detect events if we analyze the contents that are left by SNS users, where the individual acts as a sensor. Such analyses have already been attempted by many researchers. In particular, Twitter is used in related spheres in various ways, because it has structural characteristics suitable for detecting events. However, there is a limitation concerning the detection of events and their locations. Thus, we developed a system that can detect the location immediately based on the district mentioned in Twitter. We tested whether the system can function in real time and evaluated its ability to detect events that occurred in reality. We also tried to improve its detection efficiency by removing noise.

Transformation-based Learning for Korean Comparative Sentence Classification (한국어 비교 문장 유형 분류를 위한 변환 기반 학습 기법)

  • Yang, Seon;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.2
    • /
    • pp.155-160
    • /
    • 2010
  • This paper proposes a method for Korean comparative sentence classification which is a part of comparison mining. Comparison mining, one area of text mining, analyzes comparative relations from the enormous amount of text documents. Three-step process is needed for comparison mining - 1) identifying comparative sentences in the text documents, 2) classifying those sentences into several classes, 3) analyzing comparative relations per each comparative class. This paper aims at the second task. In this paper, we use transformation-based learning (TBL) technique which is a well-known learning method in the natural language processing. In our experiment, we classify comparative sentences into seven classes using TBL and achieve an accuracy of 80.01%.

A Comparative Performance Analysis of Spark-Based Distributed Deep-Learning Frameworks (스파크 기반 딥 러닝 분산 프레임워크 성능 비교 분석)

  • Jang, Jaehee;Park, Jaehong;Kim, Hanjoo;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.5
    • /
    • pp.299-303
    • /
    • 2017
  • By piling up hidden layers in artificial neural networks, deep learning is delivering outstanding performances for high-level abstraction problems such as object/speech recognition and natural language processing. Alternatively, deep-learning users often struggle with the tremendous amounts of time and resources that are required to train deep neural networks. To alleviate this computational challenge, many approaches have been proposed in a diversity of areas. In this work, two of the existing Apache Spark-based acceleration frameworks for deep learning (SparkNet and DeepSpark) are compared and analyzed in terms of the training accuracy and the time demands. In the authors' experiments with the CIFAR-10 and CIFAR-100 benchmark datasets, SparkNet showed a more stable convergence behavior than DeepSpark; but in terms of the training accuracy, DeepSpark delivered a higher classification accuracy of approximately 15%. For some of the cases, DeepSpark also outperformed the sequential implementation running on a single machine in terms of both the accuracy and the running time.

Traditional Knowledge analysis based on Native Biological Resources Database Construction of the National Park Area (국립공원 지역의 한국 자생생물자원 전통지식 DB구축을 통한 전통지식 현황 분석)

  • Bae, Se-Eun;Kim, Boyoung;Kim, Sung-Ha;Park, Jeong Hwan;Bae, EunKyung;Jang, Jin-Hwa;Lee, Sang-Hun;Park, Jae Won;Shin, Jinseop
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.9
    • /
    • pp.267-275
    • /
    • 2016
  • Species are constantly using for clothing, food, shelter and health are distributed in various places. Convention on Biological Diversity made for conserve resources and enhance the value in many countries around the world. Each country is in the process of building a database for to protect the native biological resources and establish sovereignty. This studies analyzed the distribution of such traditional knowledge of native biological resources and builded a database through using standardized form of collected data made by our research. The result, almost native biological resources used for food and medical treatment.

Postprocessing of A Speech Recognition using the Morphological Anlaysis Technique (형태소 분석 기법을 이용한 음성 인식 후처리)

  • 박미성;김미진;김계성;김성규;이문희;최재혁;이상조
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.36C no.4
    • /
    • pp.65-77
    • /
    • 1999
  • There are two problems which will be processed to graft a continuous speech recognition results into natural language processing technique. First, the speaking's unit isn't consistent with text's spacing unit. Second, when it is to be pronounced the phonological alternation phenomena occur inside morphemes or among morphemes. In this paper, we implement the postprocessing system of a continuous speech recognition that above all, solve two problems using the eo-jeol generator and syllable recoveror and morphologically analyze the generated results and then correct the failed results through the corrector. Our system experiments with two kinds of speech corpus, i.e., a primary school text book and editorial corpus. The successful percentage of the former is 93.72%, that of the latter is 92.26%. As results of experiment, we verified that our system is stable regardless the sorts of corpus.

  • PDF

Use Case Identification Method based on Goal oriented Requirements Engineering(GoRE) (Goal 지향 요구공학 기반의 유스케이스 식별 방법)

  • Park, Bokyung;Kim, R. Youngchul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.7
    • /
    • pp.255-262
    • /
    • 2014
  • Our previous research[1] suggested object extraction and modeling method based on Fillmore's case grammar. This approach had not considered of use case extraction and method. To solve this problem, we adopt Fillmore's semantic method as linguistic approach into requirement engineering, which refine fillmore's case grammar for extracting and modeling use cases from customer requirements. This Refined mechanism includes the definition of a structured procedure and the representation of visual notations for 'case' modeling. This paper also proposes the use case decision matrix to identify use case size from extracted use cases based on goal oriented requirement engineering(GoRE), which related with the complexity of use case, and also prioritizes the use cases with this matrix. It demonstrates our proposal with the bank ATM system.

GARDIAN: Rule Based Modeling Validation for Concurrent Object Modeling and Architectural Design mEThod(COMET) (GARDIAN: 실시간 내장형 소프트웨어 개발 방법론에서의 룰 기반의 모델링 평가 및 지원도구)

  • Kim, Sun-Tae;Kim, Jin-Tae;Park, Soo-Yong
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.8
    • /
    • pp.721-730
    • /
    • 2007
  • UML (Unified Modeling Language) is widely used to analyze and design target software. Developers also implement the target software based on the UML artifacts. However, it is difficult to validate whether the artifacts are generated to correspond to the modeling guidelines because the guidelines for UML modeling are described in natural language. This paper discusses rule based model checker focused on whether models are designed according to modeling methodology. We propose rules and their own checker, named GARDIAN, for UML model validation. The checkers are designed for COMET method for the real-time embedded system. We illustrate our checkers using Intelligent Robot system to validate our approach.