Search | Korea Science

Literature Review of AI Hallucination Research Since the Advent of ChatGPT: Focusing on Papers from arXiv (챗GPT 등장 이후 인공지능 환각 연구의 문헌 검토: 아카이브(arXiv)의 논문을 중심으로)

Park, Dae-Min;Lee, Han-Jong
- Informatization Policy
- /
- v.31 no.2
- /
- pp.3-38
- /
- 2024
Hallucination is a significant barrier to the utilization of large-scale language models or multimodal models. In this study, we collected 654 computer science papers with "hallucination" in the abstract from arXiv from December 2022 to January 2024 following the advent of Chat GPT and conducted frequency analysis, knowledge network analysis, and literature review to explore the latest trends in hallucination research. The results showed that research in the fields of "Computation and Language," "Artificial Intelligence," "Computer Vision and Pattern Recognition," and "Machine Learning" were active. We then analyzed the research trends in the four major fields by focusing on the main authors and dividing them into data, hallucination detection, and hallucination mitigation. The main research trends included hallucination mitigation through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), inference enhancement via "chain of thought" (CoT), and growing interest in hallucination mitigation within the domain of multimodal AI. This study provides insights into the latest developments in hallucination research through a technology-oriented literature review. This study is expected to help subsequent research in both engineering and humanities and social sciences fields by understanding the latest trends in hallucination research.
https://doi.org/10.22693/NIAIP.2024.31.2.003 인용 PDF

Exploring Factors to Minimize Hallucination Phenomena in Generative AI - Focusing on Consumer Emotion and Experience Analysis - (생성형AI의 환각현상 최소화를 위한 요인 탐색 연구 - 소비자의 감성·경험 분석을 중심으로-)

Jinho Ahn;Wookwhan Jung
- Journal of Service Research and Studies
- /
- v.14 no.1
- /
- pp.77-90
- /
- 2024
This research aims to investigate methods of leveraging generative artificial intelligence in service sectors where consumer sentiment and experience are paramount, focusing on minimizing hallucination phenomena during usage and developing strategic services tailored to consumer sentiment and experiences. To this end, the study examined both mechanical approaches and user-generated prompts, experimenting with factors such as business item definition, provision of persona characteristics, examples and context-specific imperative verbs, and the specification of output formats and tone concepts. The research explores how generative AI can contribute to enhancing the accuracy of personalized content and user satisfaction. Moreover, these approaches play a crucial role in addressing issues related to hallucination phenomena that may arise when applying generative AI in real services, contributing to consumer service innovation through generative AI. The findings demonstrate the significant role generative AI can play in richly interpreting consumer sentiment and experiences, broadening the potential for application across various industry sectors and suggesting new directions for consumer sentiment and experience strategies beyond technological advancements. However, as this research is based on the relatively novel field of generative AI technology, there are many areas where it falls short. Future studies need to explore the generalizability of research factors and the conditional effects in more diverse industrial settings. Additionally, with the rapid advancement of AI technology, continuous research into new forms of hallucination symptoms and the development of new strategies to address them will be necessary.
https://doi.org/10.18807/jsrs.2024.14.1.077 인용 PDF

Factors Influencing Seniors' Behavioral Intention of Generative AI Services (시니어의 생성형AI 서비스 이용의도에 영향을 미치는 요인)

Sung, Myoung-cheol;Dong, Hak-rim
- Journal of Venture Innovation
- /
- v.7 no.2
- /
- pp.41-56
- /
- 2024
Recently, generative AI services, including ChatGPT, have garnered significant attention. These services appealed not only to digital natives, such as Generation Z, but also to digital immigrants, including seniors. This study aimed to analyze the factors affecting seniors' behavioral intention of generative AI services. A survey targeting seniors was conducted, resulting in 250 valid responses. The data were analyzed using multiple regression analysis. For this purpose, performance expectancy, effort expectancy, social influence, requisite knowledge, biophysical aging restrictions of seniors based on MATOA (Model for the Adoption of Technology by Older Adults), a research model on technology acceptance by seniors and AI hallucinations of generative AI services were set as independent variables. The empirical results were as follows: performance expectancy and social influence had a significant positive impact on seniors' behavioral intention of generative AI services. Additionally, requisite knowledge positively influenced seniors' behavioral intention of generative AI services, while biophysical aging restrictions had a significant negative effect. However, effort expectancy and AI hallucinations did not show a significant influence on seniors' behavioral intention of generative AI services. The variables were ranked by influence as follows: performance expectancy, social influence, requisite knowledge, and biophysical aging restrictions. Based on these research results, academic and practical implications were presented.
https://doi.org/10.22788/7.2.3 인용 PDF

Empirical Study on the Hallucination of Large Language Models Derived by the Sentence-Closing Ending (어체에 따른 초거대언어모델의 한국어 환각 현상 분석)

Hyeonseok Moon;Sugyeong Eo;Jaehyung Seo;Chanjun Park;Yuna Hur;Heuiseok Lim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.677-682
- /
- 2023
초거대 언어모델은 모델의 학습 없이 학습 예시만을 입력에 추가함으로써 목표하는 작업을 수행한다. 이런 방식은 상황 내 학습 (In-Context Learning, ICL)이라 불리며, 초거대 언어모델 활용의 사실상의 표준으로 사용되고 있다. 하지만 이러한 모델은, 환각현상 등 사용상의 한계가 발생하는 상황이 다수 발생한다는 연구 결과가 나오고 있다. 본 연구에서는 초거대언어모델을 한국어 작업에서 사용하는 경우, 매우 간단한 수준의 종결어미 변환만으로도 성능 편차가 매우 크게 발생함을 확인하였다. 우리는 이에 대한 분석을 통해, 학습 예시의 어체와 추론 대상의 어체의 변환에 따라 초거대언어모델의 효용성이 크게 변함을 발견하고 이에 대해 분석한다. 나아가 우리는 본 실험 결과를 바탕으로, 어체에 대한 일관성이 유지된 형태의 한국어 데이터 구축이 이루어져야 함을 제안한다.
PDF

KFREB: Korean Fictional Retrieval-based Evaluation Benchmark for Generative Large Language Models (KFREB: 생성형 한국어 대규모 언어 모델의 검색 기반 생성 평가 데이터셋)

Jungseob Lee;Junyoung Son;Taemin Lee;Chanjun Park;Myunghoon Kang;Jeongbae Park;Heuiseok Lim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.9-13
- /
- 2023
본 논문에서는 대규모 언어모델의 검색 기반 답변 생성능력을 평가하는 새로운 한국어 벤치마크, KFREB(Korean Fictional Retrieval Evaluation Benchmark)를 제안한다. KFREB는 모델이 사전학습 되지 않은 허구의 정보를 바탕으로 검색 기반 답변 생성 능력을 평가함으로써, 기존의 대규모 언어모델이 사전학습에서 보았던 사실을 반영하여 생성하는 답변이 실제 검색 기반 답변 시스템에서의 능력을 제대로 평가할 수 없다는 문제를 해결하고자 한다. 제안된 KFREB는 검색기반 대규모 언어모델의 실제 서비스 케이스를 고려하여 장문 문서, 두 개의 정답을 포함한 골드 문서, 한 개의 골드 문서와 유사 방해 문서 키워드 유무, 그리고 문서 간 상호 참조를 요구하는 상호참조 멀티홉 리즈닝 경우 등에 대한 평가 케이스를 제공하며, 이를 통해 대규모 언어모델의 적절한 선택과 실제 서비스 활용에 대한 인사이트를 제공할 수 있을 것이다.
PDF

Korean Commonsense Reasoning Evaluation for Large Language Models (거대언어모델을 위한 한국어 상식추론 기반 평가)

Jaehyung Seo;Chanjun Park;Hyeonseok Moon;Sugyeong Eo;Aram So;Heuiseok Lim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.162-167
- /
- 2023
본 논문은 거대언어모델에 대한 한국어 상식추론 기반의 새로운 평가 방식을 제안한다. 제안하는 평가 방식은 한국어의 일반 상식을 기초로 삼으며, 이는 거대언어모델이 주어진 정보를 얼마나 잘 이해하고, 그에 부합하는 결과물을 생성할 수 있는지를 판단하기 위함이다. 기존의 한국어 상식추론 능력 평가로 사용하던 Korean-CommonGEN에서 언어 모델은 이미 높은 수준의 성능을 보이며, GPT-3와 같은 거대언어모델은 사람의 상한선을 넘어선 성능을 기록한다. 따라서, 기존의 평가 방식으로는 거대언어모델의 발전된 상식추론 능력을 정교하게 평가하기 어렵다. 더 나아가, 상식 추론 능력을 평가하는 과정에서 사회적 편견이나 환각 현상을 충분히 고려하지 못하고 있다. 본 연구의 평가 방법은 거대언어모델이 야기하는 문제점을 반영하여, 다가오는 거대언어모델 시대에 한국어 자연어 처리 연구가 지속적으로 발전할 수 있도록 하는 상식추론 벤치마크 구성 방식을 새롭게 제시한다.
PDF

An Exploratory Study on the Trustworthiness Analysis of Generative AI (생성형 AI의 신뢰도에 대한 탐색적 연구)

Soyon Kim;Ji Yeon Cho;Bong Gyou Lee
- Journal of Internet Computing and Services
- /
- v.25 no.1
- /
- pp.79-90
- /
- 2024
This study focused on user trust in ChatGPT, a generative AI technology, and explored the factors that affect usage status and intention to continue using, and whether the influence of trust varies depending on the purpose. For this purpose, the survey was conducted targeting people in their 20s and 30s who use ChatGPT the most. The statistical analysis deploying IBM SPSS 27 and SmartPLS 4.0. A structural equation model was formulated on the foundation of Bhattacherjee's Expectation-Confirmation Model (ECM), employing path analysis and Multi-Group Analysis (MGA) for hypothesis validation. The main findings are as follows: Firstly, ChatGPT is mainly used for specific needs or objectives rather than as a daily tool. The majority of users are cognizant of its hallucination effects; however, this did not hinder its use. Secondly, the hypothesis testing indicated that independent variables such as expectation- confirmation, perceived usefulness, and user satisfaction all exert a positive influence on the dependent variable, the intention for continuance intention. Thirdly, the influence of trust varied depending on the user's purpose in utilizing ChatGPT. trust was significant when ChatGPT is used for information retrieval but not for creative purposes. This study will be used to solve reliability problems in the process of introducing generative AI in society and companies in the future and to establish policies and derive improvement measures for successful employment.
https://doi.org/10.7472/jksii.2024.25.1.79 인용 PDF HTML

KoCheckGPT: Korean LLM written document detector (KoCheckGPT: 한국어 초거대언어모델 작성 글 판별기)

Myunghoon Kang;Jungseob Lee;Seungyoon Lee;Seongtae Hong;Jeongbae Park;Heuiseok, Lim
- Annual Conference on Human and Language Technology
- /
- 2023.10a
- /
- pp.432-436
- /
- 2023
초거대언어모델(LLM)의 도래에 따라 다양한 과업들이 도메인 관계 없이 제로샷으로 추론이 가능해짐에 따라서 LLM이 다양한 산업분야에 적용되고 있다. 대표적으로 ChatGPT와 GPT-4는 상용 API로 서비스를 제공하여 용이한 서비스 접근으로 다양한 이용층을 끌어들이고 있다. 그러나 현재 상용 API로 제공되고 있는 ChatGPT 및 GPT-4는 사용자의 대화 내역 데이터를 수집해 기업의 보안 문제를 야기할 수 있고 또한 생성된 결과물의 환각 문제로 인한 기업 문서의 신뢰성 저하를 초래할 수 있다. 특히 LLM 생성 글은 인간의 글과 유사한 수준으로 유창성을 확보한만큼 산업현장에서 LLM 작성 글이 판별되지 못할 경우 기업 활동에 큰 제약을 줄 수 있다. 그러나 현재 한국어 LLM 작성 글 탐지 서비스가 전무한 실정이다. 본 논문에서는 한국어 초거대언어모델 작성 글 판별기: KoCheckGPT 를 제안한다.KoCheckGPT는 산업현장에서 자주 사용되는 문어체, 개조식 글쓰기로 작성된 문서 도메인을 목표로 하여 글 전체와 문장 단위의 판별 정보를 결합하여 주어진 문서의 LLM 작성 여부를 효과적으로 판별한다. 다국어 LLM 작성 글 판별기 ZeroGPT와의 비교 실험 결과 KoCheckGPT는 우수한 한국어 LLM 작성 글 탐지 성능을 보였다.
PDF

A Study on the Intelligent Document Processing Platform for Document Data Informatization (문서 데이터 정보화를 위한 지능형 문서처리 플랫폼에 관한 연구)

Hee-Do Heo;Dong-Koo Kang;Young-Soo Kim;Sam-Hyun Chun
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.24 no.1
- /
- pp.89-95
- /
- 2024
Nowadays, the competitiveness of a company depends on the ability of all organizational members to share and utilize the organizational knowledge accumulated by the organization. As if to prove this, the world is now focusing on ChetGPT service using generative AI technology based on LLM (Large Language Model). However, it is still difficult to apply the ChetGPT service to work because there are many hallucinogenic problems. To solve this problem, sLLM (Lightweight Large Language Model) technology is being proposed as an alternative. In order to construct sLLM, corporate data is essential. Corporate data is the organization's ERP data and the company's office document knowledge data preserved by the organization. ERP Data can be used by directly connecting to sLLM, but office documents are stored in file format and must be converted to data format to be used by connecting to sLLM. In addition, there are too many technical limitations to utilize office documents stored in file format as organizational knowledge information. This study proposes a method of storing office documents in DB format rather than file format, allowing companies to utilize already accumulated office documents as an organizational knowledge system, and providing office documents in data form to the company's SLLM. We aim to contribute to improving corporate competitiveness by combining AI technology.
https://doi.org/10.7236/JIIBC.2024.24.1.89 인용 PDF HTML

Search Result 9, Processing Time 0.018 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)