• 제목/요약/키워드: Language prediction model

검색결과 120건 처리시간 0.021초

Brain-Operated Typewriter using the Language Prediction Model

  • Lee, Sae-Byeok;Lim, Heui-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제5권10호
    • /
    • pp.1770-1782
    • /
    • 2011
  • A brain-computer interface (BCI) is a communication system that translates brain activity into commands for computers or other devices. In other words, BCIs create a new communication channel between the brain and an output device by bypassing conventional motor output pathways consisting of nerves and muscles. This is particularly useful for facilitating communication for people suffering from paralysis. Due to the low bit rate, it takes much more time to translate brain activity into commands. Especially it takes much time to input characters by using BCI-based typewriters. In this paper, we propose a brain-operated typewriter which is accelerated by a language prediction model. The proposed system uses three kinds of strategies to improve the entry speed: word completion, next-syllable prediction, and next word prediction. We found that the entry speed of BCI-based typewriter improved about twice as much through our demonstration which utilized the language prediction model.

A BERT-Based Automatic Scoring Model of Korean Language Learners' Essay

  • Lee, Jung Hee;Park, Ji Su;Shon, Jin Gon
    • Journal of Information Processing Systems
    • /
    • 제18권2호
    • /
    • pp.282-291
    • /
    • 2022
  • This research applies a pre-trained bidirectional encoder representations from transformers (BERT) handwriting recognition model to predict foreign Korean-language learners' writing scores. A corpus of 586 answers to midterm and final exams written by foreign learners at the Intermediate 1 level was acquired and used for pre-training, resulting in consistent performance, even with small datasets. The test data were pre-processed and fine-tuned, and the results were calculated in the form of a score prediction. The difference between the prediction and actual score was then calculated. An accuracy of 95.8% was demonstrated, indicating that the prediction results were strong overall; hence, the tool is suitable for the automatic scoring of Korean written test answers, including grammatical errors, written by foreigners. These results are particularly meaningful in that the data included written language text produced by foreign learners, not native speakers.

자연어 처리 및 기계학습을 통한 동의보감 기반 한의변증진단 기술 개발 (Donguibogam-Based Pattern Diagnosis Using Natural Language Processing and Machine Learning)

  • 이승현;장동표;성강경
    • 대한한의학회지
    • /
    • 제41권3호
    • /
    • pp.1-8
    • /
    • 2020
  • Objectives: This paper aims to investigate the Donguibogam-based pattern diagnosis by applying natural language processing and machine learning. Methods: A database has been constructed by gathering symptoms and pattern diagnosis from Donguibogam. The symptom sentences were tokenized with nouns, verbs, and adjectives with natural language processing tool. To apply symptom sentences into machine learning, Word2Vec model has been established for converting words into numeric vectors. Using the pair of symptom's vector and pattern diagnosis, a pattern prediction model has been trained through Logistic Regression. Results: The Word2Vec model's maximum performance was obtained by optimizing Word2Vec's primary parameters -the number of iterations, the vector's dimensions, and window size. The obtained pattern diagnosis regression model showed 75% (chance level 16.7%) accuracy for the prediction of Six-Qi pattern diagnosis. Conclusions: In this study, we developed pattern diagnosis prediction model based on the symptom and pattern diagnosis from Donguibogam. The prediction accuracy could be increased by the collection of data through future expansions of oriental medicine classics.

Alzheimer's disease recognition from spontaneous speech using large language models

  • Jeong-Uk Bang;Seung-Hoon Han;Byung-Ok Kang
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.96-105
    • /
    • 2024
  • We propose a method to automatically predict Alzheimer's disease from speech data using the ChatGPT large language model. Alzheimer's disease patients often exhibit distinctive characteristics when describing images, such as difficulties in recalling words, grammar errors, repetitive language, and incoherent narratives. For prediction, we initially employ a speech recognition system to transcribe participants' speech into text. We then gather opinions by inputting the transcribed text into ChatGPT as well as a prompt designed to solicit fluency evaluations. Subsequently, we extract embeddings from the speech, text, and opinions by the pretrained models. Finally, we use a classifier consisting of transformer blocks and linear layers to identify participants with this type of dementia. Experiments are conducted using the extensively used ADReSSo dataset. The results yield a maximum accuracy of 87.3% when speech, text, and opinions are used in conjunction. This finding suggests the potential of leveraging evaluation feedback from language models to address challenges in Alzheimer's disease recognition.

사용자 적응을 통한 한국 수화 인식 시스템의 개선 (Improvement of Korean Sign Language Recognition System by User Adaptation)

  • 정성훈;박광현;변증남
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2007년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.301-303
    • /
    • 2007
  • This paper presents user adaptation methods to overcome limitations of a user-independent model and a user-dependent model in a Korean sign language recognition system. To adapt model parameters for unobserved states in hidden Markov models, we introduce new methods based on motion similarity and prediction from adaptation history so that we can achieve faster adaption and higher recognition rates comparing with previous methods.

  • PDF

다중목표 대화형 추천시스템을 위한 사전 학습된 언어모델들에 대한 성능 평가 (Performance Evaluation of Pre-trained Language Models in Multi-Goal Conversational Recommender Systems)

  • 김태호;장형준;김상욱
    • 스마트미디어저널
    • /
    • 제12권6호
    • /
    • pp.35-40
    • /
    • 2023
  • 본 연구는 대화형 추천 시스템인 다중 목표 대화형 추천 시스템(MG-CRS)에서 사용되는 다양한 사전 학습된 언어 모델들을 고찰하고, 각 언어모델의 성능을 비교하고 분석한다. 특히, 언어 모델의 크기가 다중 목표 대화형 추천 시스템의 성능에 어떤 영향을 미치는지에 대해 살펴본다. BERT, GPT2, 그리고 BART의 세 종류의 언어모델을 대상으로 하여, 대표적인 다중 목표 대화형 추천 시스템 데이터셋인 DuRecDial 2.0에서 '타입 예측'과 '토픽 예측'의 정확도를 측정하고 비교한다. 실험 결과, 타입 예측에서는 모든 모델이 뛰어난 성능을 보였지만, 토픽예측에서는 모델 간에 혹은 사이즈에 따라 성능 차이가 관찰되었다. 이러한 결과를 바탕으로 다중 목표 대화형 추천 시스템의 성능 향상을 위한 방향을 제시한다.

한국어 음성인식 플랫폼의 설계 (Design of a Korean Speech Recognition Platform)

  • 권오욱;김회린;유창동;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제51호
    • /
    • pp.151-165
    • /
    • 2004
  • For educational and research purposes, a Korean speech recognition platform is designed. It is based on an object-oriented architecture and can be easily modified so that researchers can readily evaluate the performance of a recognition algorithm of interest. This platform will save development time for many who are interested in speech recognition. The platform includes the following modules: Noise reduction, end-point detection, met-frequency cepstral coefficient (MFCC) and perceptually linear prediction (PLP)-based feature extraction, hidden Markov model (HMM)-based acoustic modeling, n-gram language modeling, n-best search, and Korean language processing. The decoder of the platform can handle both lexical search trees for large vocabulary speech recognition and finite-state networks for small-to-medium vocabulary speech recognition. It performs word-dependent n-best search algorithm with a bigram language model in the first forward search stage and then extracts a word lattice and restores each lattice path with a trigram language model in the second stage.

  • PDF

자연어 처리 기법을 활용한 충돌사고 원인 제공 비율 예측 모델 개발 (Collision Cause-Providing Ratio Prediction Model Using Natural Language Processing Analytics)

  • 윤익현;박혜인;이창희
    • 해양환경안전학회지
    • /
    • 제30권1호
    • /
    • pp.82-88
    • /
    • 2024
  • 현대 해양 산업은 기술적 발전을 통해 신속한 발전을 이루고 있다. 이러한 발전을 주도하는 주요 기술 중 하나는 데이터 처리 기술이며, 이 중 자연어 처리 기법은 사람의 언어를 기계가 이해하고 처리할 수 있도록 하는 기술이다. 본 연구는 자연어 처리 기법을 통해 해양안전심판원의 재결서를 분석하여 이미 재결이 이루어진 선박 충돌사고의 원인 제공 비율을 학습한 후, 새로운 재결서를 입력하면 원인 제공 비율을 예측하는 모델을 개발하고자 하였다. 이 모델은 사고 당시 적용되는 항법과 원인 제공 비율에 영향을 주는 핵심 키워드의 가중치를 이용하여 사고의 원인 제공 비율을 계산하는 방식으로 구성하였다. 이 연구는 이러한 방식을 통해 제작한 모델의 정확도를 분석하고, 모델의 실무 적용 가능성을 검토함과 동시에 충돌사고 재발 방지 및 해양사고 당사자들의 분쟁 해결에 기여할 것으로 기대한다.

Predicting the Unemployment Rate Using Social Media Analysis

  • Ryu, Pum-Mo
    • Journal of Information Processing Systems
    • /
    • 제14권4호
    • /
    • pp.904-915
    • /
    • 2018
  • We demonstrate how social media content can be used to predict the unemployment rate, a real-world indicator. We present a novel method for predicting the unemployment rate using social media analysis based on natural language processing and statistical modeling. The system collects social media contents including news articles, blogs, and tweets written in Korean, and then extracts data for modeling using part-of-speech tagging and sentiment analysis techniques. The autoregressive integrated moving average with exogenous variables (ARIMAX) and autoregressive with exogenous variables (ARX) models for unemployment rate prediction are fit using the analyzed data. The proposed method quantifies the social moods expressed in social media contents, whereas the existing methods simply present social tendencies. Our model derived a 27.9% improvement in error reduction compared to a Google Index-based model in the mean absolute percentage error metric.

대규모 언어 모델 기반 한국어 휴지 예측 연구 (A Study on Korean Pause Prediction based Large Language Model)

  • 나정호;이정;나승훈;정정범;최맹식;이충희
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2023년도 제35회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.14-18
    • /
    • 2023
  • 본 연구는 한국어 음성-텍스트 데이터에서 보편적으로 나타난 휴지의 실현 양상을 분석하고, 이를 토대로 데이터셋을 선별해 보편적이고 규격화된 한국어 휴지 예측을 위한 모델을 제안하였다. 이를 위해 전문적인 발성 훈련을 받은 성우 등의 발화가 녹음된 음성-텍스트 데이터셋을 수집하고 MFA와 같은 음소 정렬기를 사용해 휴지를 라벨링하는 등의 전처리를 하고, 다양한 화자의 발화에서 공통적으로 나타난 휴지를 선별해 학습데이터셋을 구축하였다. 구축된 데이터셋을 바탕으로 LLM 중 하나인 KULLM 모델을 미세 조정하고 제안한 모델의 휴지 예측 성능을 평가하였다.

  • PDF