통합 검색 | Korea Science

딥러닝 사전학습 언어모델 기술 동향 (Recent R&D Trends for Pretrained Language Model)

임준호;김현기;김영길
- 전자통신동향분석
- /
- 제35권3호
- /
- pp.9-19
- /
- 2020
Recently, a technique for applying a deep learning language model pretrained from a large corpus to fine-tuning for each application task has been widely used as a language processing technology. The pretrained language model shows higher performance and satisfactory generalization performance than existing methods. This paper introduces the major research trends related to deep learning pretrained language models in the field of language processing. We describe in detail the motivations, models, learning methods, and results of the BERT language model that had significant influence on subsequent studies. Subsequently, we introduce the results of language model studies after BERT, focusing on SpanBERT, RoBERTa, ALBERT, BART, and ELECTRA. Finally, we introduce the KorBERT pretrained language model, which shows satisfactory performance in Korean language. In addition, we introduce techniques on how to apply the pretrained language model to Korean (agglutinative) language, which consists of a combination of content and functional morphemes, unlike English (refractive) language whose endings change depending on the application.
https://doi.org/10.22648/ETRI.2020.J.350302 인용 PDF

정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응 (N- gram Adaptation Using Information Retrieval and Dynamic Interpolation Coefficient)

최준기;오영환
- 대한음성학회지:말소리
- /
- 제56호
- /
- pp.207-223
- /
- 2005
The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.
PDF

Integration of WFST Language Model in Pre-trained Korean E2E ASR Model

Junseok Oh;Eunsoo Cho;Ji-Hwan Kim
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제18권6호
- /
- pp.1692-1705
- /
- 2024
In this paper, we present a method that integrates a Grammar Transducer as an external language model to enhance the accuracy of the pre-trained Korean End-to-end (E2E) Automatic Speech Recognition (ASR) model. The E2E ASR model utilizes the Connectionist Temporal Classification (CTC) loss function to derive hypothesis sentences from input audio. However, this method reveals a limitation inherent in the CTC approach, as it fails to capture language information from transcript data directly. To overcome this limitation, we propose a fusion approach that combines a clause-level n-gram language model, transformed into a Weighted Finite-State Transducer (WFST), with the E2E ASR model. This approach enhances the model's accuracy and allows for domain adaptation using just additional text data, avoiding the need for further intensive training of the extensive pre-trained ASR model. This is particularly advantageous for Korean, characterized as a low-resource language, which confronts a significant challenge due to limited resources of speech data and available ASR models. Initially, we validate the efficacy of training the n-gram model at the clause-level by contrasting its inference accuracy with that of the E2E ASR model when merged with language models trained on smaller lexical units. We then demonstrate that our approach achieves enhanced domain adaptation accuracy compared to Shallow Fusion, a previously devised method for merging an external language model with an E2E ASR model without necessitating additional training.
https://doi.org/10.3837/tiis.2024.06.015 인용 PDF HTML

Research on Methods to Increase Recognition Rate of Korean Sign Language using Deep Learning

So-Young Kwon;Yong-Hwan Lee
- Journal of Platform Technology
- /
- 제12권1호
- /
- pp.3-11
- /
- 2024
Deaf people who use sign language as their first language sometimes have difficulty communicating because they do not know spoken Korean. Deaf people are also members of society, so we must support to create a society where everyone can live together. In this paper, we present a method to increase the recognition rate of Korean sign language using a CNN model. When the original image was used as input to the CNN model, the accuracy was 0.96, and when the image corresponding to the skin area in the YCbCr color space was used as input, the accuracy was 0.72. It was confirmed that inserting the original image itself would lead to better results. In other studies, the accuracy of the combined Conv1d and LSTM model was 0.92, and the accuracy of the AlexNet model was 0.92. The CNN model proposed in this paper is 0.96 and is proven to be helpful in recognizing Korean sign language.
PDF

문장음성인식을 위한 VCCV 기반의 효율적인 언어모델 (Efficient Language Model based on VCCV unit for Sentence Speech Recognition)

박선희;노용완;홍광석
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2003년도 학술회의 논문집 정보 및 제어부문 B
- /
- pp.836-839
- /
- 2003
In this paper, we implement a language model by a bigram and evaluate proper smoothing technique for unit of low perplexity. Word, morpheme, clause units are widely used as a language processing unit of the language model. We propose VCCV units which have more small vocabulary than morpheme and clauses units. We compare the VCCV units with the clause and the morpheme units using the perplexity. The most common metric for evaluating a language model is the probability that the model assigns the derivative measures of perplexity. Smoothing used to estimate probabilities when there are insufficient data to estimate probabilities accurately. In this paper, we constructed the N-grams of the VCCV units with low perplexity and tested the language model using Katz, Witten-Bell, absolute, modified Kneser-Ney smoothing and so on. In the experiment results, the modified Kneser-Ney smoothing is tested proper smoothing technique for VCCV units.
PDF

Style-Specific Language Model Adaptation using TF*IDF Similarity for Korean Conversational Speech Recognition

Park, Young-Hee;Chung, Min-Hwa
- The Journal of the Acoustical Society of Korea
- /
- 제23권2E호
- /
- pp.51-55
- /
- 2004
In this paper, we propose a style-specific language model adaptation scheme using n-gram based tf*idf similarity for Korean spontaneous speech recognition. Korean spontaneous speech shows especially different style-specific characteristics such as filled pauses, word omission, and contraction, which are related to function words and depend on preceding or following words. To reflect these style-specific characteristics and overcome insufficient data for training language model, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to their n-. gram based tf*idf similarity, in which in-domain language model include disfluency model. Recognition results show that n-gram based tf*idf similarity weighting effectively reflects style difference.
PDF KSCI

Simple and effective neural coreference resolution for Korean language

Park, Cheoneum;Lim, Joonho;Ryu, Jihee;Kim, Hyunki;Lee, Changki
- ETRI Journal
- /
- 제43권6호
- /
- pp.1038-1048
- /
- 2021
We propose an end-to-end neural coreference resolution for the Korean language that uses an attention mechanism to point to the same entity. Because Korean is a head-final language, we focused on a method that uses a pointer network based on the head. The key idea is to consider all nouns in the document as candidates based on the head-final characteristics of the Korean language and learn distributions over the referenced entity positions for each noun. Given the recent success of applications using bidirectional encoder representation from transformer (BERT) in natural language-processing tasks, we employed BERT in the proposed model to create word representations based on contextual information. The experimental results indicated that the proposed model achieved state-of-the-art performance in Korean language coreference resolution.
https://doi.org/10.4218/etrij.2020-0282 인용 PDF KSCI

금융권에 적용 가능한 금융특화언어모델 구축방안에 관한 연구 (A Study on the Construction of Financial-Specific Language Model Applicable to the Financial Institutions)

배재권
- 한국산업정보학회논문지
- /
- 제29권3호
- /
- pp.79-87
- /
- 2024
최근 텍스트분류, 감성분석, 질의응답 등의 자연어 처리를 위해서 사전학습언어모델(Pre-trained Language Model, PLM)의 중요성은 날로 강조되고 있다. 한국어 PLM은 범용적인 도메인의 자연어 처리에서 높은 성능을 보이나 금융, 제조, 법률, 의료 등의 특화된 도메인에서는 성능이 미약하다. 본 연구는 금융도메인 뿐만 아니라 범용도메인에서도 우수한 성능을 보이는 금융특화 언어모델의 구축을 위해 언어모델의 학습과정과 미세조정 방법을 제안하는 것이 주요 목표이다. 금융도메인 특화언어모델을 구축하는 과정은 (1) 금융데이터 수집 및 전처리, (2) PLM 또는 파운데이션 모델 등 모델 아키텍처 선정, (3) 도메인 데이터 학습과 인스트럭션 튜닝, (4) 모델 검증 및 평가, (5) 모델 배포 및 활용 등으로 구성된다. 이를 통해 금융도메인의 특성을 살린 사전학습 데이터 구축방안과 효율적인 LLM 훈련방법인 적응학습과 인스트럭션 튜닝기법을 제안하였다.
https://doi.org/10.9723/jksiis.2024.29.3.079 인용 PDF

초등학교 교사를 위한 격려 언어 모형 개발 (Development of the Encouraging Language Model for Elementary School Teachers)

선영운;오익수
- 초등상담연구
- /
- 제10권1호
- /
- pp.39-56
- /
- 2011
본 연구의 목적은 격려에 관한 선행 연구들로부터 격려의 방법에 대한 내용을 종합 정리하여 격려 언어의 요소를 도출하고, 이를 이론적 근거로 하여 초등학교 교사를 위한 격려 언어모형을 개발하는 데 있다. 이를 위해 먼저 격려에 관한 선행 연구들로부터 격려의 방법에 대한 내용들을 중심으로 자료를 수집하였다. 그 다음 수집된 자료들의 내용을 분석하여 자료에 나타난 주요 개념에 따라 범주화하였다. 이를 통해 17개의 하위 범주를 도출하였다. 이후 하위 범주를 대상으로 다시 범주화를 시도하여 5개의 상위 범주를 도출하였다. 5개의 상위범주는 존재 자체로서 아동의 가치 인정, 아동의 자질 신뢰, 실수에 대한 합리적 사고, 아동의 행동에 대한 비평가적 피드백, 아동의 긍정적 감정 반영이다. 이 5개의 상위 범주를 격려 언어의 요소로 설정하고, 이를 근거로 하여 학급 상황별로 사용할 수 있는 격려 언어 모형을 예문 중심으로 개발하였다.
PDF

Annotation of a Non-native English Speech Database by Korean Speakers

Kim, Jong-Mi
- 음성과학
- /
- 제9권1호
- /
- pp.111-135
- /
- 2002
An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.
PDF

검색결과 1,570건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)