• Title/Summary/Keyword: Language Model

Search Result 2,679, Processing Time 0.027 seconds

Recent R&D Trends for Pretrained Language Model (딥러닝 사전학습 언어모델 기술 동향)

  • Lim, J.H.;Kim, H.K.;Kim, Y.K.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.9-19
    • /
    • 2020
  • Recently, a technique for applying a deep learning language model pretrained from a large corpus to fine-tuning for each application task has been widely used as a language processing technology. The pretrained language model shows higher performance and satisfactory generalization performance than existing methods. This paper introduces the major research trends related to deep learning pretrained language models in the field of language processing. We describe in detail the motivations, models, learning methods, and results of the BERT language model that had significant influence on subsequent studies. Subsequently, we introduce the results of language model studies after BERT, focusing on SpanBERT, RoBERTa, ALBERT, BART, and ELECTRA. Finally, we introduce the KorBERT pretrained language model, which shows satisfactory performance in Korean language. In addition, we introduce techniques on how to apply the pretrained language model to Korean (agglutinative) language, which consists of a combination of content and functional morphemes, unlike English (refractive) language whose endings change depending on the application.

N- gram Adaptation Using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응)

  • Choi Joon Ki;Oh Yung-Hwan
    • MALSORI
    • /
    • no.56
    • /
    • pp.207-223
    • /
    • 2005
  • The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.

  • PDF

Dependency Structure Applied to Language Modeling for Information Retrieval

  • Lee, Chang-Ki;Lee, Gary Geun-Bae;Jang, Myung-Gil
    • ETRI Journal
    • /
    • v.28 no.3
    • /
    • pp.337-346
    • /
    • 2006
  • In this paper, we propose a new language model, namely, a dependency structure language model, for information retrieval to compensate for the weaknesses of unigram and bigram language models. The dependency structure language model is based on the first-order dependency model and the dependency parse tree generated by a linguistic parser. So, long-distance dependencies can be naturally captured by the dependency structure language model. We carried out extensive experiments to verify the proposed model, where the dependency structure model gives a better performance than recently proposed language models and the Okapi BM25 method, and the dependency structure is more effective than unigram and bigram in language modeling for information retrieval.

  • PDF

Examining Generalizability of Kang's (1999) Model of Structural Relationships between ESL Learning Strategy Use and Language Proficiency

  • Kang, Sung-Woo
    • English Language & Literature Teaching
    • /
    • v.7 no.2
    • /
    • pp.55-75
    • /
    • 2002
  • The present study examined whether Kang's (1999) model of the relationships among language learning strategy use and language proficiency for the Asian students could be applied to a more heterogeneous group. In Kang's study, he collected information of language learning strategies of 957 foreign students learning English as a second language in American colleges through a questionnaire. He also measured the subjects' language proficiency with the Institutional Testing Program TOEFL (Test of English as a Foreign Language). This study analyzed the same data without the limitation of cultural identity. Structural equation modeling was used to model the relationships among strategy use and language proficiency. Then, the model of the present study was descriptively compared with Kang's (1999) model for the Asian students. The overall flow of the relationship paths appeared to vary very little across the two models, which would have indicated that the generalizability of Kang's (1999) model could be extended more than originally examined. (156)

  • PDF

Improved Statistical Language Model for Context-sensitive Spelling Error Candidates (문맥의존 철자오류 후보 생성을 위한 통계적 언어모형 개선)

  • Lee, Jung-Hun;Kim, Minho;Kwon, Hyuk-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.371-381
    • /
    • 2017
  • The performance of the statistical context-sensitive spelling error correction depends on the quality and quantity of the data for statistical language model. In general, the size and quality of data in a statistical language model are proportional. However, as the amount of data increases, the processing speed becomes slower and storage space also takes up a lot. We suggest the improved statistical language model to solve this problem. And we propose an effective spelling error candidate generation method based on a new statistical language model. The proposed statistical model and the correction method based on it improve the performance of the spelling error correction and processing speed.

Design of Teaching·Learning Model for Programming Language Education (프로그래밍 언어 교육을 위한 교수·학습 모델 설계)

  • Kang, Hwan Soo
    • Journal of Digital Contents Society
    • /
    • v.13 no.4
    • /
    • pp.517-524
    • /
    • 2012
  • This paper deals with the design of teaching learning model for programming language education. Various courses related to programming language education have opened at the university having many academic majors. In the meantime, a variety of programming languages have been developed, many integrated development environments of programming language have also been developed for users to make a program easily. But it is difficult for many novice learners to learn programming language still, likewise it is difficult for many teachers to teach the introduction course of programming language. In this paper, we have designed a teaching learning model based on scholastic achievements and blended learning for programming language education. The teaching learning model designed in this study was applied to a course opened in the second semester of 2011. According to the course evaluation result, the teaching learning model for programming language has shown to be an effective for novice learner.

Efficient Language Model based on VCCV unit for Sentence Speech Recognition (문장음성인식을 위한 VCCV 기반의 효율적인 언어모델)

  • Park, Seon-Hui;No, Yong-Wan;Hong, Gwang-Seok
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.836-839
    • /
    • 2003
  • In this paper, we implement a language model by a bigram and evaluate proper smoothing technique for unit of low perplexity. Word, morpheme, clause units are widely used as a language processing unit of the language model. We propose VCCV units which have more small vocabulary than morpheme and clauses units. We compare the VCCV units with the clause and the morpheme units using the perplexity. The most common metric for evaluating a language model is the probability that the model assigns the derivative measures of perplexity. Smoothing used to estimate probabilities when there are insufficient data to estimate probabilities accurately. In this paper, we constructed the N-grams of the VCCV units with low perplexity and tested the language model using Katz, Witten-Bell, absolute, modified Kneser-Ney smoothing and so on. In the experiment results, the modified Kneser-Ney smoothing is tested proper smoothing technique for VCCV units.

  • PDF

Towards a small language model powered chain-of-reasoning for open-domain question answering

  • Jihyeon Roh;Minho Kim;Kyoungman Bae
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.11-21
    • /
    • 2024
  • We focus on open-domain question-answering tasks that involve a chain-of-reasoning, which are primarily implemented using large language models. With an emphasis on cost-effectiveness, we designed EffiChainQA, an architecture centered on the use of small language models. We employed a retrieval-based language model to address the limitations of large language models, such as the hallucination issue and the lack of updated knowledge. To enhance reasoning capabilities, we introduced a question decomposer that leverages a generative language model and serves as a key component in the chain-of-reasoning process. To generate training data for our question decomposer, we leveraged ChatGPT, which is known for its data augmentation ability. Comprehensive experiments were conducted using the HotpotQA dataset. Our method outperformed several established approaches, including the Chain-of-Thoughts approach, which is based on large language models. Moreover, our results are on par with those of state-of-the-art Retrieve-then-Read methods that utilize large language models.

Development of the Encouraging Language Model for Elementary School Teachers (초등학교 교사를 위한 격려 언어 모형 개발)

  • Seon, Young-Woon;Oh, Ik-Soo
    • The Korean Journal of Elementary Counseling
    • /
    • v.10 no.1
    • /
    • pp.39-56
    • /
    • 2011
  • The purpose of this study is to draw the elements of encouraging language from the literatures of encouragement and develop the encouraging language model for elementary school teachers. To achieve this, first of all, the literatures about the methods of encouragement were collected. And then the collected literatures were categorized according to the main concept which each literature contained. As a result, 5 categories and 17 subcategories were drawn. 5 categories were valuing a child as a human-being itself, trusting a child, thinking rationally about a child's mistakes, giving a feedback about a child's behaviors non-evaluatively, and reflecting a child's positive feeling. These 5 categories were established as the elements of encouraging language. The encouraging language model was developed on the bases of the 5 elements of encouraging language. The model was constructed of the examples of encouraging language in various classroom situations. The model contains various situations which elementary school teachers often confront in their classrooms. And the model shows the examples of encouraging language proper for each situation. Every example was constructed on the bases of the elements of encouraging language.

  • PDF

Subword Neural Language Generation with Unlikelihood Training

  • Iqbal, Salahuddin Muhammad;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.45-50
    • /
    • 2020
  • A Language model with neural networks commonly trained with likelihood loss. Such that the model can learn the sequence of human text. State-of-the-art results achieved in various language generation tasks, e.g., text summarization, dialogue response generation, and text generation, by utilizing the language model's next token output probabilities. Monotonous and boring outputs are a well-known problem of this model, yet only a few solutions proposed to address this problem. Several decoding techniques proposed to suppress repetitive tokens. Unlikelihood training approached this problem by penalizing candidate tokens probabilities if the tokens already seen in previous steps. While the method successfully showed a less repetitive generated token, the method has a large memory consumption because of the training need a big vocabulary size. We effectively reduced memory footprint by encoding words as sequences of subword units. Finally, we report competitive results with token level unlikelihood training in several automatic evaluations compared to the previous work.