• 제목/요약/키워드: language model

검색결과 2,796건 처리시간 0.027초

Annotation of a Non-native English Speech Database by Korean Speakers

  • Kim, Jong-Mi
    • 음성과학
    • /
    • 제9권1호
    • /
    • pp.111-135
    • /
    • 2002
  • An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.

  • PDF

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Teaching-Learning Model for Programming Language Learning with Two-Step Feedback

  • Kwon, Boseob
    • 한국컴퓨터정보학회논문지
    • /
    • 제22권8호
    • /
    • pp.101-106
    • /
    • 2017
  • In this paper, we propose a new teaching-learning model with two-step feedback on programming language learning, which is a basic preliminary learning for programming. Programming learning is aimed at improving problem solving skills and thinking by experiencing problem solving through programming. For programming, the learner must know how to work with the computer and what to do with it. To do this, concrete thinking should be established and described in an accurate programming language. In recent, most studies have focused on the effects of programming learning and have not studied the effects of education on language itself. Therefore, in this study, the teaching-learning model for programming language education is presented and applied to the field, and the results are compared with the existing instructional-teaching model.

Relationships between the Use of ESL Learning Strategies and English Language Proficiency of Asian Students

  • Kang, Sung-Woo
    • 영어어문교육
    • /
    • 제5호
    • /
    • pp.1-25
    • /
    • 1999
  • The objective of the present study was to model the relationships between language learning strategy use and language proficiency among the Asian (Korean, Japanese, and Taiwanese) students studying English in the United States. The instruments were a language learning strategy Questionnaire and the Institutional Testing Program Test of English as a Foreign Language (ITP TOEFL). Structural equation modeling was utilized to model the relationships between language learning strategies and language proficiency. The present study found only weak relationships between language learning strategies and language proficiency. Only 13% and 15% of variance of the listening and grammar/reading factor were explained by the language learning strategies. The metacognitive strategies appeared not to have direct relationships to the language skill factors, as was found in other studies (Purpura, 1996, 1997). The effects of the social and affective strategies were very small. They in combination could account about 1% and 4% of the variance of the listening and grammar/reading factors.

  • PDF

Language Modeling Approaches to Information Retrieval

  • Banerjee, Protima;Han, Hyo-Il
    • Journal of Computing Science and Engineering
    • /
    • 제3권3호
    • /
    • pp.143-164
    • /
    • 2009
  • This article surveys recent research in the area of language modeling (sometimes called statistical language modeling) approaches to information retrieval. Language modeling is a formal probabilistic retrieval framework with roots in speech recognition and natural language processing. The underlying assumption of language modeling is that human language generation is a random process; the goal is to model that process via a generative statistical model. In this article, we discuss current research in the application of language modeling to information retrieval, the role of semantics in the language modeling framework, cluster-based language models, use of language modeling for XML retrieval and future trends.

u-GIS 콘텐츠를 위한 GeoPhoto 콘텐츠 언어의 설계 (Design of GeoPhoto Contents Markup Language for u-GIS Contents)

  • 박장유;남광우;진희채
    • 한국공간정보시스템학회 논문지
    • /
    • 제11권1호
    • /
    • pp.35-42
    • /
    • 2009
  • 이 논문은 공간 사진을 이용한 u-GIS 콘텐츠를 생성할 수 있도록 하는 GeoPhoto 콘텐츠 마크업 언어를 제안하고 있다. 제안된 GeoPhoto 콘텐츠 마크업 언어는 상이한 여러 플랫폼에서 개인 맞춤형 공간 사진 영상 정보를 표출 및 활용할 수 있는 콘텐츠를 지원하기 위한 GeoPhoto 콘텐츠 모델과 마크업 언어를 포함하고 있다. GeoPhoto 콘텐츠 언어는 다양한 공간 사진 콘텐츠를 제작할 수 있도록 GeoPhoto, CubicPhoto, SequenceGeoPhoto 타입을 기반으로 현재 많이 사용되고 있는 다양한 지리 정보, 위치 정보, 사진 정보등의 정보를 융합하여 표현하는 기능을 지원한다. 또한, 사진과 관련한 다양한 콘텐츠 모텔을 제공 할 수 있도록 다양한 GeoPhoto 콘텐츠 모텔과 이 콘텐츠들 사이의 연산들을 설정할 수 있도록 지원한다. 이 논문은 GeoPhoto 콘텐츠를 표현하기 위한 연산으로서 Annotation 연산과 Enlargement 연산, Overlay 연산들을 제안하고 있다.

  • PDF

사용자와 실시간으로 감성적 소통이 가능한 한국어 챗봇 시스템 개발 (Development of a Korean chatbot system that enables emotional communication with users in real time)

  • 백성대;이민호
    • 센서학회지
    • /
    • 제30권6호
    • /
    • pp.429-435
    • /
    • 2021
  • In this study, the creation of emotional dialogue was investigated within the process of developing a robot's natural language understanding and emotional dialogue processing. Unlike an English-based dataset, which is the mainstay of natural language processing, the Korean-based dataset has several shortcomings. Therefore, in a situation where the Korean language base is insufficient, the Korean dataset should be dealt with in detail, and in particular, the unique characteristics of the language should be considered. Hence, the first step is to base this study on a specific Korean dataset consisting of conversations on emotional topics. Subsequently, a model was built that learns to extract the continuous dialogue features from a pre-trained language model to generate sentences while maintaining the context of the dialogue. To validate the model, a chatbot system was implemented and meaningful results were obtained by collecting the external subjects and conducting experiments. As a result, the proposed model was influenced by the dataset in which the conversation topic was consultation, to facilitate free and emotional communication with users as if they were consulting with a chatbot. The results were analyzed to identify and explain the advantages and disadvantages of the current model. Finally, as a necessary element to reach the aforementioned ultimate research goal, a discussion is presented on the areas for future studies.

Burmese Sentiment Analysis Based on Transfer Learning

  • Mao, Cunli;Man, Zhibo;Yu, Zhengtao;Wu, Xia;Liang, Haoyuan
    • Journal of Information Processing Systems
    • /
    • 제18권4호
    • /
    • pp.535-548
    • /
    • 2022
  • Using a rich resource language to classify sentiments in a language with few resources is a popular subject of research in natural language processing. Burmese is a low-resource language. In light of the scarcity of labeled training data for sentiment classification in Burmese, in this study, we propose a method of transfer learning for sentiment analysis of a language that uses the feature transfer technique on sentiments in English. This method generates a cross-language word-embedding representation of Burmese vocabulary to map Burmese text to the semantic space of English text. A model to classify sentiments in English is then pre-trained using a convolutional neural network and an attention mechanism, where the network shares the model for sentiment analysis of English. The parameters of the network layer are used to learn the cross-language features of the sentiments, which are then transferred to the model to classify sentiments in Burmese. Finally, the model was tuned using the labeled Burmese data. The results of the experiments show that the proposed method can significantly improve the classification of sentiments in Burmese compared to a model trained using only a Burmese corpus.

DNN 기반 수어 번역 모델을 통한 성능 분석 (Performance Analysis Using a DNN-Based Sign Language Translation Model)

  • 정민재;노승환;홍준기
    • 한국빅데이터학회지
    • /
    • 제9권1호
    • /
    • pp.187-196
    • /
    • 2024
  • 본 연구에서는 수어의 좌표를 압축하여 학습 시간을 획기적으로 단축시킬 수 있는 DNN (Deep Neural Network) 기반 수어 번역 모델을 제안하고 수어 좌표 압축 유무에 따른 정확도와 모델 학습 시간을 비교 분석하였다. 제안한 모델을 사용하여 수어를 번역한 결과, 수어 영상을 압축하기 전과 후의 정확도는 약 5.9% 감소한 반면, 학습 시간은 56.57% 감소하여 수어 번역 정확도 손실 대비 학습 시간에서 많은 이득을 얻는 것을 확인하였다.

A Spatial Structural Query Language-G/SQL

  • Fang, Yu;Chu, Fang;Xinming, Tang
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2002년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.860-879
    • /
    • 2002
  • Traditionally, Geographical Information Systems can only process spatial data in a procedure-oriented way, and the data can't be treated integrally. This method limits the development of spatial data applications. A new and promising method to solve this problem is the spatial structural query language, which extends SQL and provides integrated accessing to spatial data. In this paper, the theory of spatial structural query language is discussed, and a new geographical data model based on the concepts and data model in OGIS is introduced. According to this model, we implemented a spatial structural query language G/SQL. Through the studies of the 9-Intersection Model, G/SQL provides a set of topological relational predicates and spatial functions for GIS application development. We have successfully developed a Web-based GIS system-WebGIS-using G/SQL. Experiences show that the spatial operators G/SQL offered are complete and easy-to-use. The BNF representation of G/SQL syntax is included in this paper.

  • PDF