• Title/Summary/Keyword: Language model

Search Result 2,772, Processing Time 0.031 seconds

Annotation of a Non-native English Speech Database by Korean Speakers

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.111-135
    • /
    • 2002
  • An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.

  • PDF

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Teaching-Learning Model for Programming Language Learning with Two-Step Feedback

  • Kwon, Boseob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.8
    • /
    • pp.101-106
    • /
    • 2017
  • In this paper, we propose a new teaching-learning model with two-step feedback on programming language learning, which is a basic preliminary learning for programming. Programming learning is aimed at improving problem solving skills and thinking by experiencing problem solving through programming. For programming, the learner must know how to work with the computer and what to do with it. To do this, concrete thinking should be established and described in an accurate programming language. In recent, most studies have focused on the effects of programming learning and have not studied the effects of education on language itself. Therefore, in this study, the teaching-learning model for programming language education is presented and applied to the field, and the results are compared with the existing instructional-teaching model.

Relationships between the Use of ESL Learning Strategies and English Language Proficiency of Asian Students

  • Kang, Sung-Woo
    • English Language & Literature Teaching
    • /
    • no.5
    • /
    • pp.1-25
    • /
    • 1999
  • The objective of the present study was to model the relationships between language learning strategy use and language proficiency among the Asian (Korean, Japanese, and Taiwanese) students studying English in the United States. The instruments were a language learning strategy Questionnaire and the Institutional Testing Program Test of English as a Foreign Language (ITP TOEFL). Structural equation modeling was utilized to model the relationships between language learning strategies and language proficiency. The present study found only weak relationships between language learning strategies and language proficiency. Only 13% and 15% of variance of the listening and grammar/reading factor were explained by the language learning strategies. The metacognitive strategies appeared not to have direct relationships to the language skill factors, as was found in other studies (Purpura, 1996, 1997). The effects of the social and affective strategies were very small. They in combination could account about 1% and 4% of the variance of the listening and grammar/reading factors.

  • PDF

Language Modeling Approaches to Information Retrieval

  • Banerjee, Protima;Han, Hyo-Il
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.3
    • /
    • pp.143-164
    • /
    • 2009
  • This article surveys recent research in the area of language modeling (sometimes called statistical language modeling) approaches to information retrieval. Language modeling is a formal probabilistic retrieval framework with roots in speech recognition and natural language processing. The underlying assumption of language modeling is that human language generation is a random process; the goal is to model that process via a generative statistical model. In this article, we discuss current research in the application of language modeling to information retrieval, the role of semantics in the language modeling framework, cluster-based language models, use of language modeling for XML retrieval and future trends.

Design of GeoPhoto Contents Markup Language for u-GIS Contents (u-GIS 콘텐츠를 위한 GeoPhoto 콘텐츠 언어의 설계)

  • Park, Jang-Yoo;Nam, Kwang-Woo;Jin, Heui-Chae
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.35-42
    • /
    • 2009
  • This paper proposes a new GeoPhoto contents markup language that can create u-GIS contents by using the spatial photos. GeoPhoto contents markup language has designed the GeoPhoto contents model and markup language for contents that can be used the spatial photos information. GeoPhoto con tents markup language is represented by the convergence of GIS information, location information, photos information respectively. GeoPhoto contents markup language to provide a variety of pictures related to the content model consists of GeoPhoto contents model and operations between the GeoPhoto contents. GeoPh oto contents model supports GeoPhoto model, CubicPhoto model, Photo model and SequenceGeoPhoto mod el. In addition, this paper propose the Annotation operation, Enlargement operation and Overlay operation for represent the GeoPhoto contents. GeoPhoto Contents Markup Language has the advantage of supportin g user custom contents model of u-GIS.

  • PDF

Development of a Korean chatbot system that enables emotional communication with users in real time (사용자와 실시간으로 감성적 소통이 가능한 한국어 챗봇 시스템 개발)

  • Baek, Sungdae;Lee, Minho
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.6
    • /
    • pp.429-435
    • /
    • 2021
  • In this study, the creation of emotional dialogue was investigated within the process of developing a robot's natural language understanding and emotional dialogue processing. Unlike an English-based dataset, which is the mainstay of natural language processing, the Korean-based dataset has several shortcomings. Therefore, in a situation where the Korean language base is insufficient, the Korean dataset should be dealt with in detail, and in particular, the unique characteristics of the language should be considered. Hence, the first step is to base this study on a specific Korean dataset consisting of conversations on emotional topics. Subsequently, a model was built that learns to extract the continuous dialogue features from a pre-trained language model to generate sentences while maintaining the context of the dialogue. To validate the model, a chatbot system was implemented and meaningful results were obtained by collecting the external subjects and conducting experiments. As a result, the proposed model was influenced by the dataset in which the conversation topic was consultation, to facilitate free and emotional communication with users as if they were consulting with a chatbot. The results were analyzed to identify and explain the advantages and disadvantages of the current model. Finally, as a necessary element to reach the aforementioned ultimate research goal, a discussion is presented on the areas for future studies.

Burmese Sentiment Analysis Based on Transfer Learning

  • Mao, Cunli;Man, Zhibo;Yu, Zhengtao;Wu, Xia;Liang, Haoyuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.535-548
    • /
    • 2022
  • Using a rich resource language to classify sentiments in a language with few resources is a popular subject of research in natural language processing. Burmese is a low-resource language. In light of the scarcity of labeled training data for sentiment classification in Burmese, in this study, we propose a method of transfer learning for sentiment analysis of a language that uses the feature transfer technique on sentiments in English. This method generates a cross-language word-embedding representation of Burmese vocabulary to map Burmese text to the semantic space of English text. A model to classify sentiments in English is then pre-trained using a convolutional neural network and an attention mechanism, where the network shares the model for sentiment analysis of English. The parameters of the network layer are used to learn the cross-language features of the sentiments, which are then transferred to the model to classify sentiments in Burmese. Finally, the model was tuned using the labeled Burmese data. The results of the experiments show that the proposed method can significantly improve the classification of sentiments in Burmese compared to a model trained using only a Burmese corpus.

Performance Analysis Using a DNN-Based Sign Language Translation Model (DNN 기반 수어 번역 모델을 통한 성능 분석)

  • Min-Jae Jeong;Soong-Hwan Ro;Jun-Ki Hong
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.187-196
    • /
    • 2024
  • In this study, we propose a DNN (Deep Neural Network)-based sign language translation model that can significantly reduce training time by compressing sign language coordinates. We compared and analyzed the accuracy and training time of the model with and without sign language coordinate compression. The results of using the proposed model for sign language translation showed that while the accuracy decreased by approximately 5.9% after compressing the sign language video, the training time was reduced by 56.57%, indicating a substantial gain in training efficiency compared to the loss in translation accuracy.

A Spatial Structural Query Language-G/SQL

  • Fang, Yu;Chu, Fang;Xinming, Tang
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.860-879
    • /
    • 2002
  • Traditionally, Geographical Information Systems can only process spatial data in a procedure-oriented way, and the data can't be treated integrally. This method limits the development of spatial data applications. A new and promising method to solve this problem is the spatial structural query language, which extends SQL and provides integrated accessing to spatial data. In this paper, the theory of spatial structural query language is discussed, and a new geographical data model based on the concepts and data model in OGIS is introduced. According to this model, we implemented a spatial structural query language G/SQL. Through the studies of the 9-Intersection Model, G/SQL provides a set of topological relational predicates and spatial functions for GIS application development. We have successfully developed a Web-based GIS system-WebGIS-using G/SQL. Experiences show that the spatial operators G/SQL offered are complete and easy-to-use. The BNF representation of G/SQL syntax is included in this paper.

  • PDF