• Title/Summary/Keyword: multi-language

Search Result 675, Processing Time 0.026 seconds

Character-Aware Neural Networks with Multi-Head Attention Mechanism for Multilingual Named Entity Recognition (Multi-Head Attention 방법을 적용한 문자 기반의 다국어 개체명 인식)

  • Cheon, Min-Ah;Kim, Chang-Hyun;Park, Ho-Min;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.167-171
    • /
    • 2018
  • 개체명 인식은 문서에서 인명, 지명, 기관명 등의 고유한 의미를 나타내는 단위인 개체명을 추출하고, 추출된 개체명의 범주를 결정하는 작업이다. 최근 개체명 인식과 관련된 연구는 입력 데이터의 앞, 뒤를 고려하기 위한 Bi-RNNs와 출력 데이터 간의 전이 확률을 이용한 CRFs를 결합한 방식을 기반으로 다양한 변형의 심층학습 방법론이 제안되고 있다. 그러나 대부분의 연구는 입력 단위를 단어나 형태소로 사용하고 있으며, 성능 향상을 위해 띄어쓰기 정보, 개체명 사전 자질, 품사 분포 정보 등 다양한 정보를 필요로 한다는 어려움이 있다. 본 논문은 기본적인 학습 말뭉치에서 얻을 수 있는 문자 기반의 입력 정보와 Multi-Head Attention을 추가한 Bi-GRU/CRFs을 이용한 다국어 개체명 인식 방법을 제안한다. 한국어, 일본어, 중국어, 영어에 제안 모델을 적용한 결과 한국어와 일본어에서는 우수한 성능(한국어 $F_1$ 84.84%, 일본어 $F_1$ 89.56%)을 보였다. 영어에서는 $F_1$ 80.83%의 성능을 보였으며, 중국어는 $F_1$ 21.05%로 가장 낮은 성능을 보였다.

  • PDF

Phonological Activation in Multi-syllabic Word Recognition (다음절 단어재인에 있어서 음운적 활성화)

  • Lee, Chang-H.;Nam, Ki-Chun
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.225-228
    • /
    • 2004
  • English has words that have a silent letter in their letter strings (e.g., knowledge). Such words provide an opportunity of investigating the role of phonological information in multi-syllabic words by comparing them to words that do not have the silent letter in the corresponding position (e.g., available). Stimuli that excluded a silent letter (e.g., _nowledge) were processed faster than those that excluded a sounding letter (e.g., _vailable) in the lexical decision task. The evidence from this experiment provides seminal evidence of phonological recoding in multi-syllabic word recognition

  • PDF

A Multi-Bible Application on an Android Platform Using a Word Tokenization and Recognition Algorithm (단어 구분 및 인식 알고리즘을 이용한 안드로이드 플랫폼 기반의 멀티 성경 애플리케이션)

  • Kang, Sung-Mo;Kang, Myeong-Su;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.6 no.4
    • /
    • pp.215-221
    • /
    • 2011
  • Mobile phones, which were used for simply calling and sending text messages, have recently moved to application-oriented digital devices such as smart phones and tablet phones. The rapid increase of smart and tablet phones which can offer advanced ability and run a variety of applications based on Java requires various digital multimedia content activities. These days, there are more than 2.2 billions of Christians around the world. Among them, more than 300 millions of people live in Asian, and all of them have and read the bible. If there is an application for the bible which translates from English to their own languages, it could be very helpful. With this reason, this paper proposes a multi-bible application that supports various languages. To do this, we implemented an algorithm that recognize sentences in the bible as word by word. The algorithm is essentially composed of the following three functions: tokenizing sentences in the bible into word by word (word tokenization), recognizing words by using touch event (word recognition), and translating the selected words to the desired language. Consequently, the proposed multi-bible application supports language translation efficiently by touching words of sentences in the bible.

Understanding the Language Learner from the Imagined Communities Perspective: The Case of Korean Language Learners in the U.S. (상상공동체 관점을 통한 한국어 학습자 동기 이해)

  • Lee, Siwon;Cho, Haewon
    • Journal of Korean language education
    • /
    • v.28 no.4
    • /
    • pp.367-402
    • /
    • 2017
  • The current study seeks to understand the multi-faceted desires of language learners through the theoretical lens of imagined communities (Norton, 2001). Particularly, the study focuses on the learners of Korean language-one of the less commonly taught languages in the U.S. that has received relatively less attention in previous literature on second language motivation. The study analyzed and compared the narratives told by eleven Korean language learners in a post-secondary language program, and identified four types of imagined communities: Communities of K-pop Culture, Communities of Professionals, Communities of Korean Family and Relatives, and Communities of ethnic Koreans. The study found that these imagined communities were not restricted to a specific region or an ethnic group but encompassed various populations connected through the use of Korean language. The study also found variability within what has been readily labelled as heritage motivation (or motivation related to heritage), as well as striking differences between heritage language learners and non-heritage language learners in terms of their scope of imagination.

Multi-labeled Domain Detection Using CNN (CNN을 이용한 발화 주제 다중 분류)

  • Choi, Kyoungho;Kim, Kyungduk;Kim, Yonghe;Kang, Inho
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.56-59
    • /
    • 2017
  • CNN(Convolutional Neural Network)을 이용하여 발화 주제 다중 분류 task를 multi-labeling 방법과, cluster 방법을 이용하여 수행하고, 각 방법론에 MSE(Mean Square Error), softmax cross-entropy, sigmoid cross-entropy를 적용하여 성능을 평가하였다. Network는 음절 단위로 tokenize하고, 품사정보를 각 token의 추가한 sequence와, Naver DB를 통하여 얻은 named entity 정보를 입력으로 사용한다. 실험결과 cluster 방법으로 문제를 변형하고, sigmoid를 output layer의 activation function으로 사용하고 cross entropy cost function을 이용하여 network를 학습시켰을 때 F1 0.9873으로 가장 좋은 성능을 보였다.

  • PDF

Query Processing for Multi-level Databases Using Horizontal Partitioning and Views (수평분할과 뷰를 이용한 다단계 데이터베이스에서의 질의 처리)

  • 나민영;최병갑
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 1995.11a
    • /
    • pp.79-88
    • /
    • 1995
  • Most works done so far have concentrated on developing data modeling techniques such as multi-level relation for data protection. These techniques, however, cannot be applied to practical area. This is because they require new queries or new architectures. In this paper, we propose a query processing technique for multi-level databases using horizontal partitioning and views, which does not need any change in database architecture and query language.

  • PDF

An Implementation of Single Stack Multi-threading for Small Embedded Systems

  • Kim, Yong-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.4
    • /
    • pp.1-8
    • /
    • 2016
  • In small embedded systems including IoT devices, memory size is very small and it is important to reduce memory amount for execution of application programs. For multi-threaded applications, stack may consume a large amount of memory because each thread has its own stack of sufficiently large size for worst case. This paper presents an implementation of single stack multi-threading, called SSThread (Single Stack Thread), by sharing a stack for all threads to reduce stack memory size. By using SSThread, multi-threaded applications can be programmed based on normal C language environment and there is no requirement of transporting multi-threading operating systems. It consists of several library functions and various C macro definitions. Even though some functional restrictions in comparison to operating systems supporting complete multi-thread functionalities, it is very useful for small embedded systems with tiny memory size and it is simple to setup programming environment for multi-thread applications.

A Study on Languages and Socialities of Children in Multi-cultural Families Using Fine Arts (미술을 활용한 다문화 자녀의 언어와 사회성에 관한 연구)

  • Do, Kyung-Eun
    • Journal of Digital Convergence
    • /
    • v.11 no.12
    • /
    • pp.793-801
    • /
    • 2013
  • Our society is moving from a monocultural society of a homogeneous nation to a multi-cultural society as a lot of foreigners are flowing into the country with the advent of globalization and with an effort to secure labor force for economic growth. So, multi-cultural families composed of members using different languages spring up everywhere, but the children in these multi-cultural families have difficulties in acquiring Korean language and are socially maladjusted because of the bilingual environment. The goal of this study is to help enhance the language capabilities and socialities of the children in the multi-cultural families through fine arts using artistic methods. The study method was to analyze the high-quality effects of the fine arts with the theoretical research materials and theses showing the real conditions of the multi-cultural families. And I proposed some ways to improve the linguistic abilities and socialities of the children in the multi-cultural families with the utilization of fine arts. As a result, Firstly, An active use of the bilingual instructors and artistic multimedia is educationally necessary to overcome language restrictions. Secondly, Various ways to utilize fine arts are necessary to improve learning abilities of other subjects. Thirdly, Artistic plays and experiential activities need to be largely applied to education to enhance the abilities of emotional control and socialities. Finally, Integrated culture and art education is essential not only for creativities and socialities but also for personalities for community life.

A Simple Syntax for Complex Semantics

  • Lee, Kiyong
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.2-27
    • /
    • 2002
  • As pact of a long-ranged project that aims at establishing database-theoretic semantics as a model of computational semantics, this presentation focuses on the development of a syntactic component for processing strings of words or sentences to construct semantic data structures. For design arid modeling purposes, the present treatment will be restricted to the analysis of some problematic constructions of Korean involving semi-free word order, conjunction arid temporal anchoring, and adnominal modification and antecedent binding. The present work heavily relies on Hausser's (1999, 2000) SLIM theory for language that is based on surface compositionality, time-linearity arid two other conditions on natural language processing. Time-linear syntax for natural language has been shown to be conceptually simple and computationally efficient. The associated semantics is complex, however, because it must deal with situated language involving interactive multi-agents. Nevertheless, by processing input word strings in a time-linear mode, the syntax cart incrementally construct the necessary semantic structures for relevant queries and valid inferences. The fragment of Korean syntax will be implemented in Malaga, a C-type implementation language that was enriched for both programming and debugging purposes arid that was particluarly made suitable for implementing in Left-Associative Grammar. This presentation will show how the system of syntactic rules with constraining subrules processes Korean sentences in a step-by-step time-linear manner to incrementally construct semantic data structures that mainly specify relations with their argument, temporal, and binding structures.

  • PDF

Comparative Analysis of Written Language and Colloquial Language for Information Communication of Multi-Modal Interface Environment (다중 인터페이스 환경에서의 문자언어와 음성언어의 차이에 관한 비교 연구)

  • Choi, In-Hwan;Lee, Kun-Pyo
    • Archives of design research
    • /
    • v.19 no.2 s.64
    • /
    • pp.91-98
    • /
    • 2006
  • The product convergence and complex application environment raise the need of multi-modal interface which enables us to interact products through various human senses. The sense of vision has been used predominantly more than any other senses for the traditional and general information gathering situation, but in the future which will be developed based on the digital network technology, the practical use of the various senses will be desired for more convenient and rational usage of the information appliances. The sense of auditory which possibility of practical use is becoming higher than ever with the sense of vision, the possible usage will be developed broader and in the various ways in the future. Based on this situation, the characteristics of the written language and the colloquial language and the comparative analysis of the difference between male and female's reaction for each language were examined through this study. To achieve this purpose, the literature research about the diverse components of the language system was peformed. Then, some peculiar characters of the sense of vision and auditory were reviewed and the appropriate experimentation was planned and carried out. The result of the accomplished experimentation was examined by the objective analysis method. The main results of this study are as follows: first, the reaction time for written language is shorter than colloquial language, second, there is a partial difference between the male's and female's reaction for those two stimuli, third, there is no selection bias between the sense of sight and the sense of hearing. I think the continuous development of the broad and diverse ways of study for various senses is needed based on this study.

  • PDF