• Title/Summary/Keyword: Universal language

Search Result 177, Processing Time 0.031 seconds

Universal POS Tagset for Korean (Universal POS 태그셋의 한국어 적용)

  • Park, Hye-Jin;Oh, Tae-Hwan;Kim, Han-Saem
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.417-421
    • /
    • 2018
  • The Universal Dependencies 프로젝트는 현재 71개 언어, 122개 Treebank로 이루어져 있으며, 병렬 언어 처리를 위해 여러 언어에 적용할 수 있는 형태적, 구문론적 특성을 찾는 것을 목표로 한다. 본고는 UD의 형태 태그셋인 Universal POS를 살펴보고, 한국어의 기존 형태 태그셋을 UPOS로 자동 변환하여 적용하는 방안을 제안한다. 영어와 같은 굴절어를 중심으로 구축된 UPOS 체계를 교착어에 속하는 한국어에 적용하기 위해서는 UPOS의 개별 표지와 21세기 세종계획 형태 주석 표지 결합체 간의 일대다 사상을 시도해야 한다.

  • PDF

Applying Universal Dependency Relation Tagsets to Korean (Universal Dependency 관계 태그셋의 한국어 적용)

  • Lee, Chanyoung;Kim, Jinung;Kim, Han Saem
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.334-339
    • /
    • 2018
  • 본 논문에서는 기존에 구축되어 있는 구 구조 기반 구문 분석 태그셋을 Universal Dependency 관계 태그 셋으로 변환하는 방안에 대해 논의하였다. 범언어적으로 활용하기 위해 개발된 Universal Dependency의 관계 태그셋을 한국어에 적용할 때에는 범용 POS 태그셋인 UPOS뿐만 아니라 개별 언어의 특성을 반영하고 있는 XPOS를 반드시 참고해야만 한다. 본 연구에서는 Universal Dependency 관계 태그셋을 한국어 구문 분석 태그셋에 대응시키는 과정에서 생기는 문제점들을 '원시 말뭉치 처리 문제'와 '기구축 구문 태그 말뭉치 오류의 문제'로 나누어 지적하고, 이에 대한 해결책을 제시하였다.

  • PDF

Manual Revision of Penn Korean Universal Dependency Treebank (Penn Korean Universal Dependency Treebank 데이터셋 구축)

  • Oh, Taehwan;Han, Jiyoon;Kim, Hansaem
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.61-65
    • /
    • 2021
  • 본 연구에서는 2018년에 공개된 Penn Korean Universal Dependency Treebank(이하 PKT-UD v2018) 데이터의 오류를 분석하고 이를 개정하여 새롭게 데이터셋(이하 PKT-UD v2020)을 구축하였다. PKT-UD v2018은 구구조 분석 방식으로 구축된 Penn Korean Treebank를 UD(Universal Dependencies)의 체계에 맞추어 자동적으로 변환한 후 보정하여 구축한 데이터이다. 본 연구에서는 이와 같은 자동 변환의 과정에서 발생한 오류를 바로 잡고, UD 체계를 최대한 활용하면서 한국어의 특성을 잘 살린 데이터셋을 구축할 수 있는 방법을 제안하였다.

  • PDF

A Study on UCCA for Korean Semantic Analysis (Universal conceptual cognitive annotation(UCCA) 주석 체계의 한국어 적용 연구)

  • Oh, Tae-Hwan;Han, Ji-Yoon;Choe, Hyon-Su;Park, Seok-Won;Kim, Han-Saem
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.353-356
    • /
    • 2019
  • 본 논문은 Universal conceptual cognitive annotation(보편 개념 인지 주석, 이하 UCCA)를 한국어에 적용하는 방안에 대해 제시하였다. 우선 기존의 한국어 의미 분석 체계들의 장단점을 살펴본 뒤, UCCA가 가지고 있는 상대적인 장점들을 소개하였다. UCCA는 모든 언어에 대하여 일관적인 기술을 하려는 Meaning representation framework의 하나로, 보편언어적인 의미 분석 체계를 가지고 있다. 본고는 주석 단위와 문법적 요소의 관점에서 한국어의 특성을 반영하여 UCCA를 한국어에 적용하는 방안을 검토하였다.

  • PDF

Study of Building Korean Universal Dependency Corpus focused on Syntactic Relations (한국어 Universal Dependency 말뭉치 구축 방안 연구: 구문 관계를 중심으로)

  • Won, Hye-Jin;Ryu, Pum-Mo
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.329-333
    • /
    • 2018
  • Universal Dependency 프로젝트는 여러 언어에 공통으로 적용할 수 있는 형태소 패턴과 구문 관계를 찾기 위한 연구를 진행하고 있으며, 점진적으로 많은 언어들이 참여하여 UD 가이드라인에 따라 말뭉치를 구축하고 시스템을 개발하고 있다. 한국어 UD 말뭉치도 구축되어서 공유되고 있지만 구축을 위한 상세한 가이드라인은 제공되지 않고 있다. 본 논문에서는 UD를 기반으로 한국어 구문분석 말뭉치를 구축할 때 논의되어야 할 요소들을 나열하고 예제를 통해서 설명하였다. 본 연구를 기반으로 한국어 구문분석 말뭉치 구축, 구문분석 시스템 개발에서 UD 가이드라인을 적용하는 논의가 시작되기를 기대한다.

  • PDF

Linguistics in Postmodern Science Fiction: Delany's Babel 17 and Stephenson's Snow Crash

  • Kim, Il-Gu
    • English Language & Literature Teaching
    • /
    • v.12 no.2
    • /
    • pp.41-59
    • /
    • 2006
  • As the late partner to science fiction, various experimental languages such as animal language, telepathic language, newly invented language, alien language often appear as "unexpected and frightened situations" in SF. Like generative semanticists, some SF writers daringly delve into the sacred mystery of semantics in language whereas others avoid the dream of a universal language by holding themselves to manageable data. Samuel Delany's description of the ideal telepathic universal language in Babel 17 shows us humans' dream to be like God by showing to us the new process of communication in the factual interplanetary environment. Similar to the mystery of alien language in SF, the baby's babbling reveals how language is both simple and complicated. Children's language shows us the changing process of a soul revealed by language use and it is no wonder that many languages of AIs in SF often borrow their source from children's language acquisition processes. In short, science fiction as the repository of tropes illuminates other literary language studies and other literary genres. Especially in terms of the futuristic study of linguistics, the relationship between science fiction and linguistics is much closer than we thought.

  • PDF

Quantifiers in Questions

  • Krifka, Manfred
    • Korean Journal of English Language and Linguistics
    • /
    • v.3 no.4
    • /
    • pp.499-526
    • /
    • 2003
  • This paper, based on Krifka (2001), is about the interpretation of quantifiers in questions. I have argued that quantification into question acts is possible for universal quantifiers, as these quantifiers are based on conjunction, an operation that is defined for speech acts. This explains the restriction to universal quantifiers, which are generalized conjunctions. I have developed a type system in which quantification into question acts can be described. I have argued that expressions that scope out of speech acts must be topic, which explains a number of additional observations. I have also discussed embedded questions, which, depending on the embedding verb, may allow for quantification into questions.

  • PDF

English Floating Quantifiers and Lexical specification of Quantifier Retrieval

  • Yoo, Eun-Jung
    • Language and Information
    • /
    • v.5 no.1
    • /
    • pp.1-15
    • /
    • 2001
  • Floating quantifiers(FQs) in English exhibit both universal and language specific proper- ties This paper discusses how such syntactic and semantic characteristics can be explained in terms of a constraint-based, lexical approach to the floating quanti- fer construction within the framework of Head-Driven Phrase Structure Grammar(HPSG). Based on the assumption and FQs are base-generated VP modifiers, this paper proposes and account in which the semantic contribution of FQs consists of a "lexically retrieved" universal quantifier taking scope over the VP meaning.P meaning.

  • PDF

The First Language Acquisition of Relative Clauses in Korean: Continuity of the Principles of Universal Grammar in First Language Acquisition (한국(韓國) 아동(兒童)의 관계절 습득 연구 - 보편문법(普遍文法) 언어원리(言語原理)의 지속적(持續的) 언어습득(言語習得) 이론(理論)을 중심으로 -)

  • Lee, Kwee Ock
    • Korean Journal of Child Studies
    • /
    • v.13 no.1
    • /
    • pp.125-138
    • /
    • 1992
  • The purpose of the present study was to examine the development of embedding through relative clause formation in the first language acquisition of Korean. Results are reported from the study of the spantaneous natural speech of 36 young Korean children ranging from 16 months to 45 months in age acquiring Korean as their first language in Chinju, Korea. The results revealed a developmental order in the first language acquisition of Korean relative clause structures. Namely, a free or headless relative clause appears to be acquired first, before lexically headed restrictive relative construction. This order is consistent with one evidenced in English (and also Chinese) first language acquisition, 'free' relatives appear to provide a developmentally early stage in the acquisition of restrictive relative clauses. The Korean data provided additional evidence for an intermediary stage with an overt complementizer as well as an overt lexical head. Implications for the results are disscused with regard to a continuous theory of universal grammar in the first language acquisition.

  • PDF

Multilingual Automatic Translation Based on UNL: A Case Study for the Vietnamese Language

  • Thuyen, Phan Thi Le;Hung, Vo Trung
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.77-84
    • /
    • 2016
  • In the field of natural language processing, Universal Networking Language (UNL) has been used by various researchers as an inter-lingual approach to automatic machine translation. The UNL system consists of two main components, namely, EnConverter for converting text from a source language to UNL, and DeConverter for converting from UNL to a target language. Currently, many projects are researching how to apply UNL to different languages. In this paper, we introduce the tools that are UNL's applications and discuss how to reuse them to encode a Vietnamese sentence into UNL expressions and decode UNL expressions into a Vietnamese sentence. The testing was done with about 1,000 Vietnamese sentences (a dictionary that includes 4573 entries and 3161 rules). In addition, we compare the proportion of sentences translated based on a direct method (Google Translator) and another one based on UNL.