• Title/Summary/Keyword: language processing

Search Result 2,686, Processing Time 0.026 seconds

Biaffine Dependency Parser for Korean (Biaffine 한국어 의존파서)

  • Shadikhodjaev, Uygun;Min, Tae Hong;Youn, Junyoung;Lee, Jae Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.678-681
    • /
    • 2018
  • Dependency parsing is an important task in natural language processing whose results are used in many downstream tasks such as machine translation, information retrieval, relation extraction, question answering and many others. Most of the dependency parsing literature focuses on using end-to-end and sequence-to-sequence neural architectures as the core of the system. One such system, namely Biaffine dependency parser is explored in the current paper for effective dependency parsing of Korean language.

  • PDF

An Investigation of Robot Programming Language with the Capabilities of Sensory Information Processing (센서 정보 처리 기능을 갖는 로보트 프로그램밍 언어에 관한 조사)

  • Kim, Dae-Won;Ko, Myoun-Sam;Lee, Bum-Hee
    • Proceedings of the KIEE Conference
    • /
    • 1987.11a
    • /
    • pp.435-438
    • /
    • 1987
  • In this paper, among the robot programming languages that enable processing of sensory information, eight exemplary languages are chosen, and investigated in terms of their characteristics, why they are designed the way they are, and the kind of sensors each language can use and apply to. In addition, the characteristics of each language is compared with one another from the sensor point of view and the flow of each language is analyzed from the robot language classification point of view. Finally, We investigate the progress and the requirements of the sensor-based robot programming languages for further developments.

  • PDF

Development of a Traceability Analysis Method Based on Case Grammar for NPP Requirement Documents Written in Korean Language

  • Yoo Yeong Jae;Seong Poong Hyun;Kim Man Cheol
    • Nuclear Engineering and Technology
    • /
    • v.36 no.4
    • /
    • pp.295-303
    • /
    • 2004
  • Software inspection is widely believed to be an effective method for software verification and validation (V&V). However, software inspection is labor-intensive and, since it uses little technology, software inspection is viewed upon as unsuitable for a more technology-oriented development environment. Nevertheless, software inspection is gaining in popularity. KAIST Nuclear I&C and Information Engineering Laboratory (NICIEL) has developed software management and inspection support tools, collectively named "SIS-RT. "SIS-RT is designed to partially automate the software inspection processes. SIS-RT supports the analyses of traceability between a given set of specification documents. To make SIS-RT compatible for documents written in Korean, certain techniques in natural language processing have been studied [9]. Among the techniques considered, case grammar is most suitable for analyses of the Korean language [3]. In this paper, we propose a methodology that uses a case grammar approach to analyze the traceability between documents written in Korean. A discussion regarding some examples of such an analysis will follow.

Method for Detecting Errors of Korean-Chinese MT Using Parallel Corpus (병렬 코퍼스를 이용한 한중 기계번역 오류 탐지 방법)

  • Jin, Yun;Kim, Young-Kil
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.113-117
    • /
    • 2008
  • 본 논문에서는 패턴기반 자동번역시스템의 효율적인 번역 성능 향상을 위해 병렬 코퍼스(parallel corpus)를 이용한 오류 자동 탐지 방법을 제안하고자 한다. 번역시스템에 존재하는 대부분 오류는 크게 지식 오류와 엔진 오류로 나눌 수 있는데 통상 이런 오류는 이중 언어가 가능한 훈련된 언어학자가 대량의 자동번역 된 결과 문장을 읽음으로써 오류를 탐지하고 분석하여 번역 지식을 수정/확장하거나 또는 엔진을 개선하게 된다. 하지만, 이런 작업은 많은 시간과 노력을 필요로 하게 된다. 따라서 본 논문에서는 병렬 코퍼스 중의 목적 언어(Target Language) 문장 즉, 정답 문장과 자동번역 된 결과 문장을 다양한 방법으로 비교하면서 번역시스템에 존재하고 있는 지식 및 엔진 오류를 자동으로 탐지하는 방법을 제안한다. 제안한 방법은 한-중 자동번역시스템에 적용하여 그 정확률과 재현률을 측정하였으며, 자동적으로 오류를 탐지하여 추출 할 수 있음을 증명하였다.

  • PDF

Recent Progresses in the Linguistic Modeling of Biological Sequences Based on Formal Language Theory

  • Park, Hyun-Seok;Galbadrakh, Bulgan;Kim, Young-Mi
    • Genomics & Informatics
    • /
    • v.9 no.1
    • /
    • pp.5-11
    • /
    • 2011
  • Treating genomes just as languages raises the possibility of producing concise generalizations about information in biological sequences. Grammars used in this way would constitute a model of underlying biological processes or structures, and that grammars may, in fact, serve as an appropriate tool for theory formation. The increasing number of biological sequences that have been yielded further highlights a growing need for developing grammatical systems in bioinformatics. The intent of this review is therefore to list some bibliographic references regarding the recent progresses in the field of grammatical modeling of biological sequences. This review will also contain some sections to briefly introduce basic knowledge about formal language theory, such as the Chomsky hierarchy, for non-experts in computational linguistics, and to provide some helpful pointers to start a deeper investigation into this field.

Large Language Models: A Guide for Radiologists

  • Sunkyu Kim;Choong-kun Lee;Seung-seob Kim
    • Korean Journal of Radiology
    • /
    • v.25 no.2
    • /
    • pp.126-133
    • /
    • 2024
  • Large language models (LLMs) have revolutionized the global landscape of technology beyond natural language processing. Owing to their extensive pre-training on vast datasets, contemporary LLMs can handle tasks ranging from general functionalities to domain-specific areas, such as radiology, without additional fine-tuning. General-purpose chatbots based on LLMs can optimize the efficiency of radiologists in terms of their professional work and research endeavors. Importantly, these LLMs are on a trajectory of rapid evolution, wherein challenges such as "hallucination," high training cost, and efficiency issues are addressed, along with the inclusion of multimodal inputs. In this review, we aim to offer conceptual knowledge and actionable guidance to radiologists interested in utilizing LLMs through a succinct overview of the topic and a summary of radiology-specific aspects, from the beginning to potential future directions.

Survey on Financial Support in Chinese Language Promotion

  • Xiaowen Zhang;Lu Lu
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.67-75
    • /
    • 2024
  • In the promotion of Chinese language, the funding that Confucius Institutes can rely on only comes from Hanban. From 2009 to 2014, the number of new Confucius Institutes opened is much higher than before. With the increasing number of Confucius Institutes established in various countries, the funding for promoting Chinese language has limited its development. The development situation of Confucius Institutes in Australia is diversified with very rich experience. The market-oriented development of Confucius Institutes has also tried many times. The Confucius Institutes in the Lancang-Mekong region have less experience but they can learn from various experiences from Australia to provide better ideas and paths for the development of Confucius Institutes in this region and the promotion of Chinese. This paper uses the strength, weakness, opportunity, and threat (SWOT) model to analyze the market feasibility of financial support for the development of Confucius Institutes and makes certain suggestions for the promotion of Chinese language in the Lancang-Mekong region.

Framework for evaluating code generation ability of large language models

  • Sangyeop Yeo;Yu-Seung Ma;Sang Cheol Kim;Hyungkook Jun;Taeho Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.106-117
    • /
    • 2024
  • Large language models (LLMs) have revolutionized various applications in natural language processing and exhibited proficiency in generating programming code. We propose a framework for evaluating the code generation ability of LLMs and introduce a new metric, pass-ratio@n, which captures the granularity of accuracy according to the pass rate of test cases. The framework is intended to be fully automatic to handle the repetitive work involved in generating prompts, conducting inferences, and executing the generated codes. A preliminary evaluation focusing on the prompt detail, problem publication date, and difficulty level demonstrates the successful integration of our framework with the LeetCode coding platform and highlights the applicability of the pass-ratio@n metric.

Empathetic Dialogue Generation based on User Emotion Recognition: A Comparison between ChatGPT and SLM (사용자 감정 인식과 공감적 대화 생성: ChatGPT와 소형 언어 모델 비교)

  • Seunghun Heo;Jeongmin Lee;Minsoo Cho;Oh-Woog Kwon;Jinxia Huang
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.570-573
    • /
    • 2024
  • 본 연구는 대형 언어 모델 (LLM) 시대에 공감적 대화 생성을 위한 감정 인식의 필요성을 확인하고 소형 언어 모델 (SLM)을 통한 미세 조정 학습이 고비용 LLM, 특히 ChatGPT의 대안이 될 수 있는지를 탐구한다. 이를 위해 KoBERT 미세 조정 모델과 ChatGPT를 사용하여 사용자 감정을 인식하고, Polyglot-Ko 미세 조정 모델 및 ChatGPT를 활용하여 공감적 응답을 생성하는 비교 실험을 진행하였다. 실험 결과, KoBERT 기반의 감정 분류기는 ChatGPT의 zero-shot 접근 방식보다 뛰어난 성능을 보였으며, 정확한 감정 분류가 공감적 대화의 질을 개선하는 데 기여함을 확인하였다. 이는 공감적 대화 생성을 위해 감정 인식이 여전히 필요하며, SLM의 미세 조정이 고비용 LLM의 실용적 대체 수단이 될 수 있음을 시사한다.

  • PDF

A Preprocessor for English-to-Korean Machine Translation of Web Pages (웹용 영한 기계번역을 위한 문서 전처리기의 설계 및 구현)

  • An, Dong-Un;Ryu, Hong-Jin;Seo, Jin-Won;Lee, Young-Woo;Jeong, Sung-Jong;Yuh, Sang-Hwa;Kim, Tae-Wan;Park, Dong-In
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.249-254
    • /
    • 1997
  • 영어 웹 문서를 한국어로 기계번역을 하기 위해서는 HTML 태그를 번역 대상 문장과 분리하는 처리가 필요하다. HTML 태그를 단순히 제거하는 것이 아니라 대상 문장의 기계번역이 종료된 후에 같은 형태의 한국어 웹 문서로 복원하기 위한 방안이 마련 되어야 한다. 또한 문서 전처리기에서는 영어 형태소해석기의 성능을 높이기 위하여 번역 단위가 되는 문장의 인식 및 분리, 타이틀의 처리, 나열된 단어의 처리, 하이픈 처리, 고유명사 인식, 특수 문자 처리, 대소문자 정규화, 날짜 인식 등을 처리하여 문서의 정규화를 수행한다.

  • PDF