• Title/Summary/Keyword: Language Processing

Search Result 2,691, Processing Time 0.045 seconds

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

  • Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.2
    • /
    • pp.15-23
    • /
    • 2022
  • In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.

Scalable Deep Linguistic Processing: Mind the Lexical Gap

  • Baldwin, Timothy
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.3-12
    • /
    • 2007
  • Coverage has been a constant thorn in the side of deployed deep linguistic processing applications, largely because of the difficulty in constructing, maintaining and domaintuning the complex lexicons that they rely on. This paper reviews various strands of research on deep lexical acquisition (DLA), i.e. the (semi-)automatic creation of linguistically-rich language resources, particularly from the viewpoint of DLA for precision grammars.

  • PDF

Artificial Intelligence Applications in Library and Information Science (도서관$\cdot$정보학에서의 인공지능의 응용에 관한 고찰)

  • Chung Young Mee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.14
    • /
    • pp.67-92
    • /
    • 1987
  • In this paper, artificial intelligence applications in library and information science are reviewed. Especially, natural language processing and expert systems are represented as the two major application areas. In natural language processing, natural language interface systems and .question-answering systems are discussed in detail with some specific examples. In the second part of the paper, online search intermidiary systems, reference expert systems, classification and cataloging expert systems are described as possible expert systems to be developed in libraries and information systems. As a conclusion, implications of the artificial intelligence applications for librarians and information scientists are suggested.

  • PDF

Pattern Matching and Its Restrictions in Functional Languages (함수형 언어의 패턴 매칭 기능과 제약에 관한 연구)

  • Gwon, Gi-Hang;Ju, Ye-Chan;Sin, Hyeon-Sam
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1291-1295
    • /
    • 1999
  • Modern functional languages provide some forms of pattern matching capability in them. However, these forms are on an ad-hoc basis and vary from languages to languages, making the user hard to understand the feature. To overcome this problem, we present a systematic approach to adding pattern matching to functional language. We extend to the core functional language with pattern matching capability and illustrate several examples of the language. We also discuss how to extend the pattern matching capability to higher-order terms.

  • PDF

Safety of Large Language Model-Tool Integration (거대 언어 모델 (Large Language Model, LLM)과 도구 결합의 보안성 연구)

  • Juhee Kim;Byoungyoung Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.210-213
    • /
    • 2024
  • 이 연구는 거대한 언어 모델 (Large Language Model, LLM)과 도구를 결합한 시스템의 보안 문제를 다룬다. 프롬프트 주입과 같은 보안 취약점을 분석하고 이를 극복하기 위한 프롬프트 권한 분리 기법을 제안한다. 이를 통해 LLM-도구 결합 시스템에서의 사용자 데이터의 기밀성과 무결성을 보장한다.

Design of On-Line Natural Language Parser (온라인 방식의 자연언어 해석기 설계)

  • 우요섭;최병욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.3
    • /
    • pp.14-23
    • /
    • 1994
  • A natural language processing system usually has the demerit that its processing time is relatively long. If an interactive system makes its user kept waiting long, it can't be said to be practical. In this paper, the on-line natural language parser in which its processing coincides with the sentence's inputting is designed. Since the greater part of morpholgical and syntatic semantic analysis is already performed during the keyboard input, user can get a prompt response. Moreover, the Korean parser is implemented in multitasking environment, and it is compared with an off-line parser. The on-line parser can be considered to be efficient for its real time processing.

  • PDF

A Dictionary Constructing System based on a Web-based Object Model of Distributed Language Resources (웹 기반의 언어자원 객체화에 근거한 사전 개발 시스템)

  • 황도삼
    • Korean Journal of Cognitive Science
    • /
    • v.12 no.1_2
    • /
    • pp.1-9
    • /
    • 2001
  • In this paper. we present a web-based object model of language resources that are distributed in different places in variable forms. Language resources organized as objects distributed over web sites can be easily utilized to produce application systems of natural language processing. So. it renders effective maintenance of overall language processing environment in that upgrading language resources can lead to the mechanical upgrading of application systems. We implemented a dictionary constructing system for Korean Language (YDK2000). This system can integrate various linguistic dictionaries and also allow to construct high quality application specific dictionaries by connecting them to natural language systems on the Internet.

  • PDF

A Natural Language Information Retrieval Model using Automatic Network and Two-level Document Ranking (자동 키워드망과 2단계 문서 순위 결정에 의한 자연어 정보검색 모델)

  • Kang, Hyun-Kyu;Park, Se-Young;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.8-12
    • /
    • 1995
  • 본 논문은 정보검색에서 사용자에게 순서화된 문서를 제시하기 이전에 1차로 검색된 문서들에 대하여 자동 키워드망과 2단계로 문서 순위 결정하는 모델에 대하여 논하였다. 자연어 검색을 위한 색인은 자동으로 구축된 키워드 색인으로 1차로 자연어 검색을 하고, 2차로 자동 키워드망을 이용한 순위재조정을 통해 검색효율의 향상에 관해 검색 효율을 평가하여 1차 검색 결과보다 최대 10.9%의 검색효율 향상을 보였다. 또한 문서 순위 조정 방법에 있어서 여러 가지 공식을 비교 분석하였으며 내용 검색을 반영하는 공식을 찾았다. 본 논문에서 제시한 2단계 순위 결정 방법은 리스트를 기반으로 하는 정보 검색의 분야에 적용되어 검색효율을 높일 수 있는 한가지 방법이 될 수 있을 것이다.

  • PDF

A Study on Trend and Application of Internet Scripting Language (인터넷 스크립팅 언어의 동향 및 응용에 관한 연구)

  • Lee, Jong-Seop;Choe, Yeong-Geun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.11S
    • /
    • pp.3209-3218
    • /
    • 1999
  • Currently in the Web(World Wide Web) environment, HTML(Hyper Text Markup Language) is used for information representation and exchange. But it is thought that HTML has some constraints in information representation of various kinds because of its limited tag set. And it is considered that combining the HTML, which is used for static information representation in Web environment, with Scripting language, which is usually used for multimedia information representation in a synchronized framework, can be very useful. Consequently we show the general trend of the Scripting language in Web environment and show the possibility of HTML and Scripting language amalgamation for Web service improvement.

  • PDF