• Title/Summary/Keyword: syntactic

Search Result 717, Processing Time 0.025 seconds

A Query Expansion Technique using Query Patterns in QA systems (QA 시스템에서 질의 패턴을 이용한 질의 확장 기법)

  • Kim, Hea-Jung;Bu, Ki-Dong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.1
    • /
    • pp.1-8
    • /
    • 2007
  • When confronted with a query, question answering systems endeavor to extract the most exact answers possible by determining the answer type that fits with the key terms used in the query. However, the efficacy of such systems is limited by the fact that the terms used in a query may be in a syntactic form different to that of the same words in a document. In this paper, we present an efficient semantic query expansion methodology based on query patterns in a question category concept list comprised of terms that are semantically close to terms used in a query. The proposed system first constructs a concept list for each question type and then builds the concept list for each question category using a learning algorithm. The results of the present experiments suggest the promise of the proposed method.

  • PDF

SERI Test Suites '97 : Test Sentences for Korean Syntactic Analyser (SERI Test Suites '97 : 한국어 구문분석기 성능 평가용 문장 모음)

  • Sung, Won-Kyung;Jang, Myung-Gil;Park, Jae-Deuk;Ryu, Pum-Mo;Lee, Hyun-A;Park, Dong-In
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.320-326
    • /
    • 1997
  • 자연어 정보처리 분야의 거듭된 발전은 다양한 언어처리 도구들의 출현을 가져왔다. 그러나 객관적인 성능 평가 기준의 부재로 인해, 개발된 도구들은 임의의 기준에 따라 평가될 수 밖에 없었다. 그 결과 성능 평가 결과는 평가자와 평가자가 제안한 기준에 따라 다를 수 밖에 없었고 따라서 평가 결과 자체 역시 설득력을 갖을 수가 없었다. 이와 같은 문제에 대한 해결책을 찾고자 하는 노력의 일환으로, 본 연구에서는 한국어처리 도구들 중 특히 구문분석기의 체계적이고도 객관적인 성능 평가를 목적으로 제작된 문장들과 관련 주석 정보들로 구성된 SERI Test Suites '97을 소개한다.

  • PDF

Post-processing for Korean OCR Using Cohesive Feature between Syllables and Syntactic Lexical Feature (한국어의 음절 결합 특성 및 통사적 어휘 특성을 이용한 문자인식 후처리 시스템)

  • Hwang, Young-Sook;Park, Bong-Rae;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.175-182
    • /
    • 1997
  • 지금까지의 한글 문자인식 후처리 연구분야에서 미등록어와 비문맥적 오류 문제는 아직까지 잘 해결하지 못하고 있는 문제이다. 본 논문에서는 단어로서 가능한지를 결정하는 기준으로 확률적 음절 결합 정보를 사용하여 형태소 분석 기법만을 사용했을 때 발생할 수 있는 미등록어 문제를 해결하고, 통사적 기능의 어말 어휘를 고려한 문맥 결합 정보를 이용함으로써 다수의 후보 어절 가운데에서 최적의 후보 어절을 선택하는 방법을 제안한다. 제안된 시스템은 인식기에서 내보낸 후보 음절과 학습된 혼동 음절을 조합하여 하나 이상의 후보 어절을 생성하는 모듈과 통계적 언어 정보를 이용하여 최적의 후보 어절을 선정하는 모듈로 구성되었다. 실험은 1000만 원시 코퍼스에서 추출한 음절 결합 정보와 17만 태깅된 코퍼스에서 추출한 어절 결합 정보를 사용하였으며, 실제 인식 결과에 적용한 결과 문자 단위에서는 94.1%의 인식률을 97.4%로, 어절 단위에서는 87.6%를 96.6%로 향상시켰다. 교정률과 오교정률은 각각 문자 단위에서 56%와 0.6%, 어절 단위에서 83.9%와 1.66%를 보였으며, 전체 실험 어절의 3.4%를 차지한 미등록어 중 87.5%를 올바로 인식하는 한편, 전체 오류의 20.3%인 비문맥 오류에 대해서 91.6%를 올바로 교정하는 후처리 성능을 보였다.

  • PDF

Text Steganography Based on Ci-poetry Generation Using Markov Chain Model

  • Luo, Yubo;Huang, Yongfeng;Li, Fufang;Chang, Chinchen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.9
    • /
    • pp.4568-4584
    • /
    • 2016
  • Steganography based on text generation has become a hot research topic in recent years. However, current text-generation methods which generate texts of normal style have either semantic or syntactic flaws. Note that texts of special genre, such as poem, have much simpler language model, less grammar rules, and lower demand for naturalness. Motivated by this observation, in this paper, we propose a text steganography that utilizes Markov chain model to generate Ci-poetry, a classic Chinese poem style. Since all Ci poems have fixed tone patterns, the generation process is to select proper words based on a chosen tone pattern. Markov chain model can obtain a state transfer matrix which simulates the language model of Ci-poetry by learning from a given corpus. To begin with an initial word, we can hide secret message when we use the state transfer matrix to choose a next word, and iterating until the end of the whole Ci poem. Extensive experiments are conducted and both machine and human evaluation results show that our method can generate Ci-poetry with higher naturalness than former researches and achieve competitive embedding rate.

Design and Implementation of a Digital Library for the Visually Handicapped (시각장애인을 위한 전자 도서관의 설계 및 구현)

  • 백현기;최숙영
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.9 no.2
    • /
    • pp.51-58
    • /
    • 2004
  • Today, according to the rapid growing of internet, many homepages have been developed and given us useful information. Since most of these homepages do not consider the visually handicapped, it is very difficult for them to access information from internet. Although several hompages provide services for the visually handicapped now, they are very insufficient to provide information for them. Therefore, this research aims to develop a digital library for the visually handicapped so that they could easily get to the digital library without restriction of time and space, search information from it and use it. The system for digital library supports an interface through which the visually handicapped could easily and conveniently use it, which is developed with TTS based speech syntactic technique.

  • PDF

The Structure of Polysemy: A study of multi-sense words based on WordNet

  • Lin, Jen-Yi;Yang, Chang-Hua;Tseng, Shu-Chuan;Huang, Chu-Ren
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.320-329
    • /
    • 2002
  • The issues in polysemy with respect to the verbs in WordNet will be discussed in this paper. The hypernymy/hyponymy structure of the multiple senses is observed when we try to build a bilingual network for Chinese and English. There are several types of polysemic patterns and a co-hypernym may have the same word form as its subordinates. Fellbaum (2000) dubbed autotroponymy that the verbs linked by mailer relation share the same verb form. However, her syntactic criteria seem not compatible to the hierarchies in WN. Either the criteria or the network should be reconducted. For most verbs in WN 1.7, polysemous relations are unlikely to extend over 3 levels of IS-A relation. Highly polysemous verbs are more complicated and may be involved in certain semantic structures. Semi-automatic sense grouping may be helpful for multimlinguital information retrieveal.

  • PDF

The Loom-LAG for syntax analysis Adding a language-independent level to LAG

  • Schulze, Markus
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.411-420
    • /
    • 2002
  • The left-associative grammar model (LAG) has been applied successfully to the morphologic and syntactic analysis of various european and asian languages. The algebraic definition of the LAG is very well suited for the application to natural language processing as it inherently obeys de Saussure's second law (de Saussure, 1913, p. 103) on the linear nature of language, which phrase-structure grammar (PSG) and categorial grammar (CG) do not. This paper describes the so-called Loom-LAGs (LLAG) -a specialization of LAGs for the analysis of natural language. Whereas the only means of language-independent abstraction in ordinary LAG is the principle of possible continuations, LLAGs introduce a set of more detailed language-independent generalizations that form the so-called loom of a Loom-LAG. Every LLAG uses the very smut loom and adds the language-specific information in the form of a declarative description of the language -much like an ancient mechanised Jacquard-loom would take a program-card providing the specific pattern for the cloth to be woven. The linguistic information is formulated declaratively in so-called syntax plans that describe the sequential structure of clauses and phrases. This approach introduces the explicit notion of phrases and sentence structure to LAG without violating de Saussure's second law iud without leaving the ground of the original algebraic definition of LAG, LLAGS can in fact be shown to be just a notational variant of LAG -but one that is much better suited for the manual development of syntax grammars for the robust analysis of free texts.

  • PDF

로보트 아크용접에서 시각인식장치를 이용한 용접선의 추적

  • 손영탁;김재선;조형석
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1993.10a
    • /
    • pp.550-555
    • /
    • 1993
  • The aim of this paper is to present the development of visual seam tracking system equipped with visual range finder. The visual range finder, which consists of a CCD camera and a diode laser system with line generating optics, developed to recognize the types of weld joints and detect the location of weld joints. In practical applications, however, images of the weld joints are often degraded due to spatters, are flares, surface specularity, and welding smoke. To overcome the problem, this paper proposes a syntactic approach which is a class of artificial intelligence techniques. In the approach, the type of weld joint is inferred based upon the production rules which are linguiques grammars consisting of a set of line and junction primitives of laser strip image projected on weld joint. The production rules eliminate several noisy primitives to create new primitives through the merging process of primitives. After the recognition of weld joint, arc welding is started and the location of weld joints is repeatedly detected using a spring model-based template matching in which the template model is a by-product of the recognition process of weld joint. To show the effectiveness of the proposed approach a series of experiments-identification and robotic tracking-are conducted for four different types of weld joints.

  • PDF

An Example-Based Engligh Learing Environment for Writing

  • Miyoshi, Yasuo;Ochi, Youji;Okamoto, Ryo;Yano, Yoneo
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.292-297
    • /
    • 2001
  • In writing learning as a second/foreign language, a learner has to acquire not only lexical and syntactical knowledge but also the skills to choose suitable words for content which s/he is interested in. A learning system should extrapolate learner\\`s intention and give example phrases that concern with the content in order to support this on the system. However, a learner cannot always represent a content of his/her desired phrase as inputs to the system. Therefore, the system should be equipped with a diagnosis function for learner\\`s intention. Additionally, a system also should be equipped with an analysis function to score similarity between learner\\`s intention and phrases which is stored in the system on both syntactic and idiomatic level in order to present appropriate example phrases to a learner. In this paper, we propose architecture of an interactive support method for English writing learning which is based an analogical search technique of sample phrases from corpora. Our system can show a candidate of variation/next phrases to write and an analogous sentence that a learner wants to represents from corpora.

  • PDF

Advanced Faceted Classification Scheme and Semantic Similarity Measure for Reuse of Software Components (소프트웨어 부품의 재사용을 위한 개선된 패싯 분류 방법과 의미 유사도 측정)

  • Gang, Mun-Seol
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.4
    • /
    • pp.855-865
    • /
    • 1996
  • In this paper, we propose a automation of the classification process for reusable software component and construction method of structured software components library. In order to efficient and automatic classification of software component, we decide the facets to represent characteristics of software component by acquiring semantic and syntactic information from software components descriptions in natural language, and compose the software component identifier or automatic extract terms corresponds to each facets. And then, in order to construct the structured software components library, we sore in the near location with software components of similar characteristic according to semantic similarity of the classified software components. As the result of applying proposed method, we can easily identify similar software components, the classification process of software components become simple, and the software components store in the structured software components library.

  • PDF