• Title/Summary/Keyword: syntactic

Search Result 717, Processing Time 0.023 seconds

Chunking of Contiguous Nouns using Noun Semantic Classes (명사 의미 부류를 이용한 연속된 명사열의 구묶음)

  • Ahn, Kwang-Mo;Seo, Young-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.3
    • /
    • pp.10-20
    • /
    • 2010
  • This paper presents chunking strategy of a contiguous nouns sequence using semantic class. We call contiguous nouns which can be treated like a noun the compound noun phrase. We use noun pairs extracted from a syntactic tagged corpus and their semantic class pairs for chunking of the compound noun phrase. For reliability, these noun pairs and semantic classes are built from a syntactic tagged corpus and detailed dictionary in the Sejong corpus. The compound noun phrase of arbitrary length can also be chunked by these information. The 38,940 pairs of 'left noun - right noun', 65,629 pairs of 'left noun - semantic class of right noun', 46,094 pairs of 'semantic class of left noun - right noun', and 45,243 pairs of 'semantic class of left noun - semantic class of right noun' are used for compound noun phrase chunking. The test data are untrained 1,000 sentences with contiguous nouns of length more than 2randomly selected from Sejong morphological tagged corpus. Our experimental result is 86.89% precision, 80.48% recall, and 83.56% f-measure.

Japanese-Korean Machine Translation System Using Connection Forms of Neighboring Words (인접 단어들의 접속정보를 이용한 일한 기계번역 시스템)

  • Kim, Jung-In
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.998-1008
    • /
    • 2004
  • There are many syntactic similarities between Japanese and Korean languages. Using these similarities, we can make out the Japanese-Korean translation system without most of syntactic analysis and semantic analysis. To improve the translation rates greatly, we have been developing the Japanese-Korean translation system using these similarities from several years ago. However, the system remains some problems such as a translation of inflected words, processing of multi-translatable words and so on. In this paper, we suggest the new method of Japanese-Korean translation by using relations of two neighboring words. To solve the problems, we investigated the connection rules of auxiliary verbs priority. And we design the translation table which is consists of entry tables and connection forms tables. A case of only one translation word, we can translate a Korean to Japanese by direct matching method use of only entry table, otherwise we have to evaluate the connection value by connection forms tables and then we can select the best translation word.

  • PDF

Rule Construction for Determination of Thematic Roles by Using Large Corpora and Computational Dictionaries (대규모 말뭉치와 전산 언어 사전을 이용한 의미역 결정 규칙의 구축)

  • Kang, Sin-Jae;Park, Jung-Hye
    • The KIPS Transactions:PartB
    • /
    • v.10B no.2
    • /
    • pp.219-228
    • /
    • 2003
  • This paper presents an efficient construction method of determination rules of thematic roles from syntactic relations in Korean language processing. This process is one of the main core of semantic analysis and an important issue to be solved in natural language processing. It is problematic to describe rules for determining thematic roles by only using general linguistic knowledge and experience, since the final result may be different according to the subjective views of researchers, and it is impossible to construct rules to cover all cases. However, our method is objective and efficient by considering large corpora, which contain practical osages of Korean language, and case frames in the Sejong Electronic Lexicon of Korean, which is being developed by dozens of Korean linguistic researchers. To determine thematic roles more correctly, our system uses syntactic relations, semantic classes, morpheme information, position of double subject. Especially by using semantic classes, we can increase the applicability of the rules.

A Study on the Construction of the Automatic Summaries - on the basis of Straight News in the Web - (자동요약시스템 구축에 대한 연구 - 웹 상의 보도기사를 중심으로 -)

  • Lee, Tae-Young
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.4 s.62
    • /
    • pp.41-67
    • /
    • 2006
  • The writings frame and various rules based on discourse structure and knowledge-based methods were applied to construct the automatic Ext/sums (extracts & summaries) system from the straight news in web. The frame contains the slot and facet represented by the role of paragraphs, sentences , and clauses in news and the rules determining the type of slot. Rearrangement like Unification, separation, and synthesis of the candidate sentences to summary, maintaining the coherence of meanings, was carried out by using the rules derived from similar degree measurement, syntactic information, discourse structure, and knowledge-based methods and the context plots defined with the syntactic/semantic signature of noun and verb and category of verb suffix. The critic sentence were tried to insert into summary.

Relation Extraction based on Composite Kernel combining Pattern Similarity of Predicate-Argument Structure (술어-논항 구조의 패턴 유사도를 결합한 혼합 커널 기반관계 추출)

  • Jeong, Chang-Hoo;Choi, Sung-Pil;Choi, Yun-Soo;Song, Sa-Kwang;Chun, Hong-Woo
    • Journal of Internet Computing and Services
    • /
    • v.12 no.5
    • /
    • pp.73-85
    • /
    • 2011
  • Lots of valuable textual information is used to extract relations between named entities from literature. Composite kernel approach is proposed in this paper. The composite kernel approach calculates similarities based on the following information:(1) Phrase structure in convolution parse tree kernel that has shown encouraging results. (2) Predicate-argument structure patterns. In other words, the approach deals with syntactic structure as well as semantic structure using a reciprocal method. The proposed approach was evaluated using various types of test collections and it showed the better performance compared with those of previous approach using only information from syntactic structures. In addition, it showed the better performance than those of the state of the art approach.

Korean '-e ci' Constructions: Anti-Causatives or Passives?

  • Song, Jina
    • Language and Information
    • /
    • v.20 no.1
    • /
    • pp.51-71
    • /
    • 2016
  • The status of the Korean morphological marker '-e ci' has been controversial whether it is a passive marker, an anticausative marker, or a passive/anticausative marker. However, the previous approaches that tried to classify '-e ci' constructions based on the syntactic verb classes (i.e. intransitive or transitive) were short of explaining the properties of the constructions. In this study, the '-e ci' constructions were distinguished based on agentivity, following Levin & Rappaport Hovav (1995) and Alexiadou et al. (2006). Moreover, how the verbal root meaning is associated with the passive/anticausative construction was investigated by means of Distributed Morphology (DM) (Embick 2010; Marantz 1997). I argued that the morphological marker '-e ci' is the instantiation of the absence of external arguments. With respect to the behavior of the Korean '-e ci' constructions with the semantics of each verbal root class, I found out that the '-e ci' constructions can form passives with the verbal roots that require the external arguments; whereas, the anticausatives cannot be formed with the roots that necessarily require the agentive arguments. However, contrary to the previous arguments that '-e ci' passives can be only formed with transitive verbs, it is discovered that non-agentive transitive roots do form anticausatives. Moreover, I argued that there are two types of the anticausatives - zero and '-e ci' anticausatives. Since the valency reduction is marked by the non-active voice morphology, the zero anticausatives appear only with the roots that do not require external arguments. The different '-e ci' constructions (passives, '-e ci', and zero anticausatives) are represented by the distinct syntactic structures. I proposed that the morphological similarity between the passives and the '-e ci' anticausatives is due to the presence of VoiceP, which introduces the external arguments. Moreover, the lack of the voice morphology in the zero anticausatives is explained by the absence of the VoiceP.

  • PDF

A three-step sentence searching method for implementing a chatting system (채팅 시스템 구현을 위한 3단계 문장 검색 방법)

  • Jeon, Won-Pyo;Song, Yoeng-Kil;Kim, Hark-Soo
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.37 no.2
    • /
    • pp.205-212
    • /
    • 2013
  • The previous chatting systems have generally used methods based on lexical agreement between users' input sentences and target sentences in a database. However, these methods often raise well-known lexical disagreement problems. To resolve some of lexical disagreement problems, we propose a three-step sentence searching method that is sequentially applied when the previous step is failed. The first step is to compare common keyword sequences between users' inputs and target sentences in the lexical level. The second step is to compare sentence types and semantic markers between users' input and target sentences in the semantic level. The last step is to match users's inputs against predefined lexico-syntactic patterns. In the experiments, the proposed method showed better response precision and user satisfaction rate than simple keyword matching methods.

Analysis and Prediction of Prosodic Phrage Boundary (운율구 경계현상 분석 및 텍스트에서의 운율구 추출)

  • Kim, Sang-Hun;Seong, Cheol-Jae;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.24-32
    • /
    • 1997
  • This study aims to describe, at one aspect, the relativity between syntactic structure and prosodic phrasing, and at the other, to establish a suitable phrasing pattern to produce more natural synthetic speech. To get meaningful results, all the word boundaries in the prosodic database were statistically analyzed, and assigned by the proper boundary type. The resulting 10 types of prosodic boundaries were classified into 3 types according to the strength of the breaks, which are zero, minor, and major break respectively. We have found out that the durational information was a main cue to determine the major prosodic boundary. Using the bigram and trigram of syntactic information, we predicted major and minor classification of boundary types. With brigram model, we obtained the correct major break prediction rates of 4.60%, 38.2%, the insertion error rates of 22.8%, 8.4% on each Test-I and Test-II text database respectively. With trigram mode, we also obtained the correct major break prediction rates of 58.3%, 42.8%, the insertion error rates of 30.8%, 42.8%, the insertion error rates of 30.8%, 11.8% on Test-I and Test-II text database respectively.

  • PDF

A Genotypical Analysis of Korean REMCs and Generation of Base Line Data for the Analysis and Evaluation for Future (REMCs) Designs Using Space Syntax

  • Ullah, Ubaid;Park, Jae Seung
    • Journal of The Korea Institute of Healthcare Architecture
    • /
    • v.22 no.1
    • /
    • pp.17-28
    • /
    • 2016
  • Purpose: The purpose of this paper is to analyze the spatial configurations of a sample of Korean regional emergency medical centers (REMCs) to explore its underlying genotypes and thus produce a base line data for the analysis and evaluations of future REMCs designs using space syntax theory. Methods: Space syntax analysis was used as a major tool for the analysis and exploration of Genotype. The measures of Integration(overall integration with exterior and without exterior as well as the integration of individual clinical spaces for each center), base difference factor (DF) and Space link ratio were calculated for a sample of seven Korean REMCs. Results: The result shows a strikingly similar pattern of Syntactic measures across the sample, the mean integration of sample ranges from 0.82-0.99 with exterior (while considering the exterior space as a root) and 0.81-1.01 without exterior (considering the connections of interior spaces only with no outside connection). The base difference factor (DF) of the sample varies from 0.60-0.81 with exterior and from 0.59-0.82 without exterior. Case number-1 was identified as non-genotype with differing order of Syntactic values. Although the genotype had different forms, layouts and even sizes, these results cannot be explained by Phenotypical comparisons. Implications: This study will contribute to the configurational analysis and evaluation of existing and future Korean REMCs design and practice of emergency healthcare delivery system in Korea.

A Comparative Analysis of the Word Depth Appearing in Representations Used in the Definitions of Mathematical Terms and Word Problem in Elementary School Mathematics Textbook (초등 수학 교과서의 수학 용어 정의 및 문장제에 사용된 표현의 문장 복잡성 비교 분석)

  • Kang, Yunji;Paik, Suckyoon
    • Journal of Elementary Mathematics Education in Korea
    • /
    • v.24 no.2
    • /
    • pp.231-257
    • /
    • 2020
  • As the main mathematical concepts are presented and expressed in various ways through textbooks during the teaching and learning process, it is necessary to look at the representations used in elementary math textbooks to find effective guidance. This study analyzed sentences used in the definition of mathematical terms and unit assessments of current elementary mathematics textbooks according to word depth (Yngve, 1960) from a syntactic perspective. As a result of the analysis, it could be seen that the sentences in textbook were generally concise, the word depth was lower, and the sentence structure and form were different depending on the individual characteristics of each term. Also, the sentences in the lower grade textbooks were more easily constructed, and the sentences of the term definition were more complex than the sentences of the unit assessments. Efforts should be made to help learners learn mathematical concepts, such as clarifying sentences in textbooks, presenting visual materials together, and providing additional explanations to suit the level of individual learners.