• Title/Summary/Keyword: Korean parsing

Search Result 325, Processing Time 0.032 seconds

A Scheduling Algorithm for Parsing of MPEG Video on the Heterogeneous Distributed Environment (이질적인 분산 환경에서의 MPEG비디오의 파싱을 위한 스케줄링 알고리즘)

  • Nam Yunyoung;Hwang Eenjun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.673-681
    • /
    • 2004
  • As the use of digital videos is getting popular, there is an increasing demand for efficient browsing and retrieval of video. To support such operations, effective video indexing should be incorporated. One of the most fundamental steps in video indexing is to parse video stream into shots and scenes. Generally, it takes long time to parse a video due to the huge amount of computation in a traditional single computing environment. Previous studies had widely used Round Robin scheduling which basically allocates tasks to each slave for a time interval of one quantum. This scheduling is difficult to adapt in a heterogeneous environment. In this paper, we propose two different parallel parsing algorithms which are Size-Adaptive Round Robin and Dynamic Size-Adaptive Round Robin for the heterogeneous distributed computing environments. In order to show their performance, we perform several experiments and show some of the results.

Processing Three Types of Korean Cleft Constructions in a Typed Feature Structure Grammar (유형화된 자질문법에서의 한국어 분열구문의 전산학적 처리)

  • Kim, Jong-Bok;Yang, Jae-Hyung
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.1
    • /
    • pp.1-28
    • /
    • 2009
  • The expression KES, one of the most commonly used words in the Korean language, has various usages. This expression is also used to express English-like cleft constructions. Korean seems to employ at least three different types of cleft constructions: predicational, identificational, and eventual. The paper tries to provide a constraint-based analysis of these three types of Korean cleft constructions and implement the analysis in the LKB(Linguistic Knowledge Building) system to check the feasibility of the analysis. In particular, the paper shows how a typed feature structure grammar, couched upon HPSG, can provide a robust basis for parsing Korean cleft constructions.

  • PDF

CFG based Korean Parsing Using Sentence Patterns as Syntactic Constraint (구문 제약으로 문형을 사용하는 CFG기반의 한국어 파싱)

  • Park, In-Cheol
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.4
    • /
    • pp.958-963
    • /
    • 2008
  • Korean language has different structural properties which are controlled by semantic constraints of verbs. Also, most of Korean sentences are complex sentences which consisted of main clause and embedded clause. Therefore it is difficult to describe appropriate syntactic grammar or constraint for the Korean language and the Korean parsing causes various syntactic ambiguities. In this paper, we suggest how to describe CFG-based grammar using sentence patterns as syntactic constraint and solve syntactic ambiguities. To solve this, we classified 44 sentence patterns including complex sentences which have subordinate clause in Korean sentences and used it to reduce syntactic ambiguity. However, it is difficult to solve every syntactic ambiguity using the information of sentence patterns. So, we used semantic markers with semantic constraint. Semantic markers can be used to solve ambiguity by auxiliary particle or comitative case particle.

High Speed Korean Dependency Analysis Using Cascaded Chunking (다단계 구단위화를 이용한 고속 한국어 의존구조 분석)

  • Oh, Jin-Young;Cha, Jeong-Won
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.1
    • /
    • pp.103-111
    • /
    • 2010
  • Syntactic analysis is an important step in natural language processing. However, we cannot use the syntactic analyzer in Korean for low performance and without robustness. We propose new robust, high speed and high performance Korean syntactic analyzer using CRFs. We treat a parsing problem as a labeling problem. We use a cascaded chunking for Korean parsing. We label syntactic information to each Eojeol at each step using CRFs. CRFs use part-of-speech tag and Eojeol syntactic tag features. Our experimental results using 10-fold cross validation show significant improvement in the robustness, speed and performance of long Korea sentences.

Range Detection of Wa/Kwa Parallel Noun Phrase by Alignment method (정렬기법을 활용한 와/과 병렬명사구 범위 결정)

  • Choe, Yong-Seok;Sin, Ji-Ae;Choe, Gi-Seon;Kim, Gi-Tae;Lee, Sang-Tae
    • Proceedings of the Korean Society for Emotion and Sensibility Conference
    • /
    • 2008.10a
    • /
    • pp.90-93
    • /
    • 2008
  • In natural language, it is common that repetitive constituents in an expression are to be left out and it is necessary to figure out the constituents omitted at analyzing the meaning of the sentence. This paper is on recognition of boundaries of parallel noun phrases by figuring out constituents omitted. Recognition of parallel noun phrases can greatly reduce complexity at the phase of sentence parsing. Moreover, in natural language information retrieval, recognition of noun with modifiers can play an important role in making indexes. We propose an unsupervised probabilistic model that identifies parallel cores as well as boundaries of parallel noun phrases conjoined by a conjunctive particle. It is based on the idea of swapping constituents, utilizing symmetry (two or more identical constituents are repeated) and reversibility (the order of constituents is changeable) in parallel structure. Semantic features of the modifiers around parallel noun phrase, are also used the probabilistic swapping model. The model is language-independent and in this paper presented on parallel noun phrases in Korean language. Experiment shows that our probabilistic model outperforms symmetry-based model and supervised machine learning based approaches.

  • PDF

Improving Parsing Efficiency Using Chunking in Chinese-Korean Machine Translation (중한번역에서 구 묶음을 이용한 파싱 효율 개선)

  • 양재형;심광섭
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.1083-1091
    • /
    • 2004
  • This paper presents a chunking system employed as a preprocessing module to the parser in a Chinese to Korean machine translation system. The parser can benefit from the dependency information provided by the chunking module. The chunking system was implemented using transformation-based learning technique and an effective interface that conveys the dependency information to the parser was also devised. The module was integrated into the machine translation system and experiments were performed with corpuses collected from Chinese websites. The experimental results show the introduction of chunking module provides noticeable improvements in the parser's performance.

Cascaded Parsing Korean Sentences Using Grammatical Relations (문법관계 정보를 이용한 단계적 한국어 구문 분석)

  • Lee, Song-Wook
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.69-72
    • /
    • 2008
  • This study aims to identify dependency structures in Korean sentences with the cascaded chunking. In the first stage of the cascade, we find chunks of NP and guess grammatical relations (GRs) using Support Vector Machine (SVM) classifiers for all possible modifier-head pairs of chunks in terms of GR categories as subject, object, complement, adverbial, etc. In the next stages, we filter out incorrect modifier-head relations in each cascade for its corresponding GR using the SVM classifiers and the characteristics of the Korean language such as distance between relations, no-crossing and case property. Through an experiment with a parsed and GR tagged corpus for training the proposed parser, we achieved an overall accuracy of 85.7%.

Working memory and sensitivity to prosody in spoken language processing (언어 처리에서 운율 제약 활용과 작업 기억의 관계)

  • Lee, Eun-Kyung
    • Korean Journal of Cognitive Science
    • /
    • v.23 no.2
    • /
    • pp.249-267
    • /
    • 2012
  • Individual differences in working memory predict qualitative differences in language processing. High span comprehenders are better able to integrate probabilistic information such as plausibility and animacy, the use of which requires the computation of real world knowledge in syntactic parsing (e.g.,[1]). However, it is unclear whether similar individual differences exist in the use of informative prosodic cues. This study examines whether working memory modulates the use of prosodic boundary information in attachment ambiguity resolution. Prosodic boundaries were manipulated in globally ambiguous relative clause sentences. The results show that high span listeners are more likely to be sensitive to the distinction between different types of prosodic boundaries than low span listeners. The findings suggest that like high-level constraints, the use of low-level prosodic information is resource demanding.

  • PDF