• Title/Summary/Keyword: sequence to sequence labeling

Search Result 63, Processing Time 0.02 seconds

Korean Semantic Role Labeling Using Structured SVM (Structural SVM 기반의 한국어 의미역 결정)

  • Lee, Changki;Lim, Soojong;Kim, Hyunki
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.220-226
    • /
    • 2015
  • Semantic role labeling (SRL) systems determine the semantic role labels of the arguments of predicates in natural language text. An SRL system usually needs to perform four tasks in sequence: Predicate Identification (PI), Predicate Classification (PC), Argument Identification (AI), and Argument Classification (AC). In this paper, we use the Korean Propbank to develop our Korean semantic role labeling system. We describe our Korean semantic role labeling system that uses sequence labeling with structured Support Vector Machine (SVM). The results of our experiments on the Korean Propbank dataset reveal that our method obtains a 97.13% F1 score on Predicate Identification and Classification (PIC), and a 76.96% F1 score on Argument Identification and Classification (AIC).

Reference String Recognition based on Word Sequence Tagging and Post-processing: Evaluation with English and German Datasets

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.5
    • /
    • pp.1-7
    • /
    • 2018
  • Reference string recognition is to extract individual reference strings from a reference section of an academic article, which consists of a sequence of reference lines. This task has been attacked by heuristic-based, clustering-based, classification-based approaches, exploiting lexical and layout characteristics of reference lines. Most classification-based methods have used sequence labeling to assign labels to either a sequence of tokens within reference lines, or a sequence of reference lines. Unlike the previous token-level sequence labeling approach, this study attempts to assign different labels to the beginning, intermediate and terminating tokens of a reference string. After that, post-processing is applied to identify reference strings by predicting their beginning and/or terminating tokens. Experimental evaluation using English and German reference string recognition datasets shows that the proposed method obtains above 94% in the macro-averaged F1.

The Sequence Labeling Approach for Text Alignment of Plagiarism Detection

  • Kong, Leilei;Han, Zhongyuan;Qi, Haoliang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.9
    • /
    • pp.4814-4832
    • /
    • 2019
  • Plagiarism detection is increasingly exploiting text alignment. Text alignment involves extracting the plagiarism passages in a pair of the suspicious document and its source document. The heuristics have achieved excellent performance in text alignment. However, the further improvements of the heuristic methods mainly depends more on the experiences of experts, which makes the heuristics lack of the abilities for continuous improvements. To address this problem, machine learning maybe a proper way. Considering the position relations and the context of text segments pairs, we formalize the text alignment task as a problem of sequence labeling, improving the current methods at the model level. Especially, this paper proposes to use the probabilistic graphical model to tag the observed sequence of pairs of text segments. Hence we present the sequence labeling approach for text alignment in plagiarism detection based on Conditional Random Fields. The proposed approach is evaluated on the PAN@CLEF 2012 artificial high obfuscation plagiarism corpus and the simulated paraphrase plagiarism corpus, and compared with the methods achieved the best performance in PAN@CLEF 2012, 2013 and 2014. Experimental results demonstrate that the proposed approach significantly outperforms the state of the art methods.

A Review of the Opinion Target Extraction using Sequence Labeling Algorithms based on Features Combinations

  • Aziz, Noor Azeera Abdul;MohdAizainiMaarof, MohdAizainiMaarof;Zainal, Anazida;HazimAlkawaz, Mohammed
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.111-119
    • /
    • 2016
  • In recent years, the opinion analysis is one of the key research fronts of any domain. Opinion target extraction is an essential process of opinion analysis. Target is usually referred to noun or noun phrase in an entity which is deliberated by the opinion holder. Extraction of opinion target facilitates the opinion analysis more precisely and in addition helps to identify the opinion polarity i.e. users can perceive opinion in detail of a target including all its features. One of the most commonly employed algorithms is a sequence labeling algorithm also called Conditional Random Fields. In present article, recent opinion target extraction approaches are reviewed based on sequence labeling algorithm and it features combinations by analyzing and comparing these approaches. The good selection of features combinations will in some way give a good or better accuracy result. Features combinations are an essential process that can be used to identify and remove unneeded, irrelevant and redundant attributes from data that do not contribute to the accuracy of a predictive model or may in fact decrease the accuracy of the model. Hence, in general this review eventually leads to the contribution for the opinion analysis approach and assist researcher for the opinion target extraction in particular.

Scale Invariant Auto-context for Object Segmentation and Labeling

  • Ji, Hongwei;He, Jiangping;Yang, Xin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.8
    • /
    • pp.2881-2894
    • /
    • 2014
  • In complicated environment, context information plays an important role in image segmentation/labeling. The recently proposed auto-context algorithm is one of the effective context-based methods. However, the standard auto-context approach samples the context locations utilizing a fixed radius sequence, which is sensitive to large scale-change of objects. In this paper, we present a scale invariant auto-context (SIAC) algorithm which is an improved version of the auto-context algorithm. In order to achieve scale-invariance, we try to approximate the optimal scale for the image in an iterative way and adopt the corresponding optimal radius sequence for context location sampling, both in training and testing. In each iteration of the proposed SIAC algorithm, we use the current classification map to estimate the image scale, and the corresponding radius sequence is then used for choosing context locations. The algorithm iteratively updates the classification maps, as well as the image scales, until convergence. We demonstrate the SIAC algorithm on several image segmentation/labeling tasks. The results demonstrate improvement over the standard auto-context algorithm when large scale-change of objects exists.

Discriminative Training of Sequence Taggers via Local Feature Matching

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.3
    • /
    • pp.209-215
    • /
    • 2014
  • Sequence tagging is the task of predicting frame-wise labels for a given input sequence and has important applications to diverse domains. Conventional methods such as maximum likelihood (ML) learning matches global features in empirical and model distributions, rather than local features, which directly translates into frame-wise prediction errors. Recent probabilistic sequence models such as conditional random fields (CRFs) have achieved great success in a variety of situations. In this paper, we introduce a novel discriminative CRF learning algorithm to minimize local feature mismatches. Unlike overall data fitting originating from global feature matching in ML learning, our approach reduces the total error over all frames in a sequence. We also provide an efficient gradient-based learning method via gradient forward-backward recursion, which requires the same computational complexity as ML learning. For several real-world sequence tagging problems, we empirically demonstrate that the proposed learning algorithm achieves significantly more accurate prediction performance than standard estimators.

The Research of the 2-Edge Labeling Methods on Binomial Trees (이항트리에서 2-에지번호매김 방법에 대한 연구)

  • Kim, Yong Seok
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.2
    • /
    • pp.37-40
    • /
    • 2015
  • In this paper, we present linear, varied and mixed edge labeling methods using 2-edge labeling on binomial trees. As a result of this paper, we can design the variable topologies to enable optimal broadcasting with binomial tree as spanning tree, if we use these edge labels as the jump sequence of a sort of interconnection networks, circulant graph, with maximum connectivity and high reliability.

Korean Semantic Role Labeling using Input-feeding RNN Search Model with CopyNet (Input-feeding RNN Search 모델과 CopyNet을 이용한 한국어 의미역 결정)

  • Bae, Jangseong;Lee, Changki
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.300-304
    • /
    • 2016
  • 본 논문에서는 한국어 의미역 결정을 순차열 분류 문제(Sequence Labeling Problem)가 아닌 순차열 변환 문제(Sequence-to-Sequence Learning)로 접근하였고, 구문 분석 단계와 자질 설계가 필요 없는 End-to-end 방식으로 연구를 진행하였다. 음절 단위의 RNN Search 모델을 사용하여 음절 단위로 입력된 문장을 의미역이 달린 어절들로 변환하였다. 또한 순차열 변환 문제의 성능을 높이기 위해 연구된 인풋-피딩(Input-feeding) 기술과 카피넷(CopyNet) 기술을 한국어 의미역 결정에 적용하였다. 실험 결과, Korean PropBank 데이터에서 79.42%의 레이블 단위 f1-score, 71.58%의 어절 단위 f1-score를 보였다.

  • PDF

Korean Semantic Role Labeling using Input-feeding RNN Search Model with CopyNet (Input-feeding RNN Search 모델과 CopyNet을 이용한 한국어 의미역 결정)

  • Bae, Jangseong;Lee, Changki
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.300-304
    • /
    • 2016
  • 본 논문에서는 한국어 의미역 결정을 순차열 분류 문제(Sequence Labeling Problem)가 아닌 순차열 변환 문제(Sequence-to-Sequence Learning)로 접근하였고, 구문 분석 단계와 자질 설계가 필요 없는 End-to-end 방식으로 연구를 진행하였다. 음절 단위의 RNN Search 모델을 사용하여 음절 단위로 입력된 문장을 의미역이 달린 어절들로 변환하였다. 또한 순차열 변환 문제의 성능을 높이기 위해 연구된 인풋-피딩(Input-feeding) 기술과 카피넷(CopyNet) 기술을 한국어 의미역 결정에 적용하였다. 실험 결과, Korean PropBank 데이터에서 79.42%의 레이블 단위 f1-score, 71.58%의 어절 단위 f1-score를 보였다.

  • PDF

Improved Perfusion Contrast and Reliability in MR Perfusion Images Using A Novel Arterial Spin Labeling

  • Jahng, Geon-Ho;Xioaping Zhu;Gerald Matson;Weiner, Michael-W;Norbert Schuff
    • Proceedings of the Korean Society of Medical Physics Conference
    • /
    • 2002.09a
    • /
    • pp.341-344
    • /
    • 2002
  • Neurodegenerative disorders, like Alzheimer's disease, are often accompanied by reduced brain perfusion (cerebral blood flow). Using the intrinsic magnetic properties of water, arterial spin labeling magnetic resonance imaging (ASLMRI) can map brain perfusion without injection of radioactive tracers or contrast agents. However, accuracy in measuring perfusion with ASL-MRI can be limited because of contributions to the signal from stationary spins and because of signal modulations due to transient magnetic field effects. The goal was to optimize ASL-MRI for perfusion measurements in the aging human brain, including brains with Alzheimer's disease. A new ASL-MRI sequence was designed and evaluated on phantom and humans. Image texture analysis was performed to test quantitatively improvements. Compared to other ASL-MRI methods, the newly designed sequence provided improved signal to noise ratio improved signal uniformity across slices, and thus, increased measurement reliability. This new ASL-MRI sequence should therefore provide improved measurements of regional changes of brain perfusion in normal aging and neurodegenerative disorders.

  • PDF