• Title/Summary/Keyword: phrase identification

Search Result 18, Processing Time 0.031 seconds

Sensitivity to Phrase-initial Tone and Laryngeal Feature Identification of Foreign Learners of Korean

  • Lee, Hye-Sook
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.91-99
    • /
    • 2010
  • This paper reports on an identification test where KFL learners identified the Korean three-way laryngeal contrast in the phrase-initial position, when the phrase-initial tone was systematically manipulated. It turns out that heritage learners have some sensitivity to phrase-initial tone and show a plain-aspirated alternation in their identification according to the phrase-initial tone, as native speakers do, whereas non-heritage students do not show such tone sensitivity. However, after a weekly prosody training, second-year non-heritage students have shown a significant improvement in their performance. This paper clearly shows that the phrase-initial tone plays a critical role in distinguishing laryngeal features of Korean obstruents, and also suggests that prosody including the tone-segment correlation should be incorporated in the KFL curriculum.

  • PDF

Identification of Maximal-Length Noun Phrases Based on Expanded Chunks and Classified Punctuations in Chinese (확장청크와 세분화된 문장부호에 기반한 중국어 최장명사구 식별)

  • Bai, Xue-Mei;Li, Jin-Ji;Kim, Dong-Il;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.320-328
    • /
    • 2009
  • In general, there are two types of noun phrases(NP): Base Noun Phrase(BNP), and Maximal-Length Noun Phrase(MNP). MNP identification can largely reduce the complexity of full parsing, help analyze the general structure of complex sentences, and provide important clues for detecting main predicates in Chinese sentences. In this paper, we propose a 2-phase hybrid approach for MNP identification which adopts salient features such as expanded chunks and classified punctuations to improve performance. Experimental result shows a high quality performance of 89.66% in $F_1$-measure.

An Artificial Neural Network Based Phrase Network Construction Method for Structuring Facility Error Types (설비 오류 유형 구조화를 위한 인공신경망 기반 구절 네트워크 구축 방법)

  • Roh, Younghoon;Choi, Eunyoung;Choi, Yerim
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.21-29
    • /
    • 2018
  • In the era of the 4-th industrial revolution, the concept of smart factory is emerging. There are efforts to predict the occurrences of facility errors which have negative effects on the utilization and productivity by using data analysis. Data composed of the situation of a facility error and the type of the error, called the facility error log, is required for the prediction. However, in many manufacturing companies, the types of facility error are not precisely defined and categorized. The worker who operates the facilities writes the type of facility error in the form with unstructured text based on his or her empirical judgement. That makes it impossible to analyze data. Therefore, this paper proposes a framework for constructing a phrase network to support the identification and classification of facility error types by using facility error logs written by operators. Specifically, phrase indicating the types are extracted from text data by using dictionary which classifies terms by their usage. Then, a phrase network is constructed by calculating the similarity between the extracted phrase. The performance of the proposed method was evaluated by using real-world facility error logs. It is expected that the proposed method will contribute to the accurate identification of error types and to the prediction of facility errors.

A Method for Clustering Noun Phrases into Coreferents for the Same Person in Novels Translated into Korean (한국어 번역 소설에서 인물명 명사구의 동일인물 공통참조 클러스터링 방법)

  • Park, Taekeun;Kim, Seung-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.3
    • /
    • pp.533-542
    • /
    • 2017
  • Novels include various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author's pen and imagination. As a result, any proper noun dictionary cannot include all kinds of character names. In addition, the novels translated into Korean have character names consisting of two or more nouns (such as "Harry Potter"). In this paper, we propose a method to extract noun phrases for character names and to cluster the noun phrases into coreferents for the same character name. In the extraction of noun phrases, we utilize KKMA morpheme analyzer and CPFoAN character identification tool. In clustering the noun phrases into coreferents, we construct a directed graph with the character names extracted by CPFoAN and the extracted noun phrases, and then we create name sets for characters by traversing connected subgraphs in the directed graph. With four novels translated into Korean, we conduct a survey to evaluate the proposed method. The results show that the proposed method will be useful for speaker identification as well as for constructing the social network of characters.

Comparison of Application Effect of Natural Language Processing Techniques for Information Retrieval (정보검색에서 자연어처리 응용효과 분석)

  • Xi, Su Mei;Cho, Young Im
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.18 no.11
    • /
    • pp.1059-1064
    • /
    • 2012
  • In this paper, some applications of natural language processing techniques for information retrieval have been introduced, but the results are known not to be satisfied. In order to find the roles of some classical natural language processing techniques in information retrieval and to find which one is better we compared the effects with the various natural language techniques for information retrieval precision, and the experiment results show that basic natural language processing techniques with small calculated consumption and simple implementation help a small for information retrieval. Senior high complexity of natural language processing techniques with high calculated consumption and low precision can not help the information retrieval precision even harmful to it, so the role of natural language understanding may be larger in the question answering system, automatic abstract and information extraction.

Maximal Length Noun Phrase Identification Based on Punctuations and Expanded Chunk (문장부호 정보와 확장된 청크에 기반한 중국어 최장명사구 식별)

  • Bai, Xue-Mei;Jin, Mei-Xun;Li, Jin-Ji;Chung, You-Jin;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 2005.10a
    • /
    • pp.112-119
    • /
    • 2005
  • 명사구는 기본명사구와 최장명사구로 분류된다. 최장명사구에 대한 정확한 식별은 문장의 전체적인 구문구조를 파악하고 문장의 정확한 지배용언을 찾아내는데 중요한 역할을 수행한다. 본 논문에서는 확장된 청크(chunk) 개념과 다섯 개의 클래스로 세분화된 문장부호 정보를 사용한 최장명사구 식별 기법을 제안한다. 제안된 기법은 기본모델(baseline)보다 4.05% 향상된 평균 88.63%의 우수한 F-measure 성능을 보인다.

  • PDF

Korean Noun Phrase Identification Using Maximum Entropy Method (최대 엔트로피 모델을 이용한 한국어 명사구 추출)

  • 강인호;전수영;김길창
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2000.06a
    • /
    • pp.127-132
    • /
    • 2000
  • 본 논문에서는 격조사의 구문적인 특성을 이용하여, 수식어까지 포함한 명사구 추출 방법을 연구한다. 명사구 판정을 위해 연속적인 형태소열을 문맥정보로 사용하던 기존의 방법과 달리, 명사구의 처음과 끝 그리고 명사구 주변의 형태소를 이용하여 명사구의 수식 부분과 중심 명사를 문맥정보로 사용한다. 다양한 형태의 문맥 정보들은 최대 엔트로피 원리(Maximum Entropy Principle)에 의해 하나의 확률 분포로 결합된다. 본 논문에서 제안하는 명사구 추출 방법은 먼저 구문 트리 태깅된 코퍼스에서 품사열로 표현되는 명사구 문법 규칙을 얻어낸다. 이렇게 얻어낸 명사구 규칙을 이용하여 격조사와 인접한 명사구 후보들을 추출한다. 추출된 각 명사구 후보는 학습 코퍼스에서 얻어낸 확률 분포에 기반하여 명사구로 해석될 확률값을 부여받는다. 이 중 제일 확률값이 높은 것을 선택하는 형태로 각 격조사와 관계있는 명사구를 추출한다. 본 연구에서 제시하는 모델로 시험을 한 결과 평균 4.5개의 구를 포함하는 명사구를 추출할 수 있었다.

  • PDF

Identification of Chinese Maximal Noun Phrase on Different Context Size Settings Using SVMs (SVMs을 이용한 중국어 최장 명사구 자동 식별)

  • 윤창호;이금희;정유진;김동일;이종혁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.889-891
    • /
    • 2004
  • 중국어의 명사구는 기본 명사구, 최단 명사구, 최장 명사구 등으로 분류할 수 있다. 최장 명사구를 잘 식별해 낼 수 있다면 구문 분석의 복잡도를 크게 낮추고 구문분석의 성능을 향상시킬 수 있다. 각 단어는 시작 태그(O), 종결 태그(C), 한 단어로 이루어진 구 태그(S), 그 외의 태그(N) 등 4가지로 태깅된다. 본 논문은 서로 다른 윈도우 크기(window size)에 기반한 5가지 SVMs 학습 모델을 구축하고 시스템 합성 방법을 이용하여 중국어 최장 명사구 식별에서 85.17%의 정확률을 보여줬다.

  • PDF

Utilizing Various Natural Language Processing Techniques for Biomedical Interaction Extraction

  • Park, Kyung-Mi;Cho, Han-Cheol;Rim, Hae-Chang
    • Journal of Information Processing Systems
    • /
    • v.7 no.3
    • /
    • pp.459-472
    • /
    • 2011
  • The vast number of biomedical literature is an important source of biomedical interaction information discovery. However, it is complicated to obtain interaction information from them because most of them are not easily readable by machine. In this paper, we present a method for extracting biomedical interaction information assuming that the biomedical Named Entities (NEs) are already identified. The proposed method labels all possible pairs of given biomedical NEs as INTERACTION or NO-INTERACTION by using a Maximum Entropy (ME) classifier. The features used for the classifier are obtained by applying various NLP techniques such as POS tagging, base phrase recognition, parsing and predicate-argument recognition. Especially, specific verb predicates (activate, inhibit, diminish and etc.) and their biomedical NE arguments are very useful features for identifying interactive NE pairs. Based on this, we devised a twostep method: 1) an interaction verb extraction step to find biomedically salient verbs, and 2) an argument relation identification step to generate partial predicate-argument structures between extracted interaction verbs and their NE arguments. In the experiments, we analyzed how much each applied NLP technique improves the performance. The proposed method can be completely improved by more than 2% compared to the baseline method. The use of external contextual features, which are obtained from outside of NEs, is crucial for the performance improvement. We also compare the performance of the proposed method against the co-occurrence-based and the rule-based methods. The result demonstrates that the proposed method considerably improves the performance.

Revision of the modified Blood Stasis Questionnaire II (어혈 진단 설문지 개정 연구)

  • Jang, Soobin;Kang, Byoung-Kab;Ko, Mi Mi;Kim, Pyung-Wha;Jung, Jeeyoun
    • Journal of Society of Preventive Korean Medicine
    • /
    • v.24 no.2
    • /
    • pp.95-102
    • /
    • 2020
  • Objectives : The objective of this study was to revise the modified Blood Stasis Questionnaire II. Methods : This revision focused on refining the Korean literature expression of Blood Stasis Questionnaire II consisting of 30-question questionnaire. Seven external experts and five researchers of Korean Institute of Oriental Medicine reviewed the questionnaire and its protocol, while the addition or deletion of questions and changes in scoring method were not dealt with in this revision. Results : Among thirty questions, four questions were corrected to appropriate expressions. In case of eight questions, explanations in Korean or Chinese were added. Thirteen questions in the phrase were changed in sentence form to unify the whole questionnaire. Conclusions : This study introduces the revised version of the modified Blood Stasis Questionnaire II. It is expected that clinical demand of this questionnaire will increase and it will be used vigorously in blood stasis research.