• Title/Summary/Keyword: 어절 정보

Search Result 378, Processing Time 0.027 seconds

The effects of Korean logical ending connective affix on text comprehension and recall (연결어미가 글 이해와 기억에 미치는 효과)

  • Nam, Ki-Chun;Kim, Hyun-Jeong;Park, Chang-Su;Whang, Yu-Mi;Kim, Young-Tae;Sim, Hyun-Sup
    • Annual Conference on Human and Language Technology
    • /
    • 2004.10d
    • /
    • pp.251-258
    • /
    • 2004
  • 본 연구는 연결어미가 글 이해와 기억에 미치는 영향을 조사하고, 연결어미의 효과와 글읽기 능력과는 어떤 관련성이 있는지를 조사하기 위해 실시되었다. 연결어미로는 인과 관계와 부가 관계를 나타내는 연결어미가 사용되었다. 앞뒤에 제시되는 두 문장의 국소적 응집성(Local coherence)을 형성하는데 연결어미가 도움을 준다면, 연결어미가 있는 경우에 문장을 이해하는 속도가 빨라지고 글 내용을 기억하는 데에도 도움을 줄 것으로 예측하였다. 만일에 글읽기 능력이 연결어미를 적절히 사용할 수 있는 능력에 의해서도 영향을 받는다면, 연결어미의 출현 여부와 읽기 능력간에 상호작용이 있을 것으로 예측하였다. 실험 1에서는 인과 관계 연결어미를 사용하여 문장 읽기 시간에 연결어미의 출현이 미치는 효과와 문장 회상에 미치는 효과를 조사하였다. 실험 결과, 인과 관계 연결어미는 뒤의 문장을 읽는데 촉진적인 효과를 주었으며, 이런 연결어미의 효과는 읽기 능력에 관계없이 일관된 촉진 효과를 나타냈다. 또한, 연결어미의 출현은 문장의 회상에 도움을 주었으며, 연결어미가 문장 회상에 미치는 효과는 읽기 능력의 상하에 관계없이 일관되게 나타났다. 실험 2에서는 부가 관계 연결어미가 문장 읽기 시간과 회상에 미치는 효과를 조사하였다. 실험 결과. 부가 관계 연결어미 역시 인과 관계 연결어미와 유사한 형태의 효과를 보였다. 실험 1과 실험 2의 결과는 인과 관계와 부가 관계 연결어미가 앞뒤 문장의 응집성 형성에 긍정적인 영향을 주고, 이런 연결어미의 글읽기에 대한 효과는 글읽기 능력에 관계없이 일정하다는 것을 시사한다.건이 복합 명사의 중심어 선택과 의미 결정에 재활용 될 수 있으며, 병렬말뭉치에 의해 반자동으로 구축되는 의미 대역 패턴을 사용하여 데이터 구축의 어려움을 개선하고자 한다. 및 산출 과정에 즉각적으로 활용될 수 있을 것이다. 또한, 이러한 정보들은 현재 구축중인 세종 전자사전에도 직접 반영되고 있다.teness)은 언화행위가 성공적이라는 것이다.[J. Searle] (7) 수로 쓰인 것(상수)(象數)과 시로 쓰인 것(의리)(義理)이 하나인 것은 그 나타난 것과 나타나지 않은 것들 사이에 어떠한 들도 없음을 말한다. [(성중영)(成中英)] (8) 공통의 규범의 공통성 속에 규범적인 측면이 벌써 있다. 공통성에서 개인적이 아닌 공적인 규범으로의 전이는 규범, 가치, 규칙, 과정, 제도로의 전이라고 본다. [C. Morrison] (9) 우리의 언어사용에 신비적인 요소를 부인할 수가 없다. 넓은 의미의 발화의미(utterance meaning) 속에 신비적인 요소나 애정표시도 수용된다. 의미분석은 지금 한글을 연구하고, 그 결과에 의존하여서 우리의 실제의 생활에 사용하는 $\ulcorner$한국어사전$\lrcorner$ 등을 만드는 과정에서, 어떤 의미에서 실험되었다고 말할 수가 있는 언어과학의 연구의 결과에 의존하여서 수행되는 철학적인 작업이다. 여기에서는 하나의 철학적인 연구의 시작으로 받아들여지는 이 의미분석의 문제를 반성하여 본다.반인과 다르다는 것이 밝혀졌다. 이 결과가 옳다면 한국의 심성 어휘집은 어절 문맥에 따라서 어간이나 어근 또는 활용형 그 자체로 이루어져

  • PDF

Improving Recall for Context-Sensitive Spelling Correction Rules using Conditional Probability Model with Dynamic Window Sizes (동적 윈도우를 갖는 조건부확률 모델을 이용한 한국어 문맥의존 철자오류 교정 규칙의 재현율 향상)

  • Choi, Hyunsoo;Kwon, Hyukchul;Yoon, Aesun
    • Journal of KIISE
    • /
    • v.42 no.5
    • /
    • pp.629-636
    • /
    • 2015
  • The types of errors corrected by a Korean spelling and grammar checker can be classified into isolated-term spelling errors and context-sensitive spelling errors (CSSE). CSSEs are difficult to detect and to correct, since they are correct words when examined alone. Thus, they can be corrected only by considering the semantic and syntactic relations to their context. CSSEs, which are frequently made even by expert wiriters, significantly affect the reliability of spelling and grammar checkers. An existing Korean spelling and grammar checker developed by P University (KSGC 4.5) adopts hand-made correction rules for correcting CSSEs. The KSGC 4.5 is designed to obtain very high precision, which results in an extremely low recall. Our overall goal of previous works was to improve the recall without considerably lowering the precision, by generalizing CSSE correction rules that mainly depend on linguistic knowledge. A variety of rule-based methods has been proposed in previous works, and the best performance showed 95.19% of average precision and 37.56% of recall. This study thus proposes a statistics based method using a conditional probability model with dynamic window sizes. in order to further improve the recall. The proposed method obtained 97.23% of average precision and 50.50% of recall.

The Method of Using the Automatic Word Clustering System for the Evaluation of Verbal Lexical-Semantic Network (동사 어휘의미망 평가를 위한 단어클러스터링 시스템의 활용 방안)

  • Kim Hae-Gyung;Yoon Ae-Sun
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.3
    • /
    • pp.175-190
    • /
    • 2006
  • For the recent several years, there has been much interest in lexical semantic network However it seems to be very difficult to evaluate the effectiveness and correctness of it and invent the methods for applying it into various problem domains. In order to offer the fundamental ideas about how to evaluate and utilize lexical semantic networks, we developed two automatic vol·d clustering systems, which are called system A and system B respectively. 68.455.856 words were used to learn both systems. We compared the clustering results of system A to those of system B which is extended by the lexical-semantic network. The system B is extended by reconstructing the feature vectors which are used the elements of the lexical-semantic network of 3.656 '-ha' verbs. The target data is the 'multilingual Word Net-CoroNet'. When we compared the accuracy of the system A and system B, we found that system B showed the accuracy of 46.6% which is better than that of system A. 45.3%.

Research on the Utilization of Recurrent Neural Networks for Automatic Generation of Korean Definitional Sentences of Technical Terms (기술 용어에 대한 한국어 정의 문장 자동 생성을 위한 순환 신경망 모델 활용 연구)

  • Choi, Garam;Kim, Han-Gook;Kim, Kwang-Hoon;Kim, You-eil;Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.99-120
    • /
    • 2017
  • In order to develop a semiautomatic support system that allows researchers concerned to efficiently analyze the technical trends for the ever-growing industry and market. This paper introduces a couple of Korean sentence generation models that can automatically generate definitional statements as well as descriptions of technical terms and concepts. The proposed models are based on a deep learning model called LSTM (Long Sort-Term Memory) capable of effectively labeling textual sequences by taking into account the contextual relations of each item in the sequences. Our models take technical terms as inputs and can generate a broad range of heterogeneous textual descriptions that explain the concept of the terms. In the experiments using large-scale training collections, we confirmed that more accurate and reasonable sentences can be generated by CHAR-CNN-LSTM model that is a word-based LSTM exploiting character embeddings based on convolutional neural networks (CNN). The results of this study can be a force for developing an extension model that can generate a set of sentences covering the same subjects, and furthermore, we can implement an artificial intelligence model that automatically creates technical literature.

A Model for evaluating the efficiency of inputting Hangul on a telephone keyboard (전화기 자판의 한글 입력 효율성 평가 모형)

  • Koo, Min-Mo;Lee, Mahn-Young
    • The KIPS Transactions:PartD
    • /
    • v.8D no.3
    • /
    • pp.295-304
    • /
    • 2001
  • The standards of a telephone Hangul keyboard should be decided in terms of objective factors : the number of strokes and fingers’moving distance. A number of designers will agree on them, because these factors can be calculated in an objective manner. So, We developed the model which can evaluate the efficiency of inputting Hangul on a telephone keyboard in terms of two factors. As compared with other models, the major features of this model are as follows : in order to evaluate the efficiency of Hangul input on a telephone keyboard, (1) this model calculated not a typing time but the number of strokes ; (2) concurrence frequency that had been counted on KOREA-1 Corpus was used directly ; (3) a total set of 67 consonants and vowels was used ; and (4) this model could evaluate a number of keyboards that use a kind of syllabic function key-the complete key, the null key and the final consonant key and also calculate a lot of keyboards that adopt no syllabic function key. However, there are many other factors to judge the efficiency of inputting Hangul on a telephone keyboard. If we want to make more accurate estimate of a telephone Hangul keyboard, we must consider both logical data and experimental data as well.

  • PDF

The Status and the Value of a New Text, Chunghyangjeon(정향전) that Professor Park Sunho Possesses (새 자료 <정향전>의 자료적 특성과 가치)

  • Jang, Si Gwang
    • (The)Study of the Eastern Classic
    • /
    • no.41
    • /
    • pp.211-247
    • /
    • 2010
  • The purpose of this article is to consider the status and the value of a different book of Chunghyangjeon that Professor Park Sunho possesses. Chunghyangjeon that Professor Park Sunho possesses has the text in Chinese and in Korean. Compared with Chullidaebon, Chunghyangjeon that Professor Park Sunho possesses includes deletion, contraction and addition. Chullidaebon has many mistakes while Parkbon in Chinese shows that the mistakes were corrected. According to the standard, Parkbon in Chinese is similar to Mansongbon and Donambon rather than Chullidaebon and it is more similar to Mansongbon than Donambon. In Parkbon in Chinese, hypocrisy of Yangnyeongdaegun is weakened while his negative personality is magnified. And psychological description of Yangyeongdaegun is weakened and Chunghyang's beauty is magnified. Parkbon in Korean shows deletion, contraction, addition, and change. It doesn't show consciousness of lineage and deletes hypocrisy and negative personality of Yangnyeongdaegun. Also, the phrases are being deleted and contracted, which are difficult for those who do not master Chinese literature or Chinese poems. The professor Park Sunho's copy is more significant than other novels circulated at that times, for it is bound along with a copy in Korean. Compared with other editions of Chunghyangjeon and other classical novels circulated, that kind of edition, which included Korean copy, is very peculiar. The Korean-including edition shows that the Chinese literate tried to share the novel with persons who are Korean literate but cannot read Chinese characters.

Exploiting Chunking for Dependency Parsing in Korean (한국어에서 의존 구문분석을 위한 구묶음의 활용)

  • Namgoong, Young;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.7
    • /
    • pp.291-298
    • /
    • 2022
  • In this paper, we present a method for dependency parsing with chunking in Korean. Dependency parsing is a task of determining a governor of every word in a sentence. In general, we used to determine the syntactic governor in Korean and should transform the syntactic structure into semantic structure for further processing like semantic analysis in natural language processing. There is a notorious problem to determine whether syntactic or semantic governor. For example, the syntactic governor of the word "먹고 (eat)" in the sentence "밥을 먹고 싶다 (would like to eat)" is "싶다 (would like to)", which is an auxiliary verb and therefore can not be a semantic governor. In order to mitigate this somewhat, we propose a Korean dependency parsing after chunking, which is a process of segmenting a sentence into constituents. A constituent is a word or a group of words that function as a single unit within a dependency structure and is called a chunk in this paper. Compared to traditional dependency parsing, there are some advantage of the proposed method: (1) The number of input units in parsing can be reduced and then the parsing speed could be faster. (2) The effectiveness of parsing can be improved by considering the relation between two head words in chunks. Through experiments for Sejong dependency corpus, we have shown that the USA and LAS of the proposed method are 86.48% and 84.56%, respectively and the number of input units is reduced by about 22%p.

The perceptual span during reading Korean sentences (우리글 읽기에서 지각 폭 연구)

  • Choi, So-Young;Koh, Sung-Yrong
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.4
    • /
    • pp.573-601
    • /
    • 2009
  • The present study investigated the perceptual span during reading Korean, using the moving-window display change technique introduced by McConkie and Rayner(1975). Eight different window sizes were used in Experiment 1. They were 3, 5, 7, 9, 11, 13, 15 characters in size and the full line. Reading rate, number of fixation, saccadic distance, fixation duration were compared between each window-size condition and the full line condition. The reading rate was no higher in the full line condition than in the 15 character condition but was higher than in the other conditions. The number of fixations was no larger in the full line condition than in the 15 character condition, had a tendency to be larger than in the 13 characters condition, and was more than in the other conditions. The result pattern of the saccadic distance based on character was the same as that of the reading rate, and the saccadic distance based on the pixel was the same as that of the number of fixation. Similarly, for fixation duration, there was no differences between whole line condition and 15, 13, and 11 characters condition. The fixation duration had a tendency to be shorter in the 9 characters, and was shorter in the 7, 5, and 3 characters conditions than whole line condition. In Experiment 2, based on asymmetry of perceptual span, the 6 different window sizes(0, 1, 2, 3, 4 characters in size and the full line) were used. There was a difference only between the 0 condition and the other conditions in the reading rate, number of fixations, fixation duration. Considering the pattern of eye-movement measures above, the perceptual span of Korean readers extends about 6-7 characters to the right of fixation and 1 character to the left of fixation.

  • PDF