• Title/Summary/Keyword: word spacing

Search Result 66, Processing Time 0.022 seconds

Automatic Error Correction System for Erroneous SMS Strings (SMS 변형된 문자열의 자동 오류 교정 시스템)

  • Kang, Seung-Shik;Chang, Du-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.6
    • /
    • pp.386-391
    • /
    • 2008
  • Some spoken word errors that violate grammatical or writing rules occurs frequently in communication environments like mobile phone and messenger. These unexpected errors cause a problem in a language processing system for many applications like speech recognition, text-to-speech translation, and so on. In this paper, we proposed and implemented an automatic correction system of ill-formed words and word spacing errors in SMS sentences that has been the major errors of poor accuracy. We experimented three methods of constructing the word correction dictionary and evaluated the results of those methods. They are (1) manual construction of error words from the vocabulary list of ill-formed communication languages, (2) automatic construction of error dictionary from the manually constructed corpus, and (3) context-dependent method of automatic construction of error dictionary.

Recognizing Unknown Words and Correcting Spelling errors as Preprocessing for Korean Information Processing System (한국어 정보처리 시스템의 전처리를 위한 미등록어 추정 및 철자 오류의 자동 교정)

  • Park, Bong-Rae;Rim, Hae-Chang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.10
    • /
    • pp.2591-2599
    • /
    • 1998
  • In this paper, we proose a method of recognizing unknown words and correcting spelling errors(including spacing erors) to increase the performance of Korean information processing systems. Unknown words are recognized through comparative analysis of two or more morphologically similar eojeols(spacing units in Korean) including the same unknown word candidates. And spacing errors and spelling errors are corrected by using lexicatlized rules shich are automatically extracted from very large raw corpus. The extractionof the lexicalized rules is based on morphological and contextual similarities between error eojeols and their corection eojeols which are confirmed to be used in the corpus. The experimental result shows that our system can recognize unknown words in an accuracy of 98.9%, and can correct spacing errors and spelling errors in accuracies of 98.1% and 97.1%, respectively.

  • PDF

Analysis on Sentence Error Types of Mathematical Problem Posing of Pre-Service Elementary Teachers (초등학교 예비교사들의 수학적 '문제 만들기'에 나타나는 문장의 오류 유형 분석)

  • Huh, Nan;Shin, Hocheol
    • Journal of the Korean School Mathematics Society
    • /
    • v.16 no.4
    • /
    • pp.797-820
    • /
    • 2013
  • This study intended on analyzing the error patterns of mathematic problem posing sentences by the 100 elementary pre-teachers and discussing about the solutions. The results showed that the problem posing sentences have five error patterns: phonological error patterns, word error patterns, sentence error patterns, meaning error patterns, and notation error patterns. Divided into fourteen specific error patterns, they are as in the following. 1) Phonological error patterns are consisted of the 'ㄹ' addition error pattern and the abbreviated word error pattern. 2) Words error patterns are divided with the inappropriate usage of word error pattern and the inadequate abbreviation error pattern, which are formulized four subgroups such as the case maker, ending of the word, inappropriate usage of word, and inadequate abbreviation of article or word error pattern in detail. 3) Sentence error patterns are assumed four kinds of forms: the reference, ellipsis of sentence component, word order, and incomplete sentence error pattern. 4) Meaning error patterns are composed the logical contradiction and the ambiguous meaning. 5) Notation error patterns are formed four patterns as the spacing, punctuation, orthography of Hangul, and spelling rules of foreign words in Korean. Furthermore, the solutions for these error patterns were discussed: First, it has to be perceived the differences between spoken and written language. Second, it has to be rejected the spoken expressions in written contexts. Third, it should be focused on the learning of the basic sentence patterns during the class. Forth, it is suggested that the word meaning should have the logical development perception based on what it means. Finally, it is proposed that the system of spelling of Korean has to be learned. In addition to these suggestions, a new understanding is necessary regarding writing education for college students.

  • PDF

Performance Analysis of Space-Time Codes in Realistic Propagation Environments: A Moment Generating Function-Based Approach

  • Lamahewa Tharaka A.;Simon Marvin K.;Kennedy Rodney A.;Abhayapala Thushara D.
    • Journal of Communications and Networks
    • /
    • v.7 no.4
    • /
    • pp.450-461
    • /
    • 2005
  • In this paper, we derive analytical expressions for the exact pairwise error probability (PEP) of a space-time coded system operating over spatially correlated fast (constant over the duration of a symbol) and slow (constant over the length of a code word) fad­ing channels using a moment-generating function-based approach. We discuss two analytical techniques that can be used to evaluate the exact-PEPs (and therefore, approximate the average bit error probability (BEP)) in closed form. These analytical expressions are more realistic than previously published PEP expressions as they fully account for antenna spacing, antenna geometries (uniform linear array, uniform grid array, uniform circular array, etc.) and scattering models (uniform, Gaussian, Laplacian, Von-mises, etc.). Inclusion of spatial information in these expressions provides valuable insights into the physical factors determining the performance of a space-time code. Using these new PEP expressions, we investigate the effect of antenna spacing, antenna geometries and azimuth power distribution parameters (angle of arrival/departure and angular spread) on the performance of a four-state QPSK space-time trellis code proposed by Tarokh et al. for two transmit antennas.

Comments Classification System using Topic Signature (Topic Signature를 이용한 댓글 분류 시스템)

  • Bae, Min-Young;Cha, Jeong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.12
    • /
    • pp.774-779
    • /
    • 2008
  • In this work, we describe comments classification system using topic signature. Topic signature is widely used for selecting feature in document classification and summarization. Comments are short and have so many word spacing errors, special characters. We firstly convert comments into 7-gram. We consider the 7-gram as sentence. We convert the 7-gram into 3-gram. We consider the 3-gram as word. We select key feature using topic signature and classify new inputs by the Naive Bayesian method. From the result of experiments, we can see that the proposed method is outstanding over the previous methods.

Analysis of Passing Word Line Induced Leakage of BCAT Structure in DRAM (BCAT구조 DRAM의 패싱 워드 라인 유도 누설전류 분석)

  • Su Yeon, Kim;Dong Yeong Kim;Je Won Park;Shin Wook Kim;Chae Hyuk Lim;So won Kim;Hyeona Seo;Ju Won Kim;Hye Rin Lee;Jeong Hyeon Yun;Young-Woo Lee;Hyoung-Jin Joe;Myoung Jin Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.644-649
    • /
    • 2023
  • As the cell spacing decreases during the scaling process of DRAM(Dynamic Random Access Memory), the reduction in STI(Shallow Trench Isolation) thickness leads to an increase in sub-threshold leakage due to the passing word line effect. The increase in sub-threshold leakage current caused by the voltage applied to adjacent passing word lines affects the data retention time and increases the number of refresh operations, thereby contributing to higher power consumption in DRAM. In this paper, we identify the causes of the passing word line effect through TCAD Simulation. As a result, we confirm the DRAM operational conditions under which the passing word line effect occurs, and observe that this effect alters the proportion of the total leakage current attributable to different causes. Through this, we recognize the necessity to consider not only leakage currents due to GIDL(Gate Induced Drain Leakage) but also sub-threshold leakage currents, providing guidance for improving DRAM structure.

The Sensitivity Analysis for Customer Feedback on Social Media (소셜 미디어 상 고객피드백을 위한 감성분석)

  • Song, Eun-Jee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.4
    • /
    • pp.780-786
    • /
    • 2015
  • Social media, such as Social Network Service include a lot of spontaneous opinions from customers, so recent companies collect and analyze information about customer feedback by using the system that analyzes Big Data on social media in order to efficiently operate businesses. However, it is difficult to analyze data collected from online sites accurately with existing morpheme analyzer because those data have spacing errors and spelling errors. In addition, many online sentences are short and do not include enough meanings which will be selected, so established meaning selection methods, such as mutual information, chi-square statistic are not able to practice Emotional Classification. In order to solve such problems, this paper suggests a module that can revise the meanings by using initial consonants/vowels and phase pattern dictionary and meaning selection method that uses priority of word class in a sentence. On the basis of word class extracted by morpheme analyzer, these new mechanisms would separate and analyze predicate and substantive, establish properties Database which is subordinate to relevant word class, and extract positive/negative emotions by using accumulated properties Database.

Development of POS Tagging System Independent to Word Spacing (띄어쓰기 비종속 품사 태깅 시스템 개발)

  • Lee, Kyung-Il;Ahn, Tae-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2003.10d
    • /
    • pp.69-72
    • /
    • 2003
  • 본 논문에서는 입력된 한국어 문자열로부터 형태소를 분석하고, 품사를 태깅하는 방법에 있어 개선된 통계적 모델을 제안하고, 이에 기반한 띄어쓰기 비종속 형태소 분석 및 태깅 시스템의 개발과 성능 평가에 대한 결과를 소개하고 있다. 제안된 통계 기반품사 태깅 시스템은 입력된 문자열로부터 음절의 띄어쓰기 확률값을 계산하여 유사어절을 생성하고, 유사어절 단위로 사용자 띄어쓰기와 상관없이 형태소 후보 리스트를 생성하며, 인접한 후보 형태소들의 접속 확률 계산에 있어 어절 간 접속 확률과 어절 내 접속 확률을 모두 사용함으로, 최적의 형태소 리스트를 결정하는 모델을 사용하고 있다. 특히, 형태소들의 접속 확률 계산 시 어절 간 접속 확률과 어절 내 접속 확률의 결합 비율이 음절의 띄어쓰기 확률 값과 사용자의 띄어쓰기 여부에 따라 자동으로 조절되는 특징을 가지고 있으며, 이를 통해 극단적으로 띄어 쓰거나 붙여 쓴 문장에 대해서도 평균 90%수준의 품사 태깅 성능을 달성할 수 있었다.

  • PDF

A Design and Implementation of Hangul Spelling and Word-spacing Checker using Connectivity Information (접속정보를 이용한 한글 철자 및 띄어쓰기 검사기의 설계 및 구현)

  • Kang, J.W.;Song, C.H.;Kim, Y.B.;Choi, K.S.;Kwon, Y.R.;Kim, G.C.
    • Annual Conference on Human and Language Technology
    • /
    • 1989.10a
    • /
    • pp.3-9
    • /
    • 1989
  • 본 논문은 $UNIX^{TM}$ 환경에서의 한글 텍스트에 대해 일괄 처리 방식으로 한글 철자 및 띄어쓰기를 검사하는 시스템을 설계 및 구현하였다. 본 시스템은 접속 정보를 이용한 최단일치법을 사용하여 한 어절에 대해 형태론적인 분석을 하여 입력된 화일 내의 철자 및 띄어쓰기 오류를 찾아낸다.

  • PDF

A Study on Korean Translation of the Pathway of Lung Meridian in Miraculous Pivot·Meridian Vessel (영추·경맥편 수태음폐경 유주의 한글번역에 대한 고찰)

  • Jung, Hyejin;Lim, Sabina
    • Korean Journal of Acupuncture
    • /
    • v.33 no.3
    • /
    • pp.114-120
    • /
    • 2016
  • Objectives : It aims to establish a basic rule in Korean translation of the pathway of lung meridian in Miraculous Pivot Meridian vessel. Based on the rule, We tried to make standard translation of the pathway of lung meridian in Miraculous Pivot Meridian vessel. Methods : Books needed for this study were collected through searching Kyunghee University Library(http:// khis.khu.ac.kr). Keywords included "Miraculous Pivot of Huangdi's Internal Classic". We also include the book which is generally used as a textbook in Colleges of Korean Medicine. Results : In five Chinese books, the word-spacing was used differently in four phrases. Six Korean-translated books had the different translation in three phrases. We suggested a standard Korean translation of the pathway of lung meridian in Miraculous Pivot Meridian vessel. Conclusions : This result of the study would be expected to not only be published in Korean Journal of Acupuncture but be studied more about Korean translation by experts in this field.