Search | Korea Science

Korean Word Segmentation and Compound-noun Decomposition Using Markov Chain and Syllable N-gram (마코프 체인 밀 음절 N-그램을 이용한 한국어 띄어쓰기 및 복합명사 분리)

권오욱
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.3
- /
- pp.274-284
- /
- 2002
Word segmentation errors occurring in text preprocessing often insert incorrect words into recognition vocabulary and cause poor language models for Korean large vocabulary continuous speech recognition. We propose an automatic word segmentation algorithm using Markov chains and syllable-based n-gram language models in order to correct word segmentation error in teat corpora. We assume that a sentence is generated from a Markov chain. Spaces and non-space characters are generated on self-transitions and other transitions of the Markov chain, respectively Then word segmentation of the sentence is obtained by finding the maximum likelihood path using syllable n-gram scores. In experimental results, the algorithm showed 91.58% word accuracy and 96.69% syllable accuracy for word segmentation of 254 sentence newspaper columns without any spaces. The algorithm improved the word accuracy from 91.00% to 96.27% for word segmentation correction at line breaks and yielded the decomposition accuracy of 96.22% for compound-noun decomposition.
PDF KSCI

Unequal Bit - Error - Probability of Convolutional codes and its Application (길쌈부호의 부등 오류 특성 및 그 응용)

Lee, Soo-In;Lee, Sang-Gon;Moon, Sang-Jae
- Proceedings of the KIEE Conference
- /
- 1988.07a
- /
- pp.194-197
- /
- 1988
The unequal bit-error-probability of rate r=b/n binary convolutional code is analyzed. The error protection affored each digit of the b-tuple information word can be different from that afforded other digit. The property of the unequal protection can be applied to transmitting sampled data in PCM system.
PDF

A Study on the Synchronous Signal Detection and Error Correction in Radio Data System (RDS 수신 시스템에서 동기식 신호복원과 에러정정에 관한 연구)

김기근;류흥균
- Journal of the Korean Institute of Telematics and Electronics A
- /
- v.29A no.8
- /
- pp.1-9
- /
- 1992
Radio data system is a next-generation broadcasting system of digital information communication which multiplexes the digital data into the FM stereo signal in VHF/FM band and provides important and convenient service features. And radio data are composed of groups which are divided into 4 blocks with information word and check word. In this paper, radio data receiver is developed which recovers and process radio data to provide services. Then we confirm that 7dB SNR is required to be 10S0-5TBER of demodulation. Deconding process of shortened-cyclic-decoder has been simulated by computer. Also, the time-compression (by 16 times) method has been adopted for the RDS features post-processing. Via the error probability calculation, simulation and experimentation, the developed receiver system is proved to satisfy the system specification of EBU and implemented by general logic gates and analog circuits.
PDF

Digital enhancement of pronunciation assessment: Automated speech recognition and human raters

Miran Kim
- Phonetics and Speech Sciences
- /
- v.15 no.2
- /
- pp.13-20
- /
- 2023
This study explores the potential of automated speech recognition (ASR) in assessing English learners' pronunciation. We employed ASR technology, acknowledged for its impartiality and consistent results, to analyze speech audio files, including synthesized speech, both native-like English and Korean-accented English, and speech recordings from a native English speaker. Through this analysis, we establish baseline values for the word error rate (WER). These were then compared with those obtained for human raters in perception experiments that assessed the speech productions of 30 first-year college students before and after taking a pronunciation course. Our sub-group analyses revealed positive training effects for Whisper, an ASR tool, and human raters, and identified distinct human rater strategies in different assessment aspects, such as proficiency, intelligibility, accuracy, and comprehensibility, that were not observed in ASR. Despite such challenges as recognizing accented speech traits, our findings suggest that digital tools such as ASR can streamline the pronunciation assessment process. With ongoing advancements in ASR technology, its potential as not only an assessment aid but also a self-directed learning tool for pronunciation feedback merits further exploration.
https://doi.org/10.13064/KSSS.2023.15.2.013 인용 PDF

Optimally Weighted Cepstral Distance Measure for Speech Recognition (음성 인식을 위한 최적 가중 켑스트랄 거리 측정 방법)

김원구
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.133-137
- /
- 1994
In this paper, a method for designing an optimal weight function for the weighted cepstral distance measure is proposed. A conventional weight function or cepstral lifter is obtained eperimentally depending on the spectral components to be emphasized. The proposed method minimizes the error between word reference patterns and the traning data. To compare the proposed optimal weight function with conventional function, speech recognition systems based on Dpynamic Time Warping and Hidden Markov Models were constructed to conduct speaker independent isolated word necogination eperiment. Results show that the proposed method gives better performance than conventional weight functions.
PDF

Generalised Non Error-Accumulative Quantisation Algorithm with feedback loop

Koh, Kyoung-Chul;Choi, Byoung-Wook
- 제어로봇시스템학회:학술대회논문집
- /
- 2004.08a
- /
- pp.1269-1274
- /
- 2004
This paper presents a new quantisation algorithm which has the closed-loop form and guarantees the boundness of accumulative error. This algorithm is particularly useful for mobile robot navigation that is usually implemented on embedded systems. If wheel commands of the mobile robot are given by velocity or positional increment at every control instant and quantised due to finite word length of controller's CPU, the quantisation error gets accumulated to causes large position error. Such an error accumulative characteristic is fatal for non wheeled mobile robots or autonomous vehicles with non-holonomic constraint. To solve this problem, we propose a non-error accumulative quantisation algorithm with closed-loop form. We also show it can be extend to a generalized form corresponding to the n-th order accumulation. The boundness of the accumulative quantisation error is investigated by a series of computer simulation. The proposed method is particularly effective to precise navigation control the autonomous mobile robots.
PDF

A Quantization Algorithm without Accumulative Error

Koh, Kyoung-Chul;Cho, Hyun-Suck
- 제어로봇시스템학회:학술대회논문집
- /
- 1999.10a
- /
- pp.313-316
- /
- 1999
In this paper, a quantization algorithm by which the accumulative error can be prevented is presented. In digital control systems, the quantization cannot be avoided because of the finite word length of digital computers. The error due to quantization of the computed values may be tolerable in case of directly using them. In case of using the accumulated values, the error between sum of the original values and that of the quantized values becomes larger as the number of the values to be summed increases. Such an increasing accumulative error is critical for the control of precise NC machines, robots and autonomous vehicles. To solve this problem, a quantization algorithm without the accumulative error is presented. Basically, the algorithm is based on the feedback loop by which the accumulationive of the quantization error can be prevented. The error boundness of the proposed algorithm is proven and a computer simulation is performed to show the validity of the algorithm.
PDF

Error-Driven Learning of Chinese Word Segmentation

Hockenmaier, Julia;Brew, Chris
- Proceedings of the Korean Society for Language and Information Conference
- /
- 1998.02a
- /
- pp.218-229
- /
- 1998
PDF

Generalization of error decision rules in a grammar checker using Korean WordNet, KorLex (명사 어휘의미망을 활용한 문법 검사기의 문맥 오류 결정 규칙 일반화)

So, Gil-Ja;Lee, Seung-Hee;Kwon, Hyuk-Chul
- The KIPS Transactions:PartB
- /
- v.18B no.6
- /
- pp.405-414
- /
- 2011
Korean grammar checkers typically detect context-dependent errors by employing heuristic rules that are manually formulated by a language expert. These rules are appended each time a new error pattern is detected. However, such grammar checkers are not consistent. In order to resolve this shortcoming, we propose new method for generalizing error decision rules to detect the above errors. For this purpose, we use an existing thesaurus KorLex, which is the Korean version of Princeton WordNet. KorLex has hierarchical word senses for nouns, but does not contain any information about the relationships between cases in a sentence. Through the Tree Cut Model and the MDL(minimum description length) model based on information theory, we extract noun classes from KorLex and generalize error decision rules from these noun classes. In order to verify the accuracy of the new method in an experiment, we extracted nouns used as an object of the four predicates usually confused from a large corpus, and subsequently extracted noun classes from these nouns. We found that the number of error decision rules generalized from these noun classes has decreased to about 64.8%. In conclusion, the precision of our grammar checker exceeds that of conventional ones by 6.2%.
https://doi.org/10.3745/KIPSTB.2011.18B.6.405 인용 PDF KSCI

The Effect of Word Frequency on Noun Definitions (단어빈도가 명사정의하기에 미치는 효과)

Lee, Chan-Jong
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.6
- /
- pp.303-308
- /
- 2008
The purpose of the present study is to investigate that word frequency has significant influence on noun definitions in Korean. The experimental group was 80 students from Elementary school, Middle school, High school and University. They rated familiarity and wrote definitions for nouns. Noun definitions were analyzed with semantic categories such as "use/purpose," "description," "association/relation," "partial explanation," "explanation," "error," "partial explanation-attribute," "partial explanation-specific class," "partial explanation-nonspecific class," "explanation-specific class," "explanation-nonspecific class." As a result, they showed familiarity for high-frequency nouns. "EXPL" categories that use class terms or critical attributes were used more frequently in definitions of high-frequency nouns compared with low-frequency nouns. They increased with age and errors decreased with age. Word frequency had a significant influence on noun definitions.
https://doi.org/10.7776/ASK.2008.27.6.303 인용 PDF KSCI

Search Result 339, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)