• Title/Summary/Keyword: 음절 단위 처리

Search Result 95, Processing Time 0.019 seconds

Automatic Music Transcription System Using SIDE (SIDE를 이용한 자동 음악 채보 시스템)

  • Hyoung, A-Young;Lee, Joon-Whoan
    • The KIPS Transactions:PartB
    • /
    • v.16B no.2
    • /
    • pp.141-150
    • /
    • 2009
  • This paper proposes a system that can automatically write singing voices to music notes. First, the system uses Stabilized Diffusion Equation(SIDE) to divide the song to a series of syllabic parts based on pitch detection. By the song segmentation, our method can recognize the sound length of each fragment through clustering based on genetic algorithm. Moreover, this study introduces a concept called 'Relative Interval' so as to recognize interval based on pitch of singer. And it also adopted measure extraction algorithm using pause data to implement the higher precision of song transcription. By the experiments using 16 nursery songs, it is shown that the measure recognition rate is 91.5% and DMOS score reaches 3.82. These findings demonstrate effectiveness of system performance.

The Unsupervised Learning-based Language Modeling of Word Comprehension in Korean

  • Kim, Euhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.41-49
    • /
    • 2019
  • We are to build an unsupervised machine learning-based language model which can estimate the amount of information that are in need to process words consisting of subword-level morphemes and syllables. We are then to investigate whether the reading times of words reflecting their morphemic and syllabic structures are predicted by an information-theoretic measure such as surprisal. Specifically, the proposed Morfessor-based unsupervised machine learning model is first to be trained on the large dataset of sentences on Sejong Corpus and is then to be applied to estimate the information-theoretic measure on each word in the test data of Korean words. The reading times of the words in the test data are to be recruited from Korean Lexicon Project (KLP) Database. A comparison between the information-theoretic measures of the words in point and the corresponding reading times by using a linear mixed effect model reveals a reliable correlation between surprisal and reading time. We conclude that surprisal is positively related to the processing effort (i.e. reading time), confirming the surprisal hypothesis.

The exploration of the effects of word frequency and word length on Korean word recognition (한국어 단어재인에 있어서 빈도와 길이 효과 탐색)

  • Lee, Changhwan;Lee, Yoonhyoung;Kim, Tae Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.1
    • /
    • pp.54-61
    • /
    • 2016
  • Because a word is the basic unit of language processing, studies of the word recognition processing and the variables that contribute to word recognition processing are very important. Word frequency and word length are recognized as important factors on word recognition. This study examined the effects of those two variables on the Korean word recognition processing. In Experiment 1, two types of Hangul words, pure Hangul words and Hangul words with Hanja counterparts, were used to explore the frequency effects. A frequency effect was not observed for Hangul words with Hanja counterparts. In Experiment 2, the word length was manipulated to determine if the word length effect appears in Hangul words. Contrary to the expectation, one syllable words were processed more slowly than two syllable words. The possible explanations for these results and future research directions are discussed.

Design and Performance Evaluation of an Indexing Method for Partial String Searches (문자열 부분검색을 위한 색인기법의 설계 및 성능평가)

  • Gang, Seung-Heon;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.6
    • /
    • pp.1458-1467
    • /
    • 1999
  • Existing index structures such as extendable hashing and B+-tree do not support partial string searches perfectly. The inverted file method and the signature file method that are used in the web retrieval engine also have problems that they do not provide partial string searches and suffer from serious retrieval performance degradation respectively. In this paper, we propose an efficient index method that supports partial string searches and achieves good retrieval performance. The proposed index method is based on the Inverted file structure. It constructs the index file with patterns that result from dividing terms by two syllables to support partial string searches. We analyze the characteristics of our proposed method through simulation experiments using wide range of parameter values. We analyze the derive analytic performance evaluation models of the existing inverted file method, signature file method and the proposed index method in terms of retrieval time and storage overhead. We show through performance comparison based on analytic models that the proposed method significantly improves retrieval performance over the existing method.

  • PDF

Development and Validation of the Letter-unit based Korean Sentimental Analysis Model Using Convolution Neural Network (회선 신경망을 활용한 자모 단위 한국형 감성 분석 모델 개발 및 검증)

  • Sung, Wonkyung;An, Jaeyoung;Lee, Choong C.
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.1
    • /
    • pp.13-33
    • /
    • 2020
  • This study proposes a Korean sentimental analysis algorithm that utilizes a letter-unit embedding and convolutional neural networks. Sentimental analysis is a natural language processing technique for subjective data analysis, such as a person's attitude, opinion, and propensity, as shown in the text. Recently, Korean sentimental analysis research has been steadily increased. However, it has failed to use a general-purpose sentimental dictionary and has built-up and used its own sentimental dictionary in each field. The problem with this phenomenon is that it does not conform to the characteristics of Korean. In this study, we have developed a model for analyzing emotions by producing syllable vectors based on the onset, peak, and coda, excluding morphology analysis during the emotional analysis procedure. As a result, we were able to minimize the problem of word learning and the problem of unregistered words, and the accuracy of the model was 88%. The model is less influenced by the unstructured nature of the input data and allows for polarized classification according to the context of the text. We hope that through this developed model will be easier for non-experts who wish to perform Korean sentimental analysis.