• Title/Summary/Keyword: spoken corpus analysis

Search Result 37, Processing Time 0.023 seconds

On the Merger of Korean Mid Front Vowels: Phonetic and Phonological Evidence

  • Eychenne, Julien;Jang, Tae-Yeoub
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.119-129
    • /
    • 2015
  • This paper investigates the status of the merger between the mid front unrounded vowels ㅔ[e] and ㅐ[${\varepsilon}$] in contemporary Korean. Our analysis is based on a balanced corpus of production and perception data from young subjects from three dialectal areas (Seoul, Daegu and Gwangju). Except for expected gender differences, the production data display no difference in the realization of these vowels, in any of the dialects. The perception data, while mostly in line with the production results, show that Seoul females tend to better discriminate the two vowels in terms of perceived height: vowels with a lower F1 are more likely to be categorized as ㅔ by this group. We then investigate the possible causes of this merger: based on an empirical study of transcribed spoken Korean, we show that the pair of vowels ㅔ/ㅐ has a very low functional load. We argue that this factor, together with the phonetic similarity of the two vowels, may have been responsible for the observed merger.

PROSODY IN SPEECH TECHNOLOGY - National project and some of our related works -

  • Hirose Keikichi
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.15-18
    • /
    • 2002
  • Prosodic features of speech are known to play an important role in the transmission of linguistic information in human conversation. Their roles in the transmission of para- and non- linguistic information are even much more. In spite of their importance in human conversation, from engineering viewpoint, research focuses are mainly placed on segmental features, and not so much on prosodic features. With the aim of promoting research works on prosody, a research project 'Prosody and Speech Processing' is now going on. A rough sketch of the project is first given in the paper. Then, the paper introduces several prosody-related research works, which are going on in our laboratory. They include, corpus-based fundamental frequency contour generation, speech rate control for dialogue-like speech synthesis, analysis of prosodic features of emotional speech, reply speech generation in spoken dialogue systems, and language modeling with prosodic boundaries.

  • PDF

GMM based Nonlinear Transformation Methods for Voice Conversion

  • Vu, Hoang-Gia;Bae, Jae-Hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.67-70
    • /
    • 2005
  • Voice conversion (VC) is a technique for modifying the speech signal of a source speaker so that it sounds as if it is spoken by a target speaker. Most previous VC approaches used a linear transformation function based on GMM to convert the source spectral envelope to the target spectral envelope. In this paper, we propose several nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effect of linear transformation. In order to obtain high-quality modifications of speech signals our VC system is implemented using the Harmonic plus Noise Model (HNM)analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TlMlT.

  • PDF

Automatic Error Correction System for Erroneous SMS Strings (SMS 변형된 문자열의 자동 오류 교정 시스템)

  • Kang, Seung-Shik;Chang, Du-Seong
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.6
    • /
    • pp.386-391
    • /
    • 2008
  • Some spoken word errors that violate grammatical or writing rules occurs frequently in communication environments like mobile phone and messenger. These unexpected errors cause a problem in a language processing system for many applications like speech recognition, text-to-speech translation, and so on. In this paper, we proposed and implemented an automatic correction system of ill-formed words and word spacing errors in SMS sentences that has been the major errors of poor accuracy. We experimented three methods of constructing the word correction dictionary and evaluated the results of those methods. They are (1) manual construction of error words from the vocabulary list of ill-formed communication languages, (2) automatic construction of error dictionary from the manually constructed corpus, and (3) context-dependent method of automatic construction of error dictionary.

Distal Demonstrative Hitlo in Taiwanese Southern Min

  • Zhao, Yi-jing
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.522-530
    • /
    • 2007
  • This article investigates the use of distal demonstrative Hitlo in Taiwanese Southern Min (TSM) from a discourse-pragmatic perspective. The analysis is based on a 5-hour corpus of spoken data, including daily conversations, radio interviews, TV drama series, and some random examples. A total of 172 tokens of Hitlos are identified in the data. They can be divided into six categories according to their functions: firstly, exophoric usage, those Hitlos which refer to an object non-linguistically which can be identified in the immediate situation; secondly, endophoric usage, those which refer to an element textually; thirdly, referent introducing function, those which can be used to introduce a new but identifiable referent into the conversation (the referent usually has topical importance); fourthly, hedging expression, those which serve as a marker of imprecision; fifthly, a condition introducing marker, those which function as an indicator of the coming of a conditional sentence; finally, pause fillers, those which help speakers to manage speech turn or indicate the mental states In addition, an interactive function which Hitlo is found to serve will be discussed. Moreover, a grammaticalizational process involving semantic bleaching which Hitlo is probably undergoing is revealed in general. Finally, a filled demonstrative principle, stating that it may be a universal phenomenon to use demonstratives as filled pause will be proposed.

  • PDF

Part-of-speech Tagging for Hindi Corpus in Poor Resource Scenario

  • Modi, Deepa;Nain, Neeta;Nehra, Maninder
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.147-154
    • /
    • 2018
  • Natural language processing (NLP) is an emerging research area in which we study how machines can be used to perceive and alter the text written in natural languages. We can perform different tasks on natural languages by analyzing them through various annotational tasks like parsing, chunking, part-of-speech tagging and lexical analysis etc. These annotational tasks depend on morphological structure of a particular natural language. The focus of this work is part-of-speech tagging (POS tagging) on Hindi language. Part-of-speech tagging also known as grammatical tagging is a process of assigning different grammatical categories to each word of a given text. These grammatical categories can be noun, verb, time, date, number etc. Hindi is the most widely used and official language of India. It is also among the top five most spoken languages of the world. For English and other languages, a diverse range of POS taggers are available, but these POS taggers can not be applied on the Hindi language as Hindi is one of the most morphologically rich language. Furthermore there is a significant difference between the morphological structures of these languages. Thus in this work, a POS tagger system is presented for the Hindi language. For Hindi POS tagging a hybrid approach is presented in this paper which combines "Probability-based and Rule-based" approaches. For known word tagging a Unigram model of probability class is used, whereas for tagging unknown words various lexical and contextual features are used. Various finite state machine automata are constructed for demonstrating different rules and then regular expressions are used to implement these rules. A tagset is also prepared for this task, which contains 29 standard part-of-speech tags. The tagset also includes two unique tags, i.e., date tag and time tag. These date and time tags support all possible formats. Regular expressions are used to implement all pattern based tags like time, date, number and special symbols. The aim of the presented approach is to increase the correctness of an automatic Hindi POS tagging while bounding the requirement of a large human-made corpus. This hybrid approach uses a probability-based model to increase automatic tagging and a rule-based model to bound the requirement of an already trained corpus. This approach is based on very small labeled training set (around 9,000 words) and yields 96.54% of best precision and 95.08% of average precision. The approach also yields best accuracy of 91.39% and an average accuracy of 88.15%.

CosmoScriBe 2.0 : The development of Korean transcription tools (CosmoScriBe 2.0: 한국어 전사 도구의 개발)

  • Kwak, Sun-Dong;Chang, Moon-Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.3
    • /
    • pp.323-329
    • /
    • 2014
  • In spoken language research, transcription process needs to be carried out to translate voice data into text. Transcription tool, support program of transcription, offers various information such as content and time of utterance and speaker information. For this reason, inexperienced computer users are having trouble familiarizing with the program. Moreover, since there are little transcription tools developed domestically in Korea, they are usually not suitable for Korean environment. In this paper, we propose a transcription tool which supports not only Korean transcription but easy-to-use interface environment for novice. The transcription supporting function is also provided to minimize mistake that might happen in the process of transcription. And a system structure will be provided for data reliability. Usability of the proposed tool is evaluated in accordance with transcription experience. The evaluation result shows that transcription process and transcription support function have become faster and more convenient respectively.