• Title/Summary/Keyword: Sound Segmentation

Search Result 29, Processing Time 0.022 seconds

Research about auto-segmentation via SVM (SVM을 이용한 자동 음소분할에 관한 연구)

  • 권호민;한학용;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2220-2223
    • /
    • 2003
  • In this paper we used Support Vector Machines(SVMs) recently proposed as the loaming method, one of Artificial Neural Network, to divide continuous speech into phonemes, an initial, medial, and final sound, and then, performed continuous speech recognition from it. Decision boundary of phoneme is determined by algorithm with maximum frequency in a short interval. Recognition process is performed by Continuous Hidden Markov Model(CHMM), and we compared it with another phoneme divided by eye-measurement. From experiment we confirmed that the method, SVMs, we proposed is more effective in an initial sound than Gaussian Mixture Models(GMMs).

  • PDF

Phoneme segmentation and Recognition using Support Vector Machines (Support Vector Machines에 의한 음소 분할 및 인식)

  • Lee, Gwang-Seok;Kim, Deok-Hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.981-984
    • /
    • 2010
  • In this paper, we used Support Vector Machines(SVMs) as the learning method, one of Artificial Neural Network, to segregated from the continuous speech into phonemes, an initial, medial, and final sound, and then, performed continuous speech recognition from it. A Decision boundary of phoneme is determined by algorithm with maximum frequency in a short interval. Speech recognition process is performed by Continuous Hidden Markov Model(CHMM), and we compared it with another phoneme segregated from the eye-measurement. From the simulation results, we confirmed that the method, SVMs, we proposed is more effective in an initial sound than Gaussian Mixture Models(GMMs).

  • PDF

A Study of Correlation Between Phonological Awareness and Word Identification Ability of Hearing Impaired Children (청각장애 아동의 음운인식 능력과 단어확인 능력의 상관연구)

  • Kim, Yu-Kyung;Kim, Mun-Jung;Ahn, Jong-Bok;Seok, Dong-Il
    • Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.155-167
    • /
    • 2006
  • Hearing impairment children possess poor underlying perceptual knowledge of the sound system and show delayed development of segmental organization of that system. The purpose of this study was to investigate the relationship between phonological awareness ability and word identification ability in hearing impaired children. 14 children with moderately severe hearing loss participated in this study. All tasks were individually administered. Phonological awareness tests consisted of syllable blending, syllable segmentation, syllable deletion, body-coda discrimination, phoneme blending, phoneme segmentation and phoneme deletion. Close-set Monosyllabic Words(12 items) and lists 1 and 2 of open-set Monosyllabic Words in EARS-K were examined for word identification. Results of this study were as follows: First, from the phonological awareness task, the close-set word identification showed a high positive correlation with the coda discrimination, phoneme blending and phoneme deletion. The open-set word identification showed a high positive correlation with phoneme blending, phoneme deletion and phoneme segmentation. Second, from the level of phonological awareness, the close-set word identification showed a high positive correlation with the level of body-coda awareness and phoneme awareness while the open-set word identification showed a high positive correlation only with the level of phoneme awareness.

  • PDF

A Study on Consonant/Vowel/Unvoiced Consonant Phonetic Value Segmentation and Recognition of Korean Isolated Word Speech (한국어 고립 단어 음성의 자음/모음/유성자음 음가 분할 및 인식에 관한 연구)

  • Lee, Jun-Hwan;Lee, Sang-Beom
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.6
    • /
    • pp.1964-1972
    • /
    • 2000
  • For the Korean language, on acoustics, it creates a different form of phonetic value not a phoneme by its own peculiar property. Therefore, the construction of extended recognition system for understanding Korean language should be created with a study of the Korean rule-based system, before it can be used as post-processing of the Korean recognition system. In this paper, text-based Korean rule-based system featuring Korean peculiar vocal sound changing rule is constructed. and based on the text-based phonetic value result of the system constructed, a preliminary phonetic value segmentation border points with non-uniform blocks are extracted in Korean isolated word speech. Through the way of merge and recognition of the non-uniform blocks between the extracted border points, recognition possibility of Korean voice as the form of the phonetic vale has been investigated.

  • PDF

Korean Speech Segmentation and Recognition by Frame Classification via GMM (GMM을 이용한 프레임 단위 분류에 의한 우리말 음성의 분할과 인식)

  • 권호민;한학용;고시영;허강인
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2003.06a
    • /
    • pp.18-21
    • /
    • 2003
  • In general it has been considered to be the difficult problem that we divide continuous speech into short interval with having identical phoneme quality. In this paper we used Gaussian Mixture Model (GMM) related to probability density to divide speech into phonemes, an initial, medial, and final sound. From them we peformed continuous speech recognition. Decision boundary of phonemes is determined by algorithm with maximum frequency in a short interval. Recognition process is performed by Continuous Hidden Markov Model(CHMM), and we compared it with another phoneme divided by eye-measurement. For the experiments result we confirmed that the method we presented is relatively superior in auto-segmentation in korean speech.

  • PDF

Desktop program production

  • Enami, Kazumasa;Fukui, Kazuo;Yagi, Nobuyuki
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1996.06b
    • /
    • pp.77-81
    • /
    • 1996
  • In order to conform to the needs of effective program production in multimedia era, we are studying Desk Top Program Production system. With the DTPP, users can easily produce multimedia program including video, sound, and ancillary data, and freely handle video images synthesizing video components retrieved from video database. This paper describes the new program production system, DTPP and its key technologies such as cooperative program production via multimedia network, indexing and utilization of attribute information of images, and image segmentation and spatio-temporal editing.

  • PDF

Hardware Design of Enhanced Real-Time Sound Direction Estimation System (향상된 실시간 음원방향 인지 시스템의 하드웨어 설계)

  • Kim, Tae-Wan;Kim, Dong-Hoon;Chung, Yun-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.3
    • /
    • pp.115-122
    • /
    • 2011
  • In this paper, we present a method to estimate an accurate real-time sound source direction based on time delay of arrival by using generalized cross correlation with four cross-type microphones. In general, existing systems have two disadvantages such as system embedding limitation due to the necessity of data acquisition for signal processing from microphone input, and real-time processing difficulty because of the increased number of channels for sound direction estimation using DSP processors. To cope with these disadvantages, the system considered in this paper proposes hardware design for enhanced real-time processing using microphone array signal processing. An accurate direction estimation and its design time reduction is achieved by means of an efficient hardware design using spatial segmentation methods and verification techniques. Finally we develop a system which can be used for embedded systems using a sound codec and an FPGA chip. According to experimental results, the system gives much faster real-time processing time compared with either PC-based systems or the case with DSP processors.

Phonological Awareness in Hearing Impaired Children (청각장애아동의 음운인식능력에 대한 연구)

  • Park, Sang-Hee;Seok, Dong-Il;Jeong, Ok-Ran
    • Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.193-202
    • /
    • 2002
  • The purpose of this study is to examine the phonological awareness of hearing impaired children. A number of researches indicate that hearing impaired children have articulation disorders due to their impaired auditory feedback. However, in children who have the ability to distinguish certain phonemes, they sometimes show misarticulation of the phonemes. Phonological awareness refers to recognizing the speech-sound units and their forms in spoken language (Hong, 2001). The subjects who participated in the experiment are composed of four hearing impaired children (3 cochlear implanted children and 1 hearing aided child). Phonological Awareness was evaluated by the test battery developed by Paik et al. (2001). The subtests consisted of rhyme matching, onset matching I II, word initial segmentation and matching I II. If the children asked for retelling, it was retold to a maximum of 4 times. Each item score was 1 point. The results were compared to those of Paik et al. (2001). The results of study were that subject 1 showed superior rhyme matching ability, subjects 2 and 3 fair ability, and subject 4 inferior ability. In onset matching I, all subjects showed inferior ability except for subject 3. Interestingly, subjects 1 showed the lowest onset matching I score. In word initial segmentation and matching I, subjects 1 and 4 showed inferior ability and subjects 2 and 3 showed fair ability. In onset matching II, subject 2 showed the perfect score 10 even though she showed very low score. In word initial segmentation and matching II, only subjects 2 and 3 showed appropriate levels of the skill. The results show that the phonological awareness of hearing impaired children is different from that of normal children.

  • PDF

Retrieval of Player Event in Golf Videos Using Spoken Content Analysis (음성정보 내용분석을 통한 골프 동영상에서의 선수별 이벤트 구간 검색)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.674-679
    • /
    • 2009
  • This paper proposes a method of player event retrieval using combination of two functions: detection of player name in speech information and detection of sound event from audio information in golf videos. The system consists of indexing module and retrieval module. At the indexing time audio segmentation and noise reduction are applied to audio stream demultiplexed from the golf videos. The noise-reduced speech is then fed into speech recognizer, which outputs spoken descriptors. The player name and sound event are indexed by the spoken descriptors. At search time, text query is converted into phoneme sequences. The lists of each query term are retrieved through a description matcher to identify full and partial phrase hits. For the retrieval of the player name, this paper compares the results of word-based, phoneme-based, and hybrid approach.

Ortho-phonic Alphabet Creation by the Musical Theory and its Segmental Algorithm (악리론으로 본 정음창제와 정음소 분절 알고리즘)

  • Chin, Yong-Ohk;Ahn, Cheong-Keung
    • Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.49-59
    • /
    • 2001
  • The phoneme segmentation is a very difficult problem in speech sound processing because it has found out segmental algorithm in many kinds of allophone and coarticulation's trees. Thus system configuration for the speech recognition and voice retrieval processing has a complex system structure. To solve it, we discuss a possibility of new segmental algorithm, which is called the minus a thirds one or plus in tripartitioning(삼분손익) of twelve temporament(12 율려), first proposed by Prof. T. S. Han. It is close to oriental and western musical theory. He also has suggested a 3 consonant and 3 vowel phonemes in Hunminjungum(훈민정음) invented by the King Sejong in the 15th century. In this paper, we suggest to newly name it as ortho-phonic phoneme(OPP/정음소), which carries the meaning of 'the absoluteness and independency'. OPP also is acceptable to any other languages, for example IPA. Lastly we know that this algorithm is constantly applicable to the global language and is very useful to construct a voice recognition and retrieval structuring engineering.

  • PDF