• 제목/요약/키워드: Speech sound

검색결과 628건 처리시간 0.028초

기능적 조음장애아동과 일반아동의 어중자음 연쇄조건에서 나타나는 어중종성 오류 특성 비교 (Comparison of error characteristics of final consonant at word-medial position between children with functional articulation disorder and normal children)

  • 이란;이은주
    • 말소리와 음성과학
    • /
    • 제7권2호
    • /
    • pp.19-28
    • /
    • 2015
  • This study investigated final consonant error characteristics at word-medial position in children with functional articulation disorder. Data was collected from 11 children with functional articulation and 11 normal children, ages 4 to 5. The speech samples were collected from a naming test. Seventy-five words with every possible bi-consonants matrix at the word-medial position were used. The results of this study were as follows : First, percentage of correct word-medial final consonants of functional articulation disorder was lower than normal children. Second, there were significant differences between two groups in omission, substitution and assimilation error. Children with functional articulation disorder showed a high frequency of omission and regressive assimilation error, especially alveolarization in regressive assimilation error most. However, normal children showed a high frequency of regressive assimilation error, especially bilabialization in regressive assimilation error most. Finally, the results of error analysis according to articulation manner, articulation place and phonation type of consonants of initial consonant at word-medial, both functional articulation disorder and normal children showed a high error rate in stop sound-stop sound condition. The error rate of final consonant at word-medial position was high when initial consonant at word-medial position was alveolar sound and alveopalatal sound. Futhermore, when initial sounds were fortis and aspirated sounds, more errors occurred than linis sound was initial sound. The results of this study provided practical error characteristics of final consonant at word-medial position in children with speech sound disorder.

음성 분리를 위한 스펙트로그램의 마루와 골을 이용한 시간-주파수 공간에서 소리 분할 기법 (A Method of Sound Segmentation in Time-Frequency Domain Using Peaks and Valleys in Spectrogram for Speech Separation)

  • 임성길;이현수
    • 한국음향학회지
    • /
    • 제27권8호
    • /
    • pp.418-426
    • /
    • 2008
  • 본 논문에서는 스펙트로그램에서 마루와 골을 이용한 주파수 채널 분할 알고리즘을 제안한다. 주파수 채널 분할 문제는 동일한 음원으로부터 발생한 음성이 포함된 주파수 채널들을 하나의 그룹으로 묶는 것을 의미한다. 제안된 알고리즘은 입력 신호의 평탄화된 스펙트럼에 기반한 알고리즘이다. 평탄화된 스펙트럼에서 마루와 골은 각각 세그먼트의 중심과 경계를 판단하기 위해 사용된다. 각 세그먼트를 하나의 소리로 묶는 그룹핑 단계 이전에 제안된 알고리즘에 의한 세그멘테이션 결과가 유용함을 평가하기 위하여 이상적인 마스크에 의한 세그멘테이션 결과와 제안된 방법을 비교한다. 제안된 방법을 협대역 잡음, 광대역 잡음, 다른 음성신호와 혼합된 음성신호에 대하여 실험하였다.

피치 알고리즘 수정 및 소음에의 적용 (Modification of Pitch Algorithm and Its Application to Noise)

  • Shin, Sung-Hwan;Ih, Jeong-Guon
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2002년도 추계학술대회논문초록집
    • /
    • pp.354.1-354
    • /
    • 2002
  • Pitch is a perception related to frequency, one of the psychological aspects or attributes of tones, and an important factor to determine sound quality of sound together with loudness and timber. while a study on pitch has been actively achieved In the part of speech recognition and speech separation, that for analysis and improvement of product sound quality is not yet enough. (omitted)

  • PDF

Hardware Implementation for Real-Time Speech Processing with Multiple Microphones

  • Seok, Cheong-Gyu;Choi, Jong-Suk;Kim, Mun-Sang;Park, Gwi-Tea
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.215-220
    • /
    • 2005
  • Nowadays, various speech processing systems are being introduced in the fields of robotics. However, real-time processing and high performances are required to properly implement speech processing system for the autonomous robots. Achieving these goals requires advanced hardware techniques including intelligent software algorithms. For example, we need nonlinear amplifier boards which are able to adjust the compression radio (CR) via computer programming. And the necessity for noise reduction, double-buffering on EPLD (Erasable programmable logic device), simultaneous multi-channel AD conversion, distant sound localization will be explained in this paper. These ideas can be used to improve distant and omni-directional speech recognition. This speech processing system, based on embedded Linux system, is supposed to be mounted on the new home service robot, which is being developed at KIST (Korea Institute of Science and Technology)

  • PDF

Sinusoidal Model을 이용한 Cochannel상에서의 음성분리에 관한 연구 (A Study on Speech Separation in Cochannel using Sinusoidal Model)

  • 박현규;신중인;박상희
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1997년도 추계학술대회 논문집 학회본부
    • /
    • pp.597-599
    • /
    • 1997
  • Cochannel speaker separation is employed when speech from two talkers has been summed into one signal and it is desirable to recover one or both of the speech signals from the composite signal. Cochannel speech occurs in many common situations such as when two AM signals containing speech are transmitted on the same frequency or when two people are speaking simultaneously (e. g., when talking on the telephone). In this paper, the method that separated the speech in such a situation is proposed. Especially, only the voiced sound of few sound states is separated. And the similarity of the signals by the cross correlation between the signals for exactness of original signal and separated signal is proved.

  • PDF

음성압축을 위한 전처리기법의 비교 분석에 관한 연구 (A Study on a Analysis and Comparison of Preprocessing Technique for the Speech Compression)

  • 장경아;민소연;배명진
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.125-136
    • /
    • 2003
  • Speech coding techniques have been studied to reduce the complexity and bit rate but also to improve the sound quality. CELP type vocoder, has used as a one of standard, supports the great sound quality even low bit rate. In this paper, the preprocessing of input speech to reduce the bit rate is the different with the conventional vocoder. The different kinds of parameter are used for the preprocessing so this paper is compared with theses parameters for finding the more appropriate parameter for the vocoder. The parameters are used to synthesize the speech not to encode or decode for coding technique so we proposed the simple algorithm not to have the influence on the processing time or the computation time. The parameters in used the preprocessing step are speaking rate, duration and PSOLA technique.

  • PDF

Investigation of the Speech Intelligibility of Classrooms Depending on the Sound Source Location

  • Kim Jeong Tai;Haan Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • 제24권4E호
    • /
    • pp.139-143
    • /
    • 2005
  • The present study aims to investigate the effects of speaker location on the speech intelligibility in a classroom. In order to this, acoustic measurements were undertaken in a classroom with three different sound source locations such as center of front wall (FC), both sides of front wall (FS) and the center of ceiling (CC). SPL, RT, $D_{50}$, RASTI were measured in the 9 measurement points with same sound power level of sound source and MLS was used as the sound source signal. Also, subjective listening tests were carried out using Korean language listening materials which were recorded in an anechoic chamber. The recorded syllables were replayed and recorded again in the classroom with same sound source at three different locations and listening tests were undertaken to 20 respondents who were asked to write the correct syllables which were recorded in the classroom. The results show that higher sound intelligibility ($D_{50}$ of $47\%$, RASTI of 0.56) was obtained when sound source was located at the FS. The results also show that high sound intelligibility was obtained at the area nearby walls.

An Electropalatographic Study of English 1, r and the Korean Liquid Sound ㄹ

  • Ahn, Soo-Woong
    • 음성과학
    • /
    • 제8권2호
    • /
    • pp.93-106
    • /
    • 2001
  • The pronunciation of English l and r was a consistent problem in learning English in Korea as well as Japan. This problem occurs from the fact that in Korea and Japan there is only one liquid sound. Substituting the Korean liquid for English l and r was a common error. The pronunciation of the dark l causes a further problem in pronouncing the English l sound. To see the relationship between the English l, r, and the Korean liquid sound, an electropalatographic (EPG) experiment was done. The findings were (1) there were no tongue contacts either on the alveolar ridge or on the palate during the articulation of the dark l. (2) The Korean liquid sound was different in the tongue contact points either from English l or r. The English clear l consistently touched the alveolar ridge in the forty tokens, but the Korean liquid sound in the intervocalic and word-final position touched mainly the alveopalatal area. The English r touched exclusively the velum area. The Korean intervocalic /l/ was similar to English flap in EPG and spectrographic data. There was evidence that the word-final Korean /l/ is a lateral.

  • PDF

CPSP 문턱값 설정을 통한 음원도달 방향 추정 성능 개선 (Performance Improvement of Sound Direction of Arrival Estimation by Applying Threshold to CPSP)

  • 전성일;배건성
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.109-114
    • /
    • 2011
  • To estimate sound direction of arrival with a pair of microphones, a method based on Time Difference of Arrival (TDOA) estimation using the Cross Power Spectrum Phase (CPSP) function is largely used due to its simplicity and good performance. In this paper, we investigate CPSP maximum values for various SNRs and adverse environments, and propose a novel method to improve the estimation performance of sound direction of arrival. The proposed method applies a threshold to the CPSP values and increases the reliability of the estimated sound direction. Through computer simulation for various SNRs, we validate the effectiveness of the proposed method. When the threshold was set to 0.1, more than 90% of success rate of sound direction of arrival estimation has been achieved for directions of $10^{\circ}$, $40^{\circ}$, $70^{\circ}$ from the source location even with reverberation times of 0.1s.

  • PDF

음성 연령에 대한 음향학적 분석;동음을 중심으로 (acoustic analysis of the aging voice;Baby voice)

  • 김지채;한지연;정옥란
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 추계학술대회 발표논문집
    • /
    • pp.127-130
    • /
    • 2006
  • The purpose of this study is to examine the difference in acoustic features between Young Voices and Aged Voices, which are actually come from the same age group. The 12 female subjects in their thirties were participated and recorded their sustained vowel /a/, connected speech, and reading. Their voices were divided into Younger Voices and Aged Voices, which means voices sound like younger person and sound like in their age or more aged ones. Praat 4.4.22 was used to record and analyze their acoustic features like Fo, SFF, Jitter, Shimmer, HNR, Pitch-range. And the six female listeners guessed the subjects' age and judged whether they sound younger or as like their actual age. We used the Independent t-Test to find the significant difference between those two groups' acoustic features. The result shows a significant difference in Fo, SFF. The above and the previous studies tell us the group who sounds like younger or baby like voice has the similar acoustic features of actually young people.

  • PDF