• Title/Summary/Keyword: speech sound

Search Result 625, Processing Time 0.027 seconds

A literature review on diagnostic markers and subtype classification of children with speech sound disorders (원인을 모르는 말소리장애의 하위유형 분류 및 진단 표지에 관한 문헌 고찰)

  • Yi, Roo-Dah;Kim, Soo-Jin
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.87-99
    • /
    • 2022
  • A review regarding indicators used in Korean research is needed to develop a diagnostic marker system for Korean children with speech sound disorders (SSD). This literature review examined the research conducted to reveal the characteristics of children with SSD of unknown origin in Korea. The researchers in Korea used diverse variables as indicators to identify the natures of children with SSD. These included indicators related to external characteristics of speech sound and comorbid features other than external aspects of speech sound. The attention has been focused on specific indicators so far. This result implies that some indicators may still require closer study in various aspects due to their influence, and some may require more attention due to the limited number of research. This article argues that more research is necessary to comprehensively describe the unique characteristics of children with SSD of unknown origin and suggests a direction for future research regarding diagnostic markers and subtype classification of SSD. It also proposes potential diagnostic markers and a set of assessments for the subtype classification of SSD.

A Study Of The Meaningful Speech Sound Block Classification Based On The Discrete Wavelet Transform (Discrete Wavelet Transform을 이용한 음성 추출에 관한 연구)

  • Baek, Han-Wook;Chung, Chin-Hyun
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2905-2907
    • /
    • 1999
  • The meaningful speech sound block classification provides very important information in the speech recognition. The following technique of the classification is based on the DWT (discrete wavelet transform), which will provide a more fast algorithm and a useful, compact solution for the pre-processing of speech recognition. The algorithm is implemented to the unvoiced/voiced classification and the denoising.

  • PDF

Acoustic Analysis of Speech Disorder Associated with Motor Aphasia - A Case Report -

  • Ko, Myung-Hwan;Kim, Hyun-Ki;Kim, Yun-Hee
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.97-107
    • /
    • 2000
  • Motor aphasia is an affection frequently caused by insult of the left middle cerebral artery and usually accompanied by a large lesion involving the Broca's area and the adjacent motor and premotor areas. Therefore, a patient with motor aphasia commonly shows articulatory disturbances due to failure of the motor programing of speech sound. Objective assessment and treatment of phonologic programing is one of the important aspects of speech therapy in aphasic patients. We analyzed the speech disorders acompanied with motor aphasia in a 45-year-old man using a computerized sound spectrograph, Visi-$Pitch{\circledR}$, and Multi-Dimensional Voice $Program{\circledR}$. We concluded that a computerized speech analysis system is a useful tool to visualize and quantitatively analyse the severity and progression of dysarthria, and the effect of speech therapy.

  • PDF

Two Simultaneous Speakers Localization using harmonic structure (하모닉 구조를 이용한 두 명의 동시 발화 화자의 위치 추정)

  • Kim, Hyun-Kyung;Lim, Sung-Kil;Lee, Hyon-Soo
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.121-124
    • /
    • 2005
  • In this paper, we propose a sound localization algorithm for two simultaneous speakers. Because speech is wide-band signal, there are many frequency sub-bands in that two speech sounds are mixed. However, in some sub-bands, one speech sound is more dominant than other sounds. In such sub-bands, dominant speech sounds are little interfered by other speech or noise. In speech sounds, overtones of fundamental frequency have large amplitude, and that are called 'Harmonic structure of speech'. Sub-bands inharmonic structure are more likely dominant. Therefore, the proposed localization algorithm is based on harmonic structure of each speakers. At first, sub-bands that belong to harmonic structure of each speech signal are selected. And then, two speakers are localized using selected sub-bands. The result of simulation shows that localization using selected sub-bands are more efficient and precise than localization methods using all sub-bands.

  • PDF

A DSP Implementation of Subband Sound Localization System

  • Park, Kyusik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.52-60
    • /
    • 2001
  • This paper describes real time implementation of subband sound localization system on a floating-point DSP TI TMS320C31. The system determines two dimensional location of an active speaker in a closed room environment with real noise presents. The system consists of an two microphone array connected to TI DSP hosted by PC. The implemented sound localization algorithm is Subband CPSP which is an improved version of traditional CPSP (Cross-Power Spectrum Phase) method. The algorithm first split the input speech signal into arbitrary number of subband using subband filter banks and calculate the CPSP in each subband. It then averages out the CPSP results on each subband and compute a source location estimate. The proposed algorithm has an advantage over CPSP such that it minimize the overall estimation error in source location by limiting the specific band dominant noise to that subband. As a result, it makes possible to set up a robust real time sound localization system. For real time simulation, the input speech is captured using two microphone and digitized by the DSP at sampling rate 8192 hz, 16 bit/sample. The source location is then estimated at once per second to satisfy real-time computational constraints. The performance of the proposed system is confirmed by several real time simulation of the speech at a distance of 1m, 2m, 3m with various speech source locations and it shows over 5% accuracy improvement for the source location estimation.

  • PDF

The Phoneme Synthesis of Korean CV Mono-Syllables (한국어 CV단음절의 음소합성)

  • 안점영;김명기
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.11 no.2
    • /
    • pp.93-100
    • /
    • 1986
  • We analyzed Korean CV mono-syllables consisted of concatenation of consonants/k, t, p, g/, their fortis and rough sound and vowels/a, e, o, u, I/by the PARCOR technique, and then we synthesized those speech by means of the phoneme synthesis controlling the analyzed data. In the speech analysis, the duration of consonants decreases in the rough sound, the lenis and the fortis in turns. And also the gain of them decreases in the same tendency. The pitch period increases more and more in vowels following the rough sound, the fortis and the lenis in turns. We synthesized the lenis and the fortis by controlling the duration and the gain of the rough sound, and vowels following the fortis and the rough sound by controlling the pitch period and the duration of vowels following the lenis. As the results, the synthesized speech quality is good and we make certain it is possible to make a rule to the phonome synthesis in Korea speech.

  • PDF

Fillers in the Hong Kong Corpus of Spoken English (HKCSE)

  • Seto, Andy
    • Asia Pacific Journal of Corpus Research
    • /
    • v.2 no.1
    • /
    • pp.13-22
    • /
    • 2021
  • The present study employed an analytical framework that is characterised by a synthesis of quantitative and qualitative analyses with a specially designed computer software SpeechActConc to examine speech acts in business communication. The naturally occurring data from the audio recordings and the prosodic transcriptions of the business sub-corpora of the HKCSE (prosodic) are manually annotated with a speech act taxonomy for finding out the frequency of fillers, the co-occurring patterns of fillers with other speech acts, and the linguistic realisations of fillers. The discoursal function of fillers to sustain the discourse or to hold the floor has diverse linguistic realisations, ranging from a sound (e.g. 'uhuh') and a word (e.g. 'well') to sounds (e.g. 'um er') and words, namely phrase ('sort of') and clause (e.g. 'you know'). Some are even combinations of sound(s) and word(s) (e.g. 'and um', 'yes er um', 'sort of erm'). Among the top five frequent linguistic realisations of fillers, 'er' and 'um' are the most common ones found in all the six genres with relatively higher percentages of occurrence. The remaining more frequent realisations consist of clause ('you know'), word ('yeah') and sound ('erm'). These common forms are syntactically simpler than the less frequent realisations found in the genres. The co-occurring patterns of fillers and other speech acts are diverse. The more common co-occurring speech acts with fillers include informing and answering. The findings show that fillers are not only frequently used by speakers in spontaneous conversation but also mostly represented in sounds or non-linguistic realisations.

Design and Implementation of Korean Tet-to-Speech System (다이폰을 이용한 한국어 문자-음성 변환 시스템의 설계 및 구현)

  • 정준구
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.91-94
    • /
    • 1994
  • This paper is a study on the design and implementation of the Korean Tet-to-Speech system. In this paper, parameter symthesis method is chosen for speech symthesis method and PARCOR coeffient, one of the LPC analysis, is used as acoustic parameter, We use a diphone as synthesis unit, it include a basic naturalness of human speech. Diphone DB is consisted of 1228 PCM files. LPC synthesis method has defect that decline clearness of synthesis speech, during synthesizing unvoiced sound In this paper, we improve clearness of synthesized speech, using residual signal as ecitation signal of unvoiced sound. Besides, to improve a naturalness, we control the prosody of synthesized speech through controlling the energy and pitch pattern. Synthesis system is implemented at PC/486 and use a 70Hz-4.5KHz band pass filter for speech imput/output, amplifier and TMS320c30 DSP board.

  • PDF

Comparison of Sound Pressure Level and Speech Intelligibility of Emergency Broadcasting System at T-junction Corridor Space (T자형 복도 공간의 비상 방송용 확성기 배치별 음압 레벨과 음성 명료도 비교)

  • Jeong, Jeong-Ho;Lee, Sung-Chan
    • Fire Science and Engineering
    • /
    • v.33 no.1
    • /
    • pp.105-112
    • /
    • 2019
  • In this study, an architectural acoustics simulation was conducted to examine the clear and uniform transmission of emergency broadcasting sound in a T junction corridor space. The sound absorption performance of the corridor space and the location and spacing of the loudspeaker for emergency broadcasting were varied. The distribution of the sound pressure level and the distribution of sound transmission indices (STI, RASTI) were compared. The simulation showed that the loudspeaker for emergency broadcasting should be installed approximately 10 m from the center of the T junction corridor connection for clear voice transmission. Narrowing the 25 m installation interval of the NFSC shows that an even clearer and sufficient volume of emergency broadcast sound can be delivered evenly.

An acoustic and perceptual investigation of the vowel length contrast in Korean

  • Lee, Goun;Shin, Dong-Jin
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.37-44
    • /
    • 2016
  • The goal of the current study is to investigate how the sound change is reflected in production or in perception, and what the effect of lexical frequency is on the loss of sound contrasts. Specifically, the current study examined whether the vowel length contrasts are retained in Korean speakers' productions, and whether Korean listeners can distinguish vowel length minimal pairs in their perception. Two production experiments and two perception experiments investigated this. For production tests, twelve Korean native speakers in their 20s and 40s completed a read-aloud task as well as a map-task. The results showed that, regardless of their age group, all Korean speakers produced vowel length contrasts with a small but significant differences in the read-aloud test. Interestingly, the difference between long and short vowels has disappeared in the map task, indicating that the speech mode affects producing vowel length contrasts. For perception tests, thirty-three Korean listeners completed a discrimination and a forced-choice identification test. The results showed that Korean listeners still have a perceptual sensitivity to distinguish lexical meaning of the vowel length minimal pair. We also found that the identification accuracy was affected by the word frequency, showing a higher identification accuracy in high- and mid- frequency words than low frequency words. Taken together, the current study demonstrated that the speech mode (read-aloud vs. spontaneous) affects the production of the sound undergoing a language change; and word frequency affects the sound change in speech perception.