• 제목/요약/키워드: Speech sound

Search Result 628, Processing Time 0.048 seconds

Production of English final stops by Korean speakers

  • Kim, Jungyeon
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.11-17
    • /
    • 2018
  • This study reports on a production experiment designed to investigate how Korean speaking learners of English produce English forms ending in stops. In a repetition experiment, Korean participants listened to English nonce words ending in a stop and repeated what they heard. English speakers were recruited for the same task as a control group. The experimental result indicated that the transcriptions of the Korean productions by English native speakers showed vowel insertion in only 3% of productions although the pronunciation of English final stops showed that noise intervals after the closure of final stops were significantly longer for Korean speakers than for English speakers. This finding is inconsistent with the loanword data where 49% of words showed vowel insertion. It is also not compatible with the perceptual similarity approach, which predicts that because Korean speakers accurately perceive an English final stop as a final consonant, they will insert a vowel to make the English sound more similar to the Korean sound.

Interior surface treatment guidelines for classrooms according to the acoustical performance criteria (학교 교실의 음환경 기준에 따른 실내마감 방안)

  • Ryu, Da-Jung;Park, Chan-Jae;Haan, Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.2
    • /
    • pp.92-101
    • /
    • 2016
  • There are many results in which acoustical conditions of a classroom play an important role for studying effects and academic achievement of students. However, there are very few guidelines or design proposals which could make appropriate acoustic environment when classrooms are built or renovated. The present study suggests various design proposals satisfying acoustic standards of classrooms based on theoretical calculation and acoustic field experiments. At first, minimum area of sound absorption was calculated which is required to satisfy the acoustic standard for domestic middle and high schools. Also, room acoustic measurements were carried out in order to investigate the acoustic performance of an existing classroom by changing interior finishing materials on ceiling and rear walls. As a result, it was revealed that reverberation time standard below 0.8 s can be acquired even if there is no sound absorption on ceiling which is a general practice executed in Korea. Specially, it was found that if partial area of ceiling would be treated as reflective with the ratio of sound absorption and reflection as 2:1, almost similar acoustic parameters of $C_{50}$, $D_{50}$, RASTI (Rapid Speech Transmission Index) and higher sound levels could be acquired in comparison with the case of entire sound absorption on ceiling.

Spatial Speaker Localization for a Humanoid Robot Using TDOA-based Feature Matrix (도착시간지연 특성행렬을 이용한 휴머노이드 로봇의 공간 화자 위치측정)

  • Kim, Jin-Sung;Kim, Ui-Hyun;Kim, Do-Ik;You, Bum-Jae
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.3
    • /
    • pp.237-244
    • /
    • 2008
  • Nowadays, research on human-robot interaction has been getting increasing attention. In the research field of human-robot interaction, speech signal processing in particular is the source of much interest. In this paper, we report a speaker localization system with six microphones for a humanoid robot called MAHRU from KIST and propose a time delay of arrival (TDOA)-based feature matrix with its algorithm based on the minimum sum of absolute errors (MSAE) for sound source localization. The TDOA-based feature matrix is defined as a simple database matrix calculated from pairs of microphones installed on a humanoid robot. The proposed method, using the TDOA-based feature matrix and its algorithm based on MSAE, effortlessly localizes a sound source without any requirement for calculating approximate nonlinear equations. To verify the solid performance of our speaker localization system for a humanoid robot, we present various experimental results for the speech sources at all directions within 5 m distance and the height divided into three parts.

  • PDF

The Electropalatographic Evidence of the Korean Flap: An Intervocalic Korean Liquid Sound

  • Ahn, Soo-Woong
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.155-168
    • /
    • 2002
  • The intervocalic Korean liquid sound has been recognized as a flap in the studies of the Korean language. But there has been very little experimental data corroborating it. The electropalatographic (EPG) experiment was conducted to test this. The subjects were one Korean speaker and one native English speaker who had a pseudopalate and did the EPG experiment at the UCLA phonetics laboratory. The spectrographic evidence of the flaps in both the English t-flap and the Korean liquid flap was also sought. The English and Korean flaps were between mid/low back vowels so that the vowels themselves would not affect palatal contacts of the tongue. The results confirmed that the Korean liquid is realized as a flap in intervocallical position with many similar properties to English flap in both EPG and spectrographic data. The Korean initial liquid sound in borrowed words such as 'rotary' and 'radio' was also a flap. But the Korean liquid in the word-final and geminate positions was a lateral as in words 'dol ' (stone), 'dollo' (with stone), 'nal' (day) and 'nallara' (carry). The intuitive theory of the Korean liquid flap was proved by the EPG and spectrographic data.

  • PDF

A study on application of the statistic model about an utterance of the speaker (화자의 발음에 대한 통계적 모델의 적용에 관한 연구)

  • Kim, Dae-Sik;Bae, Myong-Jin;Yoon, Jae-Gang
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.25-28
    • /
    • 1988
  • A speech that play a part of important mediation in the man's conversation is the sound of representation to man's emotion and thought, then voice sound could be verified and identified a speaker's speech by individual property. This study indicates as distribution of pitch in searching for sample number of each pitch with eye in the sound waveform of speaker. We propose the algorithm that judge speaker's emotion state, personality, regional group, age, sex distinction, e.t.c., according to the deviation degree.

  • PDF

A Study on Speech Support for the Blind (시각 장애자를 위한 음성 지원에 관한 연구)

  • Jang, S.H.;Ham, K.K.;Choi, S.H.;Min, H.K.;Huh, W.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1993 no.05
    • /
    • pp.113-115
    • /
    • 1993
  • In this paper, we proposed a speech support system of personal computer for the blind. The system is consist of hardware part and software part. The hardware part are consist of personal computer and sound card. The software part are consist of sound driver system, character table and sound output algorithm. This system can recognize inputted characters from keyboard and program produced character strings.

  • PDF

Implementation of Sound Source Localization Based on Audio-visual Information for Humanoid Robots (휴모노이드 로봇을 위한 시청각 정보 기반 음원 정위 시스템 구현)

  • Park, Jeong-Ok;Na, Seung-You;Kim, Jin-Young
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.29-42
    • /
    • 2004
  • This paper presents an implementation of real-time speaker localization using audio-visual information. Four channels of microphone signals are processed to detect vertical as well as horizontal speaker positions. At first short-time average magnitude difference function(AMDF) signals are used to determine whether the microphone signals are human voices or not. And then the orientation and distance information of the sound sources can be obtained through interaural time difference. Finally visual information by a camera helps get finer tuning of the angles to speaker. Experimental results of the real-time localization system show that the performance improves to 99.6% compared to the rate of 88.8% when only the audio information is used.

  • PDF

Inference Ability Based Emotion Recognition From Speech (추론 능력에 기반한 음성으로부터의 감성 인식)

  • Park, Chang-Hyun;Sim, Kwee-Bo
    • Proceedings of the KIEE Conference
    • /
    • 2004.05a
    • /
    • pp.123-125
    • /
    • 2004
  • Recently, we are getting to interest in a user friendly machine. The emotion is one of most important conditions to be familiar with people. The machine uses sound or image to express or recognize the emotion. This paper deals with the method of recognizing emotion from the sound. The most important emotional component of sound is a tone. Also, the inference ability of a brain takes part in the emotion recognition. This paper finds empirically the emotional components from the speech and experiment on the emotion recognition. This paper also proposes the recognition method using these emotional components and the transition probability.

  • PDF

The Extraction of Nasal Sound by Using G-peak in Continued Speech (연속음 분류인식에서 G-peak를 이용한 비음의 분류)

  • Bae, Myung Jin;Chung, Ik Joo;ANN, Souguil
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.24 no.2
    • /
    • pp.274-279
    • /
    • 1987
  • In this paper, we describe a new algorithm for extracting nasal sound in continuous speech. We obtain pitches by using Area Comparison Method and extract nasal sound by comparing the area of G-peak and the area of side peak in one pitch interval. By using this method, the process can be speeded up. Therefore realtime processing is possible with a general microprocessor.

  • PDF

Implementation of a Single-chip Speech Recognizer Using the TMS320C2000 DSPs (TMS320C2000계열 DSP를 이용한 단일칩 음성인식기 구현)

  • Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.157-167
    • /
    • 2007
  • In this paper, we implemented a single-chip speech recognizer using the TMS320C2000 DSPs. For this implementation, we had developed very small-sized speaker-dependent recognition engine based on dynamic time warping, which is especially suited for embedded systems where the system resources are severely limited. We carried out some optimizations including speed optimization by programming time-critical functions in assembly language, and code size optimization and effective memory allocation. For the TMS320F2801 DSP which has 12Kbyte SRAM and 32Kbyte flash ROM, the recognizer developed can recognize 10 commands. For the TMS320F2808 DSP which has 36Kbyte SRAM and 128Kbyte flash ROM, it has additional capability of outputting the speech sound corresponding to the recognition result. The speech sounds for response, which are captured when the user trains commands, are encoded using ADPCM and saved on flash ROM. The single-chip recognizer needs few parts except for a DSP itself and an OP amp for amplifying microphone output and anti-aliasing. Therefore, this recognizer may play a similar role to dedicated speech recognition chips.

  • PDF