• Title/Summary/Keyword: Listener

Search Result 194, Processing Time 0.021 seconds

A Study on the Phonetic Parameters Used on the Voice Imitation (모방의 대상이 되는 음성적 특성에 관한 연구)

  • Park Jihye;Shin Jiyoung;Kang Sunmee
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.187-190
    • /
    • 2003
  • The purpose of this paper is to research the phonetic parameters used on the voice imitation. First of all, the fundamental frequency is imitated effectively. Distinctive prosodic patterns are used repeatedly on the voice imitation. Speaking rate is used in special measure in case the target speaker has extraordinary speaking rate. Also formant frequency is imitated variously. In sum, distinctive characteristics perceived by listener are used on voice imitation.

  • PDF

New Speech Enhancement Method using Psychoacoustic Criteria (심리 음향 기준을 이용한 새로운 음질 개선 방법)

  • 김대경;박장식;손경식
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.1
    • /
    • pp.56-66
    • /
    • 2001
  • The spectral subtraction algorithm using a criterion based on the human perception has been recently developed. The speech processed with Virag's algorithm sounds more pleasant to a human listener than those obtained by the classical methods. However, Virag's algorithm requires a robust voice activity detector (VAD). In the ESS (extended spectral subtraction) algorithm without VAD, the residual noise becomes more noticeable as the SNR decrease. In this paper we propose a new speech enhancement method, the combination of Wiener filter and spectral subtraction based on noise masking characteristics in the human auditory system. There is no need of VAD because the noise can be successively updated even during speech activity using Wiener filter. The adjustment of the subtraction parameter based on the masking threshold makes the residual noise inaudible. The proposed method has been compared with conventional spectral subtraction algorithms. Objective and subjective evaluation of the proposed system is performed with several noise types having different time-frequency distributions. The application of objective measures, the study of the speech spectrograms, as well as subjective listening tests, confirm that the enhanced speech with proposed algorithm is more pleasant to a human listener.

  • PDF

A Tracking of Head Movement for Stereophonic 3-D Sound (스테레오 입체음향을 위한 머리 움직임 추정)

  • Kim Hyun-Tae;Lee Kwang-Eui;Park Jang-Sik
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.11
    • /
    • pp.1421-1431
    • /
    • 2005
  • There are two methods in 3-D sound reproduction: a surround system, like 3.1 channel method and a binaural system using 2-channel method. The binaural system utilizes the sound localization principle of a human using two ears. Generally, a crosstalk between each channel of 2-channel loudspeaker system should be canceled to produce a natural 3-D sound. To solve this problem, it is necessary to trace a head movement. In this paper, we propose a new algorithm to correctly trace the head movement of a listener. The Proposed algorithm is based on the detection of face and eye. The face detection uses the intensity of an image and the position of eyes is detected by a mathematical morphology. When the head of the listener moves, length of borderline between face area and eyes may change. We use this information to the tracking of head movement. A computer simulation results show That head movement is effectively estimated within +10 margin of error using the proposed algorithm.

  • PDF

Speaker age estimation and acoustic characteristics: According to pitch and speech rate (화자 연령 지각과 음성적 특성: 음높이와 발화 속도를 중심으로)

  • Seo, YoonJeong;Shin, Jiyoung
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.9-18
    • /
    • 2019
  • This study aimed to investigate the correlation between speaker's chronological age (CA) and perceived age (PA) and to specify the effect of pitch and speech rate as acoustic cue on judging age, using perceptual testing and acoustic analysis. Three tasks were conducted to identify the degree of listener's accuracy about age estimation. Three perception tasks were conducted to measure the accuracy of 80 Korean listeners when presented with different types of speech. In all the tasks, participants listened to speech samples and gave their estimate of the speaker's age in figures. It was found that Korean listeners are able to gauge the age of a speaker fairly precisely. CA and mean PA were positively correlated in all three tasks. It is clear that the amount and type of information included in the voice samples affected the accuracy of a listener's judgement. Moreover, the result revealed that listeners make use of acoustic information such as pitch and speech rate to estimate speaker's age.

Improvement of 3D Sound Using Psychoacoustic Characteristics (인간의 청각 특성을 이용한 입체음향의 방향감 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.5
    • /
    • pp.255-264
    • /
    • 2011
  • The Head Related Transfer Function (HRTF) means a process related to acoustic transmission from 3d space to the listener's ear. In other words, it contains the information that human can perceive locations of sound sources. So, we make virtual 3d sound using HRTF, despite it doesn't actually exist. But, it can deteriorate some three-dimensional effect by the confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we proposed the new algorithm to reduce the confusion of sound image localization using human's acoustic characteristics. The frequency spectrum and global masking threshold of 3d sounds using HRTF are used to calculate the psychoacoustical differences among each directions. And perceptible cues in each critical band are boosted to create effective 3d sound. As a result, we can make the improved 3d sound, and the performances are much better than conventional methods.

Politeness Strategy in German Communication: Focusing on Politeness according to Familiarity (독일어 커뮤니케이션에서의 공손 전략: 친근감 여부에 따른 공손을 중심으로)

  • Moon, Yoon-Deok
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.3
    • /
    • pp.635-644
    • /
    • 2020
  • This paper examines the types and functions of politeness in German communication and how politeness strategy can be realized. 'Politeness' is not a grammatical terminology in German, but it can be found in many places in grammar. The criteria for politeness are not only organized according to the rules of the language system, but the boundaries are ambiguous because non-language factors affect communication. Politeness is an important strategic element as well as social value. The polite expression first appears in the grammatical level of invariant with the form of address according to the familiarity between the conversational parties, verb modus, and modal particle. Modal particle with familiarity is considered to be a positive politeness strategy that limits the listener's speech by weakening or avoiding face threatening act. Modal verbs is classified as polite expressions that do not impose a psychological burden by not forcing the listener to make a direct request. The results of this study are therefore expected to suggest a rationale for empirical research on politeness in German communication.

Analytical Approach of Fast Inter-Domain Handover Scheme in Proxy Mobile IPv6 Networks with Multicasting Support (멀티캐스팅을 지원하는 프록시 모바일IPv6 네트워크에서 빠른 도메인간 핸드오버 기법의 분석적 접근법)

  • Yoo, Se-Won;Jeong, Jong-Pil
    • The KIPS Transactions:PartC
    • /
    • v.19C no.2
    • /
    • pp.153-166
    • /
    • 2012
  • Multicast service will be required to be an important form of communication without interruption to the delivery of multicast service in mobile networks increasing MNs. In this paper, we review current status of PMIPv6(Proxy Mobile IPv6) multicast listener support being standardized in the IETF and point out limitations of the current approach and we proposed a fast multicast handover procedure in inter domain PMIPv6 network of network-based mobility management. The proposed Fast multicast handover procedure in inter domain optimizes multicast management by using the context of the MNs. We evaluate the proposed fast multicast handover procedure compared to the based one through the developed analytical models and confirm that introduced fast multicast handover procedure provides the reduced service interruption time and total network overhead compared to the based one during handovers.

A Preliminary Report on Perceptual Resolutions of Korean Consonant Cluster Simplification and Their Possible Change over Time

  • Cho, Tae-Hong
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.83-92
    • /
    • 2010
  • The present study examined how listeners of Seoul Korean would recover deleted phonemes in consonant cluster simplification. In a phoneme monitoring experiment, listeners had to monitor for C2 (/k/ or /p/) in C1C2C3 when C2 was deleted (C1 was preserved) or preserved (C1 was deleted). The target consonant (C2) was either /k/ or /p/ (e.g., i$\b{lk}$-t${\partial}$lato vs. pa$\b{lp}$-t${\partial}$lato), and there were two listener groups, one group tested in 2002 and the other in 2009. Some points have emerged from the results. First, listeners were able to detect deleted phonemes as accurately and rapidly as preserved phonemes, showing that the physical presence of the acoustic information did not improve the listeners' performance. This suggests that listeners must have relied on language-specific phonological knowledge about the consonant cluster simplification, rather than relying on the low-level acoustic-phonetic information. Second, listener groups (participants in 2002 vs. 2009), differed in processing /p/ versus /k/: listeners in 2009 failed to detect /p/ more frequently than those in 2002, suggesting that the way the consonant cluster sequence is produced and perceived has changed over time. This result was interpreted as coming from statistical patterns of speech production in contemporary Seoul Korean as reported in a recent study by Cho & Kim (2009): /p/ is deleted far more often than /p/ is preserved, which is likely reflected in the way listeners process simplified variants. Finally, listeners processed /k/ more efficiently than /p/, especially when the target was physically present (in C-preserved condition), indicating that listeners benefited more from the presence of /k/ than of /p/. This was interpreted as supporting the view that velars are perceptually more robust than labials, which constrains shaping phonological patterns of the language. These results were then discussed in terms of their implications for theories of spoken word recognition.

  • PDF

An Indoor Positioning Algorithm Based on 3 Points Near Field Angle-of-Arrival Estimation without Side Information (청취자 거리정보가 필요 없는 도달각 기반 실내 위치 추정기법)

  • Kim, Yeong-Moon;Yoo, Seung-Soo;Kim, Sun-Yong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.11C
    • /
    • pp.957-964
    • /
    • 2010
  • In this paper, we propose an indoor positioning algorithm based on 3 points near field angle-of-arrival estimation without side information. The conventional angle-of-arrival based positioning scheme requires the distance between the listener and the center of two points which is obtained by a received signal strength based range estimation. However, a received signal strength is affected by structure of room, placement of furniture, and characteristic of signal, these effects cause a large error to estimation of angle. In this paper, the proposed positioning scheme based on near field angle-of-arrival estimation can be used to estimate the position of listener without a prior distance information, just using time-difference-of-arrival information given from 3 points microphones. The performance of the proposed scheme is shown by cumulative distribution function of root mean squared error.

Improvement of Head Related Transfer Function to Create Realistic 3D Sound (현실감있는 입체음향 생성을 위한 머리전달함수의 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.381-386
    • /
    • 2008
  • Virtual 3D audio methods that create 3D sound effects are researched highly for multimedia devices using 2 speakers or headphone. The most typical method to create 3D effects is a technology through use of head related transfer function (HRTF) which contains the information that sound arrives from a sound source to the ears of the listener. But it can decline some 3D effects by cone of confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we propose a new method to use psychoacoustic theory that creates realistic 3D audio. In order to improve 3D sound, we calculate the excitation energy of each symmetric HRTF and extract the ratio of energy of each bark range. Informal listening tests show that the proposed method improves the front-bach sound localization characteristics much better than the conventional methods.