• Title/Summary/Keyword: Perceptual Weighting

Search Result 26, Processing Time 0.025 seconds

Executive function and Korean children's stop production

  • Eun Jong Kong;Hyunjung Lee;Jeffrey J. Holliday
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2023
  • Previous studies have established a role for cognitive differences in explaining variability in speech processing across individuals. In the case of perceptual cue weighting in the context of a sound change, studies have produced conflicting results regarding the relationship between executive function and the use of redundant cues. The current study aimed to explore this relationship in acoustic cue weighting during speech production. Forty-one Korean-speaking children read a list of stop-initial words and completed two tests that assess executive function, i.e., Dimensional Change Card Sorting (DCCS) and digit n-back. Voice onset time (VOT) and fundamental frequency (F0) were measured in each word, and analyses were carried out to determine the extent to which children's executive function predicted their use of both informative and less informative cues to the three pairs comprising the Korean three-way stop laryngeal contrast. No evidence was found for a relationship between cognitive ability and acoustic cue weighting in production, which is at odds with previous, albeit conflicting, results for speech perception. While this result may be due to the lack of task demands in the production task used here, it nevertheless expands the empirical ground upon which future work in this area may proceed.

On a Performance Improvement of Speaker Recognition by using the Auditory Characteristics of Speech (음성의 청각특성을 이용한 화자식별시스템의 성능향상에 관한 연구)

  • 이윤주;오세영배재옥배명진
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1223-1226
    • /
    • 1998
  • The pre-emephasis filter as the conventional method emphasizes all components of high frequency that reflects the speaker characteristics. However this filter don't show the auditory characteristics of speaker's speech. In order to emphasize the perceptual characteristics, we propose the speaker recognition system that uses the perceptual weighting as the preprocessor because the Auditory characteristic of human is sensitive to the formant peaks. This filter has the characteristcs that both deemphasizes the low-formants and emphasizes the high formants. As a result of the proposed method, we improve the total recognition rate 1.7% better than the conventional method.

  • PDF

Perceptual Quality Improvement of KLT based Entropy-Constrained Quantizer using a SAW Filter (SAW 필터를 이용한 KLT 기반 Entropy-Constrained Quantizer 성능 향상)

  • Lim, Dong-Seok;Kim, Moo Young
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.06a
    • /
    • pp.1-2
    • /
    • 2013
  • KLT-AECQ 는 지각적인 성능 향상을 위하여 formant weighting 필터를 사용한다.Code Excited Linear Prediction(CELP) 코더는 사람의 음성신호를 압축하는 대표적인 방식이다. CELP 의 Rate-Distortion 성능을 향상 시키기 위해서 Karhunen-Loeve-Transform (KLT) 기반의 Classified Vector Quantization (KLT-CVQ) 방식이 제안되었으며, 이는 KLT 기반의 Adaptive Entropy-Constrained Quantization (KLT-AECQ) 방식으로 확장되었다. 기존의 KLT-AECQ 에서는 지각적인 성능 향상을 위하여 formant weighting 필터를 사용한다. 본 논문에서는 이 필터 대신에 Spectral Amplitude Warping (SAW) 필터를 적용함으로써, KLT-AECQ 코더의 지각적인 성능을 향상하였다.

  • PDF

L2 Proficiency Effect on the Acoustic Cue-Weighting Pattern by Korean L2 Learners of English: Production and Perception of English Stops

  • Kong, Eun Jong;Yoon, In Hee
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.81-90
    • /
    • 2013
  • This study explored how Korean L2 learners of English utilize multiple acoustic cues (VOT and F0) in perceiving and producing the English alveolar stop with a voicing contrast. Thirty-four 18-year-old high-school students participated in the study. Their English proficiency level was classified as either 'high' (HEP) or 'low' (LEP) according to high-school English level standardization. Thirty different synthesized syllables were presented in audio stimuli by combining a 6-step VOTs and a 5-step F0s. The listeners judged how close the audio stimulus was to /t/ or /d/ in L2 using a visual analogue scale. The L2 /d/ and /t/ productions collected from the 22 learners (12 HEP, 10 LEP) were acoustically analyzed by measuring VOT and F0 at the vowel onset. Results showed that LEP listeners attended to the F0 in the stimuli more sensitively than HEP listeners, suggesting that HEP listeners could inhibit less important acoustic dimensions better than LEP listeners in their L2 perception. The L2 production patterns also exhibited a group-difference between HEP and LEP in that HEP speakers utilized their VOT dimension (primary cue in L2) more effectively than LEP speakers. Taken together, the study showed that the relative cue-weighting strategies in L2 perception and production are closely related to the learner's L2 proficiency level in that more proficient learners had a better control of inhibiting and enhancing the relevant acoustic parameters.

Efficient Harmonic-CELP Based Low Bit Rate Speech Coder (효율적인 하모닉-CELP 구조를 갖는 저 전송률 음성 부호화기)

  • 최용수;김경민;윤대희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.5
    • /
    • pp.35-47
    • /
    • 2001
  • This paper describes an efficient harmonic-CELP speech coder by taking advantages of harmonic and CELP coders into account. According to frame voicing decision, the proposed harmonic-CELP coder adopts the RP-VSELP coder as a fast CELP in case of an unvoiced frame, or an improved harmonic coder in case of a voiced frame. The proposed coder has main features as follows: simple pitch detection, fast harmonic estimation, variable dimension harmonic vector quantization, perceptual weighting reflecting frequency resolution, fast harmonic synthesis, naturalness control using band voicing, and multi-mode. These features make the proposed coder require very low complexity, compared with HVXC coder To demonstrate the performance of the proposed coder, a 2.4 kbps coder has been implemented and compared with reference coders. From results of informal listening tests, the proposed coder showed good quality while requiring low delay and complexity.

  • PDF

Individual differences in categorical perception: L1 English learners' L2 perception of Korean stops

  • Kong, Eun Jong
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.63-70
    • /
    • 2019
  • This study investigated individual variability of L2 learners' categorical judgments of L2 stops by exploring English learners' perceptual processing of two acoustic cues (voice onset time [VOT] and f0) and working memory capacity as sources of variation. As prior research has reported that English speakers' greater use of the redundant cue f0 was responsible for gradient processing of native stops, we examined whether the same processing characteristics would be observed in L2 learners' perception of Korean stops (/t/-/th/). 22 English learners of L2 Korean with a range of L2 proficiency participated in a visual analogue scaling task and demonstrated variable manners of judging the L2 Korean stops: Some were more gradient than others in performing the task. Correlation analysis revealed that L2 learners' categorical responses were modestly related to individuals' utilizations of a primary cue for the stop contrast (VOT for L1 English stops and f0 for L2 Korean stops), and were also related to better working memory capacity. Together, the current experimental evidence demonstrates adult L2 learners' top-down processing of stop consonants where linguistic and cognitive resources are devoted to a process of determining abstract phonemic identity.

The Comparative Analysis of Visual Perceptual Function and Impulse on Players Chagi in Taekwondo Events (태권도 종목별 선수들의 차기에 대한 시지각기능 및 충격량 비교 분석)

  • Lee, Young-Rim;Ha, Chul-Soo
    • Korean Journal of Applied Biomechanics
    • /
    • v.20 no.2
    • /
    • pp.205-212
    • /
    • 2010
  • The purpose of this study was to compare the efficiency of visual perception and impulse according to the three types of Taekwondo players to be able to supply an efficient training method, for this a total of 12 representative Taekwondo players of the Korean National team, 4 poomsae players, 4 kyokpa players and 4 kyorugi players weighting between 68 to 74 kg, and the results from the motion analysis system, eye tracker and Electronic hogu are as follows. For the visual perceptual function, the total body reaction time was slowest for the kyokpa group, and for the visible reaction and vision fixation time was longest of the poomsae group, while the performance movement was fastest for the kyorugi group. As for description of the two kicking motions dollyo chagi and dolgae chagi the longer visual fixation helps the accuracy of the kick. In conclusion, as there was a difference between the groups, this information could help to train the visual perception of players according to what event they are participating in.

A Speech Enhancement Algorithm based on Human Psychoacoustic Property (심리음향 특성을 이용한 음성 향상 알고리즘)

  • Jeon, Yu-Yong;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.6
    • /
    • pp.1120-1125
    • /
    • 2010
  • In the speech system, for example hearing aid as well as speech communication, speech quality is degraded by environmental noise. In this study, to enhance the speech quality which is degraded by environmental speech, we proposed an algorithm to reduce the noise and reinforce the speech. The minima controlled recursive averaging (MCRA) algorithm is used to estimate the noise spectrum and spectral weighting factor is used to reduce the noise. And partial masking effect which is one of the human hearing properties is introduced to reinforce the speech. Then we compared the waveform, spectrogram, Perceptual Evaluation of Speech Quality (PESQ) and segmental Signal to Noise Ratio (segSNR) between original speech, noisy speech, noise reduced speech and enhanced speech by proposed method. As a result, enhanced speech by proposed method is reinforced in high frequency which is degraded by noise, and PESQ, segSNR is enhanced. It means that the speech quality is enhanced.

On a Study of the Improvement of Speaker Recognition with Perceptual Weighting Filter (인지 가중 필터를 이용한 화자 인식의 성능 향상에 관한 연구)

  • 배재옥
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.428-431
    • /
    • 1998
  • 화자 인식의 방법에서 사용되고 있는 특징 파라미터들은 음성 인식에서 사용되고 있는 특징 파라미터를 그대로 사용하고 있다. 따라서, 이를 화자 인식에 적용할 때 화자의 특성을 효과적으로 반영할 수 있어야 한다. 일반적인 화자의 특징이 고주파수 위주로 분포되어 있기 때문에 전체 스펙트럼의 고주파 영역을 강조시킬 수 있고, 또한 인간의 청각특성이 공진 주파수에 기반하여 이루어진다는 사실에 기반을 두어서 공진 주파수 위주로 강조시키는 인지 가중 필터를 인식단의 전처리로 사용하는 방법에 관한 것이다. 본 논문을 실험한 결과 전체 인식율에서는 기존의 방법보다 3.89%까지 인식율의 향상을 얻을 수 있었다. 또한 사칭자 수리율은 2.5%의 저하를 얻을 수 있었다.

  • PDF

Weighted DCT-IF for Image up Scaling

  • Lee, Jae-Yung;Yoon, Sung-Jun;Kim, Jae-Gon;Han, Jong-Ki
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.790-809
    • /
    • 2019
  • The design of an efficient scaler to enhance the edge data is one of the most important issues in video signal applications, because the perceptual quality of the processed image is sensitively affected by the degradation of edge data. Various conventional scaling schemes have been proposed to enhance the edge data. In this paper, we propose an efficient scaling algorithm for this purpose. The proposed method is based on the discrete cosine transform-based interpolation filter (DCT-IF) because it outperforms other scaling algorithms in various configurations. The proposed DCT-IF incorporates weighting parameters that are optimized for training data. Simulation results show that the quality of the resized image produced by the proposed DCT-IF is much higher than that of those produced by the conventional schemes, although the proposed DCT-IF is more complex than other conventional scaling algorithms.