Search | Korea Science

Perception of Japanese word-initial stops by native listeners (모어청자에 의한 일본어 어두 폐쇄음의 지각)

Byun, Hi-Gyung
- Phonetics and Speech Sciences
- /
- v.13 no.3
- /
- pp.53-64
- /
- 2021
It is known that the voicing contrast for Japanese word-initial stops is primarily realized as differences in the voice onset time (VOT). However, recent studies have reported that voiced stops are more often produced with a positive VOT than with a negative VOT among the younger generation nationwide. It is also known that post-stop F0 is associated with the stop contrast, but the degree of F0 use differs from region to region. This study explores whether the difference in post-stop F0 functions as a perceptual cue to the stop contrast along with VOT. Fifty-five college students who are native listeners from four different regions participated in two or three perception tests. The results show that VOT is a primary cue to the voiced-voiceless distinction of word-initial stops, but that the effect of post-stop F0 on the stop contrast is marginal. The post-stop F0 is involved in perception only when VOT is ambiguous, such that a sound with high F0 is more often perceived as a voiceless stop, but not vice versa. The results of this study indicate that the acoustic parameters associated with the stop contrast are not the same in production and perception, and suggest that other factors such as context, which is not an acoustic characteristic, may also be involved in the stop contrast.
https://doi.org/10.13064/KSSS.2021.13.3.053 인용 PDF KSCI

Voice Analysis of Countertenors (카운터테너의 음성학적 분석)

정성민;김문정;윤선옥;신혜정;박수경;신유리;권영경
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.12 no.1
- /
- pp.39-45
- /
- 2001
Background and Objectives : A post-pubescent male classical singer has lower vocal register than a female classical singer. Countertenors who can produce higher vocal register like female classical singers with their falsetto voice and head resonance are recently active. The general purpose of this study is to analyze voice of countertenors and to determine the differences with those of classical singers. Materials and Methods : Four countertenors in Korea were examined using a videostrobos-copy and their voice were analyzed using aerodynamic, acoustic and voice range profile methods. Results and Conclusion : Countertenors could produce elevated fundamental frequency, voice intensity and mean air flow rate using large pulmonary capacity and head voiced falsetto. It means the presence of greater energy in countertenor is due to the more efficient conversion of the air flow to acoustic energy. But, they had unstable amplitude perturbation per each vocal cycle. The results indicated that countertenor is the acoustic products of different laryngeal mechanism with other classical register and it can be recognized as one of the registers of male classical singers.
PDF

Adaptive Noise Reduction of Speech Using Wavelet Transform (웨이브렛 변환을 이용한 음성의 적응 잡음 제거)

Lee, Chang-Ki;Kim, Dae-Ik
- The Journal of the Korea institute of electronic communication sciences
- /
- v.4 no.3
- /
- pp.190-196
- /
- 2009
A new time adapted threshold using the standard deviations of Wavelet coefficients after Wavelet transform by frame scale is proposed. The time adapted threshold is set up using the sum of standard deviations of Wavelet coefficient in level 3 approximation and weighted level 1 detail. Level 3 approximation coefficients represent the voiced sound with low frequency and level 1 detail coefficients represent the unvoiced sound with high frequency. After reducing noise by soft thresholding with the proposed time adapted threshold, there are still residual noises in silent interval. To reduce residual noises in silent interval, a detection algorithm of silent interval is proposed. From simulation results, it can be noticed that SNR and MSE of the proposed algorithm are improved than those of Wavelet transform and than those of Wavelet packet transform.
PDF

A Study on Speaker Identification by Difference Sum and Correlation Coefficients of Narrow-band Spectrum (좁은대역 스펙트럼의 차이값과 상관계수에 의한 화자확인 연구)

Yang, Byung-Gon;Kang, Sun-Mee
- Speech Sciences
- /
- v.9 no.3
- /
- pp.3-16
- /
- 2002
We examined some problems in speaker identification procedures: transformation of acoustic parameters into auditory scales, invalid measurement values, and comparability of spectral energy values across the frequency range. To resolve those problems, we analyzed the acoustic spectral energy of three Korean numbers produced by ten female students from narrow-band spectrograms at 19 proportional time points of each voiced segment. Then, cells of the first five spectral matrices were averaged to form a matrix model for each speaker. The correlation coefficients and sum of the absolute amplitude difference in each pair of the spectral models of the ten subjects were obtained. Also, some individual matrix models were compared to those of the same subject or the other subject with a similar spectral model. Results showed that in numbers '2' and '9' subjects could not be clearly distinguished from the others but in number '4' it shed some possibility of setting threshold values for speaker identification if we employed the coefficients and the sum of absolute difference. Further studies would be desirable on various combinations of the range of long-term average spectra and the degree of signal pre-emphasis.
PDF

An Experimental Study of Co-relation between English Pronunciation and Listening Comprehension of Korean College Students in Chungnam and Gyungnam Provinces (충남.경남지역 대학생들의 영어발음과 청해능력의 상관관계에 대한 실험적 연구)

Park, Hee-Suk;Kim, Jung-Sook
- Speech Sciences
- /
- v.11 no.3
- /
- pp.55-68
- /
- 2004
The purpose of this experimental study is to investigate the relationship between English pronunciation and listening comprehension of English diphthongs and low vowels of Korean college students from the Chungnam and Gyungnam provinces. Of 22 test sentences for listening comprehension, 15 sentences were recorded by native speakers and seven sentences were edited from Springboard by Oxford University Press. For the listening comprehension test, 90 subjects from two groups, Chungnam dialect speakers and Gyungnam dialect speakers, were selected. They listened to 22 sentences produced by audio cassette tape and completed a cloze exercise. By the results of this experiment, we observed that Korean collegians of Gyungnam province showed a better listening comprehension of words including front low vowels when they preceded voiced sounds than those of Chungnam province. When the back low vowel came in an open syllable, we also recognized the same result; Gyungnam province collegians showed better listening comprehension of words including back low vowels than those of Chungnam province. As the results of Hee-Suk Park & Jung-Soak Kim(2003) showed that Gyungnam province collegians pronounced the English low vowels longer than Chungnam province collegians, we discovered that there was a positive relation between English pronunciation and listening comprehension, especially in Gyungnam province collegians. However regarding words including English diphthongs we discovered almost no relation between English pronunciation and listening comprehension.
PDF

A Study of English Consonants Identified by College Students (대학생들의 영어자음 인지 연구)

Yang, Byung-Gon
- Speech Sciences
- /
- v.12 no.3
- /
- pp.139-151
- /
- 2005
Previous studies have shown that Korean students have difficulty identifying some English consonants which are not in the Korean sound inventory. The aim of this study was to examine the accuracy rate of English consonants correctly identified by 130 college students in order to find out which English consonants were difficult for the students to perceive. The subject's task was to identify one of the minimal pairs played in a quiet laboratory classroom. 100 minimal pairs consisted of syllables with various onsets or codas: stops, fricatives, affricates, liquids and nasals. Results were as follows: First, the average score of the English major group was significantly higher than that of the non-English major group. Second, there was a similar distribution in the rank order of minimal pairs sorted by the accuracy rate between the two groups. Third, the accuracy rate systematically decreased as each score range decreased. Fourth, the students showed higher accuracy in the perception of liquids than that of the stop-fricative contrast. Fifth, the accuracy score in onset position was higher than in coda position. Finally, the students still had problem telling voiced consonants from voiceless ones, especially in coda position. It would be desirable to extend the present research to middle or high school students to fundamentally resolve those listening problems.
PDF

Subband Based Spectrum Subtraction Algorithm (서브밴드에 기반한 스펙트럼 차감 알고리즘)

Choi, Jae-Seung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.8 no.4
- /
- pp.555-560
- /
- 2013
This paper first proposes a classification algorithm which detects a voiced, unvoiced, and silence signal using distance measure, logarithm power and root mean square methods at each frame, then a spectrum subtraction algorithm based on a subband filter. The proposed algorithm subtracts spectrums of white noise and street noise from noisy signal based on the subband filter at each frame. In this experiment, experimental results of the proposed spectrum subtraction algorithm demonstrate using the speech and noise data of Aurora-2 database. Based on measuring the speech-to-noise ratio (SNR), experiments confirm that the proposed algorithm is effective for the speech by contaminated the noise. From the experiments, the improvement in the output SNR values was approximately 2.1 dB and 1.91 dB better for white noise and street noise, respectively.
https://doi.org/10.13067/JKIECS.2013.8.4.555 인용 PDF KSCI

Adaptive Noise Reduction of Speech using Wavelet Transform (웨이브렛 변환을 이용한 음성의 적응 잡음 제거)

Im Hyung-kyu;Kim Cheol-su
- Journal of the Korea Computer Industry Society
- /
- v.6 no.2
- /
- pp.271-278
- /
- 2005
This paper proposed a new time adapted threshold using the standard deviations of Wavelet coefficients after Wavelet transform by frame scale. The time adapted threshold is set up using the sum of standard deviations of Wavelet coefficient in level 3 approximation and weighted level 1 detail. Level 3 approximation coefficients represent the voiced sound with low frequency and level 1 detail coefficients represent the unvoiced sound with high frequency. After reducing noise by soft thresholding with the proposed time adapted threshold, there are still residual noises in silent interval. To reduce residual noises in silent interval, a detection algorithm of silent interval is proposed. From simulation results, it is demonstrated that the proposed algorithm improves SNR and MSE performance more than Wavelet transform and Wavelet packet transform does.
PDF

Performance Improvement of Packet Loss Concealment Algorithm in G.711 Using Speech Characteristics (음성 특성을 이용한 G.711 패킷 손실 은닉 알고리즘의 성능개선)

Han Seung-Ho;Kim Jin-Sul;Lee Hyun-Woo;Ryu Won;Hahn Min-Soo
- MALSORI
- /
- no.57
- /
- pp.175-189
- /
- 2006
Because a packet loss brings about degradation of speech quality, VoIP speech coders have PLC (Packet Loss Concealment) mechanism. G.711, which is a mandatory VoIP speech coder, also has the PLC algorithm based on pitch period replication. However, it is not robust to burst losses. Thus, we propose two methods to improve the performance of the original PLC algorithm in G.711. One adaptively utilizes voiced/unvoiced information of adjacent good frames regarding to the current lost frame. The other is based on adaptive gain control according to energy variation across the frames. We evaluate the performance of the proposed PLC algorithm by measuring a PESQ value under different random and burst packet loss simulating conditions. It is shown from the experiments that the performance of the proposed PLC algorithm outperforms that of PLC employed in ITU-T Recommendation G.711.
PDF

Speaker Identification Based on Incremental Learning Neural Network

Heo, Kwang-Seung;Sim, Kwee-Bo
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.5 no.1
- /
- pp.76-82
- /
- 2005
Speech signal has various features of speakers. This feature is extracted from speech signal processing. The speaker is identified by the speaker identification system. In this paper, we propose the speaker identification system that uses the incremental learning based on neural network. Recorded speech signal through the microphone is blocked to the frame of 1024 speech samples. Energy is divided speech signal to voiced signal and unvoiced signal. The extracted 12 orders LPC cpestrum coefficients are used with input data for neural network. The speakers are identified with the speaker identification system using the neural network. The neural network has the structure of MLP which consists of 12 input nodes, 8 hidden nodes, and 4 output nodes. The number of output node means the identified speakers. The first output node is excited to the first speaker. Incremental learning begins when the new speaker is identified. Incremental learning is the learning algorithm that already learned weights are remembered and only the new weights that are created as adding new speaker are trained. It is learning algorithm that overcomes the fault of neural network. The neural network repeats the learning when the new speaker is entered to it. The architecture of neural network is extended with the number of speakers. Therefore, this system can learn without the restricted number of speakers.
https://doi.org/10.5391/IJFIS.2005.5.1.076 인용 PDF KSCI

Search Result 282, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)