• Title/Summary/Keyword: 스펙트로그램 분석

Search Result 66, Processing Time 0.02 seconds

A Comparison Study on the Speech Signal Parameters for Chinese Leaners' Korean Pronunciation Errors - Focused on Korean /ㄹ/ Sound (중국인 학습자의 한국어 발음 오류에 대한 음성 신호 파라미터들의 비교 연구 - 한국어의 /ㄹ/ 발음을 중심으로)

  • Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.239-246
    • /
    • 2017
  • This paper compares the speech signal parameters between Korean and Chinese for Korean pronunciation /ㄹ/, which is caused many errors by Chinese leaners. Allophones of /ㄹ/ in Korean is divided into lateral group and tap group. It has been investigated the reasons for these errors by studying the similarity and the differences between Korean /ㄹ/ pronunciation and its corresponding Chinese pronunciation. In this paper, for the purpose of comparison the speech signal parameters such as energy, waveform in time domain, spectrogram in frequency domain, pitch based on ACF, Formant frequencies are used. From the phonological perspective the speech signal parameters such as signal energy, a waveform in the time domain, a spectrogram in the frequency domain, the pitch (F0) based on autocorrelation function (ACF), Formant frequencies (f1, f2, f3, and f4) are measured and compared. The data, which are composed of the group of Korean words by through a philological investigation, are used and simulated in this paper. According to the simulation results of the energy and spectrogram, there are meaningful differences between Korean native speakers and Chinese leaners for Korean /ㄹ/ pronunciation. The simulation results also show some differences even other parameters. It could be expected that Chinese learners are able to reduce the errors considerably by exploiting the parameters used in this paper.

A Study on Fuzziness Parameter Selection in Fuzzy Vector Quantization for High Quality Speech Synthesis (고음질의 음성합성을 위한 퍼지벡터양자화의 퍼지니스 파라메타선정에 관한 연구)

  • 이진이
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.60-69
    • /
    • 1998
  • This paper proposes a speech synthesis method using Fuzzy VQ, and then study how to make choice of fuzziness value which optimizes (controls) the performance of FVQ in order to obtain the synthesized speech which is closer to the original speech. When FVQ is used to synthesize a speech, analysis stage generates membership function values which represents the degree to which an input speech pattern matches each speech patterns in codebook, and synthesis stage reproduces a synthesized speech, using membership function values which is obtained in analysis stage, fuzziness value, and fuzzy-c-means operation. By comparsion of the performance of the FVQ and VQ synthesizer with simmulation, we show that, although the FVQ codebook size is half of a VQ codebook size, the performance of FVQ is almost equal to that of VQ. This results imply that, when Fuzzy VQ is used to obtain the same performance with that of VQ in speech synthesis, we can reduce by half of memory size at a codebook storage. And then we have found that, for the optimized FVQ with maximum SQNR in synthesized speech, the fuzziness value should be small when the variance of analysis frame is relatively large, while fuzziness value should be large, when it is small. As a results of comparsion of the speeches synthesized by VQ and FVQ in their spectrogram of frequency domain, we have found that spectrum bands(formant frequency and pitch frequency) of FVQ synthesized speech are closer to the original speech than those using VQ.

  • PDF

Influence of standard Korean and Gyeongsang regional dialect on the pronunciation of English vowels (표준어와 경상 지역 방언의 한국어 모음 발음에 따른 영어 모음 발음의 영향에 대한 연구)

  • Jang, Soo-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.1-7
    • /
    • 2021
  • This study aims to enhance English pronunciation education for Korean students by examining the impact of standard Korean and Gyeongsang regional dialect on the articulation of English vowels. Data were obtained through the Korean-Spoken English Corpus (K-SEC). Seven Korean words and ten English mono-syllabic words were uttered by adult, male speakers of standard Korean and Gyeongsang regional dialect, in particular, speakers with little to no experience living abroad were selected. Formant frequencies of the recorded corpus data were measured using spectrograms, provided by the speech analysis program, Praat. The recorded data were analyzed using the articulatory graph for formants. The results show that in comparison with speakers using standard Korean, those using the Gyeongsang regional dialect articulated both Korean and English vowels in the back. Moreover, the contrast between standard Korean and Gyeongsang regional dialect in the pronunciation of Korean vowels (/으/, /어/) affected how the corresponding English vowels (/ə/, /ʊ/) were articulated. Regardless of the use of regional dialect, a general feature of vowel pronunciation among Korean people is that they show more narrow articulatory movements, compared with that of native English speakers. Korean people generally experience difficulties with discriminating tense and lax vowels, whereas native English speakers have clear distinctions in vowel articulation.

Design and Implementation of BNN-based Gait Pattern Analysis System Using IMU Sensor (관성 측정 센서를 활용한 이진 신경망 기반 걸음걸이 패턴 분석 시스템 설계 및 구현)

  • Na, Jinho;Ji, Gisan;Jung, Yunho
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.5
    • /
    • pp.365-372
    • /
    • 2022
  • Compared to sensors mainly used in human activity recognition (HAR) systems, inertial measurement unit (IMU) sensors are small and light, so can achieve lightweight system at low cost. Therefore, in this paper, we propose a binary neural network (BNN) based gait pattern analysis system using IMU sensor, and present the design and implementation results of an FPGA-based accelerator for computational acceleration. Six signals for gait are measured through IMU sensor, and a spectrogram is extracted using a short-time Fourier transform. In order to have a lightweight system with high accuracy, a BNN-based structure was used for gait pattern classification. It is designed as a hardware accelerator structure using FPGA for computation acceleration of binary neural network. The proposed gait pattern analysis system was implemented using 24,158 logics, 14,669 registers, and 13.687 KB of block memory, and it was confirmed that the operation was completed within 1.5 ms at the maximum operating frequency of 62.35 MHz and real-time operation was possible.

Effective PPG Signal Processing Method for Detecting Emotional Stimulus (감성 자극 판단을 위한 효과적인 PPG 신호 처리 방법)

  • Oh, Dong-Gi;Min, Byung-Seok;Kwon, Sung-Oh;Kim, Hyun-Joong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5C
    • /
    • pp.393-402
    • /
    • 2012
  • In this study, we propose a signal processing algorithm to measure the arousal level of a human subject using a PPG(Photoplethysmography) sensor. From the measured PPG signals, the arousal level is determined by PPI(Pulse to Pulse Interval) and discrete-time signal processing. We ran psychophysical experiments displaying visual stimuli on TV display while measuring PPG signal from a finger, where the nature landscape scenes were used for restorative effect, and the urban environments were used to stimulate the stress. However, the measured PPG signals may include noise due to subject movement and measurement error, which results in incorrect detections. In this paper, to mitigate the noise impact on stimulus detection, we propose a detecting algorithm using digital signal processing methods and statistics of measured signals. A filter is adopted to remove a high frequency noise and adaptively designed taking into account the statistics of the measured PPG signals. Moreover we employ a hysteresis method to reduce the distortion of PPI in decision of emotional. Via experiment, we show that the proposed scheme reduces signal noise and improves stimulus detection.

Acoustic features of diphthongs produced by children with speech sound disorders (말소리장애 아동이 산출한 이중모음의 음향학적 특성)

  • Cho, Yoon Soo;Pyo, Hwa Young;Han, Jin Soon;Lee, Eun Ju
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.65-72
    • /
    • 2021
  • The aim of this study is to prepare basic data that can be used for evaluation and intervention by investigating the characteristics of diphthongs produced by children with speech sound disorders. To confirm this, two groups of 10 children each, with and without speech sound disorders were asked to imitate the meaningless two-syllable 'diphthongs + da'. The slope of F1 and F2, amount of change of formant, and duration of glide were analyzed by Praat (version 6.1.16). As a result, the difference between the two groups was found in the slope of F1 of /ju/. Children with speech sound disorders had smaller changes in formants and shorter duration time values compared to normal children, and there were statistically significant differences. The amount of change in formant in the glide was found in F1 of /ju, jɛ/, F2 of /jɑ, jɛ/, and there were significant differences in the duration of glide in /ju, jɛ/. The results of this study showed that the range of articulation of diphthongs in children with speech sound disorders is relatively smaller than that of normal children, thus the time it takes to articulate was reduced. These results suggest that the range of articulation and acoustic analysis should be further investigated for evaluation and intervention regarding diphthongs of children with speech sound disorders.