• Title/Summary/Keyword: voice parameter

Search Result 179, Processing Time 0.027 seconds

Noise Canceler Based on Deep Learning Using Discrete Wavelet Transform (이산 Wavelet 변환을 이용한 딥러닝 기반 잡음제거기)

  • Haeng-Woo Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1103-1108
    • /
    • 2023
  • In this paper, we propose a new algorithm for attenuating the background noises in acoustic signal. This algorithm improves the noise attenuation performance by using the FNN(: Full-connected Neural Network) deep learning algorithm instead of the existing adaptive filter after wavelet transform. After wavelet transforming the input signal for each short-time period, noise is removed from a single input audio signal containing noise by using a 1024-1024-512-neuron FNN deep learning model. This transforms the time-domain voice signal into the time-frequency domain so that the noise characteristics are well expressed, and effectively predicts voice in a noisy environment through supervised learning using the conversion parameter of the pure voice signal for the conversion parameter. In order to verify the performance of the noise reduction system proposed in this study, a simulation program using Tensorflow and Keras libraries was written and a simulation was performed. As a result of the experiment, the proposed deep learning algorithm improved Mean Square Error (MSE) by 30% compared to the case of using the existing adaptive filter and by 20% compared to the case of using the STFT(: Short-Time Fourier Transform) transform effect was obtained.

Usefulness of Cepstral Peak Prominence (CPP) in Unilateral Vocal Fold Paralysis Dysphonia Evaluation (일측성 성대마비 환자 평가에서 Cepstral Peak Prominence의 유용성)

  • Lee, Chang-Yoon;Jeong, Hee Seok;Son, Hee Young
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.28 no.2
    • /
    • pp.84-88
    • /
    • 2017
  • Background and Objectives : The purpose of this study was to compare the usefulness of Cepstral peak prominence (CPP) with parameter of Multiple Dimensional Voice Program (MDVP) in evaluating unilateral vocal fold paraylsis patients with subjective voice impairment. Materials and Methods : From July 2014 to August 2016, 37 patients with unilateral vocal fold paralysis who had been diagnosed with unilateral vocal fold paralysis and had received two or more voice tests before and after the diagnosis were evaluated for maximum phonation time (MPT), MDVP and CPP. Respectively. Voice tests were performed with short vowel /a/ and paragraph reading. Results : The CPP-a (CPP with vowel /a/) and CPP-s (CPP with paragraph reading) of the Cepstrum were statistically negatively correlated with G, R, B, and A before the voice therapy. Jitter, Shimmer, and NHR of MDVP were positively correlated with G, R, B. Jitter, Shimmer, and NHR of the MDVP were significantly correlated with the Cepstrum index. G, B, A and CPP-a and CPP-s showed a statistically significant negative correlation and a somewhat higher correlation coefficient between 0.5 and 0.78. On the other hand, in MDVP index, there was a positive correlation with G and B only with Jitter of 0.4. Conclusion : CPP can be an important evaluation tool in the evaluation of speech in the unilateral vocal cord paralysis when speech energy changes or the cycle is not constant during speech.

  • PDF

A Simulator for Integrated Voice/Data Packet Communication Networks (음성과 데이터가 집적된 패킷통신망을 위한 시뮬레이터 개발)

  • Park, Soon;Un, Chong-Kwan
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.11 no.2
    • /
    • pp.108-121
    • /
    • 1986
  • In this paper, the development of a simulator for the performance estimation and parameter optimization of an integrates voice/data packet communication network is described. The simulator implemented is capable of simulating the integrated voice/data network that handles packet voice terminals as well as data terminals and hosts operating under standard CCITT protocols. Of the three descrete event simulation approaches presently known, the process interaction method has been chose. With this approach one can implement a simulator that is related most Closely with the real system. The simulator has been implemented in PL/I and GPSS simulation languages, resulting in a software package of about 4,000 lines. To reduce the computer run time of the simulator, we have used a method of reducing conditional events based on a GPSS LINK block. We describe various aspects of the simulation model developed. We then investigate the performance of a 7-node network using the simulator, and present the results. For validation of the simulator developed, we construct a simulation model for a simple voice/ data multiplexer, and compare the results of simulation with those of an analytical model.

  • PDF

Reliability of OperaVOXTM against Multi-Dimensional Voice Program to Assess Voice Quality before and after Laryngeal Microsurgery in Patient with Vocal Polyp (성대 용종 환자의 후두미세수술 전후 음성 평가에서 OperaVOXTM와 Multi-Dimensional Voice Program 간의 신뢰도 연구)

  • Kim, Sun Woo;Kim, So Yean;Cho, Jae Kyung;Jin, Sung Min;Lee, Sang Hyuk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.31 no.2
    • /
    • pp.71-77
    • /
    • 2020
  • Background and Objectives OperaVOXTM (Oxford Wave Research Ltd.) is a portable voice analysis software package designed for use with iOS devices. As a relatively cheap, portable and easily accessible form of acoustic analysis, OperaVOXTM may be more clinically useful than laboratory-based software in many situations. The aim of this study was to evaluate the agreement between OperaVOXTM and Multi-Dimensional Voice Program (MDVP; Computerized Speech Lab) to assess voice quality before and after laryngeal microsurgery in patient with vocal polyp. Materials and Method Twenty patients who had undergone laryngeal microsurgery for vocal polyp were enrolled in this study. Preoperative and postoperative voices were assessed by acoustic analysis using MDVP and OperaVOXTM. A five-seconds recording of vowel /a/ was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). Results Several acoustic parameters of MDVP and OperaVOXTM related to short-term variability showed significant improvement. While pre-operative value of F0, jitter, shimmer, NHR was 155.75 Hz (male: 125.37 Hz, female: 183.37 Hz), 2.20%, 6.28%, 0.16, post-operative values of these parameter was 164.34 Hz (male: 129.42 Hz, female: 199.26 Hz), 2.15%, 5.18%, 0.14 Hz in MDVP. While pre-operative value of F0, jitter, shimmer, NHR was 168.26 Hz (male: 135.16 Hz, female: 201.37 Hz), 2.27%, 6.95%, 0.26, post-operative values of these parameters was 162.72 Hz (male: 128.267 Hz, female: 197.18 Hz), 1.71%, 5.36%, 0.20 in OperaVOXTM. There was high intersoftware agreement for F0, jitter, shimmer with intraclass correlation coefficient. Conclusion Our results showed that the short-term variability of acoustic parameters in both MDVP and OperaVOXTM were useful for the objective assessment of voice quality in patients who received laryngeal microsurgery. OperaVOXTM is comparable to MDVP and has high intersoftware reliability with MDVP in measuring the F0, jitter, and shimmer

A Study on Voice Recognition Pattern matching level for Vehicle ECU control (자동차 ECU제어를 위한 음성인식 패턴매칭레벨에 관한 연구)

  • Ahn, Jong-Young;Kim, Young-Sub;Kim, Su-Hoon;Hur, Kang-In
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.1
    • /
    • pp.75-80
    • /
    • 2010
  • Noise handing is very important in voice recognition of vehicle environment. that has been studying about to hardware and software approach. hardware method that is noise filter circuit design, basically using Low-pass filter. it was shown a good result. and the side of software that has been developing about to algorithm for Noise canceler, NN(neural network), etc. in this paper we have analysis about to classified parameter pattern matting level for voice recognition on car noise environment that use of DTW(Dynamic Time Warping) which is applicable time series pattern recognition algorithm.

The Stability and Variability based on Vowels in Voice Quality Analysis (음질 분석에 있어서 모음에 따른 안정성과 변이성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.79-86
    • /
    • 2015
  • This study explored the vowel effect on acoustic perturbation measures in voice quality analysis. For this study, the perturbation parameters (%jitter, %shimmer) and noise parameter (SNR) were measured with 7 Korean vowels (/a/, /ɛ/, /i/, /o/, /u/, /ɯ/, /ʌ/) using CSpeech with 50 Korean normal young adults (24 males and 26 females). A significant vowel effect was found only in %shimmer and in particular, low-back /a/vowel was significantly different from other vowels in %shimmer. The least perturbation and noise were exhibited on high-back /ɯ/ and /o/ vowel, respectively. Based on tongue height, a significant higher %shimmer was demonstrated on low vowels than high vowels. In addition, back vowels in tongue advancement and rounded vowels in lip rounding showed significantly less perturbation and noise. The least variability of perturbation and noise within individuals was demonstrated on the vowel /i/ in three repeated measures. However, there was no significant difference among 3 token measures in single session among vowels tested except the vowel /o/. Consequently, the vowel /a/ commonly used in acoustic perturbation measures exhibited higher perturbation and noise whereas higher stability and less variability were demonstrated on the high-back vowel /u/. These results suggested that the Korean high-back vowel /u/ can be more appropriate and reliable for perturbation acoustic measures.

Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method (선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환)

  • 권홍석;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.15-23
    • /
    • 2001
  • This paper presents a voice conversion technique that modifies the utterance of a source speaker as if it were spoken by a target speaker. Feature parameter conversion methods to perform the transformation of vocal tract and prosodic characteristics between the source and target speakers are described. The transformation of vocal tract characteristics is achieved by modifying the LPC cepstral coefficients using Linear Multivariate Regression (LMR). Prosodic transformation is done by changing the average pitch period between speakers, and it is applied to the residual signal using the LP-PSOLA scheme. Experimental results show that transformed speech by LMR and LP-PSOLA synthesis method contains much characteristics of the target speaker.

  • PDF

Development of Combined Permanent Magnet Type Microspeakers Used for Mobile Phones (이동통신 단말기용 통합 영구 자석 형태의 마이크로스피커 개발)

  • Hwang, Sang-Moon;Lee, Hong-Joo;Kwon, Joong-Hak;Hwang, Gun-Yong;Yang, Yong-Chang
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.16 no.2 s.107
    • /
    • pp.183-189
    • /
    • 2006
  • In mobile phones of multimedia era, microspeakers of high qualify sound are essential parts to generate human voice in speaker phone and MP3 song player. In this paper, two types of microspeakers, outer permanent magnet (PM) and combined PM type, are analyzed using electromagnetic, mechanical and their coupling analysis. For performance comparison, voice coil diameter is chosen as a design parameter to change excitation position and magnet volume for both types. For combined PM type, sound pressure level (SPL) is improved due to increased PM volume compared to outer PM type. Also, with the decreased voice coil diameter for combined PM type, the 1st resonant mode of the diaphragm is more efficiently excited due to concentrative excitation, resulting in lower and broader frequency range. Therefore, it can be said that the combined PM type microspeakers are more advantageous for high performance microspeaker which are essential for multimedia era.

A Study of MAC Architecture Dynamic cope with channel status (채널상황에 동적 대응하는 MAC의 구조에 대한 연구)

  • Kim, Sook-Young;Kim, Young-Sung;Suk, Jung-Bong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07a
    • /
    • pp.463-465
    • /
    • 2005
  • 본 논문에서는 802.l1e MAC의 EDCA 모드를 기반으로 한 무선랜 환경에서 realtime data인 voice traffic의 QoS를 향상하기 위한 방안을 연구하였다. 동적으로 채널의 상태를 보고, 네트워크 상태를 예측하여 채널 경쟁에 참여할지를 결정하는 방안으로, 경쟁이 치열할 때 voice traffic에 차별화된 가중치를 더 줄 수 있어 전반적인 voice traffic의 성능 향상에 기여한다. 본 연구에서 SU(slot utilization)을 계산하여 PT(Probability of Transmission)을 구할 때, 기존의 DCC 알고리즘을 그대로 802.11e에 적용하게 되면 4개의 AC에 똑같은 알고리즘이 적용되어 802.11e의 핵심인 차별화된 QoS를 지원하는데 무리가 있다. 기존 DCC 알고리즘에서 재시도 회수만 고려해 구하는 것을 802.11e 구조에 맞추어 4개의 AC별로 차별화하여 전송확률(PT)를 구할 수 있게 하였다. 뿐만 아니라, 재시도 회수를 고려하여 재시도 회수가 않은 packet에는 PT값이 높게 나을 수 있게 하고, 최대 가능한 재시도 회수에 도달할 때에는 상위 AC 영역의 값이 나오게 하여 AC의 upgrade가 되는 효과를 가져오게 하였다. 이 때 각 AC의 lower bound와 upper bound를 정하고, 이 때 802.11e의 최대 재시도 회수 parameter와의 상관관계를 정의하여 하위 AC가 상위 AC의 영역을 침범하지 못하도록 정의하고 있다. 추가적으로 SU의 값을 구할 때에도 현재의 SU값 대신 누적된 평균 SU값을 사용해 PT값을 구하여, utilization, latency, Packet loss등 전반적인 부분의 성능개선을 확인할 수 있다.

  • PDF

Development of Combined Permanent Magnet Type Microspeakers Used for Mobile Phones (이동통신 단말기용 통합 영구 자석 형태의 마이크로스피커 개발)

  • Lee, Hong-Joo;Hwang, Sang-Moon;Kwon, Joong-Hak;Hwang, Gun-Yong;Yang, Yong-Chang
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2005.11a
    • /
    • pp.497-502
    • /
    • 2005
  • In mobile phones of multimedia era, microspeakers of high quality sound are essential parts to generate human voice in speaker phone and MP3 song player. In this paper, two types of microspeakers, outer permanent magnet (PM) and combined PM type, are analyzed using electromagnetic, mechanical, acoustical and their coupling analysis. For performance comparison, voice coil diameter is chosen as a design parameter to change excitation position and magnet volume for both types. For combined PM type, sound pressure level (SPL) is improved due to increased PM volume compared to outer PM type. Also, with the decreased voice coil diameter for combined PM type, the 1st resonant mode of the diaphragm is more efficiently excited due to concentrative excitation, resulting in lower and broader frequency range. Therefore, it can be said that the combined PM type microspeakers are more advantageous for high performance microspeaker which are essential for multimedia era.

  • PDF