• Title/Summary/Keyword: 음질평가

Search Result 353, Processing Time 0.026 seconds

Enhanced Adjustment Strategy of Masking Threshold for Speech Signals in Low Bit-Rate Audio Coding (저전송률 오디오 부호화에서 음성 신호의 성능 개선을 위한 마스킹 임계값 적응기법 향상)

  • Lee, Chang-Heon;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.62-68
    • /
    • 2010
  • This paper proposes a new masking threshold adjustment strategy to improve the performance for speech signals in low bit-rate audio coding. After determining formant regions, the masking threshold is adjusted by using the energy ratio of each sub-band to the average energy of each formant. More quantization noises are added to the bands that have relatively large energy, but less distortion is allowed in spectral valley regions by allocating more bits, which reflects the concept of perceptual weighting widely used in speech coding. From the results of objective speech quality measure, we verified that the proposed method improves quality for the speech input signals compared to the conventional one.

Inter-rater Reliability and Training Effect of the Differential Diagnosis of Speech and Language Disorder for Stroke Patients (뇌졸중 환자의 말, 언어장애 선별에 대한 검사자간 신뢰도 및 훈련효과)

  • Kim, Jung-Wan
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.9
    • /
    • pp.407-413
    • /
    • 2011
  • Distinguishing aphasia in stroke patients and observing the subtle linguistic characteristics associated with it primarily requires the use of instruments that provide reliable assessment results. Additionally, examiners should be fully aware of how to use those instruments. This study examined 46 stroke patients for aphasia and assessed the reliability of their diagnoses according to examiners whose medical fields were different from each other. Furthermore, a comparison was made between the reliability before training and that after training. To this end, 46 stroke patients were tested for aphasia and in terms of their speech disorder degree by 3 groups, each of which consisted of 12 professionals (3 SLP, 3 neurologist, and 3 nurse). In the result, a rating of 'acceptable' was given for speech intelligibility tasks and the voice quality of /ah-/ prolongation, and other sub-tests were marked as 'good-excellent' by the experts with different areas of medical expertise. For the tasks marked as 'acceptable', the subjects were video-trained for 3 weeks and the differences were compared before and after their training. Consequently, the differences in the examiners' ratings in the speech intelligibility tasks showed a significant decrease and the accuracy of their voice quality ratings showed a significant increase. In the result of research on the correlation between the accuracy of the sub-test ratings and the amount of clinic experience, speech therapists developed more accuracy in rating a picture description task and a speech intelligibility task as their experience accumulated. Meanwhile, doctors and nurses showed more accurate ratings in picture description tasks with greater clinical experience. The results of this study suggest that assessing the neurologic-communicative disorders of stroke patients requires ongoing training and experience, especially for speech disorders. It was also found that the rating reliability in this case could be improved by training.

Changes in Acoustic Parameters According to Intensity Increase in Voice Assessment (음성질환자의 음성검사 시 강도 증가에 따른 음향학적 지표의 변화)

  • Nam, Do-Hyun;Rheem, Sung-Sue;Yun, Bo-Ram;Cho, Sun-A;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.22 no.2
    • /
    • pp.143-150
    • /
    • 2011
  • Background and Objectives : Clinically, as a tool for voice assessment before and after the operation or the voice treatment, acoustic analysis is widely used. However, in clinical situations, acoustic parameters vary according to how the assessment is made. Thus, with voice disease patients as subjects, we are to investigate what influence intensity increase exerts on acoustic parameters and how to reduce variation according to the way of assessing. Material and Method : At the voice clinic of the department of otorhinolaryngology in Gangnam Severance Hospital, with 30 female voice-disease patients (40.6 years old on the average) and 23 male voice-disease patients (40.1 years old on the average) as subjects, using the Dr Speech vocal-assessment program, we statistically tested the significance of the difference in each of acoustic parameters between when the "Ah" vowel is produced with a normal voice and when the "Ah" vowel is produced with a loud voice. Results : Acoustic parameters that showed a statistically significant difference according to intensity increase were Jitter, SD F0, and NNE for females, and Jitter, SD F0, HNR, SNR, and NNE for males. Voice quality estimates showed a statistically significant difference according to intensity increase in female hoarse voice, female breathy voice, and male breathy voice. Conclusion : In this research, acoustic analysis, which is generally used for voice assessment before and after the operation or the voice treatment, showed a tendency that acoustic parameters became better under the influence of intensity increase except for the cases where a voice disease was severe. Thus, to raise the reliability of voice assessment, the range of intensity needs to be set up. This should be the topic for the future research.

  • PDF

A Pre-Selection of Candidate Units Using Accentual Characteristic In a Unit Selection Based Japanese TTS System (일본어 악센트 특징을 이용한 합성단위 선택 기반 일본어 TTS의 후보 합성단위의 사전선택 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Kwang-Hyoung;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.159-165
    • /
    • 2007
  • In this paper, we propose a new pre-selection of candidate units that is suitable for the unit selection based Japanese TTS system. General pre-selection method performed by calculating a context-dependent cost within IP (Intonation Phrase). Different from other languages, however. Japanese has an accent represented as the height of a relative pitch, and several words form a single accentual phrase. Also. the prosody in Japanese changes in accentual phrase units. By reflecting such prosodic change in pre-selection. the qualify of synthesized speech can be improved. Furthermore, by calculating a context-dependent cost within accentual phrase, synthesis speed can be improved than calculating within intonation phrase. The proposed method defines AP. analyzes AP in context and performs pre-selection using accentual phrase matching which calculates CCL (connected context length) of the Phoneme's candidates that should be synthesized in each accentual phrase. The baseline system used in the proposed method is VoiceText, which is a synthesizer of Voiceware. Evaluations were made on perceptual error (intonation error, concatenation mismatch error) and synthesis time. Experimental result showed that the proposed method improved the qualify of synthesized speech. as well as shortened the synthesis time.

Evaluation of a signal segregation by FDBM (FDBM의 음원분리 성능평가)

  • Lee, Chai-Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.12
    • /
    • pp.1793-1802
    • /
    • 2013
  • Various approaches for sound source segregation have been proposed. Among these approaches, frequency domain binaural model(FDBM) has the advantages of low computational load and effective howling cancellation. A binaural hearing assistance system based on FDBM has been proposed. This system can enhance desired signal based on the directivity information. Although FDBM has been evaluated in terms of signal-to-noise ratio (SNR) and coherence function, the evaluation results do not always agree with the human impressions. These evaluation methods provide physical measures, and do not take account of perceptual aspect of human being. Considering a binaural hearing assistance system as a one of major applications, the quality of segregated sound should keep level enough. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and Perceptual Evaluation of Speech Quality(PESQ), to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and PESQ, to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions.

The Estimation of Subjective Evaluations for Impact Sound and Analysis of the Effects for Parts of a Car (자동차 임팩트 사운드에 대한 주관적 평가 및 차량 개발에 응용)

  • Na, Eun-Woo;Park, Sang-Won;Kim, Ho-Wok;Lee, Sang-Kwon;Lee, Kyung-Hoi;Shin, Young-Gon;Bae, Byung-Kook
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2009.10a
    • /
    • pp.137-142
    • /
    • 2009
  • Impact noise is induced in a car when it is driven on a harsh road or over some bumps. This noise occurs with the very high level of sound, which affects passengers in some way or other. Although it is impossible to clearly remove such noise. It is necessary to research an improvement in sound quality for impact noise. A new sound metric for impact sound is presented in the previous work. This metric is verified by comparison between mean subjective ratings and several sound metrics. In this paper, more objective attributes are considered, which are the attributes expressing the level and modulation of sound. Three sound metrics are employed to get impact sound indexes for each course by the method of multiple linear regressions. The indexes are verified by considering the correlation between the estimated values from the multiple linear regressions and the mean subjective ratings by evaluators. Also, the subjective ratings on the indexes are estimated for the case in which some parts of suspension system are changed. The estimated ratings represent more reasonable or acceptable ratings. Thus, such indexes can be used for modification of the parts of suspension system under considering a good sound quality.

  • PDF

Design of the Noise Suppressor Using Wavelet Transform (웨이블릿 변환을 이용한 잡음제거기 설계)

  • 원호진;김종학;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.37-46
    • /
    • 2001
  • This paper proposes a new noise suppression method using the Wavelet transform analysis. The noise suppressor using the Wavelet transform shows the more effective advantages in a babble noise than one using the short-time Fourier transform. We designed a new channel structure based on spectral subtraction of Wavelet transform coefficients and used the Wavelet mask pattern with more higher time resolution in high frequency. It showed a good adaptation capability for babble noise with a non-stationary property. To evaluate the performance of proposed noise canceller, the informal subjective listening tests (Mos tests) were performed in background noise environments (car noise, street noise, babble noise) of mobile communication. The proposed noise suppression algorithm showed about MOS 0.2 performance improvements than the suppression algorithm of EVRC in informal listening tests. The noise reduction by the proposed method was shown in spectrogram of speech signal.

  • PDF

Real-time Implementation of a GSM-EFR Speech Coder on a 16 Bit Fixed-point DSP (16 비트 고정 소수점 DSP를 이용한 GSM-EFR 음성 부호화기의 실시간 구현)

  • 최민석;변경진;김경수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.7
    • /
    • pp.42-47
    • /
    • 2000
  • This paper describes a real-time implementation of a GSM-EFR (Global System for Mobil communications Enhanced Full Rate) speech coder using OakDSP core; a 16bit fixed-point Digital Signal Processor (DSP) by DSP Group, Inc. The real-time implemented speech coder required about 24MIPS for computation and 7.06K words and 12.19K words for code and data memory, respectively. The implemented GSM-EFR speech coder passes all of test vectors provided by ETSI (European Telecommunication Standard Institute), and perceptual speech quality measurement using MNB algorithm shows that the quality of the GSM-EFR speech coder is similar to the one of 32kbps ADPCM. The real-time implemented GSM-EFR speech coder which is the highest bit-rate mode of the GSM-AMR speech coder will be used as the basic structure of the GSM-AMR speech coder which is embedded in MODEM ASIC of IMT2000 asynchronous mode mobile station.

  • PDF

The Study of Comparison between RPE-LTP and VSELP Speech Coder (RPE-LTP와 VSELP 음성부호화기의 비교에 관한 연구)

  • 박대덕;김화준;심재훈;유재희;정하봉;서정하
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.9
    • /
    • pp.1838-1847
    • /
    • 1994
  • Until recently, they decided the standard of the digital mobile communication speech coding method and competively developed the more detailed techniques in North America, Europe, Japan, etc. But, we have not yet determined. In this paper, we compared the RPE-LTP speech coding algorithm, standard in Europe, with the VSELP speech coding algorith, standard in North America, with respect to the soruce coding. We described the comprehensive verification and comparison with each speech coder, and discussed the improvement plan. Next, we also compared the number of computations which affects the real time processing seriously. Moreover, we performed the simulation with the Korean speech data, concreting the algorithm of each speech coder. Finally, we compared the performance of each speech coder with segmental SNR and 5-point MOS. The number of computations was calculated, and the result was that the number of multiplication computing times of VSELP speech encoder was the largest. With 26 speech data, the segmental SNR of VSELP was calculated larger than that of RPE-LTP. The 5-point MOS test was performed, and the result was that the basic speech quality of VSELP was equivalent or better than that of RPE-LTP.

  • PDF

Evaluation of Sound Quality for Ergonomic Design of Movable Parts in a Refrigerator (냉장고 동작부품의 소음특성 분석을 통한 감성품질 개선)

  • Kang, Seong Yeop;So, Sae Rom;Kim, Gun Ou;Kim, Ji Hoon;Park, Sang Hu
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.17 no.6
    • /
    • pp.7-15
    • /
    • 2018
  • We propose a method for evaluating sound quality quantitatively to develop high-level home appliances (HA). Generally, a refrigerator has diverse movable parts such as slider, drawer, and folding shelf. Therefore, an engineering treatment to control the noise quality is considered as one of key technologies for a higher level refrigerator. Among the movable parts, we have selected a folding shelf as an example, which is commonly setup inside of a home refrigerator for increasing space convenience, to control the noise quality. However, it is known that its noise level is very high comparing to other movable parts when folding or unfolding actions. In order to evaluate and compare the noise quality, we have tested different eighteen models, and have suggested an impact sound quality index (ISQI) based on subjective evaluation data obtained experimentally by thirty two evaluators. The ISQI was formulated using three sound quality elements (noise peak, raising time, impact duration) to determine psycho-acoustic properties. Through this work, we developed an evaluating process and ISQI that was verified the usefulness by comparing the test results of personal perceptions given by evaluators with the prediction value of ISQI. We showed a good relations between them, so we believe that the proposed method and ISQI can be utilized to control of the noise quality of HA effectively.