• Title/Summary/Keyword: perceptual distortion

Search Result 63, Processing Time 0.019 seconds

A Study on Objective Speech Quality Measure under CDMA Telephone Networks Environment (CDMA 통신망에서의 객관적 음질 평가 척도에 관한 연구)

  • 김광수;김민정;석수영;정호열;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.2 no.4
    • /
    • pp.53-58
    • /
    • 2001
  • In this paper to develop objective speech quality measure for CDMA telephone network environments, recent developed measures are investigated first. But those measures show low performances in CDMA telephone networks. To solve this problem, new objective speech quality measure adopting noise masking threshold is proposed and studied. To acquire better performance, scaled noise masking threshold calculation for speech signals is employed instead of conventional tone signals. To verify effectiveness of proposed method performance comparison experiments are carried out for CDMA telephone network speech databases, for the results proposed methods show improved performances compared to existing meaures.

  • PDF

A Study on the Expression by Anamorphose Phenomenon (아나모르포즈(anamorphose)지각현상에 의한 공간 표현 연구)

  • Lee, Jeong-Yoon;Kim, Kai-Chun
    • Korean Institute of Interior Design Journal
    • /
    • v.23 no.4
    • /
    • pp.63-71
    • /
    • 2014
  • Anamorphosis is highly favored in modern days as the atmosphere of pursuing unusual manners is growing while transformation and distortion of images are freely available. This research is to understand the affect of these distorted images on space designs and the close connection between anamorphosis and visual perceptions, and to identify the new perceptual phenomenon created through it, and the methods of expressing those. Four expressional methods were defined through the process of studying Anamorphosis based on its definition by Niceron, examining artworks such as paintings and photographs, and case-studying example spaces of visual perception experiments. Expressing anamorphosis through visual perceptions are broadly categorized to directional, dimensional, flatness, and optical. The analysis of 10 case projects suggests that the experimental spaces offer joys of finding and interpreting metaphorical forms and meanings caused by the four characteristic categories above. Also, they artificially show the boundaries between reality and virtual spaces in 2-dimensional or 3-dimensional spaces, and form hyper-boundaries, new experience, and an internal mechanism that is vague and chaotic. Therefore, this research concludes that anamorphosis which is a distorted perspective, is not only a simple measure to overcome perspectival errors, but is an existence suitable to the current era, that will extend its potential and value in spatial design.

Isolated-Word Speech Recognition in Telephone Environment Using Perceptual Auditory Characteristic (인지적 청각 특성을 이용한 고립 단어 전화 음성 인식)

  • Choi, Hyung-Ki;Park, Ki-Young;Kim, Chong-Kyo
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.39 no.2
    • /
    • pp.60-65
    • /
    • 2002
  • In this paper, we propose GFCC(gammatone filter frequency cepstrum coefficient) parameter which was based on the auditory characteristic for accomplishing better speech recognition rate. And it is performed the experiment of speech recognition for isolated word acquired from telephone network. For the purpose of comparing GFCC parameter with other parameter, the experiment of speech recognition are carried out using MFCC and LPCC parameter. Also, for each parameter, we are implemented CMS(cepstral mean subtraction)which was applied or not in order to compensate channel distortion in telephone network. Accordingly, we found that the recognition rate using GFCC parameter is better than other parameter in the experimental result.

Front-End Processing for Speech Recognition in the Telephone Network (전화망에서의 음성인식을 위한 전처리 연구)

  • Jun, Won-Suk;Shin, Won-Ho;Yang, Tae-Young;Kim, Weon-Goo;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.57-63
    • /
    • 1997
  • In this paper, we study the efficient feature vector extraction method and front-end processing to improve the performance of the speech recognition system using KT(Korea Telecommunication) database collected through various telephone channels. First of all, we compare the recognition performances of the feature vectors known to be robust to noise and environmental variation and verify the performance enhancement of the recognition system using weighted cepstral distance measure methods. The experiment result shows that the recognition rate is increasedby using both PLP(Perceptual Linear Prediction) and MFCC(Mel Frequency Cepstral Coefficient) in comparison with LPC cepstrum used in KT recognition system. In cepstral distance measure, the weighted cepstral distance measure functions such as RPS(Root Power Sums) and BPL(Band-Pass Lifter) help the recognition enhancement. The application of the spectral subtraction method decrease the recognition rate because of the effect of distortion. However, RASTA(RelAtive SpecTrAl) processing, CMS(Cepstral Mean Subtraction) and SBR(Signal Bias Removal) enhance the recognition performance. Especially, the CMS method is simple but shows high recognition enhancement. Finally, the performances of the modified methods for the real-time implementation of CMS are compared and the improved method is suggested to prevent the performance degradation.

  • PDF

An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree (난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가)

  • Joo, S.I.;Jeon, Y.Y.;Song, Y.R.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.3 no.1
    • /
    • pp.27-34
    • /
    • 2009
  • Hearing impaired person's hearing loss has personally various shape, so existing symmetrical auditory filter of frequency band method wasn't properly simulated the hearing impaired person's various hearing loss shape. The shapes of auditory filter are asymmetrical different with each center frequency and each input level. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. In this study, the asymmetrical auditory filter was simulated and then some tests to estimate the filter's performance objectively were performed. The experiment as simulated auditory filter's performance evaluation method used perceptual evaluation of speech quality (PESQ) and log likelihood ratio (LLR) for speech through auditory filter. In the test, processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference between symmetrical and asymmetrical auditory filter. It means that the difference of the shape auditory filter may affect to speech quality. Especially, when hearing loss existed, auditory filter changing according to asymmetrical shape for each center frequency affected to perceive speech quality of the hearing impaired.

  • PDF

Attention-induced expansion in visual space (주의에 의한 시각 공간 확장)

  • 유명현;박정선;정찬섭
    • Korean Journal of Cognitive Science
    • /
    • v.10 no.3
    • /
    • pp.51-66
    • /
    • 1999
  • Selective attention induces perceptual distortions. ranging from repulsion of objects located near the attended area(Suzuki & Cavanagh. 1997) to magnification of the u unattended objects (Tsal & Shalev. 1996). Two hypothetical mechanisms have been p postulated: a shift of receptive fields' positions away from the locus of attention(receptive-field-recruitment hypothesis) or the enlargement of perceived space around the a attended location(space-enlargement hypothesis). The present study distinguished between these hypotheses by investigating the spatial and temporal properties of attention-induced d distortions. Perceptual judgements on vernier alignment. line tilt. line length were used to measure attention-induced changes in perception. Attention was induced exogenously(by blinking a specific set of dots around the test stimuli} or endogenously(by instructing the subject to selectively attend the dots). After inducing attention. the test stimuli were briefly flashed. A staircase method was used to measure the attentional effect. A vertical line was perceived as repelled from the locus of attention. and a line segment appeared longer when attention was given to its vicinity. The effects decreased as the distance between the locus of attention or the time between the onset of attention and the stimulus presentation increased. The results imply that the space-enlargement hypothesis provides a better explanation for the attention-induced changes in perception than the receptive-field-recruitment hypothesis.

  • PDF

Reversible Watermarking based on Predicted Error Histogram for Medical Imagery (의료 영상을 위한 추정오차 히스토그램 기반 가역 워터마킹 알고리즘)

  • Oh, Gi-Tae;Jang, Han-Byul;Do, Um-Ji;Lee, Hae-Yeoun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.5
    • /
    • pp.231-240
    • /
    • 2015
  • Medical imagery require to protect the privacy with preserving the quality of the original contents. Therefore, reversible watermarking is a solution for this purpose. Previous researches have focused on general imagery and achieved high capacity and high quality. However, they raise a distortion over entire image and hence are not applicable to medical imagery which require to preserve the quality of the objects. In this paper, we propose a novel reversible watermarking for medical imagery, which preserve the quality of the objects and achieves high capacity. First, object and background region is segmented and then predicted error histogram-based reversible watermarking is applied for each region. For the efficient watermark embedding with small distortion in the object region, the embedding level at object region is set as low while the embedding level at background region is set as high. In experiments, the proposed algorithm is compared with the previous predicted error histogram-based algorithm in aspects of embedding capacity and perceptual quality. Results support that the proposed algorithm performs well over the previous algorithm.

A Study on Development of a Hearing Impairment Simulator considering Frequency Selectivity and Asymmetrical Auditory Filter of the Hearing Impaired (난청인의 주파수 선택도와 비대칭적 청각 필터를 고려한 난청 시뮬레이터 개발에 관한 연구)

  • Joo, Sang-Ick;Kang, Hyun-Deok;Song, Young-Rok;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.4
    • /
    • pp.831-840
    • /
    • 2010
  • In this paper, we propose a hearing impairment simulator considering reduced frequency selectivity and asymmetrical auditory filter of the hearing impaired, and we verified the reduced frequency selectivity and asymmetrical auditory filter affected in speech perception through experiments. The reduced frequency selectivity has made embodied by spectral smearing using LPC(linear prediction coding). The shapes of auditory filter are asymmetrical different with each center frequency. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. The experiments confirmed subjective test and objective test. The subjective experiments are composed of 4 kinds of tests: pure tone test, SRT(speech reception threshold) test, and WRS(word recognition score) test without spectral smearing, and WRS test with spectral smearing. The experiment of the hearing impairment simulator was performed from 9 subjects who have normal ears. The amount of spectral smearing was controlled by LPC order. The asymmetrical auditory filter of proposed hearing impairment simulator was simulated and then some tests to estimate the filter's performance objectively were performed. The objective experiment as simulated auditory filter's performance evaluation method used PESQ(perceptual evaluation of speech quality) and LLR(log likelihood ratio) for speech through auditory filter. The processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference according to asymmetrical auditory filter in hearing impairment simulator.

Improved CycleGAN for underwater ship engine audio translation (수중 선박엔진 음향 변환을 위한 향상된 CycleGAN 알고리즘)

  • Ashraf, Hina;Jeong, Yoon-Sang;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.4
    • /
    • pp.292-302
    • /
    • 2020
  • Machine learning algorithms have made immense contributions in various fields including sonar and radar applications. Recently developed Cycle-Consistency Generative Adversarial Network (CycleGAN), a variant of GAN has been successfully used for unpaired image-to-image translation. We present a modified CycleGAN for translation of underwater ship engine sounds with high perceptual quality. The proposed network is composed of an improved generator model trained to translate underwater audio from one vessel type to other, an improved discriminator to identify the data as real or fake and a modified cycle-consistency loss function. The quantitative and qualitative analysis of the proposed CycleGAN are performed on publicly available underwater dataset ShipsEar by evaluating and comparing Mel-cepstral distortion, pitch contour matching, nearest neighbor comparison and mean opinion score with existing algorithms. The analysis results of the proposed network demonstrate the effectiveness of the proposed network.

An Objective Speech Quality Measure using Masking Effect under Digital Mobile Telephone Network Environment (디지털 이동통신망 환경 하에서 마스킹 효과를 이용한 객관적 음질 평가 척도)

  • 김광수;김민정;석수영;정호열;정현일
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.4
    • /
    • pp.405-414
    • /
    • 2002
  • In this paper, we propose a new objective speech quality measure using noise masking threshold for speech quality assessment of mobile telephone network environments, and verify the effectiveness of the proposed method through the experiments. For such a purpose, well known objective speech quality measures such as BSD and PSQM are first evaluated for digital mobile telephone network environments. However, these conventional methods does not have good performance under mobile networks environments compared to literary results. To be mote effective objective speech quality measure under mobile telephone environments, the proposed method employs human psychoacoustic masking effect. The DMOS, instead of MOS, is used as a subjective speech quality measure for performance evaluation. The performance comparison are carried out with speech data collected from digital mobile telephone environments. As results, the proposed measure have and average 4% higher performance, in terms of correlation, than existing objective speech quality measures such as BSD and PSQM.

  • PDF