• Title/Summary/Keyword: Perceptual model

Search Result 219, Processing Time 0.029 seconds

Content Adaptive Watermarkding Using a Stochastic Visual Model Based on Multiwavelet Transform

  • Kwon, Ki-Ryong;Kang, Kyun-Ho;Kwon, Seong-Geun;Moon, Kwang-Seok;Lee, Joon-Jae
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1511-1514
    • /
    • 2002
  • This paper presents content adaptive image watermark embedding using stochastic visual model based on multiwavelet transform. To embedding watermark, the original image is decomposed into 4 levels using a discrete multiwavelet transform, then a watermark is embedded into the JND(just noticeable differences) of the image each subband. The perceptual model is applied with a stochastic approach fer watermark embedding. This is based on the computation of a NVF(noise visibility function) that have local image properties. The perceptual model with content adaptive watermarking algorithm embed at the texture and edge region for more strongly embedded watermark by the JND. This method uses stationary Generalized Gaussian model characteristic because watermark has noise properties. The experiment results of simulation of the proposed watermark embedding method using stochastic visual model based on multiwavelet transform techniques was found to be excellent invisibility and robustness.

  • PDF

A study on sound source segregation of frequency domain binaural model with reflection (반사음이 존재하는 양귀 모델의 음원분리에 관한 연구)

  • Lee, Chai-Bong
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.3
    • /
    • pp.91-96
    • /
    • 2014
  • For Sound source direction and separation method, Frequency Domain Binaural Model(FDBM) shows low computational cost and high performance for sound source separation. This method performs sound source orientation and separation by obtaining the Interaural Phase Difference(IPD) and Interaural Level Difference(ILD) in frequency domain. But the problem of reflection occurs in practical environment. To reduce this reflection, a method to simulate the sound localization of a direct sound, to detect the initial arriving sound, to check the direction of the sound, and to separate the sound is presented. Simulation results show that the direction is estimated to lie close within 10% from the sound source and, in the presence of the reflection, the level of the separation of the sound source is improved by higher Coherence and PESQ(Perceptual Evaluation of Speech Quality) and by lower directional damping than those of the existing FDBM. In case of no reflection, the degree of separation was low.

Performance comparison evaluation of speech enhancement using various loss functions (다양한 손실 함수를 이용한 음성 향상 성능 비교 평가)

  • Hwang, Seo-Rim;Byun, Joon;Park, Young-Cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.2
    • /
    • pp.176-182
    • /
    • 2021
  • This paper evaluates and compares the performance of the Deep Nerual Network (DNN)-based speech enhancement models according to various loss functions. We used a complex network that can consider the phase information of speech as a baseline model. As the loss function, we consider two types of basic loss functions; the Mean Squared Error (MSE) and the Scale-Invariant Source-to-Noise Ratio (SI-SNR), and two types of perceptual-based loss functions, including the Perceptual Metric for Speech Quality Evaluation (PMSQE) and the Log Mel Spectra (LMS). The performance comparison was performed through objective evaluation and listening tests with outputs obtained using various combinations of the loss functions. Test results show that when a perceptual-based loss function was combined with MSE or SI-SNR, the overall performance is improved, and the perceptual-based loss functions, even exhibiting lower objective scores showed better performance in the listening test.

HDR image display combines weighted least square filtering with color appearance model

  • Piao, Meixian;Lee, Kyungjun;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.260-263
    • /
    • 2016
  • Recently high dynamic range imaging technique is hot issue in computer graphic area. We present a progressive tone mapping algorithm, which is based on weighted least squares optimization framework. Our approach combines weighted leastsquaresfiltering with iCAM06, for showing more perceptual high dynamic range images in conventional display, while avoiding visual halo artifacts. We decompose high dynamic range image into base layer and detail layer. The base layer has large scale variation, it is obtained by using weighted least squares filtering, and then the base layer incorporates iCAM06 model. Then, adaptive compression on the base layer according to human visual system. Only the base layer reduces contrast, and preserving detail. The resultshows more perceptual color appearance and preserve fine detail, while avoiding common artifacts.

  • PDF

WLSD: A Perceptual Stimulus Model Based Shape Descriptor

  • Li, Jiatong;Zhao, Baojun;Tang, Linbo;Deng, Chenwei;Han, Lu;Wu, Jinghui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.12
    • /
    • pp.4513-4532
    • /
    • 2014
  • Motivated by the Weber's Law, this paper proposes an efficient and robust shape descriptor based on the perceptual stimulus model, called Weber's Law Shape Descriptor (WLSD). It is based on the theory that human perception of a pattern depends not only on the change of stimulus intensity, but also on the original stimulus intensity. Invariant to scale and rotation is the intrinsic properties of WLSD. As a global shape descriptor, WLSD has far lower computation complexity while is as discriminative as state-of-art shape descriptors. Experimental results demonstrate the strong capability of the proposed method in handling shape retrieval.

A Study for the Development of Korean Voice Assessment Model for the Patients with Voice Disorders: A Qualitative Study (음성장애 진단 및 평가에 관한 질적 연구: 진단 및 평가 모형 정립을 위한 기초연구)

  • Pyo, Hwa-Young;Sim, Hyun-Sub
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.7-22
    • /
    • 2007
  • The purpose of this study was to develop a Korean assessment model for the patients with voice disorders. Interviews were conducted with 4 voice therapists and the results were analyzed by using a qualitative, constant-comparative design. According to the three themes emerged from the qualitative analysis, 10 subthemes were derived. The three main themes were 1) consideration on the disordered voice, 2) status quo of instrumental and perceptual evaluation, and 3) suggestions for the other voice therapists. The 10 subthemes can be summarized as the following: 1) judgment centering on the patients, 2) increase of the reliability of instrumental and perceptual evaluation, 3) voice therapists' positive participation in the assessment procedure of voice disorder.

  • PDF

A Perceptual Rate Control Algorithm with S-JND Model for HEVC Encoder (S-JND 모델을 사용한 주관적인 율 제어 알고리즘 기반의 HEVC 부호화 방법)

  • Kim, JaeRyun;Ahn, Yong-Jo;Lim, Woong;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.21 no.6
    • /
    • pp.929-943
    • /
    • 2016
  • This paper proposes the rate control algorithm based on the S-JND (Saliency-Just Noticeable Difference) model for considering perceptual visual quality. The proposed rate control algorithm employs the S-JND model to simultaneously reflect human visual sensitivity and human visual attention for considering characteristics of human visual system. During allocating bits for CTU (Coding Tree Unit) level in a rate control, the bit allocation model calculates the S-JND threshold of each CTU in a picture. The threshold of each CTU is used for adaptively allocating a proper number of bits; thus, the proposed bit allocation model can improve perceptual visual quality. For performance evaluation of the proposed algorithm, the proposed algorithm was implemented on HM 16.9 and tested for sequences in Class B and Class C under the CTC (Common Test Condition) RA (Random Access), Low-delay B and Low-delay P case. Experimental results show that the proposed method reduces the bit-rate of 2.3%, and improves BD-PSNR of 0.07dB and bit-rate accuracy of 0.06% on average. We achieved MOS improvement of 0.03 with the proposed method, compared with the conventional method based on DSCQS (Double Stimulus Continuous Quality Scale).

Glottal Weighted Cepstrum for Robust Speech Recognition (잡음에 강한 음성 인식을 위한 성문 가중 켑스트럼에 관한 연구)

  • 전선도;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.5
    • /
    • pp.78-82
    • /
    • 1999
  • This paper is a study on weighted cepstrum used broadly for robust speech recognition. Especially, we propose the weighted function of asymmetric glottal pulse shape. which is used for weighted cepstrum extracted by PLP(Perceptual Linear Predictive) based on auditory model. Also, we analyze this glottal weighted cepstrum from the glottal pulse of glottal model in connection with the cepstrum. And we obtain speech features analyzed by both the glottal model and the auditory model. The isolated-word recognition rate is adopted for the test of proposed method in the car moise and street environment. And the performance of glottal weighted cepstrum is compared with both that of weighted cepstrum extracted by LP(Linear Prediction) and that of weighted cepstrum extracted by PLP. The result of computer simulation shows that recognition rate of the proposed glottal weighted cepstrum is better than those of other weighted cepstrums.

  • PDF

Modern Cause and Effect Model by Factors of Root Cause for Accident Prevention in Small to Medium Sized Enterprises

  • Kang, Youngsig;Yang, Sunghwan;Patterson, Patrick
    • Safety and Health at Work
    • /
    • v.12 no.4
    • /
    • pp.505-510
    • /
    • 2021
  • Background: Factors related to root causes can cause commonly occurring accidents such as falls, slips, and jammed injuries. An important means of reducing the frequency of occupational accidents in small- to medium-sized enterprises (SMSEs) of South Korea is to perform intensity analysis of the root cause factors for accident prevention in the cause and effect model like decision models, epidemiological models, system models, human factors models, LCU (life change unit) models, and the domino theory. Especially intensity analysis in a robot system and smart technology as Industry 4.0 is very important in order to minimize the occupational accidents and fatal accident because of the complexity of accident factors. Methods: We have developed the modern cause and effect model that includes factors of root cause through statistical testing to minimize commonly occurring accidents and fatal accidents in SMSEs of South Korea and systematically proposed educational policies for accident prevention. Results: As a result, the consciousness factors among factors of root cause such as unconsciousness, disregard, ignorance, recklessness, and misjudgment had strong relationships with occupational accidents in South Korean SMSEs. Conclusion: We conclude that the educational policies necessary for minimizing these consciousness factors include continuous training procedures followed by periodic hands-on experience, along with perceptual and cognitive education related to occupational health and safety.