• Title/Summary/Keyword: perceptual models

Search Result 60, Processing Time 0.02 seconds

Robust Image Watermarking via Perceptual Structural Regularity-based JND Model

  • Wang, Chunxing;Xu, Meiling;Wan, Wenbo;Wang, Jian;Meng, Lili;Li, Jing;Sun, Jiande
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.1080-1099
    • /
    • 2019
  • A better tradeoff between robustness and invisibility will be realized by using the just noticeable (JND) model into the quantization-based watermarking scheme. The JND model is usually used to describe the perception characteristics of human visual systems (HVS). According to the research of cognitive science, HVS can adaptively extract the structure features of an image. However, the existing JND models in the watermarking scheme do not consider the structure features. Therefore, a novel JND model is proposed, which includes three aspects: contrast sensitivity function, luminance adaptation, and contrast masking (CM). In this model, the CM effect is modeled by analyzing the direction features and texture complexity, which meets the human visual perception characteristics and matches well with the spread transform dither modulation (STDM) watermarking framework by employing a new method to measure edge intensity. Compared with the other existing JND models, the proposed JND model based on structural regularity is more efficient and applicable in the STDM watermarking scheme. In terms of the experimental results, the proposed scheme performs better than the other watermarking scheme based on the existing JND models.

The Assessment on the Sound Quality of Reduced Frequency Selectivity of Hearing Impaired People (난청인의 주파수 선택도 둔화현상이 음질에 미치는 영향 평가)

  • An, Hong-Sub;Park, Gyu-Seok;Jeon, Yu-Yong;Song, Young-Rok;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.6
    • /
    • pp.1196-1203
    • /
    • 2011
  • The reduced frequency selectivity is a typical phenomenon of sensorineural hearing loss. In this paper, we compared two modeling methods for reduced frequency selectivity of hearing impaired people. The two models of reduced frequency selectivity were made using LPC(linear prediction coding) algorithm and bandwidth control algorithm based on ERB(equivalent rectangular bandwidth) of auditory filter, respectively. To compare the effectiveness of two models, we compared the result of PESQ (perceptual evaluation of speech quality) and LLR(log likelihood ratio) using 36 Korean words of two syllables. To verify the effect on noise condition, we mixed white and babble noise with 0dB and -3dB SNR to speech words. As the result, it is confirmed that the PESQ score of bandwidth control algorithm is higher than the score of LPC algorithm, on the other hands, and the LLR score of LPC algorithm is lower than the score of bandwidth control algorithm. It means that both non-linearity and widen auditory filter characteristics caused by reduced frequency selectivity could be more reflected in bandwidth control algorithm than in LPC algorithm.

Nurses단 Role Models, Perceptions Toward Occupation, Self-Actualization Value and the Phases of Socialization Process (임상간호원의 사회화과정단계에 있어서의 역할모델, 직업에 대한 지각향성 및 자아실현성간의 관계)

  • 한윤복;강윤숙
    • Journal of Korean Academy of Nursing
    • /
    • v.17 no.1
    • /
    • pp.24-32
    • /
    • 1987
  • This study was designed to investigate the changes of nurses' role model, perceptions toward occupation, and self actualization value in terms of the phases of socialization process. Two hundred and sixty nine nurses working in clinical settings were randomly selected from 15 general hospitals despersed over Seoul and Kyungki province. Data were gathered by the standardized Perceptual Orientation Test, the Self-actualization Test, and Questionnaires on role models and phases of socialization process developed by the investigators from October 1985 to March 1986. The data were analysed by ANOVA and Pearson's Correlation Coefficient. The results were as follows: 1. The average time period required for the shift of phases of socialization process were; phase Ⅰ, role adjustment, took average 10 months of employment: Phase Ⅱ, interpersonal adjustment, 12 months: and Phase Ⅲ, role conflict, 15 months respectively. Conflict resolution, phase Ⅳ, began to take place 18 months of employment; and shifted to phase V, internalization and self-actualization at 25 months of employment. 2. Throughout 5 consecutive phase, the number of immediate superior nurse model was dominantly the highest among the role models. The number of head nurse role model increased at phase Ⅱ, phase Ⅲ, and phase Ⅳ. Respondents with school model in phase I tended to transfer to work model at phase Ⅱ. 3. The perceptions toward occupation were not significantly influenced by the Phases of socialization process. 4. The score of self-actualization value was not significantly influenced by the phases of socialization process. 5. In regard to perceptions toward occupation, nursing director model group showed significantly lower score in phase I (p<.01). 6. The comparison of self-actualization value between the 5 phases revealed significant difference in phase I: in particular among respondents with school model at p<.05. To conclude: 1. The phase Ⅲ of socialization process is the period of role conflict which occur at 15 months of employment, an6 conflict resolution, phase Ⅳ, begins at 18 months of employment on the average in clinical settings. 2. The immediate superior nurse and the head nurse are important role models for nurses all through their socialization process.

  • PDF

Adaptive Digital Watermarking using Stochastic Image Modeling Based on Wavelet Transform Domain (웨이브릿 변환 영역에서 스토케스틱 영상 모델을 이용한 적응 디지털 워터마킹)

  • 김현천;권기룡;김종진
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.3
    • /
    • pp.508-517
    • /
    • 2003
  • This paper presents perceptual model with a stochastic multiresolution characteristic that can be applied with watermark embedding in the biorthogonal wavelet domain. The perceptual model with adaptive watermarking algorithm embeds at the texture and edge region for more strongly embedded watermark by the SSQ. The watermark embedding is based on the computation of a NVF that has local image properties. This method uses non- stationary Gaussian and stationary Generalized Gaussian models because watermark has noise properties. The particularities of embedding in the stationary GG model use shape parameter and variance of each subband regions in multiresolution. To estimate the shape parameter, we use a moment matching method. Non-stationary Gaussian model uses the local mean and variance of each subband. The experiment results of simulation were found to be excellent invisibility and robustness. Experiments of such distortion are executed by Stirmark 3.1 benchmark test.

  • PDF

A Study on Speech Recognition in a Running Automobile (주행중인 자동차 환경에서의 음성인식 연구)

  • 양진우;김순협
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.3-8
    • /
    • 2000
  • In this paper, we studied design and implementation of a robust speech recognition system in noisy car environment. The reference pattern used in the system is DMS(Dynamic Multi-Section). Two separate acoustic models, which are selected automatically depending on the noisy car environment for the speech in a car moving at below 80km/h and over 80km/h are proposed. PLP(Perceptual Linear Predictive) of order 13 is used for the feature vector and OSDP (One-Stage Dynamic Programming) is used for decoding. The system also has the function of editing the phone-book for voice dialing. The system yields a recognition rate of 89.75% for male speakers in SI (speaker independent) mode in a car running on a cemented express way at over 80km/h with a vocabulary of 33 words. The system also yields a recognition rate of 92.29% for male speakers in SI mode in a car running on a paved express way at over 80km/h.

  • PDF

Performance comparison evaluation of real and complex networks for deep neural network-based speech enhancement in the frequency domain (주파수 영역 심층 신경망 기반 음성 향상을 위한 실수 네트워크와 복소 네트워크 성능 비교 평가)

  • Hwang, Seo-Rim;Park, Sung Wook;Park, Youngcheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.30-37
    • /
    • 2022
  • This paper compares and evaluates model performance from two perspectives according to the learning target and network structure for training Deep Neural Network (DNN)-based speech enhancement models in the frequency domain. In this case, spectrum mapping and Time-Frequency (T-F) masking techniques were used as learning targets, and a real network and a complex network were used for the network structure. The performance of the speech enhancement model was evaluated through two objective evaluation metrics: Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI) depending on the scale of the dataset. Test results show the appropriate size of the training data differs depending on the type of networks and the type of dataset. In addition, they show that, in some cases, using a real network may be a more realistic solution if the number of total parameters is considered because the real network shows relatively higher performance than the complex network depending on the size of the data and the learning target.

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

  • Seorim Hwang;Sung Wook Park;Youngcheol Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.253-259
    • /
    • 2024
  • This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.

Estimation of grain size data from the hydraulic conductivity (투수계수로부터 입도분포 자료의 추정)

  • Nkomozepi, Temba;Chung, Sang-Ok
    • Current Research on Agriculture and Life Sciences
    • /
    • v.29
    • /
    • pp.29-35
    • /
    • 2011
  • The relationship between hydrologic processes and scale is one of the more complex issues in surface water hydrology. Disturbances that change vegetation and/or soil properties have been known to subsequently alter the landscape. The primary objective of this study was to estimate the grain size of soils with different properties from the hydraulic conductivity using pedotransfer functions. The double ring infiltrometer method was used to measure the vertical hydraulic conductivity of three soils under different soil planar surface treatments. Seven selected pedotransfer functions were used to estimate percentile diameters and the reduction in infiltration caused by compaction was misconstrued as caused by changes in percentile diameter. Results showed that compaction on the sandy loamy foot paths reduced the hydraulic conductivity by about 50%. The study showed that perceptual models of infiltration processes and appreciation of scale problems in modeling are far more sophisticated than normally presented in texts. Hydraulic measurement methods are still relevant and will provide significant information of grain size of the soils.

  • PDF

A Study on the Design of MDCT/IMDCT for MPEG Audio (MPEG Audio을 위 한 MDCT/IMDCT의 설계에 관한 연구)

  • 김정태;방기천;이강현
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.530-533
    • /
    • 1999
  • During the last decade, high quality digital audio has essentially replaced analog audio. During this period, digital audio have applied many application areas of the info-industry. These applications have created a demand for high quality digital audio. In audio compression, the methods using human auditory nervous properties are used and introduced from psychoacoustical model utilized perceptual audio coding unable to code above the limitation of human perception. The discussion concentrates on architectures and applications of those techniques which utilize psychoacoustical models to exploit efficiently masking characteristics of the human receiver. In this paper, the designed MDCT/IMBCT as a standard of current MPEG is implemented onto FPGA.

  • PDF

The acoustic realization of the Korean sibilant fricative contrast in Seoul and Daegu

  • Holliday, Jeffrey J.
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.67-74
    • /
    • 2012
  • The neutralization of /$s^h$/ and /$s^*$/ in Gyeongsang dialects is a culturally salient stereotype that has received relatively little attention in the phonetic literature. The current study is a more extensive acoustic comparison of the sibilant fricative productions of Seoul and Gyeongsang dialect speakers. The data presented here suggest that, at least for young Seoul and Daegu speakers, there are few inter-dialectal differences in sibilant fricative production. These conclusions are supported by the output of mixed effects logistic regression models that used aspiration duration, spectral mean of the frication noise, and H1-H2 of the following vowel to predict fricative type in each dialect. The clearest dialect difference was that Daegu speakers' /$s^h$/ and /$s^*$/ productions had overall shorter aspiration durations than those of Seoul speakers, suggesting the opposite of the traditional "/$s^*$/ produced as [$s^h$]" stereotype of Gyeongsang dialects. Further work is needed to investigate whether /$s^h/-/s^*$/ neutralization in Daegu is perceptual rather than acoustic in nature.