• Title/Summary/Keyword: perceptual models

Search Result 60, Processing Time 0.025 seconds

A Perception-based Color Correction Method for Multi-view Images

  • Shao, Feng;Jiang, Gangyi;Yu, Mei;Peng, Zongju
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.2
    • /
    • pp.390-407
    • /
    • 2011
  • Three-dimensional (3D) video technologies are becoming increasingly popular, as it can provide users with high quality and immersive experiences. However, color inconsistency between the camera views is an urgent problem to be solved in multi-view imaging. In this paper, a perception-based color correction method for multi-view images is proposed. In the proposed method, human visual sensitivity (VS) and visual attention (VA) models are incorporated into the correction process. Firstly, the VS property is used to reduce the computational complexity by removing these visual insensitive regions. Secondly, the VA property is used to improve the perceptual quality of local VA regions by performing VA-dependent color correction. Experimental results show that compared with other color correction methods, the proposed method can greatly promote the perceptual quality of local VA regions greatly and reduce the computational complexity, and obtain higher coding performance.

Chasing ideas in phonetics

  • Ladefoged, Peter
    • Speech Sciences
    • /
    • v.5 no.2
    • /
    • pp.7-16
    • /
    • 1999
  • Starting as a poet, I learned about the sounds of words with David Abercrombie. Then, remembering my background in physics, I moved to studying acoustic phonetics and speech synthesis. From there I learned about psychology and how. to test perceptual theories. A meeting with a physiologist led to work on the use of the respiratory muscles in speech. Later I landed in Africa teaching English phonetics and learning about African languages. When I went to UCLA to set up a lab I was able to find bright students who helped make computer models of the vocal tract and taught me linguistic theory. And I was able to continue wandering around the world, describing the sounds of a wide range of languages.

  • PDF

Human Laughter Generation using Hybrid Generative Models

  • Mansouri, Nadia;Lachiri, Zied
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1590-1609
    • /
    • 2021
  • Laughter is one of the most important nonverbal sound that human generates. It is a means for expressing his emotions. The acoustic and contextual features of this specific sound are different from those of speech and many difficulties arise during their modeling process. During this work, we propose an audio laughter generation system based on unsupervised generative models: the autoencoder (AE) and its variants. This procedure is the association of three main sub-process, (1) the analysis which consist of extracting the log magnitude spectrogram from the laughter database, (2) the generative models training, (3) the synthesis stage which incorporate the involvement of an intermediate mechanism: the vocoder. To improve the synthesis quality, we suggest two hybrid models (LSTM-VAE, GRU-VAE and CNN-VAE) that combine the representation learning capacity of variational autoencoder (VAE) with the temporal modelling ability of a long short-term memory RNN (LSTM) and the CNN ability to learn invariant features. To figure out the performance of our proposed audio laughter generation process, objective evaluation (RMSE) and a perceptual audio quality test (listening test) were conducted. According to these evaluation metrics, we can show that the GRU-VAE outperforms the other VAE models.

Experimental Study on Subjective Sound Quality Evaluation of Vehicle Noises (승용차소음의 주관적 음질평가 실험연구)

  • Choe, Byongho
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.14 no.12
    • /
    • pp.1223-1232
    • /
    • 2004
  • This study is directed toward determining the number and characteristics of psychologically meaningful perceptual dimensions required for assessing the sound quality with respect to vehicle noises, and toward identifying the acoustical and/or psychoacoustical bases underlying the preference and similarity judgments. For the purpose of analyzing the paired comparison data produced by subjective ratings we used nonmetric multidimensional scaling(MDS). The perceptual dimensions based upon preference ratings could explain 76.3 % of the variance by maximum dB(A) and sharpness acum. The correlation between objective and subjective positions of the stimuli is $R^2$=0.97(F(1,13)=195.45, p < .01), corrected $R^2$=0.93. The less the intensity of the stimulus the more becomes the subjective Position would be over-estimated relative to the objective one. The same is valid for the opposite case. The perceptual dimensions based upon similarity judgments could be accounted for 47.8 % and 23.5% of the variance, each of which might be a match for the maximum dB(A) and the sharpness acum, respectively. The correlation between objective and subjective positions of the stimuli is $R^2$=0.94(F(1,13)=92.38, p < .01), corrected $R^2$=0.87. The more the intensity of the stimulus the more becomes the subjective position would be over-estimated relative to the objective one. The same is valid for the opposite case. In other words, it is likely that the larger the amount of two stimuli which to compare would be judged similar. So far it should be further clarified that whether the relationship between preference ratings and psychological distances nay be optimized through which psycho-physical models.

User Perception on Character Clone of Crowds based on Perceptual Organization (군중에서의 캐릭터 복제에 관한 지각체제화 기반 사용자 인지)

  • Byun, Hae-Won;Park, Yoon-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.11
    • /
    • pp.819-830
    • /
    • 2009
  • When simulating large crowds, it is inevitable that the models and motions of many characters will be cloned. McDonnell et al. analyzed user's perception to find cloned characters. They established that clones of appearance are far easier to detect than motion clones. In this paper, we expand McDonnell's research[1], with the focus on multiple clones and the appearance variety in real-time game environment. Introducing the perceptual organization, we show the appearance variety of crowd clones by using game items and texture modulation. Other factors that influence the ability to detect clones were examined, such as the moving direction and distance between character clones. Our results provide novel insights and useful thresholds that will assist in creating more realistic crowds of game environments.

The effect of TV media on adolescent body image (TV 미디어가 청소년의 신체이미지에 미치는 영향)

  • 김재숙;이미숙
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.25 no.5
    • /
    • pp.957-968
    • /
    • 2001
  • The purpose of this study was to extend the social comparison theory in an attempt to examine the effect of TV media on adolescent body images. The research was a survey and the subjects were 895 male and female adolescents in Taejon, Korea. The measuring instruments were 2 sets of stimuli of male and female body silhouettes and self-administrated questionnaire. Results were as follows: 1) The subjects TV viewing periods were 3∼4 hours per day and their involvements in TV media were moderate degree. 2) The results of perceptual body images showed that adolescent favored thin body type as an ideal body and had distorting tendency that their bodies were larger than the actual sizes. 3) The results of attitudinal body images showed three factors such as \"appearance evaluation\", \"appearance orientation\", and \"fitness orientation\". 4) TV media had significant effects on perceptual and attitudinal body images. It is concluded that the results of this study support social comparison theory that people compare themselves to others to satisfy their needs for self-evaluation and for judgments of their own personal worth since TV media give strong influence on adolescents through presenting social comparison models to body images.

  • PDF

Recognition of Restricted Continuous Korean Speech Using Perceptual Model (인지 모델을 이용한 제한된 한국어 연속음 인식)

  • Kim, Seon-Il;Hong, Ki-Won;Lee, Haing-Sei
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.3
    • /
    • pp.61-70
    • /
    • 1995
  • In this paper, the PLP cepstrum which is close to human perceptual characteristics was extracted through the spread time area to get the temperal feature. Phonemes were recognized by artificial neural network similar to the learning method of human. The phoneme strings were matched by Markov models which well suited for sequence. Phoneme recognition for the continuous Korean speech had been done using speech blocks in which speech frames were gathered with unequal numbers. We parameterized the blocks using 7th order PLPs, PTP, zero crossing rate and energy, which neural network used as inputs. The 100 data composed of 10 Korean sentences which were taken from the speech two men pronounced five times for each sentence were used for the the recognition. As a result, maximum recognition rate of 94.4% was obtained. The sentence was recognized using Markov models generated by the phoneme strings recognized from earlier results the recognition for the 200 data which two men sounded 10 times for each sentence had been carried out. The sentence recognition rate of 92.5% was obtained.

  • PDF

A Comparative Study of Second Language Acquisition Models: Focusing on Vowel Acquisition by Chinese Learners of Korean (중국인 학습자의 한국어 모음 습득에 대한 제2언어 습득 모델 비교 연구)

  • Kim, Jooyeon
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.27-36
    • /
    • 2014
  • This study provided longitudinal examination of the Chinese learners' acquisition of Korean vowels. Specifically, I examined the Chinese learners' Korean monophthongs /i, e, ɨ, ${\Lambda}$, a, u, o/ that were created at the time of 1 month and 12 months, tried to verify empirically how they learn by dealing with their mother tongue, and Korean vowels through dealing with pattern of the Perceptual Assimilation Model (henceforth PAM) of Best (Best, 1993; 1994; Best & Tyler, 2007) and the Speech Learning Model (henceforth SLM) of Flege (Flege, 1987; Bohn & Flege, 1992, Flege, 1995). As a result, most of the present results are shown to be similarly explained by the PAM and SLM, and the only discrepancy between these two models is found in the 'similar' category of sounds between the learners' native language and the target language. Specifically, the acquisition pattern of /u/ and /o/ in Korean is well accounted for the PAM, but not in the SLM. The SLM did not explain why the Chinese learners had difficulty in acquiring the Korean vowel /u/, because according to the SLM, the vowel /u/ in Chinese (the native language) is matched either to the vowel /u/ or /o/ in Korean (the target language). Namely, there is only a one-to-one matching relationship between the native language and the target language. In contrast, the Chinese learners' difficulty for the Korean vowel /u/ is well accounted for in the PAM in that the Chinese vowel /u/ is matched to the vowel pair /o, u/ in Korean, not the single vowel, /o/ or /u/.

Image Quality Assessment by Combining Masking Texture and Perceptual Color Difference Model

  • Tang, Zhisen;Zheng, Yuanlin;Wang, Wei;Liao, Kaiyang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2938-2956
    • /
    • 2020
  • Objective image quality assessment (IQA) models have been developed by effective features to imitate the characteristics of human visual system (HVS). Actually, HVS is extremely sensitive to color degradation and complex texture changes. In this paper, we firstly reveal that many existing full reference image quality assessment (FR-IQA) methods can hardly measure the image quality with contrast and masking texture changes. To solve this problem, considering texture masking effect, we proposed a novel FR-IQA method, called Texture and Color Quality Index (TCQI). The proposed method considers both in the masking effect texture and color visual perceptual threshold, which adopts three kinds of features to reflect masking texture, color difference and structural information. Furthermore, random forest (RF) is used to address the drawbacks of existing pooling technologies. Compared with other traditional learning-based tools (support vector regression and neural network), RF can achieve the better prediction performance. Experiments conducted on five large-scale databases demonstrate that our approach is highly consistent with subjective perception, outperforms twelve the state-of-the-art IQA models in terms of prediction accuracy and keeps a moderate computational complexity. The cross database validation also validates our approach achieves the ability to maintain high robustness.

Eye-Tracking and Protocol Analyses of Expert and Novice Situation Awareness in Air Traffic Control (항공관제 상황인식에서 전문가와 초보자의 시선추적 및 프로토콜 분석)

  • Hyun, Suk-Hoon;Lee, Kyung-Soo;Kim, Kyeong-Tae;Sohn, Young-Woo
    • Journal of the Ergonomics Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.17-24
    • /
    • 2007
  • Analyses of eye tracking and think-aloud protocol data were performed to examine novice-expert differences in perceptual and cognitive aspects of air traffic controllers' situation awareness. In Experiment 1, three groups of field air traffic controllers (experts, intermediates, novices) were asked to perceive situations that were manipulated by situation complexity. In Experiment 2, protocol analysis for previous situation awareness tasks was performed to extract different task models and strategy models as a function of expertise. Then delayed-recall task and interviews about air control plans for the recalled situations were also executed. Results showed that expert controllers concentrate only on several critical features and have their own strategies to reduce mental workloads.