• Title/Summary/Keyword: Perceptual region

Search Result 47, Processing Time 0.02 seconds

3D Visual Attention Model and its Application to No-reference Stereoscopic Video Quality Assessment (3차원 시각 주의 모델과 이를 이용한 무참조 스테레오스코픽 비디오 화질 측정 방법)

  • Kim, Donghyun;Sohn, Kwanghoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.110-122
    • /
    • 2014
  • As multimedia technologies develop, three-dimensional (3D) technologies are attracting increasing attention from researchers. In particular, video quality assessment (VQA) has become a critical issue in stereoscopic image/video processing applications. Furthermore, a human visual system (HVS) could play an important role in the measurement of stereoscopic video quality, yet existing VQA methods have done little to develop a HVS for stereoscopic video. We seek to amend this by proposing a 3D visual attention (3DVA) model which simulates the HVS for stereoscopic video by combining multiple perceptual stimuli such as depth, motion, color, intensity, and orientation contrast. We utilize this 3DVA model for pooling on significant regions of very poor video quality, and we propose no-reference (NR) stereoscopic VQA (SVQA) method. We validated the proposed SVQA method using subjective test scores from our results and those reported by others. Our approach yields high correlation with the measured mean opinion score (MOS) as well as consistent performance in asymmetric coding conditions. Additionally, the 3DVA model is used to extract information for the region-of-interest (ROI). Subjective evaluations of the extracted ROI indicate that the 3DVA-based ROI extraction outperforms the other compared extraction methods using spatial or/and temporal terms.

A CELP Coder using the Band-Divided Long Term Prediction (대역 분할 장구간 예측을 이용한 CELP 부호화기)

  • Choi, Young-Soo;Kang, Hong-Goo;Lim, Myoung-Seob;Ahn, Dong-Soon;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.38-45
    • /
    • 1995
  • In this paper a way to improve the performance of the long term prediction is proposed, which adopts the Multi-band Excitation (MBE) method in addition to the Code-Excited Linear Prediction (CELP) method at low bit rates below 4.8 kbps. In the proposed method, the multiband long term prediction is performed on the periodic components which still remain after the long term prediction of the conventional CELP method. At this point, the whole frequency region is divided into subbands whose size is equal to the spacing between the harmonics of the fundamental frequency, and the periodic multiband excitation signals. are represented as the sum of sine waves approximately as large as the spectrum of the excitation signals, so that the actual characteristics of the excitation signals can be better taken into account. To evaluate the performance of the proposed method, computer simulation is performed at 4.8 kbps. The 4.8 kbps DoD CELP and the 4.4 kbps IMBE were chosen as the reference vocoders for the speech quality measure. The result of the perceptual speech quality measure showed that the performance of the proposed method is better than that of the 4.8 kbps DoD CELP vocoder, and similar to that of the 4.4 kbps IMBE vocoder.

  • PDF

Baleen Whale Sound Synthesis using a Modified Spectral Modeling (수정된 스펙트럴 모델링을 이용한 수염고래 소리 합성)

  • Jun, Hee-Sung;Dhar, Pranab K.;Kim, Cheol-Hong;Kim, Jong-Myon
    • The KIPS Transactions:PartB
    • /
    • v.17B no.1
    • /
    • pp.69-78
    • /
    • 2010
  • Spectral modeling synthesis (SMS) has been used as a powerful tool for musical sound modeling. This technique considers a sound as a combination of a deterministic plus a stochastic component. The deterministic component is represented by the series of sinusoids that are described by amplitude, frequency, and phase functions and the stochastic component is represented by a series of magnitude spectrum envelopes that functions as a time varying filter excited by white noise. These representations make it possible for a synthesized sound to attain all the perceptual characteristics of the original sound. However, sometimes considerable phase variations occur in the deterministic component by using the conventional SMS for the complex sound such as whale sounds when the partial frequencies in successive frames differ. This is because it utilizes the calculated phase to synthesize deterministic component of the sound. As a result, it does not provide a good spectrum matching between original and synthesized spectrum in higher frequency region. To overcome this problem, we propose a modified SMS that provides good spectrum matching of original and synthesized sound by calculating complex residual spectrum in frequency domain and utilizing original phase information to synthesize the deterministic component of the sound. Analysis and simulation results for synthesizing whale sounds suggest that the proposed method is comparable to the conventional SMS in both time and frequency domain. However, the proposed method outperforms the SMS in better spectrum matching.

Neural Substrates of Picture Encoding: An fMRI Study (그림의 부호화 과정과 신경기제 : fMRI 연구)

  • 강은주;김희정;김성일;나동규;이경민;나덕렬;이정모
    • Korean Journal of Cognitive Science
    • /
    • v.13 no.1
    • /
    • pp.23-40
    • /
    • 2002
  • This study is to examine brain regions that are involved in picture encoding in normal adults using fMRI methods. In Scan 1, the picture encoding was studied during a semantic categorization task in comparison with word. In Scan 2 task type effects were studied both during a picture naming task and during a semantic categorization task with pictures. Subjects were asked to make decision either by pressing a mouse button (Scan 1) or by responding subvocally (naming or saying yes/no) (Scan 2). Regardless of stimulus type, left prefrontal, bilateral occipital, and parietal activations were observed during semantic processing in comparison with fixation baseline. Processing of word stimulus relative to picture resulted in activations in prefrontal and parieto-temporal regions in the left side while that of picture stimulus relative to word resultd in activations in bilateral extrastriatal visual cortices and parahippocampal regions. In spite of the same task demands, stimulus-specific information processings were involved and mediated by different neural substrates; the word encoding was associated with more semantic/lexical processings than pictures and the picture processing associated with more perceptual and novelty related information processings than word. Activations of dorsal part of inferior prefrontal region, i.e., Broca's areas were found both during the picture naming and during the semantic tasks subvocally performed Especially, during the picture naming task, greater occipital activations were found bilaterally relative to the semantic categorization task. indicating a possibility that greater and higher visual processing was involved in retrieving the name referred by picture stimuli.

  • PDF

Image Enhancement and Clinical Evaluation in Digital Chest Radiography (디지털 방사선 흉부영상의 영상개선과 임상평가)

  • Kim, Sung-Hyun;Suh, Tae-Suk;Choe, Bo-Young;Lee, Hyoung-Koo
    • Progress in Medical Physics
    • /
    • v.19 no.3
    • /
    • pp.143-149
    • /
    • 2008
  • The aim of this study is to suggest the method for image enhancement of digital chest radiograph and evaluate clinically the quality of the resultant image. A nonlinear iterative filter was developed in order to reduce quantum noise preserving edge. Dynamic range was adjusted and adaptive image enhancement was performed based on the property of anatomic region and the degree of compatibility with neighboring pixels. The lung fields were enhanced appropriately to visualize effectively vascular tissue, bronchus and lung tissue with the desired mediastinum enhancement. Clinic evaluation was performed by three radiologists with at least 8 years experience. The anatomic regions of 11 in PA and 9 in Lateral were observed carefully in each 100 radiographs according to ITU (International Telecommunication Union) recommendation 500 protocol. The result showed the mean 3.4 between good and adequate. This means that the clinical utility of the image quality is enough. In this study, image enhancement was carried out considering image display device and human perceptual system to prevent the loss of useful anatomic information. In order to increase the diagnostic accuracy in digital radiograph, the continuous study on image enhancement is needed.

  • PDF

Harmony Arrangements using B-Spline Tension Curves (B-스플라인 텐션 곡선을 이용한 음악 편곡)

  • Yoo, Min-Joon;Lee, In-Kwon;Kwon, Dae-Hyun
    • Journal of the HCI Society of Korea
    • /
    • v.1 no.1
    • /
    • pp.1-8
    • /
    • 2006
  • We suggest a graphical representation of the tension flow in tonal music using a piecewise parametric curve, which is a function of time illustrating the changing degree of tension in a corresponding chord progression. The tension curve can be edited by using conventional curve editing techniques to reharmonize the original music with reflecting the user's demand to control the tension of music. We introduce three different methods to measure the tension of a chord in terms of a specific key, which can be used to represent the tension of the chord numerically. Then, by interpolating the series of numerical tension values, a tension curve is constructed. In this paper, we show the tension curve editing method can be effectively used in several interesting applications: enhancing or weakening the overall feeling of tension in a whole song, the local control of tension in a specific region of music, the progressive transition of tension flow from source to target chord progressions, and natural connection of two songs with maintaining the smoothness of the tension flow. Our work shows the possibility of controlling the perceptual factor (tension) in music by using numerical methods. Most of the computations used in this paper are not expensive so they can be calculated in real time. We think that an interesting application of our method is an interactive modification of tension in background music according to the user's emotion or current scenario in the interactive environments such as games.

  • PDF

Speech Evaluation Tasks Related to Subthalamic Nucleus Deep Brain Stimulation in Idiopathic Parkinson's Disease: A Review (특발성 파킨슨병의 시상밑부핵 심부뇌자극술 관련 말 평가 과제에 대한 문헌연구)

  • Kim, Sun Woo;Kim, Hyang Hee
    • 재활복지
    • /
    • v.18 no.4
    • /
    • pp.237-255
    • /
    • 2014
  • Idiopathic Parkinson disease(IPD) is an neurodegenerative disease caused by the loss of dopamine cells in the substantia nigra, a region of midbrain. Its major symptoms are muscular rigidity, bradykinesia, resting tremor, and postural instability. An estimated 70~90% of patients with IPD also have hypokinetic dysarthria. Subthalamic nucleus deep brain stimulation (STN-DBS) has been reported to be successful in relieving the core motor symptoms of IPD in the advanced stages of the disease. However, data on the effects of STN-DBS on speech performance are inconsistent. A medline literature search was done to retrieve articles published from 1987 to 2012. The results were narrowed down to focus on speech performance under STN-DBS based perceptual, acoustic, and/or aerodynamic analyses. Among the 32 publications which dealt with speech performance after STN-DBS indicated improvement(42%), deterioration(29%), mixed results(26%), or no change(3%). The most favorite method was found to be based upon acoustic analysis by using a vowel prolongation and Unified Parkinson's Disease Rating Scale(UPDRS). For the purpose of verifying the effect of the STN-DBS, speech evaluation should be undertaken on all speech components such as articulation, resonance, phonation, respiration, and prosody by using a contextual speech task.