• Title/Summary/Keyword: audio level

Search Result 252, Processing Time 0.023 seconds

Salience of Envelope Interaural Time Difference of High Frequency as Spatial Feature (공간감 인자로서의 고주파 대역 포락선 양이 시간차의 유효성)

  • Seo, Jeong-Hun;Chon, Sang-Bae;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.6
    • /
    • pp.381-387
    • /
    • 2010
  • Both timbral features and spatial features are important in the assessment of multichannel audio coding systems. The prediction model, extending the ITU-R Rec. BS. 1387-1 to multichannel audio coding systems, with the use of spatial features such as ITDDist (Interaural Time Difference Distortion), ILDDist (Interaural Level Difference Distortion), and IACCDist (InterAural Cross-correlation Coefficient Distortion) was proposed by Choi et al. In that model, ITDDistswere only computed for low frequency bands (below 1500Hz), and ILDDists were computed only for high frequency bands (over 2500Hz) according to classical duplex theory. However, in the high frequency range, information in temporal envelope is also important in spatial perception, especially in sound localization. A new model to compute the ITD distortions of temporal envelopes in high frequency components is introduced in this paper to investigate the role of such ITD on spatial perception quantitatively. The computed ITD distortions of temporal envelopes in high frequency components were highly correlated with perceived sound quality of multichannel audio sounds.

Enhancement of SBR for Speech Signal Using Adaptive Noise Floor Level (가변 잡음 레벨을 이용한 음성신호에 대한 SBR 성능 항상 기술)

  • Lee, Se-Won;Oh, Seoung-Jun;Ahn, Chang-Beom;Lee, Tae-Jin;Kang, Kyoung-Ok;Park, Ho-Chong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.148-154
    • /
    • 2009
  • In audio coding, SBR technology synthesizes the high-bands using patched time-frequency information from low-bands and the correction parameters, Since SBR transmits only correction parameters for high-bands, it provides a low-rate coding of high-bands, and is used as a core module of MPEG-4 HE-AAC, SBR was originally designed for audio signal and its performance for speech signal tends to decrease, and the major reason is an excessive noise floor in high-bands which is caused by incorrect tonality computation, In this paper, a new method to determine noise floor level in an adaptive fashion according to the speech characteristics is proposed in order to solve the problem of SBR for speech signal, The proposed method maintains the compatibility with the standard SBR, and the subjective performance evaluation shows that the proposed method improves the SBR performance especially for male speech signal compared with the standard SBR.

A Study of the spatial perception by audio-visual information (시각과 청각에 의한 공간적 지각에 관한 연구)

  • Lee, Chai-Bong;Kang, Dae-Gee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.2
    • /
    • pp.132-136
    • /
    • 2010
  • Psychophysical experiment was performed to investigate how audio-visual spatial disparity affects on perceptual space in peripheral vision. In the experiment, participants were exposed to two stimuli of vision and sound which comes simultaneously from different directions, respectively. The visual stimulus was implemented by 7 white LEDs which were located at an equal distance with 7 different angles of $-70^{\circ}$, $-40^{\circ}$, $-20^{\circ}$, $0^{\circ}$, $20^{\circ}$, $40^{\circ}$, and $70^{\circ}$ from the right front. Those audial stimuli were also implemented by loudspeakers which were placed at 9 different directions equally spaced by $5^{\circ}$ ranged from $-20^{\circ}$ to $20^{\circ}$. Each participant then evaluated spatial disparity between visual and audial stimuli with 5 levels of response, in which the higher level indicates the larger gap. When the visual stimulus is applied from the right, the results show that the response level gets higher for a larger angle between visual and auditory stimuli. A similar tendency for the visual stimulus with $0^{\circ}$ orientation was also be observed. On the other hand, when the visual stimulus is applied from the left, the response level gets lower for the larger angle.

Content-based Music Information Retrieval using Pitch Histogram (Pitch 히스토그램을 이용한 내용기반 음악 정보 검색)

  • 박만수;박철의;김회린;강경옥
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.2-7
    • /
    • 2004
  • In this paper, we proposed the content-based music information retrieval technique using some MPEG-7 low-level descriptors. Especially, pitch information and timbral features can be applied in music genre classification, music retrieval, or QBH(Query By Humming) because these can be modeling the stochasticpattern or timbral information of music signal. In this work, we restricted the music domain as O.S.T of movie or soap opera to apply broadcasting system. That is, the user can retrievalthe information of the unknown music using only an audio clip with a few seconds extracted from video content when background music sound greeted user's ear. We proposed the audio feature set organized by MPEG-7 descriptors and distance function by vector distance or ratio computation. Thus, we observed that the feature set organized by pitch information is superior to timbral spectral feature set and IFCR(Intra-Feature Component Ratio) is better than ED(Euclidean Distance) as a vector distance function. To evaluate music recognition, k-NN is used as a classifier

Auditory Model Design for Objective Audio Quality Measurement

  • Dongil Seo;Park, Se-Hyoung;Ryu, Seung-wan;Jaeho Shin
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1717-1720
    • /
    • 2002
  • Objective quality measurement schemes that in- corporate properties of the human auditory system. The basilar membrane(BM) acts as a spectrum analyzer, spatially decomposing the signal into frequency components. Each filterbank is an implementation of the ERB, gam-machirp function. This filterbank is level-dependent asymmetric compensation filters. And for the validation of the auditory model, we calculate the CPD. Quality measurement is obtained from the result.

  • PDF

The Content Based Analysis According to the Composition of the Feature Parameters for the Auditory Data (오디오 데이터의 특징 파라메터 구성에 따른 내용기반 분석)

  • 한학용;허강인;김수훈
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.182-189
    • /
    • 2002
  • In this paper, we research the content-based analysis and classification according to the composition of the feature parameters pool for the auditory signals to implement the auditory indexing and searching system. Auditory data is classified to the primitive various auditory types. we described the analysis and feature extraction method for the feature parameters available to the auditory data classification. And we compose the feature parameters pool in the indexing group unit, then compare and analysis the auditory data centering around the including level and indexing criterion into the audio categories. Based on this result, we composed the classification procedure and simulate the auditory data classification.

Development of Auto Presentation System of Toolbook Using Object Auto Transition on Multimedia Authoring Tool (멀티미디어를 기반으로 하는 저작도구 툴북에서 객체 자동 변환을 이용한 자동 프리젠테이션 시스템 개발)

  • Yang, Ok-Yul;Jeong, Yeong-Sik;Lee, Yong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.5
    • /
    • pp.1182-1195
    • /
    • 1997
  • When we present some information, we can use application programs through multinedia-based authoring tools. Especially.many programers proposed to improve its intergration time and reduce programming speed and easy to use. However, multimedia based authoring tools have not all of programming methodolgies and do not supply special functions from user's request. Therefore, we have to apply effective functions through high-level programming languages.In this paper, we propose to use small appkication prograns through linking methods, So we reduce overhead from memory loading In authoring tools, we can use MCI(media control interface) call functions for playback audio files.we development ATS(Auto Transition System) for several functions-close MCI call audio files, get object status, page-to page trancition.We evidently show that an optimal configuration of presentation obtained by ATS algorithm.

  • PDF

A Human Sensibility Ergonomic Establishment of Customer-Satisfying Strategy for a Multimedia Telecommunication System (멀티미디어 통신시스템을 대상으로한 사용자 만족 전략의 감성공학적 수립)

  • Park, Min-Yong;Park, Hui-Seok
    • Journal of the Ergonomics Society of Korea
    • /
    • v.17 no.1
    • /
    • pp.23-36
    • /
    • 1998
  • The primary objective of this research was to establish and quantify the relationship between the physical degradation factors of multimedia telecommunications (teleconferencing) system and Subjective human perception. The research was performed in two stages. A field survey of the real users and pilot experiments were carried out in the first stage to determine customers' major complaints and corresponding system degradation factors. A prototype teleconferencing simulator was developed in two separate sound-treated chambers equipped with audio/video equipment running under a custom-developed software program. In the second stage, simulation experiments using the semantic differential methodology were performed utilizing 26 paid participants (14 college students and 12 housewives). The results indicated that audio/video synchronization and the frame rate were the main system factors for both subject groups, but different pattern of factors' influence was found according to the group, implying that the system configuration would hopefully accommodate the characteristics of the end users. Also, a single quality index, developed for system preference, was revealed to be highly correlated with user satisfaction. The results provide some fundamental data on the human subjective perception of multimedia telecommunications quality, and further can help establish the quality standards to enhance service level.

  • PDF

A Study on the Music Retrieval System using MPEG-7 Audio Low-Level Descriptors (MPEG-7 오디오 하위 서술자를 이용한 음악 검색 방법에 관한 연구)

  • Park Mansoo;Park Chuleui;Kim Hoi-Rin;Kang Kyeongok
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2003.11a
    • /
    • pp.215-218
    • /
    • 2003
  • 본 논문에서는 MPEG-7에 정의된 오디오 서술자를 이용한 오디오 특징을 기반으로 한 음악 검색 알고리즘을 제안한다. 특히 timbral 특징들은 음색 구분을 용이하게 할 수 있어 음악 검색뿐만 아니라 음악 장르 분류 또는 Query by humming에 이용 될 수 있다. 이러한 연구를 통하여 오디오 신호의 대표적인 특성을 표현 할 수 있는 특징벡터를 구성 할 수 있다면 추후에 멀티모달 시스템을 이용한 검색 알고리즘에도 오디오 특징으로 이용 될 수 있을 것이다 본 논문에서는 방송 시스템에 적용 할 수 있도록 검색 범위를 특정 컨텐츠의 O.S.T 앨범으로 제한하였다. 즉, 사용자가 임의로 선택한 부분적인 오디오 클립만을 이용하여 그 컨텐츠 전체의 O.S.T 앨범 내에서 음악을 검색할 수 있도록 하였다. 오디오 특징벡터를 구성하기 위한 MPEG-7 오디오 서술자의 조합 방법을 제안하고 distance 또는 ratio 계산 방식을 통해 성능 향상을 추구하였다. 또한 reference 음악의 템플릿 구성 방식의 변화를 통해 성능 향상을 추구하였다. Classifier로 k-NN 방식을 사용하여 성능 평가를 수행한 결과 timbral spectral feature들의 비율을 이용한 IFCR(Intra-Feature Component Ratio) 방식이 Euclidean distance 방식보다 우수한 성능을 보였다.

  • PDF