Search | Korea Science

Objective measurement of spatial auditory quality for multi channel audio codecs (멀티채널 오디오 압축 코덱 음질의 객관적인 측정방법)

Choi, In-Yong;Chon, Sang-Bae;Sung, Koeng-Mo
- Proceedings of the IEEK Conference
- /
- 2005.11a
- /
- pp.431-434
- /
- 2005
본 논문은 멀티채널 오디오 압축 코덱의 음질을 객관적으로 평가할 수 있는 시스템 및 파라메터에 관한 것으로, 멀티채널 오디오 신호로부터 양이입력신호(ear input signals)를 만들어내는 전처리 과정과 이 과정을 통해 출력되는 양이입력신호로부터 양이레벨차이왜곡(inter-aural level difference distortion)을 구하는 과정 및 양이레벨차이왜곡이 청취평가 결과와 일관적인 상관관계를 보임을 서술한다. 본 연구에 의하면 멀티채널 오디오 압축 코덱의 음질을 선별된 청취자에 의한 주관적인 평가와 통계처리 없이 객관적인 측정만을 통해 평가하는 것이 가능하며, 이를 사용하면 멀티채널 오디오 압축 코덱 개발자들이 시간, 경제적 부담 없이 자신이 개발한 압축 코덱의 음질을 간단하게 평가해볼 수 있다.
PDF

Unified coding scheme of speech and music (음악 및 음성 신호의 융합 압축 기술)

O, Eun-Mi
- Broadcasting and Media Magazine
- /
- v.16 no.4
- /
- pp.59-71
- /
- 2011
오디오와 음성 압축 기술적 근간은 서로 다르지만, 최근의 모바일 멀티미디어 기기 시장의 컨버전스 현상에 따라 압축하고자 하는 신호가 혼용되고 있으며, 비슷한 목표 전송률과 음질로 수렴하고 있다. 현재는 동일 기기에서 서로 다른 압축 기술을 적용하고 있으나, 음성과 음악이 동시에 서비스 되는 멀티미디어 기기에서는 단일 압축 방식으로 처리하고자 하는 이슈가 부각되고 있다. 특히, 스마트 폰 및 음악 콘텐츠 포탈 서비스의 대중화를 고려할 때, 음성 및 음악 신호 모두를 효율적으로 압축하는 음악 및 음성 신호의 융합 압축 기술이 더욱 필요해 보인다. 본 고에서는 MPEG 오디오 그룹에서 가장 최근 진행한 Unified Speech and Audio Coding(USAC)의 탄생 배경 및 표준화 현황을 소개한다. USAC는 64kbps 이하에서 기술적으로 최고 성능을 지닌 AMR-WB+ 및 HE-AAC v2보다도 우월한 음질을 보이며, 높은 비트율에서도 동등한 음질을 보장한다. 이런 우수한 음질에 기여한 USAC의 스위칭 구조와 더불어 기술적으로 향상된 주요 모듈인 파라미터 기반 스테레오 및 고주파 압축, 그리고 엔트로피 코딩 방식에 대해서 살펴 본다. 향후, 다양한 오디오 신호를 효율적으로 압축하는 USAC는 디지털 라디오, 모바일 TV, 그리고 오디오 북과 같은 사용자 시나리오에서 사용될 확률이 높아 보인다. 또한, USAC는 배경 잡음이나 배경 음악이 있는 경우에도 성능이 우수하기 때문에 YouTube 및 podcast 등과 같이 사용자가 콘텐츠를 생성할 때도 유용하게 사용 될 수 있다.
PDF KSCI

A New Robust Acoustic Crosstalk Cancellation Method with Sum and Difference Filter in 3D Audio System (3차원 오디오 시스템에서 합과 차 여파기를 이용한 새로운 광대억 간섭신호 제거 방법)

김래훈;임준석;성굉모
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.4
- /
- pp.17-21
- /
- 2001
There are some methods to enhance the ‘sweet spot’in loudspeaker-based 3D audio systems. Most of them can be only applied to narrow frequency band inherently. In this paper, we introduce the more robust 3D sound reproduction system which has far wider robust bandwidth. The system applies a sum and difference filter to the conventional three loudspeaker-based one.
PDF

Robust Layered Watermarking of Digital Audio for Possible Timing Changes (시간축 변형을 고려한 디지털 오디오의 계층적 워터마크)

정사라;홍진우
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.8
- /
- pp.719-726
- /
- 2002
In this paper, we present a layered watermarking technique for digital audio data that is capable of detecting timing change and adapting complexity in detection. The proposed watermarking uses echo hiding as the first layer, which enables the detector to estimate linear speed change. The spread spectrum watermark is then inserted in the second layer which includes additional information like copyright data. We use two kinds of sequences in the second layer, one of which is for synchronization and the other is for data. The results of previous layer are used to make estimate of timing change in the next layer. The detector in the presented method can select detecting range form the first layer to the first layer, second pre-layer, or second main-layer due to the required system specification. Experimental results show that the proposed watermarking technique is robust to several processing attacks including timing change.
PDF KSCI

Design of the 5-band Digital Audio Graphic Equalizer adopted Automatic Gain Controller (자동 이득 제어기를 적용한 5-밴드 디지털 오디오 그래픽 이퀄라이저 설계)

김태형;김환용
- Journal of the Korea Computer Industry Society
- /
- v.3 no.1
- /
- pp.27-34
- /
- 2002
There is much interest on information communications owing to the rapid development of network and IT(Information Technology). Analog signals are converted into digital signals for information communications. However, it is very difficult to completely erase the distortion induced during the conversion of analog signals such as voices and images into digital signals. Existing audio graphic equalizer requires very complex processes to calculate the gain and coefficients of the higher-order filter which is required to generate natural sound and to satisfy the need of each person. Unfortunately it is uneconomical and very difficult to embed the existing digital audio equalizer in the system because of the complexity of the existing digital audio equalizer for high quality sound. This paper discusses the design of a new digital audio graphic equalizer(DAGEQ) which can improve system performance and the quality of audio sound, and can be embedded in the system. This new DAGEQ is designed so that the gain can be controlled automatically. The automatic control of coefficients and gain empowers real time processing and the improvement of audio quality.
PDF

Improving Fidelity of Synthesized Voices Generated by Using GANs (GAN으로 합성한 음성의 충실도 향상)

Back, Moon-Ki;Yoon, Seung-Won;Lee, Sang-Baek;Lee, Kyu-Chul
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.1
- /
- pp.9-18
- /
- 2021
Although Generative Adversarial Networks (GANs) have gained great popularity in computer vision and related fields, generating audio signals independently has yet to be presented. Unlike images, an audio signal is a sampled signal consisting of discrete samples, so it is not easy to learn the signals using CNN architectures, which is widely used in image generation tasks. In order to overcome this difficulty, GAN researchers proposed a strategy of applying time-frequency representations of audio to existing image-generating GANs. Following this strategy, we propose an improved method for increasing the fidelity of synthesized audio signals generated by using GANs. Our method is demonstrated on a public speech dataset, and evaluated by Fréchet Inception Distance (FID). When employing our method, the FID showed 10.504, but 11.973 as for the existing state of the art method (lower FID indicates better fidelity).
https://doi.org/10.3745/KTSDE.2021.10.1.9 인용 PDF KSCI

A Signaling Processor IC for Land Mobile Radio System (육상 이동 라디오 시스템용 호처리기 IC)

전형근;김종문;송호준
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.24 no.10A
- /
- pp.1588-1596
- /
- 1999
This paper describes a signaling processor IC for land mobile radio systems. This IC generates CTSS tone or DCS code signals for signaling between the land mobile radio systems and decodes them to open the audio path. The CTSS tone or DCS code signals occupy the subaudio band and are transmitted with voice signal. The audio and subaudio paths consist of switched capacitor filters. The IC has been implemented in 0.6-$\mu\textrm{m}$ 2-poly 3-metal CMOS process. The chip size is 3 mm$\times$4.3 mm and total current is about 3.4 ㎃ at 3.3 V.
PDF

Audio Signal Processing and System Design for improved intelligibility in Conference Room (회의실의 명료성(STI) 향상을 위한 오디오신호 처리 및 시스템 설계)

Kang, Cheolyong;Lee, Seokjoo;Jo, Kwangyeon;Lee, Seonhee
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.17 no.2
- /
- pp.225-232
- /
- 2017
Recently, the development of digital transmission technology of audio signals and the introduction of audio network equipment using digital transmission technology have been made. As a result, audio network technology and equipment are actively applied to the design and construction of audio systems. The meeting room is a place where a large number of participants exchange opinions and communicate with each other. In addition to using an electric acoustic device such as a microphone and a speaker, it improves the intelligibility of the conference room through an example using an audio network.
https://doi.org/10.7236/JIIBC.2017.17.2.225 인용 PDF KSCI

DNN-based Audio Compression Model Optimization Utilizing Entropy Model (엔트로피 모델을 활용한 심층 신경망 기반 오디오 압축 모델 최적화)

Lim, Hyungseob;Kang, Hong-Goo;Jang, Inseon
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.06a
- /
- pp.54-57
- /
- 2022
본 논문에서는 심층 신경망 기반 점진적 다계층 오디오 코덱의 비트 전송률 효율 향상을 위한 엔트로피 모델 기반 양자화 방식을 제안한다. 최근 심층 신경망을 이용하여 전통적인 신호 처리 이론 기반의 상용 오디오 코덱들을 대체하기 위한 오디오 압축 및 복원 시스템에 관한 연구가 활발하게 이루어지고 있다. 그러나 아직은 기존 상용 코덱의 성능에 도달하지 못하고 있으며 특히 종단 간 오디오 압축 모델의 경우, 적은 정보량으로 높은 품질을 얻기 위해서는 부호화기의 양자화 구조를 개선하는 것이 필수적이다. 본 연구에서는 기존에 제안된 종단 간 오디오 압축 모델 중 하나인 점진적 다계층 오디오 코덱의 벡터 양자화기를 엔트로피 모델 기반 양자화기로 대체하고 전송률-왜곡 트레이드오프 관계를 활용하여 전송률을 다양한 형태로 조절할 수 있음을 보임으로써 엔트로피 모델 기반 양자화기 도입의 타당성을 검증한다.
PDF

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

윤원중;이강규;박규식
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.6
- /
- pp.271-278
- /
- 2004
In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.
PDF KSCI

Search Result 148, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)