• Title/Summary/Keyword: Audio signal analysis

Search Result 74, Processing Time 0.029 seconds

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • Ahmed, Tanveer;Memon, Sajjad Ali;Hussain, Saqib;Tanwani, Amer;Sadat, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.369-376
    • /
    • 2021
  • One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

Joint Channel Coding Based on Principal Component Analysis

  • Hyun, Dong-Il;Lee, Dong-Geum;Park, Young-Cheol;Youn, Dae-Hee;Seo, Jeong-Il
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.831-834
    • /
    • 2010
  • This paper proposes a new joint channel coding algorithm based on principal component analysis. A conventional joint channel coder using passive downmixing undergoes a reduction of both the primary-to-ambient energy ratio (PAR) of the downmix signal and the panning gain ratio of the primary source. The proposed system preserves the PAR of the downmix signal by using active downmixing which reflects spatial characteristic. The proposed system also improves the accuracy of the panning gain ratio estimation. Computer simulations and subjective listening tests verify the performance of the proposed system.

Effects Analysis of DRAM for Digital Signal Processor Performance (디지털 신호처리 프로세서의 성능에 대한 DRAM의 영향 분석)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.177-183
    • /
    • 2018
  • Currently, digital signal processing systems are used extensively in image processing, audio processing, filtering, and equalizations, etc. In addition, the importance of DRAM, which has a great influence on the performance of an digital signal processor has been increased, making research on DRAM actively conducted in industry and academia. Therefore, it is important to have a more accurate DRAM model in order to obtain reliable results when evaluating the performance of a digital signal processor through simulation. In this paper, we developed a digital signal processor simulator capable of inter-working with a DRAM simulator. With the simulator, we analyzed the influence of the DRAM model which operates correctly on a cycle-by-cycle basis, on the performance of the digital signal processor by using the UTDSP digital signal benchmark.

On-Line Audio Genre Classification using Spectrogram and Deep Neural Network (스펙트로그램과 심층 신경망을 이용한 온라인 오디오 장르 분류)

  • Yun, Ho-Won;Shin, Seong-Hyeon;Jang, Woo-Jin;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.21 no.6
    • /
    • pp.977-985
    • /
    • 2016
  • In this paper, we propose a new method for on-line genre classification using spectrogram and deep neural network. For on-line processing, the proposed method inputs an audio signal for a time period of 1sec and classifies its genre among 3 genres of speech, music, and effect. In order to provide the generality of processing, it uses the spectrogram as a feature vector, instead of MFCC which has been widely used for audio analysis. We measure the performance of genre classification using real TV audio signals, and confirm that the proposed method has better performance than the conventional method for all genres. In particular, it decreases the rate of classification error between music and effect, which often occurs in the conventional method.

MPEG-H 3D Audio Decoder Structure and Complexity Analysis (MPEG-H 3D 오디오 표준 복호화기 구조 및 연산량 분석)

  • Moon, Hyeongi;Park, Young-cheol;Lee, Yong Ju;Whang, Young-soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.432-443
    • /
    • 2017
  • The primary goal of the MPEG-H 3D Audio standard is to provide immersive audio environments for high-resolution broadcasting services such as UHDTV. This standard incorporates a wide range of technologies such as encoding/decoding technology for multi-channel/object/scene-based signal, rendering technology for providing 3D audio in various playback environments, and post-processing technology. The reference software decoder of this standard is a structure combining several modules and can operate in various modes. Each module is composed of independent executable files and executed sequentially, real time decoding is impossible. In this paper, we make DLL library of the core decoder, format converter, object renderer, and binaural renderer of the standard and integrate them to enable frame-based decoding. In addition, by measuring the computation complexity of each mode of the MPEG-H 3D-Audio decoder, this paper also provides a reference for selecting the appropriate decoding mode for various hardware platforms. As a result of the computational complexity measurement, the low complexity profiles included in Korean broadcasting standard has a computation complexity of 2.8 times to 12.4 times that of the QMF synthesis operation in case of rendering as a channel signals, and it has a computation complexity of 4.1 times to 15.3 times of the QMF synthesis operation in case of rendering as a binaural signals.

Measurement Procedure and Analysis of Terrestrial DTV Field Test in Taejeon (대전지역의 지상파 DTV 현장 측정 및 결과 분석)

  • 김종호;조진호;이형수;박재홍
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.11 no.5
    • /
    • pp.830-838
    • /
    • 2000
  • This paper represents measurement prodcedure and analysis of terrestrial DTV field test results over Taejeon city area. Thirty three points were selected as measuring points. Signal power, noise power, Segment Error Rate, RMS delay spread and equalizer performance was measured. The video and audio quality of DTV was good over half of test sites. Equalizer, could correct signal ghost and improve S/N up to 13.7 dB. From this test, the test procedure for DTV will be estabilished.

  • PDF

A Source Separation Algorithm for Stereo Panning Sources (스테레오 패닝 음원을 위한 음원 분리 알고리즘)

  • Baek, Yong-Hyun;Park, Young-Cheol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.4 no.2
    • /
    • pp.77-82
    • /
    • 2011
  • In this paper, we investigate source separation algorithms for stereo audio mixed using amplitude panning method. This source separation algorithms can be used in various applications such as up-mixing, speech enhancement, and high quality sound source separation. The methods in this paper estimate the panning angles of individual signals using the principal component analysis being applied in time-frequency tiles of the input signal and independently extract each signal through directional filtering. Performances of the methods were evaluated through computer simulations.

A Study on Bundang-line Urban Transit Operation Mode and Operation Algorithm Analysis of an ATC System (분당선 도시철도 운전모드와 ATC 시스템 동작알고리즘에 관한 연구)

  • Kim Jong-ki;Lee Key-soe
    • Proceedings of the KSR Conference
    • /
    • 2004.06a
    • /
    • pp.1247-1252
    • /
    • 2004
  • ATC(Automatic Train Control) system employed in Bundang urban transit is operated in accordance with automatic blocking equipment. Using AF(Audio Frequency) track circuits installed at a block section, the block signal is automatically controlled and the safety of train operation is supported. In this paper, we investigate the operation mode of bundang urban transit and analyze the operation algorithm of ATC on-board system.

  • PDF

Performance Improvement of Speech Enhancement Using Independent Component Analysis and Perceptual Filtering (독립 성분 분석과 지각 필터를 이용한 음질 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.4
    • /
    • pp.270-277
    • /
    • 2010
  • In this paper, we proposed an algorithm that improves tone quality of noisy audio signals by using ICA(Independent Component Analysis) algorithm and perceptual filters. Many algorithms have been proposed to eliminate the noise from the audio signals, such as spectral subtraction method, perceptual filter, etc. The perceptual filter uses a noise that is acquired from silent ranges in the input signal. In this case, the improvement rate of tone quality decreases if the noise energy is changed by the environmental variation in a signal frame. But the proposed method estimates a noise that is changed at each frame using ICA algorithm. The estimated noise is applied to perceptual filter. To show the performance of the proposed algorithm, several tests are performed to various input signals. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR), noise-to-mask ratio (NMR) and Degradation Category Rating (DCR) test.

Subjective Listening Test based on Frontal Loudspeaker Array Reproduction System (전방 스피커 어레이 재생 방식 기반 음향 재현 성능 평가)

  • Yoo, Jae-hyoun;Jang, Daeyoung;Lee, Taejin
    • Journal of Broadcast Engineering
    • /
    • v.20 no.5
    • /
    • pp.667-675
    • /
    • 2015
  • As the interest on the high-definition and high-quality broadcasting is increased, the request on the high quality sound signal is enlarged as well as on the video signal's quality. One factor contributing to the high-quality of audio signal is an expansion of reproduction channels like 10.2channel and 22.2channel, but there is a problem of speaker installation issue of these many channels. One solution to solve this problem, we can use frontal loudspeaker array reproduction technique making virtual surround sound. So in this paper, we introduce theocratical analysis on the Wave Field Synthesis used for speaker array based sound reproduction and also present the result about the subjective listening test of reproduction performance based on this technique to check the perfoemance of this system. As a result, we showed WFS based frontal loudspeaker array reproduction method could provide sufficient performance compared to conventional discrete 5.1 channel reproduction method.