• Title/Summary/Keyword: Audio Signal

Search Result 476, Processing Time 0.025 seconds

A Study about the Users's Preferred Playing Speeds on Categorized Video Content using WSOLA method (WSOLA를 이용한 동영상 미세배속 재생 서비스에 대한 콘텐츠별 배속 선호도 분석 연구)

  • Kim, I-Gil
    • Journal of Digital Contents Society
    • /
    • v.16 no.2
    • /
    • pp.291-298
    • /
    • 2015
  • In a fast-paced information technology environment, consumption of video content is changing from one-way television viewing to VOD (Video on Demand) playing anywhere, anytime, on any device. This video-watching trend gives additional importance to videos with fine-speed-control, in addition to the strength of the digital video signal. Currently, many video players provide a fine-speed-control function which can speed up the video to skip a boring part, or slow it down to focus on an exciting scene. The audio information is just as important as the visual information for understanding the content of the speed-controlled video. Thus, a number of algorithms for fine-speed-control video-playing technologies have been proposed to solve the pitch distortion in the audio-processing area. In this study, well-known techniques for prosodic modification of speech signals, WSOLA (Waveform-Similarity-Based Overlap-Add), have been applied to analyze users' needs for fine-speed-control video playing. By surveying the users' preferred speeds on categorized video content and analyzing the results, this paper proposes that various fine-speed adjustments are needed to accommodate users' preferred video consumption.

Audio Stream Delivery Using AMR(Adaptive Multi-Rate) Coder with Forward Error Correction in the Internet (인터넷 환경에서 FEC 기능이 추가된 AMR음성 부호화기를 이용한 오디오 스트림 전송)

  • 김은중;이인성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12A
    • /
    • pp.2027-2035
    • /
    • 2001
  • In this paper, we present an audio stream delivery using the AMR (Adaptive Multi-Rate) coder that was adopted by ETSI and 3GPP as a standard vocoder for next generation IMT-2000 service in which includes combined sender (FEC) and receiver reconstruction technique in the Internet. By use of the media-specific FEC scheme, the possibility to recover lost packets can be much increased due to the addition of repair data to a main data stream, by which the contents of lost packets can be recovered. The AMR codec is based on the code-excited linear predictive (CELP) coding model. So we use a frame erasure concealment for CELP-based coders. The proposed scheme is evaluated with ITU-T G.729 (CS-ACELP) coder and AMR - 12.2 kbit/s through the SNR (Signal to Noise Ratio) and the MOS (Mean Opinion Score) test. The proposed scheme provides 1.1 higher in Mean Opinion Score value and 5.61 dB higher than AMR - 12.2 kbit/s in terms of SNR in 10% packet loss, and maintains the communicab1e quality speech at frame erasure rates lop to 20%.

  • PDF

Using a H/W ADL-based Compiler for Fixed-point Audio Codec Optimization thru Application Specific Instructions (응용프로그램에 특화된 명령어를 통한 고정 소수점 오디오 코덱 최적화를 위한 ADL 기반 컴파일러 사용)

  • Ahn Min-Wook;Paek Yun-Heung;Cho Jeong-Hun
    • The KIPS Transactions:PartA
    • /
    • v.13A no.4 s.101
    • /
    • pp.275-288
    • /
    • 2006
  • Rapid design space exploration is crucial to customizing embedded system design for exploiting the application behavior. As the time-to-market becomes a key concern of the design, the approach based on an application specific instruction-set processor (ASIP) is considered more seriously as one alternative design methodology. In this approach, the instruction set architecture (ISA) for a target processor is frequently modified to best fit the application with regard to code size and speed. Two goals of this paper is to introduce our new retargetable compiler and how it has been used in ASIP-based design space exploration for a popular digital signal processing (DSP) application. Newly developed retargetable compiler provides not only the functionality of previous retargetable compilers but also visualizes the features of the application program and profiles it so that it can help architecture designers and application programmers to insert new application specific instructions into target architecture for performance increase. Given an initial RISC-style ISA for the target processor, we characterized the application code and incrementally updated the ISA with more application specific instructions to give the compiler a better chance to optimize assembly code for the application. We get 32% performance increase and 20% program size reduction using 6 audio codec specific instructions from retargetable compiler. Our experimental results manifest a glimpse of evidence that a higgly retargetable compiler is essential to rapidly prototype a new ASIP for a specific application.

A study on the application of residual vector quantization for vector quantized-variational autoencoder-based foley sound generation model (벡터 양자화 변분 오토인코더 기반의 폴리 음향 생성 모델을 위한 잔여 벡터 양자화 적용 연구)

  • Seokjin Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.243-252
    • /
    • 2024
  • Among the Foley sound generation models that have recently begun to be studied, a sound generation technique using the Vector Quantized-Variational AutoEncoder (VQ-VAE) structure and generation model such as Pixelsnail are one of the important research subjects. On the other hand, in the field of deep learning-based acoustic signal compression, residual vector quantization technology is reported to be more suitable than the conventional VQ-VAE structure. Therefore, in this paper, we aim to study whether residual vector quantization technology can be effectively applied to the Foley sound generation. In order to tackle the problem, this paper applies the residual vector quantization technique to the conventional VQ-VAE-based Foley sound generation model, and in particular, derives a model that is compatible with the existing models such as Pixelsnail and does not increase computational resource consumption. In order to evaluate the model, an experiment was conducted using DCASE2023 Task7 data. The results show that the proposed model enhances about 0.3 of the Fréchet audio distance. Unfortunately, the performance enhancement was limited, which is believed to be due to the decrease in the resolution of time-frequency domains in order to do not increase consumption of the computational resources.

Design and Implementation of Fire distress Detection and Rescue user Terminal (소방조난 탐지구조 단말장치 설계 및 제작)

  • Kim, Kun-Joong;Na, Sang-Guen;Kim, Young-Wan
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.557-559
    • /
    • 2012
  • The fire distress detection and rescue user terminal, which rescue the survivor by using the direction finding of distress place and sensing techniques, was design and implemented. The user terminal provides the rescue function in the place of evil surroundings that can not be available the communication facilities. The rescue user terminal provides the portable configuration, which consists of a RF board with radio frequency of 2.45 GHz and inner antenna, and a control board. The inner antenna with $60^{\circ}$ or $120^{\circ}$ directivity, which use the triangulation, detects the rescue signal from survivor. The rescue was managed by allotment of user ID and can use the bidirectional audio channel using radio frequency of 5.8 GHz.

  • PDF

Speech Packet Transmission Using the AMR-WB Coder with FEC (FEC기능을 추가한 AMR-WB 음성 부호화기를 이용한 음성 패킷 전송)

  • 황정준;이인성
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.11
    • /
    • pp.63-71
    • /
    • 2003
  • This paper suggests the packet loss recovery method to communicate in real time in the Internet. To reduce the effects of packet loss, Forward Error Correction (FEC) that adds redundant information to voice packets can be used. Adaptive Multi Rate Wideband(AMR-WB) codec which is recently selected by the Third Generation Partnership Project(3GPP) for GSM and the third generation mobile communication WCDMA system and has also been standardized in ITU-T for providing wideband speech services is used. The major cause for speech qualitly degradation in IP-networks is packet loss. So, We recovered single lossy packet by using FEC method and concealed continued errors. The proposed scheme if evaluated in the Gilbert Internet channel model. The high quality of audio maintained up to 30% packet loss.

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

  • Lee, Gi Yong;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.600-605
    • /
    • 2020
  • In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.

Implementation of NTSC TV Transmitter Module (NTSC TV Transmitter Module의 구현)

  • Kim Kwang-Tae;Sim Myoung-Su
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.43 no.2 s.308
    • /
    • pp.28-32
    • /
    • 2006
  • In this paper, NTSC TV Transmitter Modulo will be designed and produced which make possible playing the motion picture not only on TV but also on portable TV. NTSC TV Transmitter Module modulates signals that received Video and Audio signals from a mobile on NTSC TV CH4 mechanism. so it has an advantage of convenience that watching the motion picture of mobile on TV without any other cable through transmitting signals by wireless. But it has some demerits of long size antenna and noise sensitiveness. In the future, if some problems like a size of antenna distortion of signal and noise can be solved through continuous researching about Radio Frequency part, it is possible to play mobile motion pictures on the more media like a camcorder, DVD player and so on.

Design of QPSK Ultrasonic Transceiver For Underwater Communication (수중 통신을 위한 QPSK 초음파 송수신기의 설계)

  • Cho Nai-Hyun;Kim Duk-Yung;Kim Yong-Deuk;Chung Yun-Mo
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.43 no.3 s.309
    • /
    • pp.51-59
    • /
    • 2006
  • In this paper, we propose an excellent ultrasonic transceiver system based on a QPSK modulation technique for underwater communication. The transmitter sends a still image at the level of 187dB re $1{\mu}Pa/V@1m$ through a power amplifier by driving an ultrasonic sensor. The receiver performs digital conversion at the 100kHz sampling frequency, demodulation and decoding process for the image sent from the transmitter through the underwater communication. We have shown that the processed image at the receiver is almost the same as the orignal one. The maximum detection distance of the system proposed in this paper is approximately 1.17km. To cope with the difficulties of transmission loss, this paper proposes, implements and analyzes important parameters of sensors and circuits used in the system. Most of the underwater communication has focused on the transmission of audio signal, but this paper suggests an efficient underwater communication system for still image transmission.

Concealment of Propagation Delay using Synchronized overlap-add Algorithm in Internet Phone (인터넷 폰에서 Synchronized overlap-add 알고리즘을 이용한 전송지연 보상 기법)

  • Nam, Jae-Hyun;Lee, Jung-Tae
    • Journal of KIISE:Information Networking
    • /
    • v.28 no.4
    • /
    • pp.540-549
    • /
    • 2001
  • Internet telephony service is very cheap and very easy to introduce the value-added service than the POTS, but is difficult to the QoS of telephone service. The existing Internet typically offers 'best effort' services only, which do not make any commitment about delay, packet loss and jitter. This paper compensates the low quality of the speech for packet loss or delay using SOLA algorithm in Internet phone. SOLA algorithm is a popular technique for Time Scale Modification of speech and audio signal. In the proposed algorithm, the receiver expands the received packet under resonable threshold, and hence compensates the QoS of speech. From the simulation, this algorithm can conceals packet loss considerably, and is also improved the quality of the speech.

  • PDF