• Title/Summary/Keyword: Audio signal

Search Result 476, Processing Time 0.025 seconds

Implementation of Low Complexity FFT, ADC and DAC Blocks of an OFDM Transmitter Receiver Using Verilog

  • Joshi, Alok;Gupta, Dewansh Aditya;Jaipuriyar, Pravriti
    • Journal of Information Processing Systems
    • /
    • v.15 no.3
    • /
    • pp.670-681
    • /
    • 2019
  • Orthogonal frequency division multiplexing (OFDM) is a system which is used to encode data using multiple carriers instead of the traditional single carrier system. This method improves the spectral efficiency (optimum use of bandwidth). It also lessens the effect of fading and intersymbol interference (ISI). In 1995, digital audio broadcast (DAB) adopted OFDM as the first standard using OFDM. Later in 1997, it was adopted for digital video broadcast (DVB). Currently, it has been adopted for WiMAX and LTE standards. In this project, a Verilog design is employed to implement an OFDM transmitter (DAC block) and receiver (FFT and ADC block). Generally, OFDM uses FFT and IFFT for modulation and demodulation. In this paper, 16-point FFT decimation-in-frequency (DIF) with the radix-2 algorithm and direct summation method have been analyzed. ADC and DAC in OFDM are used for conversion of the signal from analog to digital or vice-versa has also been analyzed. All the designs are simulated using Verilog on ModelSim simulator. The result generated from the FFT block after Verilog simulation has also been verified with MATLAB.

A Study on Unmanned Image Tracking System based on Smart Phone (스마트폰 기반의 무인 영상 추적 시스템 연구)

  • Ahn, Byeong-tae
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.3
    • /
    • pp.30-35
    • /
    • 2019
  • An unattended recording system based on smartphone based image image tracking is rapidly developing. Among the existing products, a system that automatically tracks and rotates the object to be photographed using an infrared signal is very expensive for general users. Therefore, this paper proposes a mobile unattended recording system that enables automatic recording by anyone who uses a smartphone. The system consists of a commercial mobile camera, a servomotor that moves the camera from side to side, a microcontroller to control the motor, and a commercial wireless Bluetooth Earset for video audio input. In this paper, we designed a system that enables unattended recording through image tracking using smartphone.

A completely non-contact recognition system for bridge unit influence line using portable cameras and computer vision

  • Dong, Chuan-Zhi;Bas, Selcuk;Catbas, F. Necati
    • Smart Structures and Systems
    • /
    • v.24 no.5
    • /
    • pp.617-630
    • /
    • 2019
  • Currently most of the vision-based structural identification research focus either on structural input (vehicle location) estimation or on structural output (structural displacement and strain responses) estimation. The structural condition assessment at global level just with the vision-based structural output cannot give a normalized response irrespective of the type and/or load configurations of the vehicles. Combining the vision-based structural input and the structural output from non-contact sensors overcomes the disadvantage given above, while reducing cost, time, labor force including cable wiring work. In conventional traffic monitoring, sometimes traffic closure is essential for bridge structures, which may cause other severe problems such as traffic jams and accidents. In this study, a completely non-contact structural identification system is proposed, and the system mainly targets the identification of bridge unit influence line (UIL) under operational traffic. Both the structural input (vehicle location information) and output (displacement responses) are obtained by only using cameras and computer vision techniques. Multiple cameras are synchronized by audio signal pattern recognition. The proposed system is verified with a laboratory experiment on a scaled bridge model under a small moving truck load and a field application on a footbridge on campus under a moving golf cart load. The UILs are successfully identified in both bridge cases. The pedestrian loads are also estimated with the extracted UIL and the predicted weights of pedestrians are observed to be in acceptable ranges.

Infant cry recognition using a deep transfer learning method (딥 트랜스퍼 러닝 기반의 아기 울음소리 식별)

  • Bo, Zhao;Lee, Jonguk;Atif, Othmane;Park, Daihee;Chung, Yongwha
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.971-974
    • /
    • 2020
  • Infants express their physical and emotional needs to the outside world mainly through crying. However, most of parents find it challenging to understand the reason behind their babies' cries. Failure to correctly understand the cause of a baby' cry and take appropriate actions can affect the cognitive and motor development of newborns undergoing rapid brain development. In this paper, we propose an infant cry recognition system based on deep transfer learning to help parents identify crying babies' needs the same way a specialist would. The proposed system works by transforming the waveform of the cry signal into log-mel spectrogram, then uses the VGGish model pre-trained on AudioSet to extract a 128-dimensional feature vector from the spectrogram. Finally, a softmax function is used to classify the extracted feature vector and recognize the corresponding type of cry. The experimental results show that our method achieves a good performance exceeding 0.96 in precision and recall, and f1-score.

On the Principles and Applications of Wave Field Synthesis (WFS의 원리와 활용에 관하여)

  • Yoo, Jae-Hyoun;Shim, Hwan;Chung, Hyun-Joo;Sung, Koeng-Mo;Kang, Kyeong-Ok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.688-696
    • /
    • 2009
  • There are many studies on Wave Field Synthesis(WFS) which provides better presence and spaciousness than conventional discrete multichannel audio reproduction methods. However, it has several problems such as the listener-enclosing loudspeaker array and pre-authorized object-based source signal, so it is not widely used except in large-scale listening rooms. This paper presents a method which utilizes the merit of WFS in small listening rooms such as a living room.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

Development of ATSC3.0 based UHDTV Broadcasting System providing Ultra-high-quality Service that supports HDR/WCG Video and 3D Audio, and a Fixed UHD/Mobile HD Service (HDR/WCG 비디오와 3D 오디오를 지원하는 초고품질 방송서비스와 고정 UHD/이동 HD 방송 서비스를 제공하는 ATSC 3.0 기반 UHDTV 방송 시스템 개발)

  • Ki, Myungseok;Seok, Jinwuk;Beack, Seungkwon;Jang, Daeyoung;Lee, Taejin;Kim, Hui Yong;Oh, Hyeju;Lim, Bo-mi;Bae, Byungjun;Kim, Heung Mook;Choi, Jin Soo
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.829-849
    • /
    • 2017
  • Due to the large-scale TV display, the convergence of broadcasting and broadband, and the advancement of signal compression and transmission technology, terrestrial digital broadcasting has evolved into UHD broadcasting capable of providing simultaneous broadcasting of fixed UHD and mobile HD. The Korean standard for terrestrial UHDTV broadcasting is based on ATSC 3.0, the broadcasting standard of North America. The terrestrial UHDTV broadcasting standard chose that as a new AV codec standard, HEVC video codec which can compress with higher efficiency compared to AVC, and MPEG-H 3D audio codec for realistic audio. Also, DASH and MMT are adopted as transmission format instead of MPEG-2 TS to support broadband as well as broadcasting network, and in order to provide 4K UHD/mobile HD service simultaneously ROUTE multiplexing technology is applied. In this paper, we propose an audio/video encoder, which is required to provide HDR/WCG supported high quality video service, 10.2 channel/4 object supporting stereo sound service, fixed UHD and mobile HD simultaneous broadcasting service based on ATSC3.0, also we implemented the ATSC 3.0 LDM system for ROUTE/DASH packager, multiplexing system and physical layer transmission/reception, and verified the service ability by applying it to real time broadcast environment.

Utility Estimation of the Application of Auditory-Visual-Tactile Sense Feedback in Respiratory Gated Radiation Therapy (호흡동조방사선치료 시 Real Time Monitor와 Ventilator의 유용성 평가)

  • Jo, Jung Hun;Kim, Byeong Jin;Roh, Shi Won;Lee, Hyeon Chan;Jang, Hyeong Jun;Kim, Hoi Nam;Song, Jae Hun;Kim, Young Jae
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.25 no.1
    • /
    • pp.33-40
    • /
    • 2013
  • Purpose: The purpose of this study was to evaluate the possibility to optimize the gated treatment delivery time and maintenance of stable respiratory by the introduction of breath with the assistance of auditory-visual-tactile sense. Materials and Methods: The experimenter's respiration were measured by ANZAI 4D system. We obtained natural breathing signal, monitor-induced breathing signal, monitor & ventilator-induced breathing signal, and breath-hold signal using real time monitor during 10 minutes beam-on-time. In order to check the stability of respiratory signals distributed in each group were compared with means, standard deviation, variation value, beam_time of the respiratory signal. Results: The stability of each respiratory was measured in consideration of deviation change studied in each respiratory time lapse. As a result of an analysis of respiratory signal, all experimenters has showed that breathing signal used both Real time monitor and Ventilator was the most stable and shortest time. Conclusion: In this study, it was evaluated that respiratory gated radiation therapy with auditory-visual-tactual sense and without auditory-visual-tactual sense feedback. The study showed that respiratory gated radiation therapy delivery time could significantly be improved by the application of video feedback when this is combined with audio-tactual sense assistance. This delivery technique did prove its feasibility to limit the tumor motion during treatment delivery for all patients to a defined value while maintaining the accuracy and proved the applicability of the technique in a conventional clinical schedule.

  • PDF

Sound System Design and Characteristic Analysis based on Power Line Communication (전력선통신 기반 음향 시스템 설계 및 특성 분석)

  • Kim, Kwan-Kyu;Yeom, Keong-Tae;Kim, Kwan-Woong;Kim, Yong-Kab
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.6
    • /
    • pp.1-7
    • /
    • 2008
  • The paper is to solve the problem of existing sound system, which has difficulties of system organization and the increase of additional install cost and unfriendly interior. To solve the existing system, we drew the new sound system based on PLC and studied it. A transmitter and a receiver were designed using the PLC chip INT5500CS. Sound system was configured with a CD player that sound signals are sent from the transmitter and a speaker connected to the receiver. For analysis of characteristics of this system, a USBPre external sound card and Smaart Live 5 which is a PC-based sound measuring program were added. As a result of our experiment, the measured signal level is $2{\sim}3$[dB] lower than reference signal, latency is 16.69[ms] and the specific character of coherency is bad in high frequency band. Otherwise, this system transmits and receives signals over 90[%] in good condition as a result of measuring pink noise, frequency(1kHz), and phase, magnitude. In view of the result so far achieved, the system designed our team has excellent performance, it resolves defect of existing audio signal transmition system.

Manipulation of the Compressed Video for Multimedia Networking : A Bit rate Shaping of the Compressed Video (멀티미디어 네트워킹을 위한 압축 신호상에서 동영상 처리 : 압축 동영상 비트율 변환)

  • 황대환;조규섭;황수용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.11A
    • /
    • pp.1908-1924
    • /
    • 2001
  • Interoperability and inter-working in the various network and media environment with different technology background is very important to enlarge the opportunity of service access and to increase the competitive power of service. The ITU-T and advanced counties are planning ahead for provision of GII enabling user to access advanced global communication services supporting multimedia communication applications, embracing all modes of information. In this paper, we especially forced the heterogeneity of end user applications for multimedia networking. The heterogeneity has several technical aspects, like different medium access methods, heterogeneous coding algorithms for audio-visual data and so on. Among these elements, we have been itemized bit rate shaping algorithm on the compressed moving video. Previous manipulations of video has been done on the uncompressed signal domain. That is, compressed video should be converted to linear PCM signal. To do such a procedures, we should decode, manipulate and then encode the video to compressed signal once again. The traditional approach for processing the video signa1 has several critical weak points, requiring complexity to implement, degradation of image quality and large processing delay. The bit rate shaping algorithm proposed in this paper process the manipulation of moving video on the completely compressed domain to cope with above deficit. With this algorithms. we could realized efficient video bit rate shaping and the result of software simulation shows that this method has significant advantage than that of pixel oriented algorithms.

  • PDF