Search | Korea Science

Restoration of damaged speech files using deep neural networks (심층 신경망을 활용한 손상된 음성파일 복원 자동화)

Heo, Hee-Soo;So, Byung-Min;Yang, IL-Ho;Yoon, Sung-Hyun;Yu, Ha-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.2
- /
- pp.136-143
- /
- 2017
In this paper, we propose a method for restoring damaged audio files using deep neural network. It is different from the conventional file carving based restoration. The purpose of our method is to infer lost information which can not be restored by existing techniques such as the file carving. We have devised methods that can automate the tasks which are essential for the restoring but are inappropriate for humans. As a result of this study it has been shown that it is possible to restore the damaged files, which the conventional file carving method could not, by using tasks such as speech or nonspeech decision and speech encoder recognizer using a deep neural network.
https://doi.org/10.7776/ASK.2017.36.2.136 인용 PDF KSCI

A Study on Auditory Data Visualization Design for Multimedia Contents (멀티미디어 컨텐츠를 위한 청각데이터의 시각화 디자인에 관한 연구)

Hong, Sung-Dae;Park, Jin-Wan
- Archives of design research
- /
- v.18 no.1 s.59
- /
- pp.195-204
- /
- 2005
Due to the of evolution of digital technology, trends are moving toward personalization and customization in design (art), media, science. Existing mass media has been broadcasting to the general public due to technical and economic limitation and art works also communicate one-sidedly with spectators in the gallery or stage. But nowaday, it is possible for spectators to participate directly. We can make different products depending on the tastes of individuals who demand media or art. The essence of technology which makes it possible is 'interactive technology'. A goal of this research is to find out the true nature of the interactive design in multimedia contents and find the course of interactive communication design research. In this paper, we pass through two stages to solve this kind of problem. At first, we studied the concept of multimedia contents from the aspect of information revolution. Next, we decided our research topic to be 'visual reacting with audio' and made audio-visual art work as graphic designers. Through this research we can find the possibility to promote 'communication' in a broad sense, with appropriate interactive design.
PDF

Video Data Compression using the MPEG-2 Video Algorithm (MPEG-2 비디오 알고리즘을 이용한 비디오 데이터 압축)

남재열;이영선;이현주;김재곤;이상미;안치득
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.8
- /
- pp.1069-1082
- /
- 1993
The International Organization for Standardization(ISO) has undertaken an effort to develop a standard for video and associated audio on digital storage media. This effort is known by the name of the expert group that started if : MPEG-Moving Picture Experts Group Is currently part of the ISO-I EC/J TC1/SC2/WG11. The promise of MPEG-2 is that a video signal and its associated audio can be compressed to a bit rate of about 10 Mbits/s with an acceptable quality. In this paper, the implementation of a video compression simulator based on MPEG-2 Video Test Model 2(TM2) is described and analyzed according to the simulation results. The implemented simulator is also applied to code HDTV sequences at the several bit rates. Some computer simulation results using the MPEG and the HDTV test sequences are given. In addition, some techniques which can improve the coding efficiency of the implemented video compression simulator are also suggested.
PDF

Additive Data Insertion into MP3 Bitstream Using linbits Characteristics (Linbits 특성을 이용하여 MP3 비트스트림에 부가적인 정보를 삽입하는 방법에 관한 연구)

김도형;양승진;정재호
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.7
- /
- pp.612-621
- /
- 2003
As the use of MP3 audio compression increased, the demand for the insertion of additive data about copyright or information on music contents has been groved and the related research has been progressed actively. When an additive data is inserted into MP3 bitstream, it should not to happen any distortion of music quality or the change of file size, due to the modification of MP3 bitstream structure. In our study, to make these conditions satisfied, we inserted some additive data to bitstream by modifying some bits of linbits among the quantized integer coefficients having big values. At this time, we consider the characteristics of linbits and their distributions. As a result of subjective sound quality test through MOS test, we confirmed that the quality of MOS 4.6 can be achieved at the data insertion rate of 60 bytes/sec. Using the proposed method, it is possible to effectively insert an additive data such as copyright information or information about media itself, so that various applications like audio database management can be realized.
PDF KSCI

MPEG Audio New Standard: USAC Technology (MPEG 오디오 최신 표준: USAC 기술)

Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
- Journal of Broadcast Engineering
- /
- v.16 no.5
- /
- pp.693-704
- /
- 2011
As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music contents. MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved Study on DIS at the 96th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC, ACELP, and TCX) for low frequency regions, SBR for high frequency regions, the MPEG Surround for stereo information, and window transition technology for smoothing transition between various core coder. USAC can provide consistent sound quality for both speech and music contents and can be applied to various applications such as multi-media download to mobile devices, digital radio, mobile TV and audio books.
https://doi.org/10.5909/JEB.2011.16.5.693 인용 PDF KSCI

Energy and Statistical Filtering for a Robust Audio Fingerprinting System (강인한 오디오 핑거프린팅 시스템을 위한 에너지와 통계적 필터링)

Jeong, Byeong-Jun;Kim, Dae-Jin
- The Journal of the Korea Contents Association
- /
- v.12 no.5
- /
- pp.1-9
- /
- 2012
The popularity of digital music and smart phones led to develope noise-robust real-time audio fingerprinting system in various ways. In particular, The Multiple Hashing(MLH) of fingerprint algorithms is robust to noise and has an elaborate structure. In this paper, we propose a filter engine based on MLH to achieve better performance. In this approach, we compose a energy-intensive filter to improve the accuracy of Q/R from music database and a statistic filter to remove continuity and redundancy. The energy-intensive filter uses the Discrite Cosine Transform(DCT)'s feature gathering energy to low-order bits and the statistic filters use the correlation between searched fingerprint's information. Experimental results show that the superiority of proposed algorithm consists of the energy and statistical filtering in noise environment. It is found that the proposed filter engine achieves more robust to noise than Philips Robust Hash(PRH), and a more compact way than MLH.
https://doi.org/10.5392/JKCA.2012.12.05.001 인용 PDF KSCI

Design of User Access Authentication and Authorization System for VoIP Service (사용자 접근권한 인증을 이용한 안전한 VoIP 시스템 설계)

Yang, Ho-Kyung;Kim, Jin-Mook;Ryou, Hwang-Bin;Park, Choon-Sik
- Convergence Security Journal
- /
- v.8 no.4
- /
- pp.41-49
- /
- 2008
VoIP is a service that changes the analogue audio signal into a digital signal and then transfers the audio information to the users after configuring it as a packet; and it has an advantage of lower price than the existing voice call service and better extensibility. However, VoIP service has a system structure that, compared to the existing PSTN (Public Switched Telephone Network), has poor call quality and is vulnerable in the security aspect. To make up these problems, TLS service was introduced to enhance the security. In practical system, however, since QoS problem occurs, it is necessary to develop the VoIP security system that can satisfy QoS at the same time in the security aspect. In this paper, a user authentication VoIP system that can provide a service according to the security and the user through providing a differential service according to the approach of the users by adding AA server at the step of configuring the existing VoIP session is suggested. It was found that the proposed system of this study provides a quicker QoS than the TLS-added system at a similar level of security. Also, it is able to provide a variety of additional services by the different users.
PDF

Class-D Digital Audio Amplifier Using 1-bit 4th-order Delta-Sigma Modulation (1-비트 4차 델타-시그마 변조기법을 이용한 D급 디지털 오디오 증폭기)

Kang, Kyoung-Sik;Choi, Young-Kil;Roh, Hyung-Dong;Nam, Hyun-Seok;Roh, Jeong-Gin
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.45 no.3
- /
- pp.44-53
- /
- 2008
In this paper, we present the design of delta-sigma modulation-based class-D amplifier for driving headphones in portable audio applications. The presented class-D amplifier generates PWM(pulse width modulation) signals using a single-bit fourth-order high-performance delta-sigma modulator. To achieve a high SNR(signal-to-noise ratio) and ensure system stability, the locations of the modulator loop filter poles and zeros are optimized and thoroughly simulated. The test chip is fabricated using a standard $0.18{\mu}m$ CMOS process. The active area of the chip is $1.6mm^2$. It operates for the signal bandwidth from 20Hz to 20kHz. The measured THD+N(total harmonic distortion plus noise) at the $32{\Omega}$ load terminal is less than 0.03% from a 3V power supply.
PDF KSCI

Haptic Media Broadcasting (촉각방송)

Cha, Jong-Eun;Kim, Yeong-Mi;Seo, Yong-Won;Ryu, Je-Ha
- Broadcasting and Media Magazine
- /
- v.11 no.4
- /
- pp.118-131
- /
- 2006
With rapid development in ultra fast communication and digital multimedia, the realistic broadcasting technology, that can stimulate five human senses beyond the conventional audio-visual service is emerging as a new generation broadcasting technology. In this paper, we introduce a haptic broadcasting system and related core system and component techniques by which we can 'touch and feel' objects in an audio-visual scene. The system is composed of haptic media acquisition and creation, contents authoring, in the haptic broadcasting, the haptic media can be 3-D geometry, dynamic properties, haptic surface properties, movement, tactile information to enable active touch and manipulation and passive movement following and tactile effects. In the proposed system, active haptic exploration and manipulation of a 3-D mesh, active haptic exploration of depth video, passive kinesthetic interaction, and passive tactile interaction can be provided as potential haptic interaction scenarios and a home shopping, a movie with tactile effects, and conducting education scenarios are produced to show the feasibility of the proposed system.
PDF KSCI

Implementation of SMIL Editor for Multimedia Broadcasting (멀티미디어 방송을 위한 SMIL 편집 시스템 구현)

장대영;김창수;정회경
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.8 no.3
- /
- pp.622-629
- /
- 2004
Recently, as digital broadcasting and internet are spreaded out of the world, we can easily use informations with less restrictions of time and space. According to the current trends, concerns for the ways of representing multimedia data has been rapidly increased, and users demand the services with integrated document that takes not only simple text and image but also time varying audio-visual data. Therefore, in 1998, W3C presented an international standard, SMIL in order to solve multimedia object representation and synchronization problems. By using SMIL, various multimedia elements can be integrated as a multimedia document with proper view in a space and time. Using this SMIL document, we can create new internet radio broadcasting service that delivers not only audio data but also various text, image and video. In this paper, we describe on a SMIL document editor for the common users to be able to represent time varying multimedia data with special layout and synchronization of time and space.
PDF KSCI

Search Result 626, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)