• Title/Summary/Keyword: audio application

Search Result 253, Processing Time 0.031 seconds

The Implementation of Personal Audio Recorder Service based on Embedded Linux (임베디드 리눅스 기반의 개인 오디오 레코더 서비스 구현)

  • Kim, Do-Hyung;Lee, Kyung-Hee;Lee, Cheol-Hoon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.257-262
    • /
    • 2008
  • This paper describes the implementations of the application service based on embedded Linux; Personal Audio Recorder (PAR) which uses WiBro network for data communications and CDMA network for voice communications. At PAR, when PAR client starts voice recording on a dual-mode terminal, the CDMA voice data of caller and callee is transmitted to storage server located in the Internet through WiBro network. Then, PAR server stores voice data on storage server according to the call number and call time. In case of shortage of storage space on terminal, PAR makes user to store voice data. And, PAR can search a catalog of stored data on server and play the specific content.

Implementation of speech interface for windows 95 (Windows95 환경에서의 음성 인터페이스 구현)

  • 한영원;배건성
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.5
    • /
    • pp.86-93
    • /
    • 1997
  • With recent development of speech recognition technology and multimedia computer systems, more potential applications of voice will become a reality. In this paper, we implement speech interface on the windows95 environment for practical use fo multimedia computers with voice. Speech interface is made up of three modules, that is, speech input and detection module, speech recognition module, and application module. The speech input and etection module handles th elow-level audio service of win32 API to input speech data on real time. The recognition module processes the incoming speech data, and then recognizes the spoken command. DTW pattern matching method is used for speech recognition. The application module executes the voice command properly on PC. Each module of the speech interface is designed and examined on windows95 environments. Implemented speech interface and experimental results are explained and discussed.

  • PDF

Application of Block On-Line Blind Source Separation to Acoustic Echo Cancellation

  • Ngoc, Duong Q.K.;Park, Chul;Nam, Seung-Hyon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.1E
    • /
    • pp.17-24
    • /
    • 2008
  • Blind speech separation (BSS) is well-known as a powerful technique for speech enhancement in many real world environments. In this paper, we propose a new application of BSS - acoustic echo cancellation (AEC) in a car environment. For this purpose, we develop a block-online BSS algorithm which provides robust separation than a batch version in changing environments with moving speakers. Simulation results using real world recordings show that the block-online BSS algorithm is very robust to speaker movement. When combined with AEC, simulation results using real audio recording in a car confirm the expectation that BSS improves double talk detection and echo suppression.

The Study of the Performance Improvement of UDP Packet Loss affected by TCP Flows (TCP Flows의 영향하에서 UDP 패킷손실을 줄이는 방법에 관한 연구)

  • 조기영;문호림;김서균;남지승
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.1061-1064
    • /
    • 1999
  • UDP has likely been used for real-time applications, such as video and audio. UDP supplies minimized transmission delay by omitting the connection setup process, flow control, and retransmission In general, more than 80 percent of the WAN resources are occupied by Transmission Control Protocol(TCP) traffic as opposed to UDP's simplicity, TCP adopts a unique flow control in this paper, I report new methods to minimize a udp packet loss considering TCP flow control on the real-time application the better performance of real time application can be obtained when they reduce a packet size and FIFO buffer scheduling method competing with TCP bandwidth for the bandwidth and buffering.

  • PDF

지상파 DMB 방송을 위한 양방향 데이터 방송 서버 설계 및 구현

  • Kim, Gwang-Yong;Lee, Gwang-Sun;Yang, Gyu-Tae;Ham, Yeong-Gwon;An, Chung-Hyeon
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.429-430
    • /
    • 2006
  • In this paper, we describe the architecture of interactive data broadcasting server that can transmit the various data-service contents of the Terrestrial DMB(digital multimedia broadcasting). In the broadcasting environment of the Terrestrial DMB, we enjoys the PADS(program associated data service), or the PIDS(program independent data service) to be executed on various T-DMB terminals as well as the basic video or audio services. This server provides the function such as the data contests management, data channel management, service information configure, return channel connection. Etc. Particularly, this system provides the method to create and transfer the application signaling information for the T- DMB middle application based the java language.

  • PDF

ConWis: Assistive Software for People with Hearing and Speaking Disorders

  • Kodirov, Khasanboy;Kodirov, Khusanboy;Lee, Young-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.678-679
    • /
    • 2019
  • In this paper, we developed a medical computer application for both disable children and adults in order to provide the chance to communicate easily with others. Although there are many mobile healthcare apps available nowadays, we believe that users should also have many options for choosing different types of healthcare programs developed for computers. That's why we have developed ConWis. This application helps a person with hearing loss, voice, speech, or language disorder to communicate easily with others. Through this software, hearing and understanding what is being said more clearly or to express thoughts become easier. To use this software, patient should input a sentence and it will be converted to audio speech using built-in voices for man or woman. In addition to that, it can convert voice that is received by microphone into text and display it on the screen.

A Study on the Application of Motion Graphics Animation in Opening Titles of Noir Dramas

  • LinLin Huang;Xinyi Shan;Jeanhun Chung
    • International Journal of Advanced Culture Technology
    • /
    • v.12 no.3
    • /
    • pp.278-283
    • /
    • 2024
  • As the introductory content of televison series, the opening titles are crucial for helping the audience quickly grasp the tone of the narrative. With the continuous integration of the televison production industry and digital computer technology, motion graphics, featuring its unique dynamic graphic design, offers new avenues for title sequence creation. This paper dives into the application of motion graphics in the title sequences of noir genre television series, analyzing aspects such as visual style, content presentation, and narrative expression. By comparing early static text title sequences with motion graphics ones, this paper reveals the advantages of motion graphics in designing opening titles for noir genre television series and examines how it enhances visual impact and improves audience experience. This study not only enriches the creative techniques for title sequence design, but also provides valuable insights for future creations.

Design and Analysis of a New Video Conference System Supporting the NAT of Firewall (방화벽 NAT를 지원하는 새로운 다자간 화상회의 시스템의 설계 및 분석)

  • Jung, Yong-Deug;Kim, Gil-Choon;Jeon, Moon-Seog
    • The Journal of Society for e-Business Studies
    • /
    • v.9 no.4
    • /
    • pp.137-155
    • /
    • 2004
  • A video-conference system is being utilized in web based application services in various fields due to the widespread use of Internet and the progress of computer technologies. This system should use the public IP address for sharing file and white board and it is difficult to manage the internal network users of the firewall and non-public IP address users. In this paper, we propose an Application Level Gateway which transforms non-public IP address into public IP address. This mechanism is for the internal network users of the firewall or non-public IP address users over the Internet. We also propose a Control Daemon which manages video and audio media dynamically according to network bandwidth. This mechanism can start and terminate a video conference and manage the process of the video conference.

  • PDF

Development of a Mobile Application for Disease Prediction Using Speech Data of Korean Patients with Dysarthria (한국인 구음장애 환자의 발화 데이터 기반 질병 예측을 위한 모바일 애플리케이션 개발)

  • Changjin Ha;Taesik Go
    • Journal of Biomedical Engineering Research
    • /
    • v.45 no.1
    • /
    • pp.1-9
    • /
    • 2024
  • Communication with others plays an important role in human social interaction and information exchange in modern society. However, some individuals have difficulty in communicating due to dysarthria. Therefore, it is necessary to develop effective diagnostic techniques for early treatment of the dysarthria. In the present study, we propose a mobile device-based methodology that enables to automatically classify dysarthria type. The light-weight CNN model was trained by using the open audio dataset of Korean patients with dysarthria. The trained CNN model can successfully classify dysarthria into related subtype disease with 78.8%~96.6% accuracy. In addition, the user-friendly mobile application was also developed based on the trained CNN model. Users can easily record their voices according to the selected inspection type (e.g. word, sentence, paragraph, and semi-free speech) and evaluate the recorded voice data through their mobile device and the developed mobile application. This proposed technique would be helpful for personal management of dysarthria and decision making in clinic.

A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E (ITU-T G.729/G.729E와 호환성을 갖는 광대역 음성/오디오 부호화기)

  • Kim, Kyung-Tae;Lee, Min-Ki;Youn, Dae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.2
    • /
    • pp.81-89
    • /
    • 2008
  • Wideband speech, characterized by a bandwidth of about 7 kHz (50-7000 Hz), provides a substantial quality improvement in terms of naturalness and intelligibility. Although higher data rates are required, it has extended its application to audio and video conferencing, high-quality multimedia communications in mobile links or packet-switched transmissions, and digital AM broadcasting. In this paper, we present a new bandwidth-scalable coder for wideband speech and audio signals. The proposed coder spits 8kHz signal bandwidth into two narrow bands, and different coding schemes are applied to each band. The lower-band signal is coded using the ITU-T G.729/G.729E coder, and the higher-band signal is compressed using a new algorithm based on the gammatone filter bank with an invertible auditory model. Due to the split-band architecture and completely independent coding schemes for each band, the output speech of the decoder can be selected to be a narrowband or wideband according to the channel condition. Subjective tests showed that, for wideband speech and audio signals, the proposed coder at 14.2/18 kbit/s produces superior quality to ITU-T 24 kbit/s G.722.1 with the shorter algorithmic delay.