• Title/Summary/Keyword: Voice broadcast

Search Result 57, Processing Time 0.025 seconds

An asymmetric WDM-EPON structure for the convergence of broadcast and communication (방송통신 통합을 위한 비대칭 WDM-EPON 구조에 관한 연구)

  • Hur Jung;Koo Bon-Jeong;Park Youngil
    • Journal of Broadcast Engineering
    • /
    • v.10 no.2
    • /
    • pp.182-189
    • /
    • 2005
  • In this paper, an asymmetric WDM-EPON transmission scheme is proposed to be used in a high speed access network system, which is required to implement the convergence of broadcast and communication. WDM is used for downstream transmission from OLT to access nodes, satisfying wide bandwidth requirement for broadcasting and various multimedia services. And an EPON scheme, which is cheaper than WDM, is applied to upstream transmission where less bandwidth is required. A transmission test in physical layer was performed successfully and the results are provided. If ONUs are to be used in a home gateway, its protocol should be appropriate to its traffic pattern. Voice is sensitive to a time delay while data is not. A new dynamic bandwidth assignment protocol for PON system, which can cope with various types of data in access network is proposed and its performance is analysed. A maximum cycle time is specified to achieve the QoS of signals sensitive to time delay. And a minimum window is specified to prevent the downstream control signals from uprising. It is shown by simulation that the proposed EPON protocol can provide a better performance than previous ones.

Recognition of Overlapped Sound and Influence Analysis Based on Wideband Spectrogram and Deep Neural Networks (광역 스펙트로그램과 심층신경망에 기반한 중첩된 소리의 인식과 영향 분석)

  • Kim, Young Eon;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.421-430
    • /
    • 2018
  • Many voice recognition systems use methods such as MFCC, HMM to acknowledge human voice. This recognition method is designed to analyze only a targeted sound which normally appears between a human and a device one. However, the recognition capability is limited when there is a group sound formed with diversity in wider frequency range such as dog barking and indoor sounds. The frequency of overlapped sound resides in a wide range, up to 20KHz, which is higher than a voice. This paper proposes the new recognition method which provides wider frequency range by conjugating the Wideband Sound Spectrogram and the Keras Sequential Model based on DNN. The wideband sound spectrogram is adopted to analyze and verify diverse sounds from wide frequency range as it is designed to extract features and also classify as explained. The KSM is employed for the pattern recognition using extracted features from the WSS to improve sound recognition quality. The experiment verified that the proposed WSS and KSM excellently classified the targeted sound among noisy environment; overlapped sounds such as dog barking and indoor sounds. Furthermore, the paper shows a stage by stage analyzation and comparison of the factors' influences on the recognition and its characteristics according to various levels of noise.

Voice Activity Detection using Motion and Variation of Intensity in The Mouth Region (입술 영역의 움직임과 밝기 변화를 이용한 음성구간 검출 알고리즘 개발)

  • Kim, Gi-Bak;Ryu, Je-Woong;Cho, Nam-Ik
    • Journal of Broadcast Engineering
    • /
    • v.17 no.3
    • /
    • pp.519-528
    • /
    • 2012
  • Voice activity detection (VAD) is generally conducted by extracting features from the acoustic signal and a decision rule. The performance of such VAD algorithms driven by the input acoustic signal highly depends on the acoustic noise. When video signals are available as well, the performance of VAD can be enhanced by using the visual information which is not affected by the acoustic noise. Previous visual VAD algorithms usually use single visual feature to detect the lip activity, such as active appearance models, optical flow or intensity variation. Based on the analysis of the weakness of each feature, we propose to combine intensity change measure and the optical flow in the mouth region, which can compensate for each other's weakness. In order to minimize the computational complexity, we develop simple measures that avoid statistical estimation or modeling. Specifically, the optical flow is the averaged motion vector of some grid regions and the intensity variation is detected by simple thresholding. To extract the mouth region, we propose a simple algorithm which first detects two eyes and uses the profile of intensity to detect the center of mouth. Experiments show that the proposed combination of two simple measures show higher detection rates for the given false positive rate than the methods that use a single feature.

A Development of Automatic Safety Navigation Support Service Providing System for Medium and Small Ships based on Speech Synthesis (중소형 선박을 위한 음성합성 기반 자동 안전항해 지원 서비스 제공 시스템 개발)

  • Hwang, Hun-Gyu;Kim, Bae-Sung;Woo, Yum-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.595-602
    • /
    • 2021
  • Marine accidents are mostly caused by medium and small ships, and are continuously increasing. In this paper, we propose an architecture of the speech synthesis based automatic safety navigation support service providing system for small ships that equiped onboard systems compared with vessels. The main purpose of the system is to prevent marine accidents by providing synthesized voice safety messages to nearby ships. The safety navigation support service is operated by connecting GPS and AIS to synthesize voice safety messages, automatically broadcast through VHF. Therefore, we developed a data processing module, a staged risk analysis module, a voice synthesis safety message generation module, and a VHF broadcasting equipment control module, which are components of the system. In addition, we conducted laboratory-level and sea-trial demonstration tests using the developed the system, which verified usefulness of the proposed service.

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.150-151
    • /
    • 2010
  • 본 논문에서는 잡음환경에서의 이중채널 음성인식을 위한 통계모델 기반 음성구간 검출 방법을 제안한다. 제안된 방법에서는 다채널 입력 신호로부터 얻어진 공간정보를 이용하여 음성 존재 및 부재 확률모델을 구하고 이를 통해 음성구간 검출을 행한다. 이때, 공간정보는 두 채널간의 상호 시간 차이와 상호 크기 차이로, 음성 존재 및 부재 확률은 가우시안 커널 밀도 기반의 확률모델로 표현된다. 그리고 음성구간은 각 시간 프레임 별 음성 존재 확률 대비 음성 부재 확률의 비를 추정하여 검출된다. 제안된 음성구간 검출 방법의 평가를 위해 검출된 구간만을 입력으로 하는 음성인식 성능을 측정한다. 실험결과, 제안된 공간정보를 이용하는 통계모델 기반의 음성구간 검출 방법이 주파수 에너지를 이용하는 통계모델 기반의 음성구간 검출 방법과 주파수 스펙트럼 밀도 기반 음성구간 검출 방법에 비해 각각 15.6%, 15.4%의 상대적 오인식률 개선을 보였다.

  • PDF

Implementation of an Efficient Voice Transmission System in Bluetooth Network Rnvironments (블루투스 네트워크 환경에서의 효율적인 음성전송 시스템 구현)

  • Kim, Myung-Jong;Park, Ji-Hun;Kim, Hong-Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2008.02a
    • /
    • pp.125-128
    • /
    • 2008
  • IPTV의 상용화에 맞추어 사용자와 TV간의 정보 교환에 의한 대화형 서비스들이 제공되고 있으며, 특히 음성인식 기술은 이러한 서비스를 실현하기 위한 중요한 기술 중의 하나로 대두되고 있다. TV에서의 음성인식 수행을 위해서는 가정환경과 같은 제한된 공간에서 효율적으로 사용자의 음성을 TV에 전송할 수 있는 근거리 무선통신 수단이 필요하게 된다. 특히, 리모트 컨트롤러와 같은 저전력 시스템 환경에서 구현이 가능해야 한다. 따라서 이러한 제한된 조건에서 최적의 성능을 갖는 음성 전송 시스템 개발이 요구되고 있다. 본 논문에서는 블루투스 환경 하에서 음성인식을 위해 필요한 음성전송 시스템을 실시간 구현한다. 효율적인 음성전송을 위해 G.711을 기본 코덱으로 사용하며, 음성전송 시 발생하는 패킷손실에 따른 음성 품질 저하를 줄이기 위해 G.711 패킷손실 은닉 알고리즘을 음성전송 시스템에 적용한다. 특히 G.711 패킷 손실 은닉 알고리즘 수행을 위해 블루투스 프로토콜 스택application layer에 RTP 프로토콜을 적용하여 패킷 손실 여부를 확인하고, 패킷 손실 발생 시 패킷손실 은닉 알고리즘을 통해 음성의 품질 저하를 줄인다. 구현된 시스템의 성능을 평가한 결과, G.711 패킷 손실 알고리즘을 적용하여 2~10%의 패킷손실 환경에서 14.7%의 음질개선을 얻을 수 있었다.

  • PDF

Voice Activity Detection Using Ellipse Fitting of the Oral Cavity Region (구강 영역에 대한 타원 근사법을 이용한 음성 구간 검출법)

  • Ryu, Jewoong;Choo, Sung Kwon;Kim, Gibak;Cho, Namik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.271-274
    • /
    • 2012
  • 음성 신호처리에서 많이 쓰이는 음성구간 검출은 주로 음향신호의 분석을 통하여 음향 신호에 음성이 존재하는지 여부를 판단한다. 그러나 음향신호를 이용한 방법은 음성 또는 비음성 잡음이나 주위 음향 환경에 의하여 성능이 결정된다는 단점이 있다. 음향 환경 변화에 강인한 음성구간 검출을 수행하기 위하여, 영상정보를 이용한 음성구간 검출 방법들이 최근에 연구되어 왔는데 기존 방법들은 입술 모양의 변화를 추정하기 위하여 입술 모델 등을 이용하거나 구강(oral cavity) 영역에 해당하는 픽셀 수의 변화를 이용하여 음성 구간을 검출하였다. 위 방법들은 입술의 모양을 추정하는 데 복잡한 계산이 필요하거나, 입술 모양 추정 없이 구강 영역픽셀 수만 이용하기 때문에 다소 정확도가 떨어진다는 단점이 있다. 본 논문에서는, 입술 모양의 변화를 추정하기 위해 밖으로 드러나는 구강 영역의 모양을 타원 근사법으로 추정하고, 타원의 넓이와 높이의 변화를 이용하여 음성 구간을 검출하는 방법을 제안하였다. 비교 실험 결과, 제안하는 방법은 구강영역 픽셀 수의 변화만 이용하는 방법에 비해 우수한 성능을 보임을 확인할 수 있었다.

  • PDF

A Study on subtitle synchronization calibration to enhance hearing-impaired persons' viewing convenience of e-sports contents or game streamer contents (청각장애인의 이스포츠 중계방송 및 게임 스트리머 콘텐츠 시청 편의성 증대를 위한 자막 동기화 보정 연구)

  • Shin, Dong-Hwan;Kim, Jeong-Soo;Kim, Chang-Won
    • Journal of Korea Game Society
    • /
    • v.19 no.1
    • /
    • pp.73-84
    • /
    • 2019
  • This study is intended to suggest ways to improve the quality of the service of subtitles provided for the convenience of viewing for deaf people on e-sports broadcast content and game streamer content. Generally, subtitling files of broadcast content are manually written on air by stenographers, so a delay of 3 to 5 seconds is inevitable compared to the original content. Therefore, the present study proposed the formation of an automatic synchronization calibration system using speech recognition technology. In addition, a content application experiment using this system was conducted, and the final result confirmed that the time of synchronization error of subtitling data could be reduced to less than 1 second.

A security method for Gatekeeper based on Digital Authentication by H.235

  • Hwang Seon Cheol;Han Seung Soo;Lee Jun Young;Choi Jun Rim
    • Proceedings of the IEEK Conference
    • /
    • 2004.08c
    • /
    • pp.759-763
    • /
    • 2004
  • While the needs for VoIPs(Voice over IP) encourage the commercial trials for VoIP services, there are many problems such as user authentication, blocking of illegal user and eavesdropping. In this paper, a management algorithm of registration of VoIP terminals is explained and security methods for tolling and data encryption module is designed and built up. The module structure will have the advantages of the entire development of secured gatekeeper without whole modification of gatekeeper. In order to secure the ordinary gatekeeper based on H.323 standard, user authentication and data encryption technologies are developed based on the H.235 standard and simply located over the plain H.323 stacks. The data structures for secured communications are implemented according to ASN.1 structures by H.235.

  • PDF

Metadata Design and Verification Test Bed System for Augmented Broadcasting (증강방송 메타데이터 설계 및 검증용 테스트 베드 시스템 구현)

  • Choi, Bumsuk;Kim, Suncheol;Jeong, Youngho;Hong, Jinwoo;Lee, Wondon
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.736-745
    • /
    • 2014
  • In this paper we introduce augmented broadcasting service scenarios which combines augmented reality service with broadcasting environment. As the broadcasting environment is different from mobile service environment, there are many restrictions in developing full AR services in TV. However TV has strong benefit of large screen, high quality contents, advanced user interface for motion and voice, and smart TV applications, which means that they will enhance the possibility of success for augmented broadcasting service. This paper proposes metadata structure containing information for augmentation region, time, augmented contents, and registration information for natural composition. We also implemented test bed system comprised of authoring server, broadcasting server, and user terminal for verifying metadata in broadcasting system.