• Title/Summary/Keyword: Audio Generation

Search Result 103, Processing Time 0.02 seconds

A Study on Delivery Integration of UHD, Mobile HD, Digital Radio based on ATSC 3.0 (ATSC 3.0 기반 UHD, 이동HD, 디지털라디오 통합전송 연구)

  • Seo, Chang Ho;Im, Yoon Hyeock;Jeon, Sung Ho;Seo, Jae Hyun;Choi, Seong Jhin
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.643-659
    • /
    • 2019
  • In this paper, the technology verification of next generation broadcasting technology and service suitable for domestic broadcasting environment was carried out to build and activate domestic terrestrial UHD broadcasting. ATSC 3.0-based mobile HD broadcasting is currently conducting experiments with various parameters from broadcasting companies, research institutes and others. However, experiments on integrated transmissions, including audio services, have not been carried out. Through this experiment, we first performed the theory and experiment on the maximum number of ATSC 3.0 based UHD broadcasting service, maximum service number of HD broadcasting considering mobility, and maximum service number of audio broadcasting within one channel (6MHz). Second, parameters for integrated transmission of each service (UHD broadcasting, mobile HD and audio broadcasting) in one channel were derived. Finally, we studied technical possibilities through field tests that we receive while moving directly in the field.

Korean Emotional Speech and Facial Expression Database for Emotional Audio-Visual Speech Generation (대화 영상 생성을 위한 한국어 감정음성 및 얼굴 표정 데이터베이스)

  • Baek, Ji-Young;Kim, Sera;Lee, Seok-Pil
    • Journal of Internet Computing and Services
    • /
    • v.23 no.2
    • /
    • pp.71-77
    • /
    • 2022
  • In this paper, a database is collected for extending the speech synthesis model to a model that synthesizes speech according to emotions and generating facial expressions. The database is divided into male and female data, and consists of emotional speech and facial expressions. Two professional actors of different genders speak sentences in Korean. Sentences are divided into four emotions: happiness, sadness, anger, and neutrality. Each actor plays about 3300 sentences per emotion. A total of 26468 sentences collected by filming this are not overlap and contain expression similar to the corresponding emotion. Since building a high-quality database is important for the performance of future research, the database is assessed on emotional category, intensity, and genuineness. In order to find out the accuracy according to the modality of data, the database is divided into audio-video data, audio data, and video data.

User-created multi-view video generation with portable camera in mobile environment (모바일 환경의 이동형 카메라를 이용한 사용자 저작 다시점 동영상의 제안)

  • Sung, Bo Kyung;Park, Jun Hyoung;Yeo, Ji Hye;Ko, Il Ju
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.1
    • /
    • pp.157-170
    • /
    • 2012
  • Recently, user-created video shows high increasing in production and consumption. Among these, videos records an identical subject in limited space with multi-view are coming out. Occurring main reason of this kind of video is popularization of portable camera and mobile web environment. Multi-view has studied in visually representation technique fields for point of view. Definition of multi-view has been expanded and applied to various contents authoring lately. To make user-created videos into multi-view contents can be a kind of suggestion as a user experience for new form of video consumption. In this paper, we show the possibility to make user-created videos into multi-view video content through analyzing multi-view video contents even there exist attribute differentiations. To understanding definition and attribution of multi-view classified and analyzed existing multi-view contents. To solve time axis arranging problem occurred in multi-view processing proposed audio matching method. Audio matching method organize feature extracting and comparing. To extract features is proposed MFCC that is most universally used. Comparing is proposed n by n. We proposed multi-view video contents that can consume arranged user-created video by user selection.

A Quality Improvement of MP3-Coded Audios Using Bandwidth Extension (대역 확장을 통한 MP3 오디오의 음질 향상)

  • Heo, So-Young;Kim, Rin-Chul
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.744-751
    • /
    • 2008
  • In this paper, we investigate methods to enhance the perceptual quality of MP3-coded audios. Based on the high frequency reconstruction method by Liu, in the proposed method, we determine adaptively the starting point of high frequency reconstruction. We also present an improved linear estimation method. For high frequency component generation, we compare two methods. One is a replication of low-frequency components and the other is an insertion of additive white Gaussian noise signals. Through subjective tests, we shall show that the proposed method can improve the perceptual quality of MP3-coded audio.

Video Generation Algorithm for Remote Lecture Recording Tools (원격 강의용 콘텐츠 제작 도구를 위한 동영상 생성 알고리즘)

  • Kwon, Oh-Sung
    • Journal of The Korean Association of Information Education
    • /
    • v.22 no.5
    • /
    • pp.605-611
    • /
    • 2018
  • On-Line Lectures are becoming more common due to the MOOK service and the expansion of national policy in Korea. Especially, It is being changed to new remote mixed style from traditional lecture in universities. We propose and implement a remote contents making tool with audio synchronization function based on more with less resources. To implement our proposed algorithm, we design an interactive interface to assign multiple cutting intervals and convert an input video to print a new result. In experimental, we can confirm our algorithm works properly with average performance value 9.3% cpu share ratio and 87mega byte ram usage(CPU 2.60GHz, 820*600 Area).

Helmet Tracking Techniques Using Phase Difference between Acoustic Beating Envelope which Wave Length is Longer than Audio Frequency (고주파 맥놀이 신호의 포락선 위상차를 이용한 음향식 헬멧자세추정 기법)

  • Choi, Kyong-Sik;Kim, Sang-Seok;Park, Chan-Heum;Yang, Jun-Ho
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.16 no.1
    • /
    • pp.27-33
    • /
    • 2013
  • Helmet Mounted Display(HMD) has great advantages on the navigation and mission symbologies for the pilot's forward looking display and, therefore, has been remarkably drawing attention as the up coming display of the next generation aircraft. The essential technology to process the Line of Sight-Foward(LOS-F) data in real-time is to estimate exact helmet situation and position. In this paper, we research a acoustic helmet tracking technique. For the reason that mechanical acoustic noises might interfere with Helmet Tracking System(HTS) and unnecessary acoustic noises are inevitable when using acoustic technique, this approach has not been adapted. In order to overcome this problem. We propose that acoustic wave of which the wave length is longer than audio frequency and, especially, we used beating signal envelope which is composed of two close high frequency.

Deep Learning based Singing Voice Synthesis Modeling (딥러닝 기반 가창 음성합성(Singing Voice Synthesis) 모델링)

  • Kim, Minae;Kim, Somin;Park, Jihyun;Heo, Gabin;Choi, Yunjeong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.127-130
    • /
    • 2022
  • This paper is a study on singing voice synthesis modeling using a generator loss function, which analyzes various factors that may occur when applying BEGAN among deep learning algorithms optimized for image generation to Audio domain. and we conduct experiments to derive optimal quality. In this paper, we focused the problem that the L1 loss proposed in the BEGAN-based models degrades the meaning of hyperparameter the gamma(𝛾) which was defined to control the diversity and quality of generated audio samples. In experiments we show that our proposed method and finding the optimal values through tuning, it can contribute to the improvement of the quality of the singing synthesis product.

  • PDF

The Analysis of the Effects of Hanliu Phenomenon on the Chinese Young Generation′s Fashion Style (한류(韓流) 현상에 중국 신세대 패션에 미친 영향 분석)

  • 김재은;박길순
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.28 no.1
    • /
    • pp.154-164
    • /
    • 2004
  • The purpose of this theses is to review Hanliu phenomenon, a kind of social and cultural phenomenon, in China around A. D. 2000 in the view of the culture-diffusion theory, and analyze its effect to the fashion style of the new young generation of China. In this theses, Hanliu phenomenon means the enthusiasm of Asian people for Korean mass cultures such as Korean dramas, pop songs and fashions from late 1990's. This research adopts two kinds of methods for analyzing Hanliu phenomenon: a qualitative research method and a quantitative one. As a qualitative research method, we analyzed Hanliu phenomenon with several sources of documentaries and audio-visual materials on it. As a quantitative research method, we conducted a survey of about 100 university students in Beijing for how they feel of korean culture and fashions. The Hanliu phenomenon leads to the popularity of Korean products and the general Korean cultures. Also, it affected the Chinese young generation so much that the Korean fashion becomes popular among them. Its effects to the fashion styles of Chinese youths can be summarized in three factors as follows. Firstly, the fashions of Korean entertainers such as H.O.T hair style and Hip-hop fashion style are widely imitated. Secondly, the preference of Korean fashion products has been widely increased. The number of stores dealing with Korean fashion products has been increased. Finally, Korean culture and products have actively been imitated in China according to the increased popularity of Korean fashion products.

Automatic measurement of voluntary reaction time after audio-visual stimulation and generation of synchronization signals for the analysis of evoked EEG (시청각자극 후의 피험자의 자의적 반응시간의 자동계측과 유발뇌파분석을 위한 동기신호의 생성)

  • 김철승;엄광문;손진훈
    • Science of Emotion and Sensibility
    • /
    • v.6 no.4
    • /
    • pp.15-23
    • /
    • 2003
  • Recently, there have been many attempts to develop BCI (brain computer interface) based on EEG (electroencephalogram). Measurement and analysis of EEG evoked by particular stimulation is important for the design of brain wave pattern and interface of BCI. The purpose of this study is to develop a general-purpose system that measures subject's reaction time after audio-visual stimulation which can work together with any other biosignal measurement systems. The entire system is divided into four modules, which are stimulation signal generation, reaction time measurement, evoked potential measurement and synchronization. Stimulation signal generation module was implemented by means of Flash. Measurement of the reaction time (the period between the answer request and the subject reaction) was achieved by self-made microcontroller system. EEG measurement was performed using the ready-made hardware and software without any modification. Synchronization of all modules was achieved by, first, the black-and-white signals on the stimulation screen synchronized with the problem presentation and the answer request, second, the photodetectors sensing the signals. The proposed method offers easy design of purpose-specific system only by adding simple modules (reaction time measurement, synchronization) to the ready-made stimulation and EEG system, and therefore, it is expected to accelerate the researches requiring the measurement of evoked response and reaction time.

  • PDF

A Study for Depth-map Generation using Vanishing Point (소실점을 이용한 Depth-map 생성에 관한 연구)

  • Kim, Jong-Chan;Ban, Kyeong-Jin;Kim, Chee-Yong
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.2
    • /
    • pp.329-338
    • /
    • 2011
  • Recent augmentation reality demands more realistic multimedia data with the mixture of various media. High-technology for multimedia data which combines existing media data with various media such as audio and video dominates entire media industries. In particular, there is a growing need to serve augmentation reality, 3-dimensional contents and realtime interaction system development which are communication method and visualization tool in Internet. The existing services do not correspond to generate depth value for 3-dimensional space structure recovery which is to form solidity in existing contents. Therefore, it requires research for effective depth-map generation using 2-dimensional video. Complementing shortcomings of existing depth-map generation method using 2-dimensional video, this paper proposes an enhanced depth-map generation method that defines the depth direction in regard to loss location in a video in which none of existing algorithms has defined.