Search | Korea Science

Stereo-video Synchronization for 3D Video Transmission (3차원 비디오 전송을 위한 스테레오비디오 동기화 방법)

Lee, Dong-Jin;Lee, Seon-Oh;Sim, Dong-Gyu;Lee, Hyuk-Joon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.34 no.4B
- /
- pp.349-359
- /
- 2009
In this paper, we propose a stereo-video transmission method for reduction of delay and maximization of 3D effect. Conventional multimedia synchronization algorithms were designed to achieve minimum delay and synchronize multiple video and audio streams, however, they could not be effective for 3D video transmission. In this paper, we proposed a synchronization algorithm by considering the minimum error of time difference between streams for 3D effect. The minimum error of time difference for 3D effect was derived based on a 3D subjective quality test. We compute display time of the delivered videos within the allowed time-difference and the video are displayed according to the display time. To evaluate the performance of the proposed algorithm, we implemented a real-time video communication system and subjective quality test has been conducted with the proposed system. We found that video quality displayed by the proposed system. We found that video quality displayed by the proposed algorithm ranks 'good' and 'excellent' in the DMOS (Differential Mean Opinion Score) scale, based on the MOS (Mean Opinion Score) test.
PDF KSCI

Performance Improvement of Perceptual Filter Using Noise Energy Control (잡음 에너지 제어를 통한 지각 필터 성능 개선)

Seo Joung-Kook;Cha Hyung-Tai
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.1
- /
- pp.43-51
- /
- 2005
In this paper, we propose an algorithm that improves a tone quality of a noisy audio signal in order to enhance a Performance of perceptual filter using noise energy control. Most of the algorithms which were proposed by the other researchers usually applied a filter using the noise energy acquired from a silent range. In this case. the improvement rate of tone quality decreases if the noise energy is changed by the magnitude or environment variation in a signal frame. But the Proposed method Provides the means to find a food estimated noise through energy control of the estimated noise which is obtained from a silent range. Also we can get the enhancement of tone qualify in low frequency band unlike other methods. To show the performance of the Proposed algorithm, various input signals which had a different signal-to-noise ratio (SNR) such as 5dB, l0dB, 15dB and 20dB were used to test the proposed algorithm. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR). noise-to-mask ration (NMR) and mean opinion score (MOS) test.
PDF KSCI

A Study on Comparison of Pronunciation Accuracy of Soprano Singers

Song, Uk-Jin;Park, Hyungwoo;Bae, Myung-Jin
- International journal of advanced smart convergence
- /
- v.6 no.2
- /
- pp.59-64
- /
- 2017
There are three sorts of voices of female vocalists: soprano, mezzo-soprano, and contralto according to the transliteration. Among them, the soprano has the highest vocal range. Since the voice is generated through the human vocal tract based on the voice generation model, it is greatly influenced by the vocal tract. The structure of vocal organs differs from person to person, and the formants characteristic of vocalization differ accordingly. The formant characteristic refers to a characteristic in which a specific frequency band appears distinctly due to resonance occurring in each vocal tract in the vocal process. Formant characteristics include personality that occurs in the throat, jaw, lips, and teeth, as well as phonological properties of phonemes. The first formant is the throat, the second formant is the jaw, the third formant and the fourth formant are caused by the resonance phenomenon in the lips and the teeth. Among them, pronunciation is influenced not only by phonological information but also by jaws, lips and teeth. When the mouth is small or the jaw is stiff when pronouncing, pronunciation becomes unclear. Therefore, the higher the accuracy of the pronunciation characteristics, the more clearly the formant characteristics appear in the grammar spectrum. However, many soprano singers can not open their mouths because their jaws, lips, teeth, and facial muscles are rigid to maintain high tones when singing, which makes the pronunciation unclear and thus the formant characteristics become unclear. In this paper, in order to confirm the accuracy of the pronunciation characteristics of soprano singers, the experimental group was selected as the soprano singers A, B, C, D, E of Korea and analyzed the grammar spectrum and conducted the MOS test for pronunciation recognition. As a result, soprano singer B showed a clear recognition from F1 to F5 and MOS test result showed the highest recognition rate with 4.6 points. Soprano singers A, C, and D appear from F1 to F3, but it was difficult to find formants above 2kHz. Finally, the soprano singer E had difficulty in finding the formant as a whole, and MOS test showed the lowest recognition rate at 2.1 points. Therefore, we confirmed that the soprano singer B, which exhibits the most distinct formant characteristics in the grammar spectrum, has the best pronunciation accuracy.
https://doi.org/10.7236/IJASC.2017.6.2.59 인용 PDF KSCI

Adaptive Enhancement Algorithm of Perceptual Filter Using Variable Threshold (가변 임계값을 이용한 지각 필터의 적응적인 음질 개선 알고리즘)

차형태
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.6
- /
- pp.446-453
- /
- 2004
In this paper, a new adaptive perceptual filter using variable threshold to enhance audio signals degraded by additively nonstationary noise is proposed. The adaptive perceptual filter updates variable threshold each time according to the power of signal and the effect of noise variation. So the noisy audio signal is enhanced by the method which controls a residual noise effectively. The proposed algorithm uses the perceptual filter which transforms a time domain signal into frequency domain and calculates an intensity energy and an excitation energy in bark domain. In this method. the stage updated the response of filter is decided by threshold. The proposed algorithm using vairable threshold effectively controls a residual noise using the energy difference of audio signals degraded by the additive nonstationary noise. The proposed method is tested with the noisy audio signals degraded by nonstationary noise at various signal -to-noise ratios (SNR). We carry out NMR and MOS test when the input SNR is 15dB. 20dB. 25dB and 30dB. An approximate improvement of 17.4dB. 15.3dB, 12.8dB. 9.8dB in NMR and enhancement of 2.9, 2.5, 2.3, 1.7 in MOS test is achieved with the input signals. respectively.
PDF KSCI

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

Lee, Jung-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.7
- /
- pp.91-98
- /
- 2010
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.9708/jksci.2010.15.7.091 인용 PDF KSCI

A Study on Objective Quality Assessment for Synthesized speech by Rule (규칙합성음의 객관적 품질평가에 관한 연구)

홍진우;김순협
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.10
- /
- pp.42-49
- /
- 1993
In this paper, we evaluate the quality of synthesized speech by rule using the LPC CD as a objective measure, and then compare the test result with the subjective one. Speech used for the test consists of 108 words which are selected by word construction method using Korean attribute and frequency distribution, synthesized by demi-syllable rule. By evaluating the quality of synthesized speech by reule objectively, we have tried to resolve the problems such as lots of evaluation time, expansion of test scale, and variables of analysis result arised by subjective measure. We have, also, proved the validity of the objective test using the LPC CD, by comparing intelligibility which is the index for the subjective quality evaluation of synthesized speech by rule with MOS. From this results, we can provide a guide for quality assessment that would be useful in the R&D of synthesis method and the commercial products using synthesized speech.
PDF

The Study for Investigation of the sufficient vertical profile with reducing loading effect for silicon deep trench etching (Vertical Profile Silicon Deep Trench Etch와 Loading effect의 최소화에 대한 연구)

Kim, Sang-Yong;Jeong, Woo-Yang;Yi, Keun-Man;Kim, Chang-Il
- Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
- /
- 2009.06a
- /
- pp.118-119
- /
- 2009
This paper presents the feature profile evolution silicon deep trench etching, which is very crucial for the commercial wafer process application. The silicon deep trenches were etched with the SF6 gas & Hbr gas based process recipe. The optimized silicon deep trench process resulted in vertical profiles (87o~90o) with loading effect of < 1%. The process recipes were developed for the silicon deep trench etching applications. This scheme provides vertically profiles without notching of top corner was observed. In this study, the production of SF6 gas based silicon deep trench etch process much more strongly than expected on the basis of Hbr gas trench process that have been investigated by scanning electron microscope (SEM). Based on the test results, it is concluded that the silicon deep trench etching shows the sufficient profile for practical MOS FET silicon deep trench technology process.
PDF

Time-Dependent Dielectric Breakdown Characteristics of Thin $SiO_2$ Films and Their Correlation to Defects in the Oxide (얇은 산화막의 TDDB 특성과 막내의 결함과의 상관성)

Sung, Yung-Kwon;Choi, Jong-Ill;Kim, Sang-Yung;Han, Sung-Jin
- Proceedings of the KIEE Conference
- /
- 1988.11a
- /
- pp.147-150
- /
- 1988
Since the integration level of VLSI circuits progresses very quickly, a highly reliable thin $SiO_2$ film is required to fabricate a small-geometry MOS device. In the present study we have attempted to eliminate the failure-causing defects that develop in thin oxide films during the oxidation step by performing a long-time preoxidation and postoxidation annealing. The TDDB test and the copper decoration method were used to calculate the oxide defects density of MOS device. The dielectric reliability of high-quality thin oxides have been studied by using the time-zero-dielectric-breakdown (ramp-voltage-stressed I-V) and time-dependent-dielectric -breakdown (Constant-stressed I-V) tests. Failure times against temperature and electric field are examined and acceleration factors are abtained for each parameter. Based on the data obtained, breakdown wearout limitation for thin oxide films is estimated.
PDF

A Short-term and Long-term Usability Testing of the Speech Synthesizer for the People with Visual Impairments (시각장애인용 음성합성기에 대한 장/단기 사용성 평가)

Lee, H.Y.;Hong, K.H.
- Journal of rehabilitation welfare engineering & assistive technology
- /
- v.9 no.1
- /
- pp.53-60
- /
- 2015
We conducted a long-term and short-term usability testing on the built-in speech synthesizer of a screen-reader for the people with visual impairments. A total of 20 persons with visual impairments participated in the short-term usability testing, and 10 of them participated in the long-term usability testing. Naturalness and clarity of the synthetic speech were evaluated by MOS scores, preference for various synthetic speeches was examined through a preference test, and the users' satisfaction level and other requirements for the synthetic speech were evaluated by open feedback. We also examined naturalness, clarity, preference, and user requirements for the synthetic speech through a long-term usability testing. Then, we compare and contrast the long-term and short-term usability testing results.
PDF

Automatic Music-Story Video Generation Using Music Files and Photos in Automobile Multimedia System (자동차 멀티미디어 시스템에서의 사진과 음악을 이용한 음악스토리 비디오 자동생성 기술)

Kim, Hyoung-Gook
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.9 no.5
- /
- pp.80-86
- /
- 2010
This paper presents automated music story video generation technique as one of entertainment features that is equipped in multimedia system of the vehicle. The automated music story video generation is a system that automatically creates stories to accompany musics with photos stored in user's mobile phone by connecting user's mobile phone with multimedia systems in vehicles. Users watch the generated music story video at the same time. while they hear the music according to mood. The performance of the automated music story video generation is measured by accuracies of music classification, photo classification, and text-keyword extraction, and results of user's MOS-test.
PDF KSCI

Search Result 114, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)