Search | Korea Science

A Study on the Korean Continuous Speech Recognition using Phonetic Decision Tree-based State Splitting (음소결정트리 상태분할을 이용한 한국어 연속음성인식에 관한 연구)

오세진;황철준;김범국;정호열;정현열
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2001.06a
- /
- pp.277-280
- /
- 2001
본 연구에서는 연속음성인식 시스템의 성능개선을 위한 기초 연구로서 음소결정트리 상태분할과 한국어 음성학적 지식을 이용하여 문맥의존 음향모델의 작성방법을 검토하고. 한국어 연속음성인식에 적용을 소개한다. 음소결정트리 상태분할 알고리즘은 각 노드에서 한국어 음성학적 지식으로 구성된 음소 질의어 집합에 따라 2진 트리로 SSS(Successive State Splitting) 알고리즘에 의해 상태분할 하는 방법으로서 상태분할 후 각 상태를 네트워크로 연결한 구조를 HM-Net(Hidden Markow Network)이라 하며 문맥의존 음향모델로 표현된다. 작성한 문맥의존 음향모델의 유효성을 확인하기 위해 본 연구실의 항공편 예약 문장(YNU200)에 대해 연속음성인식 실험을 수행하였다. 인식실험 결과, 문맥의존 음향모델에 대한 화자독립 연속음성인식률이 기존의 단일 HMM 모델보다 평균적으로 1-pass의 경우 9.9%, 2-pass의 경우 4.1% 향상된 인식률을 보였다. 따라서 문맥의존 음향모델을 작성하는데 음소결정트리 상태분할과 한국어 음성학적 지식이 유효함을 확인하였다.
PDF

A Study on Extracting Ideas from Documents and Webpages in the Field of Idea Mining (아이디어 마이닝 분야에서 문헌과 웹페이지의 아이디어 발췌에 대한 연구)

Lee, Tae-Young
- Journal of the Korean Society for information Management
- /
- v.29 no.1
- /
- pp.25-43
- /
- 2012
The ideas and quasi-ideas useful for human's creation were drawn out from documents and webpages with extraction methods used in idea mining, opinion mining, and topic signal mining. The extraction methods comprised (1) decisive cue phrases, (2) cue figures and sounds, (3) contextual signals, and (4) discourse segmentations, They tested on the idea samples, such as thoughts, plans, opinions, writings, figures, sounds, and formulas. Methods (1), (3), and (4) received largely positive evaluation, judging the efficiency of 4 methods by F measure, a mixture of recall and precision ratio. In particular, decisive cue phrase method was effective to search idea and contextual signal method was effective to detect quasi-idea.
https://doi.org/10.3743/KOSIM.2012.29.1.025 인용 PDF KSCI

A Study on Applied to Optimal Diagnostic Device in Portal Vein Visualization: Focused on MRI and CT (간문맥 묘출을 위한 최적의 영상진단 장치에 관한 연구: MRI, CT 중심으로)

Goo, Eun-Hoe
- Journal of the Korean Society of Radiology
- /
- v.13 no.2
- /
- pp.217-225
- /
- 2019
The purpose of this study was to quantitate signal to noise ratio and contrast to noise ratio of the portal vein using CT and 3.0T MRI and to investigate the optimal imaging device. Twenty patients who inspective CT and 3.0T MRI between February 2018 and April 2018 were randomly assigned to receive data from the picture archiving communication system. The SNR and CNR values were evaluated by measuring the mean and standard deviation of the region of interest of the four regions of the portal vein (the main portal vein, the right vein, the left vein, and the middle vein). The results showed that SNR was 9.180.72 in the right context, 9.410.84 in the left context, 9.540.59 in the middle context, 9.550.75 in the order context, and 22.292.03 in the right context and 25.893 in the 3.0T MRI. 19, median context: 24.392.87, and order Mac: 26.642.30 (p<0.05). CNR was 3.790.68 in the CT context, 3.740.65 in the left context, 3.710.39 in the middle context, 3.790.68 in the order context, 9.490.65 in the right context, and 11.0001.90 in the 3.0T MRI, Intermediate context: 12.701.75, order Mac: 10.010.98, 3.0T MRI was higher than CT (p<0.05). In conclusion, SNR and CNR values were higher in the 3.0T MRI than CT in the 4 portal regions. Therefore, 3.0T MRI using non-ionizing radiation was the most superior imaging equipment than CT.
https://doi.org/10.7742/jksr.2019.13.2.217 인용 PDF KSCI HTML

A Study-on Context-Dependent Acoustic Models to Improve the Performance of the Korea Speech Recognition (한국어 음성인식 성능향상을 위한 문맥의존 음향모델에 관한 연구)

황철준;오세진;김범국;정호열;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.2 no.4
- /
- pp.9-15
- /
- 2001
In this paper we investigate context dependent acoustic models to improve the performance of the Korean speech recognition . The algorithm are using the Korean phonological rules and decision tree, By Successive State Splitting(SSS) algorithm the Hidden Merkov Netwwork(HM-Net) which is an efficient representation of phoneme-context-dependent HMMs, can be generated automatically SSS is powerful technique to design topologies of tied-state HMMs but it doesn't treat unknown contexts in the training phoneme contexts environment adequately In addition it has some problem in the procedure of the contextual domain. In this paper we adopt a new state-clustering algorithm of SSS, called Phonetic Decision Tree-based SSS (PDT-SSS) which includes contexts splits based on the Korean phonological rules. This method combines advantages of both the decision tree clustering and SSS, and can generated highly accurate HM-Net that can express any contexts To verify the effectiveness of the adopted methods. the experiments are carried out using KLE 452 word database and YNU 200 sentence database. Through the Korean phoneme word and sentence recognition experiments. we proved that the new state-clustering algorithm produce better phoneme, word and continuous speech recognition accuracy than the conventional HMMs.
PDF

An algorithm of the Non-uniform synthesis unit selection for concatenative speech synthesis system (연결형 합성시스템을 위한 문맥종속 단위 기반의 비정형 합성단위 추출 알고리즘)

김영일
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.273.2-277
- /
- 1998
본 논문에서는 음소단위 비정형 연결합성 시, 접합점에서 포만트 불연속을 최소화할 수 있도록 이웃음소간 경계강도 예측모델과 합성단위 검색시 음소단위 최장일치 검색 알고리즘을 설계하였다. 합성단위 연결부에서 발생하는 신호왜곡을 최소화하기 위해 “_C_”환경에서 자음이 유성음화된 경우, “_V_”환경에서 모음이 무성음화된 경우, 그리고 유성음 사이의 포만트 주파수 차이에 대한 모델을 생성하여, 음소간의 조음강도가 약한 부분이 합성단위 경계로 설정되도록 하였다. 합성단위 경계가 결정되면 주어진 문장의 문맥정보만을 이용하여 코포스로부터 후보를 선택한다. 선택된 후보를 사이의 연결성을 측정하기 위하여 합성 경계를 기준으로 전, 후 음소에 대한 음성적 특성과 포만트 천이 특성을 고려하였다. 실험은 K-ToBI 레이블링된 200문장을 기반으로 하였으며, 코퍼스로부터 한 문장을 선택하여 이를 목적치 패턴으로 선정 한 후, 목적치 패턴과 후보사이의 단위비용과 후보들 간의 연결비용을 계산하여 최적의 합성단위열을 추출하는 방식으로 이루어졌다. 본 논문에서는 이러한 문맥종속 단위 기반의 합성단위 추출 알고리즘과 실험 결과에 대해 보고한다.
PDF

Implementation of Sentence Construction using Lexical Information (어휘 정보를 이용한 문장완성의 구현)

황인정;이은실;민홍기
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2003.06a
- /
- pp.10-13
- /
- 2003
본 연구는 어휘 정보를 이용하여 구어체 문장구성을 하였다. 구어체 문장구성의 목적은 언어생활이 불편한 사람들을 위한 통신보조기기에 사용하기 위해서이다. 통신보조기기는 사용자가 원하는 문장을 만들어 음성으로 출력해주는 시스템이다. 그러므로 문장을 구성하기 위해서 어휘 정보를 통신보조기기의 개념에 맞도록 변형하여 도입하였다. 어휘는 도메인별로 발췌하고 분류하였으며, 각 어휘에 대해 시소러스와 하위범주화사전을 만들었다. 어휘정보에 관한 상세한 정보는 문장구성과 재사용 그리고 문맥상 어색한 문장검출을 위해 중요한 자료가 된다.
PDF

Multi-band multi-scale DenseNet with dilated convolution for background music separation (배경음악 분리를 위한 확장된 합성곱을 이용한 멀티 밴드 멀티 스케일 DenseNet)

Heo, Woon-Haeng;Kim, Hyemi;Kwon, Oh-Wook
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.6
- /
- pp.697-702
- /
- 2019
We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and -10 dB Signal to Noise Ratio (SNR) environments, respectively.
https://doi.org/10.7776/ASK.2019.38.6.697 인용 PDF KSCI

An Analysis on Phone-Like Units for Korean Continuous Speech Recognition in Noisy Environments (잡음환경하의 연속 음성인식을 위한 유사음소단위 분석)

Shen Guang-Hu;Lim Soo-Ho;Seo Jun-Bae;Kim Joo-Gon;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.123-126
- /
- 2004
본 논문은 잡음환경 하에서의 효율적인 문맥의존 음향 모델 구성에 대한 기초연구로서 잡음환경 하에서의 유사 음소단위 수에 따른 연속 음성인식 성능을 비교, 평가한 결과에 대한 보고이다. 기존의 연구[1,2]로부터 연속음성 인식의 경우 문맥종속모델은 변이음을 고려한 39유사음소를 이용한 경우가 48유사음소를 이용하는 것보다 더 좋은 인식성능을 나타냄을 알 수 있었다. 이 연구 결과를 바탕으로 본 연구에서는 잡음환경에서도 효율적인 문맥 의존 음향모델을 구성하기 위한 기초 연구를 수행하였다. 다양한 잡음환경을 고려하기 위해 White, Pink, LAB 잡음을 신호 대 잡음비(Signal to Noise Ratio) 5dB, 10dB, 15dB 레벨로 음성에 부가한 후 각 유사음소단위 수에 따른 연속음성인식 실험을 수행하였다. 그 결과, 39유사음소를 이용한 경우가 48유사음소를 이용한 경우보다 clear 환경인 경우에 약 $7\%$와 $17\%$ 향상된 단어인식률과 문장 인식률을 얻을 수 있었으며, 각 잡음환경에서도 39유사음소를 이용한 경우가 48유사음소를 이용한 경우보다 평균 적으로 $17\%$와 $28\%$ 향상된 단어인식률과 문장인식률을 얻을 수 있어 39유사음소 단위가 한국어 연속음성인식에 더 적합하고 잡음환경에서도 유효함을 확인할 수 있었다.
PDF

DNN based Speech Detection for the Media Audio (미디어 오디오에서의 DNN 기반 음성 검출)

Jang, Inseon;Ahn, ChungHyun;Seo, Jeongil;Jang, Younseon
- Journal of Broadcast Engineering
- /
- v.22 no.5
- /
- pp.632-642
- /
- 2017
In this paper, we propose a DNN based speech detection system using acoustic characteristics and context information of media audio. The speech detection for discriminating between speech and non-speech included in the media audio is a necessary preprocessing technique for effective speech processing. However, since the media audio signal includes various types of sound sources, it has been difficult to achieve high performance with the conventional signal processing techniques. The proposed method improves the speech detection performance by separating the harmonic and percussive components of the media audio and constructing the DNN input vector reflecting the acoustic characteristics and context information of the media audio. In order to verify the performance of the proposed system, a data set for speech detection was made using more than 20 hours of drama, and an 8-hour Hollywood movie data set, which was publicly available, was further acquired and used for experiments. In the experiment, it is shown that the proposed system provides better performance than the conventional method through the cross validation for two data sets.
https://doi.org/10.5909/JBE.2017.22.5.632 인용 PDF KSCI KPUBS

A Scheduler for Multimedia Data and Evaluation Method (멀티미디어 데이터를 위한 스케쥴러 및 평가법 설계)

유명련;김현철
- Journal of the Institute of Convergence Signal Processing
- /
- v.3 no.2
- /
- pp.1-7
- /
- 2002
Since multimedia data such as video and audio data are displayed within a certain time constraint, their computation and manipulation should be handled under limited condition. Traditional real-time scheduling algorithms could not be directly applicable, because they are not suitable for multimedia scheduling applications which support many clients at the same time. Rate Regulating Proportional Share Scheduling Algorithm is a scheduling algorithm considered the time constraint of the multimedia data. This scheduling algorithm uses a rate regulator which prevents tasks from receiving more resource than its share in a given period. But this algorithm loses fairness, and does not show graceful degradation of performance under overloaded situation. This paper proposes a new modified algorithm, namely Modified Proportional Share Scheduling Algorithm considering the characteristics of multimedia data such as its continuity and time dependency. Proposed scheduling algorithm shows graceful degradation of performance in overloaded situation and the reduction in the number of context switching. Furthermore, a new evaluation method is proposed which can evaluate the flexibility of scheduling algorithm.
PDF

Search Result 25, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)