Search | Korea Science

LM Clustering based Dynamic LM Interpolation for ASR N-best Rescoring (언어모델 군집화와 동적 언어모델 보간을 통한 음성인식 성능 향상)

Chung, Euisok;Jeon, Hyung-Bae;Jung, Ho-Young;Park, Jeon-Gue
- Annual Conference on Human and Language Technology
- /
- 2015.10a
- /
- pp.240-245
- /
- 2015
일반영역 음성인식은 n-gram 희소성 문제로 인해 대용량의 언어모델이 필요하다. 대용량 언어모델은 분산형 모델로 구현될 수 있고, 사용자 입력에 대한 동적 언어모델 보간 기술을 통해 음성인식 성능을 개선할 수 있다. 본 논문은 동적 언어모델 보간 기술에 대한 새로운 접근방법을 시도한다. 텍스트 군집화를 통해 주제별 언어모델을 생성한다. 여기서 주제는 사용자 입력 영역에 대응한다. 본 논문은 사용자 입력에 대하여 실시간으로 주제별 언어모델의 보간 가중치 값을 계산하는 접근 방법을 제시한다. 또한 언어모델의 보간 가중치 값 계산의 부담을 감소하기 위해 언어모델 군집화를 통해 대용량 언어모델 보간 접근 방법의 연산 부담을 해소하기 위한 시도를 한다. 주제별 언어모델에 기반하고 언어모델 군집화를 통한 동적 언어모델 보간 기술의 실험 결과 음성인식 오류 감소율 6.89%를 달성했다. 또한 언어모델 군집화 기술은 음성인식 정확도를 0.09% 저하시켰을 때 실행 시간을 17.6% 개선시키는 실험결과를 보였다.
PDF

N-gram Adaptation using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 적응)

Choi, Joon-Ki;Oh, Yung-Hwan
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.107-112
- /
- 2005
연속음성인식을 위한 언어모델 적응기법은 특정 영역의 정보만을 담고 있는 적응 코퍼스를 이용해 작성한 적응 언어모델과 기본 언어모델을 병합하는 방법이다. 본 논문에서는 추가되는 자료 없이 인식 시스템이보유하고 있는 코퍼스만을 사용하여 적응 코퍼스를 구축하기 위해 언어모델에 기반한 정보검색 기법을 사영하였다. 검색된 적응 코퍼스로 작성된 적응 언어모델과 기본 언어모델과의 병합을 위해 본 논문에서는 입력음성을 분할하여 각 구간에 최적인 동적 보간 계수를 구하는 방법을 제안하였다. 제안된 적응 코퍼스를 구하는 방법과 동적 보간 계수는 기본 언어모델 대비절대 3.6%의 한국어 방송뉴스 인식 성능 향상을 보여주었으며 기존의 검증자료를 이용한 정적 보간 계수에 비해 상대 13.6%의 한국어 방송뉴스 인식 성능 향상을 보여 주었다.
PDF

Statistical Space-Time Metamodels Based on Multiple Responses Approach for Time-Variant Dynamic Response of Structures (구조물의 시간-변화 동적응답에 대한 다중응답접근법 기반 통계적 공간-시간 메타모델)

Lee, Jin-Min;Lee, Tae-Hee
- Transactions of the Korean Society of Mechanical Engineers A
- /
- v.34 no.8
- /
- pp.989-996
- /
- 2010
Statistical regression and/or interpolation models have been used for data analysis and response prediction using the results of the physical experiments and/or computer simulations in structural engineering fields. These models have been employed during the last decade to develop a variety of design methodologies. However, these models only handled responses with respect to space variables such as size and shape of structures and cannot handle time-variant dynamic responses, i.e. response varying with time. In this research, statistical space-time metamodels based on multiple response approach that can handle responses with respect to both space variables and a time variable are proposed. Regression and interpolation models such as the response surface model (RSM) and kriging model were developed for handling time-variant dynamic responses of structural engineering. We evaluate the accuracies of the responses predicted by the two statistical space-time metamodels by comparing them with the responses obtained by the physical experiments and/or computer simulations.
https://doi.org/10.3795/KSME-A.2010.34.8.989 인용 PDF KSCI

Reference Points Selection for Interpolation in Digital Elevation Model (수치표고모델의 보간기준점 선정에 관한 연구)

최병길;김욱남;진세일
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.21 no.2
- /
- pp.131-136
- /
- 2003
The method that selects reference points for interpolation is very important in Digital Elevation Model. However, there is no definition of an accurate standard until now, so users select the reference points for interpolation at their option. This paper aims to study on the accurate selection of the reference points for interpolation of DEM. This paper analyzed the method using the number of points and the reference points selection method by using the average distance calculated, from irregular points. Based on the analysis of the results, it shows that the Kriging method applying of the average distance is more efficient in construction of DEM.
PDF KSCI

HRTF Interpolation Using a Spherical Head Model (원형 머리 모델을 이용한 머리 전달 함수의 보간)

Lee, Ki-Seung;Lee, Seok-Pil
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.7
- /
- pp.333-341
- /
- 2008
In this paper, a new interpolation model for the head related transfer function (HRTF) was proposed. In the method herein, we assume that the impulse response of the HRTF for each azimuth angle is given by linear interpolation of the time-delayed neighboring impulse responses of HRTFs. The time delay of the HRTF for each azimuth angle is given by sum of the sound wave propagation time from the ears to the sound source, which can be estimated by using azimuth angle, the physical shape of the underlying head and the distance between the head and sound source, and the refinement time yielding the minimum mean square error. Moreover, in the proposed model, the interpolation intervals were not fixed but varied, which were determined by minimizing the total number of HRTFs while the synthesized signals have no perceptual difference from the original signals in terms of sound location. To validate the usefulness of the proposed interpolation model, the proposed model was applied to the several HRTFs that were obtained from one dummy-head and three human heads. We used the HRTFs that have 5 degree azimuth angle resolution at 0 degree elevation (horizontal plane). The experimental results showed that using only $30\sim40%$ of the original HRTFs were sufficient for producing the signals that have no audible differences from the original ones in terms of sound location.
https://doi.org/10.7776/ASK.2008.27.7.333 인용 PDF KSCI

Image Interpolation Using Linear Modeling for the Absolute Values of Wavelet Coefficients Across Scale (스케일간 웨이블릿 계수 절대치의 선형 모델링을 이용한 영상 보간)

Kim Sang-Soo;Eom Il-Kyu;Kim Yoo-Shin
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.6
- /
- pp.19-26
- /
- 2005
Image interpolation in the wavelet domain usually takes advantage of the probabilistic models for the intrascale statistics and the interscale dependency. In this paper, we adopt the linear model for the absolute values of wavelet coefficients of interpolated image across scale to estimate the variances of extrapolated bands. The proposed algorithm uses randomly generated wavelet coefficients based on the estimated parameters for probabilistic model. Random number generation according to the estimated probabilistic model may induce the 'salt and pepper' noise in subbands. We reduce the noise power by Wiener filtering. We observe that the proposed method generates the histogram of the subband coefficients similar to the that of original image. Experimental results show that our method outperforms the previous wavelet-domain interpolation method as well as the conventional bicubic method.
PDF KSCI

Comparison of Seismic Data Interpolation Performance using U-Net and cWGAN (U-Net과 cWGAN을 이용한 탄성파 탐사 자료 보간 성능 평가)

Yu, Jiyun;Yoon, Daeung
- Geophysics and Geophysical Exploration
- /
- v.25 no.3
- /
- pp.140-161
- /
- 2022
Seismic data with missing traces are often obtained regularly or irregularly due to environmental and economic constraints in their acquisition. Accordingly, seismic data interpolation is an essential step in seismic data processing. Recently, research activity on machine learning-based seismic data interpolation has been flourishing. In particular, convolutional neural network (CNN) and generative adversarial network (GAN), which are widely used algorithms for super-resolution problem solving in the image processing field, are also used for seismic data interpolation. In this study, CNN-based algorithm, U-Net and GAN-based algorithm, and conditional Wasserstein GAN (cWGAN) were used as seismic data interpolation methods. The results and performances of the methods were evaluated thoroughly to find an optimal interpolation method, which reconstructs with high accuracy missing seismic data. The work process for model training and performance evaluation was divided into two cases (i.e., Cases I and II). In Case I, we trained the model using only the regularly sampled data with 50% missing traces. We evaluated the model performance by applying the trained model to a total of six different test datasets, which consisted of a combination of regular, irregular, and sampling ratios. In Case II, six different models were generated using the training datasets sampled in the same way as the six test datasets. The models were applied to the same test datasets used in Case I to compare the results. We found that cWGAN showed better prediction performance than U-Net with higher PSNR and SSIM. However, cWGAN generated additional noise to the prediction results; thus, an ensemble technique was performed to remove the noise and improve the accuracy. The cWGAN ensemble model removed successfully the noise and showed improved PSNR and SSIM compared with existing individual models.
https://doi.org/10.7582/GGE.2022.25.3.140 인용 PDF KSCI

Denoising Self-Attention Network for Mixed-type Data Imputation (혼합형 데이터 보간을 위한 디노이징 셀프 어텐션 네트워크)

Lee, Do-Hoon;Kim, Han-Joon;Chun, Joonghoon
- The Journal of the Korea Contents Association
- /
- v.21 no.11
- /
- pp.135-144
- /
- 2021
Recently, data-driven decision-making technology has become a key technology leading the data industry, and machine learning technology for this requires high-quality training datasets. However, real-world data contains missing values for various reasons, which degrades the performance of prediction models learned from the poor training data. Therefore, in order to build a high-performance model from real-world datasets, many studies on automatically imputing missing values in initial training data have been actively conducted. Many of conventional machine learning-based imputation techniques for handling missing data involve very time-consuming and cumbersome work because they are applied only to numeric type of columns or create individual predictive models for each columns. Therefore, this paper proposes a new data imputation technique called 'Denoising Self-Attention Network (DSAN)', which can be applied to mixed-type dataset containing both numerical and categorical columns. DSAN can learn robust feature expression vectors by combining self-attention and denoising techniques, and can automatically interpolate multiple missing variables in parallel through multi-task learning. To verify the validity of the proposed technique, data imputation experiments has been performed after arbitrarily generating missing values for several mixed-type training data. Then we show the validity of the proposed technique by comparing the performance of the binary classification models trained on imputed data together with the errors between the original and imputed values.
https://doi.org/10.5392/JKCA.2021.21.11.135 인용 PDF KSCI HTML

A Study on Methods of Speacker Adaptation for Speech Recognition (음성인식을 위한 화자적응화 기법에 관한 연구)

이종연
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.309.2-314
- /
- 1998
본 연구에서는 음성인식을 위한 화자적응화 기법에 대해 연구하였다. 첫째로 적응화에 포함되지 않은 카테고리 음절에 대해 적응화 효과를 줄 수 있는 보간적응화 방법에 대해 연구하였다. 표준모델과 소량의 음성 데이터만으로 적응화가 가능한 MAPE(최대사후확률추정)으로 적응화한 모델의 평균벡터 변화정도를 적응화 발화에 포함되지 않은 모델에 보간적응하는 방법이다. 둘째로 음절단위 모델을 구축한 후 적응화 하고자 하는 화자의 데이터를 연결학습법과 Viterbi 알고리즘으로 음절단위의 추출을 자동화 한 후 MAPE으로 적응화하는 방법에 대해 각각 실험을 하였다.
PDF

Mapping of Environmental Data Using Spatial Interpolation Methods (공간보간기법을 이용한 환경자료의 지도화)

Cho, Hong-Lae;Jeong, Jong-Chul
- 한국공간정보시스템학회:학술대회논문집
- /
- 2007.06a
- /
- pp.273-279
- /
- 2007
환경분야에서 사용되는 대부분의 자료는 공간상 모든 위치에 그 값이 존재하나 모든 지점에서 자료를 획득하는 것이 불가능하므로 몇 개의 대표 지점에서 필요로 하는 자료를 수집한 후 이를 미관측 지역까지 확장하여 사용하게 된다. 관측된 자료를 이용하여 미관측 지점의 값을 예측하는 과정에는 공간보간 기법이 사용되는데, 본 논문에서는 지역경향면 모델, IDW, RBF, 크리깅 등의 공간보간 기법을 서울시의 미세먼지(PM10) 연평균 농도 공간보간에 적용하고 그 정확성을 살펴보았다. 정확성 평가를 위하여 예측값의 범위, RMSE, 평균오차 등을 살펴보았으며 이로부터 크리깅, RBF 기법의 예측 정확도가 높은 것으로 분석되었다.
PDF

Search Result 315, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)