• 제목/요약/키워드: unseen model

검색결과 35건 처리시간 0.024초

Optimal Decision Tree를 이용한 Unseen Model 추정방법 (Unseen Model Prediction using an Optimal Decision Tree)

  • 김성탁;김회린
    • 대한음성학회지:말소리
    • /
    • 제45호
    • /
    • pp.117-126
    • /
    • 2003
  • Decision tree-based state tying has been proposed in recent years as the most popular approach for clustering the states of context-dependent hidden Markov model-based speech recognition. The aims of state tying is to reduce the number of free parameters and predict state probability distributions of unseen models. But, when doing state tying, the size of a decision tree is very important for word independent recognition. In this paper, we try to construct optimized decision tree based on the average of feature vectors in state pool and the number of seen modes. We observed that the proposed optimal decision tree is effective in predicting the state probability distribution of unseen models.

  • PDF

무한 사전 온라인 LDA 토픽 모델에서 의미적 연관성을 사용한 토픽 확장 (Topic Expansion based on Infinite Vocabulary Online LDA Topic Model using Semantic Correlation Information)

  • 곽창욱;김선중;박성배;김권양
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제22권9호
    • /
    • pp.461-466
    • /
    • 2016
  • 토픽 확장은 학습된 토픽의 질을 향상시키기 위해 추가적인 외부 데이터를 반영하여 점진적으로 토픽을 확장하는 방법이다. 기존의 온라인 학습 토픽 모델에서는 외부 데이터를 확장에 사용될 경우, 새로운 단어가 기존의 학습된 모델에 반영되지 않는다는 문제가 있었다. 본 논문에서는 무한 사전 온라인 LDA 토픽 모델을 이용하여 외부 데이터를 반영한 토픽 모델 확장 방법을 연구하였다. 토픽 확장 학습에서는 기존에 형성된 토픽과 추가된 외부 데이터의 단어와 유사도를 반영하여 토픽을 확장한다. 실험에서는 기존의 토픽 확장 모델들과 비교하였다. 비교 결과, 제안한 방법에서 외부 연관 문서 단어를 토픽 모델에 반영하기 때문에 대본 토픽이 다루지 못한 정보들을 토픽에 포함할 수 있었다. 또한, 일관성 평가에서도 비교 모델보다 뛰어난 성능을 나타냈다.

Zero-shot voice conversion with HuBERT

  • Hyelee Chung;Hosung Nam
    • 말소리와 음성과학
    • /
    • 제15권3호
    • /
    • pp.69-74
    • /
    • 2023
  • This study introduces an innovative model for zero-shot voice conversion that utilizes the capabilities of HuBERT. Zero-shot voice conversion models can transform the speech of one speaker to mimic that of another, even when the model has not been exposed to the target speaker's voice during the training phase. Comprising five main components (HuBERT, feature encoder, flow, speaker encoder, and vocoder), the model offers remarkable performance across a range of scenarios. Notably, it excels in the challenging unseen-to-unseen voice-conversion tasks. The effectiveness of the model was assessed based on the mean opinion scores and similarity scores, reflecting high voice quality and similarity to the target speakers. This model demonstrates considerable promise for a range of real-world applications demanding high-quality voice conversion. This study sets a precedent in the exploration of HuBERT-based models for voice conversion, and presents new directions for future research in this domain. Despite its complexities, the robust performance of this model underscores the viability of HuBERT in advancing voice conversion technology, making it a significant contributor to the field.

VELOCITY ANALYSIS OF M13 BY MAXIMUM LIKELIHOOD METHOD

  • Oh, K.S.;Lin, D. N. C.
    • 천문학회지
    • /
    • 제25권1호
    • /
    • pp.1-9
    • /
    • 1992
  • We present new approach to analysis of velocity data of globular clusters. Maximum likelihood method is applied to get model parameters such as central potential, anisotropy radius, and total mass fractions in each mass class. This method can avoid problems in conventional binning method of chi-square. We utilize three velocity components, one from line of sight radial velocity and two from proper motion data. In our simplified scheme we adopt 3 mass-component model with unseen high mass stars, intermediate visible stars, and low mass dark remnants. Likelihood values are obtained for 124 stars in M13 for various model parameters. Our preferred model shows central potential of $W_o=7$ and anisotropy radius with 7 core radius. And it suggests non-negligible amount of unseen high mass stars and considerable amount of dark remnants in M13.

  • PDF

상태 공유와 결정트리 방법을 이용한 효율적인 문맥 종속 프로세스 모델링 (Efficient context dependent process modeling using state tying and decision tree-based method)

  • 안찬식;오상엽
    • 한국멀티미디어학회논문지
    • /
    • 제13권3호
    • /
    • pp.369-377
    • /
    • 2010
  • HMM(Hidden Markov Model)을 사용하는 어휘 인식 시스템에서 인식 시 훈련 중에 나타나지 않는 모델들로 인해 인식률의 저하를 가져오며 인식 대상 어휘가 변경되거나 추가되면 데이터베이스의 수집과 훈련 과정을 수행하여 모델을 재생성해야 하고 그에 따른 시간과 추가 비용이 초래된다. 본 논문에서는 결정 트리 방법과 모델 공유 방법을 사용하여 효율적인 문맥 종속 프로세스 모델링 방법을 제안하였다. 제안한 방법은 생성된 모델들로부터 모델 공유 방법을 이용하여 모델의 재생성 과정을 줄이고 강인하고 정확한 문맥 종속 음향 모델링을 제공한다. 또한, 모델의 수를 줄이고 훈련 중에 나타나지 않는 모델들에 대해 문맥 종속 유사 음소 모델을 제공하여 훈련 중에 나타나지 않는 모델의 문제점을 해결하고 훈련성을 확보하였다. 제안된 방법으로 6종류의 음성 데이터베이스를 이용하여 어휘 종속 인식과 어휘 독립 인식 실험을 수행한 결과 어휘 종속 인식 실험에서는 98.01%의 성능을 보였고, 어휘 독립 인식 실험에서 97.38%의 성능을 보였다.

질의응답 시스템에서 처음 보는 단어의 역문헌빈도 기반 단어 임베딩 기법 (Inverse Document Frequency-Based Word Embedding of Unseen Words for Question Answering Systems)

  • 이우인;송광호;심규석
    • 정보과학회 논문지
    • /
    • 제43권8호
    • /
    • pp.902-909
    • /
    • 2016
  • 질의응답 시스템은 사용자의 질문에 대한 답을 찾아주는 시스템으로, 기존의 검색엔진이 사용자의 질의에 대해 관련된 문서의 링크만을 찾아주는 반면 질문에 대한 최종적인 답을 찾아준다는 차이점이 있다. 특정 분야에 국한되지 않고 다양한 질문을 처리해주는 오픈 도메인 질의응답 시스템에 필요한 연구들이 최근 자연어 처리, 인공지능, 데이터 마이닝 등 학계의 다양한 분야들에서 뜨거운 관심을 받고 있다. 하지만 관련 연구에서는 학습 데이터에는 없었던 단어들이 질문에 대한 정확한 답과 유사한 오답을 구별해내는데 결정적인 역할을 할 수 있음에도, 이러한 처음 보는 단어들을 모두 단일 토큰으로 치환해버리는 문제가 있다. 본 논문에서는 문맥 정보를 통해 이러한 모르는 단어에 대한 벡터를 계산하는 방법을 제안한다. 그리고 역문헌빈도 가중치를 활용하여 문맥정보를 더 효율적으로 처리하는 모델을 제안한다. 또한 풍부한 실험을 통해 질의응답 시스템의 모델 학습 속도 및 정확성이 기존 연구에 비해 향상됨을 확인하였다.

A Feature-Based Malicious Executable Detection Approach Using Transfer Learning

  • Zhang, Yue;Yang, Hyun-Ho;Gao, Ning
    • 인터넷정보학회논문지
    • /
    • 제21권5호
    • /
    • pp.57-65
    • /
    • 2020
  • At present, the existing virus recognition systems usually use signature approach to detect malicious executable files, but these methods often fail to detect new and invisible malware. At the same time, some methods try to use more general features to detect malware, and achieve some success. Moreover, machine learning-based approaches are applied to detect malware, which depend on features extracted from malicious codes. However, the different distribution of features oftraining and testing datasets also impacts the effectiveness of the detection models. And the generation oflabeled datasets need to spend a significant amount time, which degrades the performance of the learning method. In this paper, we use transfer learning to detect new and previously unseen malware. We first extract the features of Portable Executable (PE) files, then combine transfer learning training model with KNN approachto detect the new and unseen malware. We also evaluate the detection performance of a classifier in terms of precision, recall, F1, and so on. The experimental results demonstrate that proposed method with high detection rates andcan be anticipated to carry out as well in the real-world environment.

보정신경망을 이용한 냉연 압하력 적중율 향상 (Improvement of roll force precalculation accuracy in cold mill using a corrective neural network)

  • 이종영;조형석;조성준;조용중;윤성철
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1996년도 한국자동제어학술회의논문집(국내학술편); 포항공과대학교, 포항; 24-26 Oct. 1996
    • /
    • pp.1083-1086
    • /
    • 1996
  • Cold rolling mill process in steel works uses stands of rolls to flatten a strip to a desired thickness. At cold rolling mill process, precalculation determines the mill settings before a strip actually enters the mill and is done by an outdated mathematical model. A corrective neural network model is proposed to improve the accuracy of the roll force prediction. Additional variables to be fed to the network include the chemical composition of the coil, its coiling temperature and the aggregated amount of processed strips of each roll. The network was trained using a standard backpropagation with 4,944 process data collected from no.1 cold rolling mill process from March 1995 through December 1995, then was tested on the unseen 1,586 data from Jan 1996 through April 1996. The combined model reduced the prediction error by 32.8% on average.

  • PDF

Text-Independent Speaker Verification Using Variational Gaussian Mixture Model

  • Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
    • ETRI Journal
    • /
    • 제33권6호
    • /
    • pp.914-923
    • /
    • 2011
  • This paper concerns robust and reliable speaker model training for text-independent speaker verification. The baseline speaker modeling approach is the Gaussian mixture model (GMM). In text-independent speaker verification, the amount of speech data may be different for speakers. However, we still wish the modeling approach to perform equally well for all speakers. Besides, the modeling technique must be least vulnerable against unseen data. A traditional approach for GMM training is expectation maximization (EM) method, which is known for its overfitting problem and its weakness in handling insufficient training data. To tackle these problems, variational approximation is proposed. Variational approaches are known to be robust against overtraining and data insufficiency. We evaluated the proposed approach on two different databases, namely KING and TFarsdat. The experiments show that the proposed approach improves the performance on TFarsdat and KING databases by 0.56% and 4.81%, respectively. Also, the experiments show that the variationally optimized GMM is more robust against noise and the verification error rate in noisy environments for TFarsdat dataset decreases by 1.52%.

Learning Deep Representation by Increasing ConvNets Depth for Few Shot Learning

  • Fabian, H.S. Tan;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • 제8권4호
    • /
    • pp.75-81
    • /
    • 2019
  • Though recent advancement of deep learning methods have provided satisfactory results from large data domain, somehow yield poor performance on few-shot classification tasks. In order to train a model with strong performance, i.e. deep convolutional neural network, it depends heavily on huge dataset and the labeled classes of the dataset can be extremely humongous. The cost of human annotation and scarcity of the data among the classes have drastically limited the capability of current image classification model. On the contrary, humans are excellent in terms of learning or recognizing new unseen classes with merely small set of labeled examples. Few-shot learning aims to train a classification model with limited labeled samples to recognize new classes that have neverseen during training process. In this paper, we increase the backbone depth of the embedding network in orderto learn the variation between the intra-class. By increasing the network depth of the embedding module, we are able to achieve competitive performance due to the minimized intra-class variation.