• Title/Summary/Keyword: text input

Search Result 361, Processing Time 0.023 seconds

N- gram Adaptation Using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 언어모델의 적응)

  • Choi Joon Ki;Oh Yung-Hwan
    • MALSORI
    • /
    • no.56
    • /
    • pp.207-223
    • /
    • 2005
  • The goal of language model adaptation is to improve the background language model with a relatively small adaptation corpus. This study presents a language model adaptation technique where additional text data for the adaptation do not exist. We propose the information retrieval (IR) technique with N-gram language modeling to collect the adaptation corpus from baseline text data. We also propose to use a dynamic language model interpolation coefficient to combine the background language model and the adapted language model. The interpolation coefficient is estimated from the word hypotheses obtained by segmenting the input speech data reserved for held-out validation data. This allows the final adapted model to improve the performance of the background model consistently The proposed approach reduces the word error rate by $13.6\%$ relative to baseline 4-gram for two-hour broadcast news speech recognition.

  • PDF

Performance comparison of Text-Independent Speaker Recognizer Using VQ and GMM (VQ와 GMM을 이용한 문맥독립 화자인식기의 성능 비교)

  • Kim, Seong-Jong;Chung, Hoon;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.235-244
    • /
    • 2000
  • This paper was focused on realizing the text-independent speaker recognizer using the VQ and GMM algorithm and studying the characteristics of the speaker recognizers that adopt these two algorithms. Because it was difficult ascertain the effect two algorithms have on the speaker recognizer theoretically, we performed the recognition experiments using various parameters and, as the result of the experiments, we could show that GMM algorithm had better recognition performance than VQ algorithm as following. The GMM showed better performance with small training data, and it also showed just a little difference of recognition rate as the kind of feature vectors and the length of input data vary. The GMM showed good recognition performance than the VQ on the whole.

  • PDF

Ternary Decomposition and Dictionary Extension for Khmer Word Segmentation

  • Sung, Thaileang;Hwang, Insoo
    • Journal of Information Technology Applications and Management
    • /
    • v.23 no.2
    • /
    • pp.11-28
    • /
    • 2016
  • In this paper, we proposed a dictionary extension and a ternary decomposition technique to improve the effectiveness of Khmer word segmentation. Most word segmentation approaches depend on a dictionary. However, the dictionary being used is not fully reliable and cannot cover all the words of the Khmer language. This causes an issue of unknown words or out-of-vocabulary words. Our approach is to extend the original dictionary to be more reliable with new words. In addition, we use ternary decomposition for the segmentation process. In this research, we also introduced the invisible space of the Khmer Unicode (char\u200B) in order to segment our training corpus. With our segmentation algorithm, based on ternary decomposition and invisible space, we can extract new words from our training text and then input the new words into the dictionary. We used an extended wordlist and a segmentation algorithm regardless of the invisible space to test an unannotated text. Our results remarkably outperformed other approaches. We have achieved 88.8%, 91.8% and 90.6% rates of precision, recall and F-measurement.

Text-driven Speech Animation with Emotion Control

  • Chae, Wonseok;Kim, Yejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3473-3487
    • /
    • 2020
  • In this paper, we present a new approach to creating speech animation with emotional expressions using a small set of example models. To generate realistic facial animation, two example models called key visemes and expressions are used for lip-synchronization and facial expressions, respectively. The key visemes represent lip shapes of phonemes such as vowels and consonants while the key expressions represent basic emotions of a face. Our approach utilizes a text-to-speech (TTS) system to create a phonetic transcript for the speech animation. Based on a phonetic transcript, a sequence of speech animation is synthesized by interpolating the corresponding sequence of key visemes. Using an input parameter vector, the key expressions are blended by a method of scattered data interpolation. During the synthesizing process, an importance-based scheme is introduced to combine both lip-synchronization and facial expressions into one animation sequence in real time (over 120Hz). The proposed approach can be applied to diverse types of digital content and applications that use facial animation with high accuracy (over 90%) in speech recognition.

Analysis and Interpretation of Intonation Contours of Slovene

  • Ales Dobnikar
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.542-547
    • /
    • 1996
  • Prosodic characteristics of natural speech, especially intonation, in many cases represent specific feelings of the speaker at the time of the utterance, with relatively vast variations of speaking styles over the same text. We analyzed a collected speech corpus, recorded with ten Slovene speakers. Interpretation of observed intonation contours was done for the purpose of modelling the intonation contour in synthesis process. We devised a scheme for modeling the intonation contour for different types of intonation units based on the results of analyzing intonation contours. The intonation scheme uses a superpositional approach, which defines the intonation contour as the sum of global (intonation unit) and local (accented syllables or syntactic boundaries) components. Near-to-natural intonation contour was obtained by rules, using only the text of the utterance as input.

  • PDF

Deriving TrueType Features for Letter Recognition in Word Images (워드이미지로부터 영문인식을 위한 트루타입 특성 추출)

  • SeongAh CHIN
    • Journal of the Korea Society for Simulation
    • /
    • v.11 no.3
    • /
    • pp.35-48
    • /
    • 2002
  • In the work presented here, we describe a method to extract TrueType features for supporting letter recognition. Even if variously existing document processing techniques have been challenged, almost few methods are capable of recognize a letter associated with its TrueType features supporting OCR free, which boost up fast processing time for image text retrieval. By reviewing the mechanism generating digital fonts and birth of TrueType, we realize that each TrueType is drawn by its contour of the glyph table. Hence, we are capable of deriving the segment with density for a letter with a specific TrueType, defined by the number of occurrence over a segment width. A certain number of occurrence appears frequently often due to the fixed segment width. We utilize letter recognition by comparing TrueType feature library of a letter with that from input word images. Experiments have been carried out to justify robustness of the proposed method showing acceptable results.

  • PDF

Rapid and Brief Communication GPU implementation of neural networks

  • Oh, Kyoung-Su;Jung, Kee-Chul
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02c
    • /
    • pp.322-325
    • /
    • 2007
  • Graphics processing unit (GPU) is used for a faster artificial neural network. It is used to implement the matrix multiplication of a neural network to enhance the time performance of a text detection system. Preliminary results produced a 20-fold performance enhancement using an ATI RADEON 9700 PRO board. The parallelism of a GPU is fully utilized by accumulating a lot of input feature vectors and weight vectors, then converting the many inner-product operations into one matrix operation. Further research areas include benchmarking the performance with various hardware and GPU-aware learning algorithms. (c) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

The possibility and prospect for developing Sijo Munhwa information system (시조문화 정보시스템 개발의 가능성과 전망)

  • 한창훈
    • Sijohaknonchong
    • /
    • v.19 no.1
    • /
    • pp.37-62
    • /
    • 2003
  • This treatise discuss the possibility and prospect for developing Sijo Munhwa information system. The contents of it is summarized as follows. 1. We must gather and input correctly materials including original texts to develope Sijo Munhwa information system as linguistic data. 2. We must consider TEI(Text Encoding Initiative) and thesauras when we process the database. 3. 2 form the ground work of building Topic Map. 4. It's very important to link Sijo Munhwa information system as art material with visual or auditory images in three dimensions completely.

  • PDF

Analyzing Customer Experience in Hotel Services Using Topic Modeling

  • Nguyen, Van-Ho;Ho, Thanh
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.586-598
    • /
    • 2021
  • Nowadays, users' reviews and feedback on e-commerce sites stored in text create a huge source of information for analyzing customers' experience with goods and services provided by a business. In other words, collecting and analyzing this information is necessary to better understand customer needs. In this study, we first collected a corpus with 99,322 customers' comments and opinions in English. From this corpus we chose the best number of topics (K) using Perplexity and Coherence Score measurements as the input parameters for the model. Finally, we conducted an experiment using the latent Dirichlet allocation (LDA) topic model with K coefficients to explore the topic. The model results found hidden topics and keyword sets with high probability that are interesting to users. The application of empirical results from the model will support decision-making to help businesses improve products and services as well as business management and development in the field of hotel services.

Toward Sentiment Analysis Based on Deep Learning with Keyword Detection in a Financial Report (재무 보고서의 키워드 검출 기반 딥러닝 감성분석 기법)

  • Jo, Dongsik;Kim, Daewhan;Shin, Yoojin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.670-673
    • /
    • 2020
  • Recent advances in artificial intelligence have allowed for easier sentiment analysis (e.g. positive or negative forecast) of documents such as a finance reports. In this paper, we investigate a method to apply text mining techniques to extract in the financial report using deep learning, and propose an accounting model for the effects of sentiment values in financial information. For sentiment analysis with keyword detection in the financial report, we suggest the input layer with extracted keywords, hidden layers by learned weights, and the output layer in terms of sentiment scores. Our approaches can help more effective strategy for potential investors as a professional guideline using sentiment values.