Browse > Article
http://dx.doi.org/10.5762/KAIS.2020.21.7.699

A Study on Quantitative Evaluation Method for STT Engine Accuracy based on Korean Characteristics  

Min, So-Yeon (Dept. of Information and Communication Engineering, Seoil University)
Lee, Kwang-Hyong (Dept. of Computer Software, Seoil University)
Lee, Dong-Seon (Dept. of Computer Science, Soongsil University)
Ryu, Dong-Yeop (NEXTG)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.21, no.7, 2020 , pp. 699-707 More about this Journal
Abstract
With the development of deep learning technology, voice processing-related technology is applied to various areas, such as STT (Speech To Text), TTS (Text To Speech), ChatBOT, and intelligent personal assistant. In particular, the STT is a voice-based, relevant service that changes human languages to text, so it can be applied to various IT related services. Recently, many places, such as general private enterprises and public institutions, are attempting to introduce the relevant technology. On the other hand, in contrast to the general IT solution that can be evaluated quantitatively, the standard and methods of evaluating the accuracy of the STT engine are ambiguous, and they do not consider the characteristics of the Korean language. Therefore, it is difficult to apply the quantitative evaluation standard. This study aims to provide a guide to an evaluation of the STT engine conversion performance based on the characteristics of the Korean language, so that engine manufacturers can perform the STT conversion based on the characteristics of the Korean language, while the market could perform a more accurate evaluation. In the experiment, a 35% more accurate evaluation could be performed compared to the existing methods.
Keywords
Speech To Text; Text To Speech; Evaluation; Measure; Korean Characteristics;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Lee Mi-suk, "A copy detection system," Ph.D. dissertation, University of Dongguk, Seoul, Korea, 2005.
2 Koopman B, Zuccon G, Bruza P, Sitbon L, Lawley M: An evaluation of corpus-driven measures of medical concept similarity for information retrieval. Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York: ACM, 2439-2442, 2012. DOI : http://dx.doi.org/10.1145/2396761.2398661
3 T. Mikolov et al., "Distributed Representations of Words and Phrases and their Compositionality", Int. Conf. NIPS, pp. 3111-3119, 2013.
4 O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, pages 2177-2185, 2014.
5 DongKeonLee, O KyoJoongOh, Ho-Jin Choi, Measuring the Syntactic Similarity between Korean Sentences Using RNN, KCC, 2016.
6 P. Achananuparp, et al., "The evaluation of sentence similarity measures." Data Warehousing and Knowledge Discovery, Springer Berlin Heidelberg, pp. 305-316, 2008. DOI : http://dx.doi.org/10.1007/978-3-540-85836-2_29
7 Wo Hyun Jung, Soo Jin Park, Word and coding-unit superiority effect in the perception of Korean Letter, The Korean Psychological Association. 18-2, pp.139-156, 2006.
8 P. Achananuparp, et al., "The evaluation of sentence similarity measures." Data Warehousing and Knowledge Discovery, Springer Berlin Heidelberg, pp. 305-316, 2008. DOI : http://dx.doi.org/10.1007/978-3-540-85836-2_29
9 T. Mikolov, et al., "Distributed representations of words and phrases and their compositionality," In Proc. of Advances in Neural Information Processing Systems, pp. 3111-3119, 2013.
10 J. Wang, G. Li and J. Fe, "Fast-Join: An Efficient Method for Fuzzy Token Matching based String Similarity Join", In ICDE, 2011.
11 Lee Mi-suk, "A copy detection system," Ph.D. dissertation, University of Dongguk, Seoul, Korea, 2005.
12 Manning, C. D.; Raghavan, P.; Schutze, H. . Cambridge University Press. 100-123. ISBN 9780521865715. Scoring, term weighting, and the vector space model