Search | Korea Science

Oh, Sang-Yeob
- Journal of Digital Convergence
- /
- v.10 no.6
- /
- pp.219-224
- /
- 2012
In the communication mobile terminal, Vocabulary recognition system has low recognition rates, because this problems are due to phoneme feature extract from inaccurate vocabulary. Therefore they are not recognize the phoneme and similar phoneme misunderstanding error. To solve this problem, this paper propose the system model, which based on the two step process. First, input phoneme is represent by number which measure the distance of phonemes through phoneme likelihood process. next step is recognize the result through the reliability measure. By this process, we minimize the phoneme misunderstanding error caused by inaccurate vocabulary and perform error correction rate for error provrd vocabulary using phoneme likelihood and reliability. System performance comparison as a result of recognition improve represent 2.7% by method using error pattern learning and semantic pattern.
https://doi.org/10.14400/JDPM.2012.10.6.219 인용 PDF

Oh, SangYeob
- Journal of Convergence for Information Technology
- /
- v.10 no.8
- /
- pp.35-39
- /
- 2020
DNN error is small compared to the conventional speech recognition system, DNN is difficult to parallel training, often the amount of calculations, and requires a large amount of data obtained. In this paper, we generate a phoneme unit to estimate the GMM parameters with each phoneme model parameters from the GMM to solve the problem efficiently. And it suggests ways to improve performance through clustering for a specific vocabulary to effectively apply them. To this end, using three types of word speech database was to have a DB build vocabulary model, the noise processing to extract feature with Warner filters were used in the speech recognition experiments. Results using the proposed method showed a 97.9% recognition rate in speech recognition. In this paper, additional studies are needed to improve the problems of improved over fitting.
https://doi.org/10.22156/CS4SMB.2020.10.08.035 인용 PDF KSCI

Ueda, Yuichi;Sakata, Tadashi
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2009.01a
- /
- pp.441-445
- /
- 2009
We have developed a real-time software tool to extract a speech feature vector whose time sequences consist of three groups of vector components; the phonetic/acoustic features such as formant frequencies, the phonemic features as outputs on neural networks, and some distances of Japanese phonemes. In those features, since the phoneme distances for Japanese five vowels are applicable to express vowel articulation, we have designed a switch, a volume control and a color representation which are operated by pronouncing vowel sounds. As examples of those vowel interface, we have developed some speech training tools to display a image character or a rolling color ball and to control a cursor's movement for aurally- or vocally-handicapped children. In this paper, we introduce the functions and the principle of those systems.
PDF