• Title/Summary/Keyword: 음성데이터베이스

Search Result 269, Processing Time 0.021 seconds

A Study on Implementation of Sound Recording and Player of Smartphone for Mobile Learning (모바일 학습을 위한 스마트폰의 사운드 레코딩과 플레이어 구현에 관한 연구)

  • Seo, Jung-Hee;Park, Hung-Bog
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.6
    • /
    • pp.847-854
    • /
    • 2013
  • This paper implements a smartphone application for sound recording and player of mobile learning. Due to its ubiquitous nature, smartphones could be used anytime anywhere, and because they combine an audio and a microphone, the application for sound recording and player that this paper suggests can be easily and cost effectively developed without additional infrastructure. This paper also explains a technique which processes data of music lyrics. The technique is built on a database technology by using SQLite, a DBMS combined in a platform of android. Thus, as long as the smartphone application for sound recording and player is developed and the mobile phone has sound source files, learners could record their own voices to the sound. Therefore, we expect the learners without additional infrastructure to enable mobile learning.

Evaluation of Word Recognition System For Mobile Telephone (이동전화를 위한 단어 인식기의 성능평가)

  • Kim Min-Jung;Hwang Cheol-Jun;Chung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.92-95
    • /
    • 1999
  • 본 논문에서는 음성에 의해 구동되는 이동천화를 구현하기 위한 기초 실험으로서, 이동전화상에서 많이 사용되는 단어 데이터를 직접 채록하여 단어 인식 실험을 수행하여 인식기의 성능을 평가하였다. 인식 실험에 사용된 단어 데이터베이스는 서울 화자 360명(남성화자 180명, 여성화자 180명), 41상도 화자 240명(남성화자 120명, 여성화자 120명)으로 구성된 600명의 발성을 이용하여 구성하였다. 발성 단어는 이동전화에 주로 사용되는 중요 기능과 제어 단어, 그리고 숫자음을 포함한 55개 단어로 구성되었으며, 각 화자가 3회씩 발성하였다. 데이터의 채집환경은 잡음이 다소 있는 사무실환경이며, 샘플링율은 8kHz였다. 인식의 기본단위는 48개의 유사음소단위(Phoneme Like Unit : PLU)를 사용하였으며, 정적 특징으로 멜켑스트럼과 동적 특징으로 회귀계수를 특징 파라미터로 사용하였다. 인식실험에서는 OPDP(One Pass Dynamic Programming)알고리즘을 사용하였다. 인식실험을 위한 모델은 각 지역에 따라 학습을 수행한 모델과, 지역에 상관없이 학습한 모델을 만들었으며, 기존의 16Htz의 초기 모델에 8kHz로 채집된 데이터를 적응화시키는 방법을 이용하여 학습을 수행하였다. 인식실험에 있어서는 각 지역별 모델과 지역에 관계없이 학습한 모델에 대하여, 각 지역별로, 그리고 지역에 관계없이 평가용 데이터로 인식실험을 수행하였다 인식실험 결과, $90\%$이상의 비교적 높은 인식률을 얻어 인식시스템 성능의 유효성을 확인할 수 있었다.

  • PDF

Convergence Characteristics of Ant Colony Optimization with Selective Evaluation in Feature Selection (특징 선택에서 선택적 평가를 사용하는 개미 군집 최적화의 수렴 특성)

  • Lee, Jin-Seon;Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.41-48
    • /
    • 2011
  • In feature selection, the selective evaluation scheme for Ant Colony Optimization(ACO) has recently been proposed, which reduces computational load by excluding unnecessary or less promising candidate solutions from the actual evaluation. Its superiority was supported by experimental results. However the experiment seems to be not statistically sufficient since it used only one dataset. The aim of this paper is to analyze convergence characteristics of the selective evaluation scheme and to make the conclusion more convincing. We chose three datasets related to handwriting, medical, and speech domains from UCI repository whose feature set size ranges from 256 to 617. For each of them, we executed 12 independent runs in order to obtain statistically stable data. Each run was given 72 hours to observe the long-time convergence. Based on analysis of experimental data, we describe a reason for the superiority and where the scheme can be applied.

Classification of Underwater Transient Signals Using MFCC Feature Vector (MFCC 특징 벡터를 이용한 수중 천이 신호 식별)

  • Lim, Tae-Gyun;Hwang, Chan-Sik;Lee, Hyeong-Uk;Bae, Keun-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.8C
    • /
    • pp.675-680
    • /
    • 2007
  • This paper presents a new method for classification of underwater transient signals, which employs frame-based decision with Mel Frequency Cepstral Coefficients(MFCC). The MFCC feature vector is extracted frame-by-frame basis for an input signal that is detected as a transient signal, and Euclidean distances are calculated between this and all MFCC feature. vectors in the reference database. Then each frame of the detected input signal is mapped to the class having minimum Euclidean distance in the reference database. Finally the input signal is classified as the class that has maximum mapping rate in the reference database. Experimental results demonstrate that the proposed method is very promising for classification of underwater transient signals.

Communication Aid System For Dementia Patients (치매환자를 위한 대화 보조 시스템)

  • Sung-Ill Kim;Byoung-Chul Kim
    • Journal of Biomedical Engineering Research
    • /
    • v.23 no.6
    • /
    • pp.459-465
    • /
    • 2002
  • The goat of the present research is to improve the quality of life of both the elderly patients with dementia and their caregivers. For this Purpose, we developed a communication aid system that is consisted of three modules such as speech recognition engine, graphical agent. and database classified by a nursing schedule. The system was evaluated in an actual environment of nursing facility by introducing the system to an older mail patient with dementia. The comparison study was then carried out with and without system, respectively. The occupational therapists then evaluated subject"s reaction to the system by photographing his behaviors. The evaluation results revealed that the proposed system was more responsive in catering to needs of subject than professional caregivers. Moreover we could see that the frequency of causing the utterances of subject increased by introducing the system.

A Study on Traffic Information Service and Collection by the Use of DSRC Technology (DSRC통신 기반 교통정보 제공 및 수집에 관한 연구)

  • Yang, Won-Mo;Bang, Jeong-Hyeon;Kim, Gyu-Ok
    • Proceedings of the KOR-KST Conference
    • /
    • 2007.05a
    • /
    • pp.399-408
    • /
    • 2007
  • Dedicated Short Range Communications(DSRC) is a block of spectrum in the 5.8GHz band. DSRC is the useful technology of ITS Service. Japan operates ETCS, VICS by DSRC technology and DSRC technology is used ETCS Standard in Korea. There are many kind of utilization of DSRC in ITS. This is a study for traffic information service and collection with DSRC. Traffic management server service traffic information to driving vehicle by RSE(Road Side Equipment). OBU(Onboard Unit) in vehicle send the information to PDA(Personal Digital Assistant). Client S/W show the information to driver by text, pictogram, sound and return PDA H/W ID to OBU. Server make section traffic information by the PDA H/W ID information.

  • PDF

An Implementation of an Android Mobile System for Extracting and Retrieving Texts from Images (이미지 내 텍스트 추출 및 검색을 위한 안드로이드 모바일 시스템 구현)

  • Go, Eun-Bi;Ha, Yu-Jin;Choi, Soo-Ryum;Lee, Ki-Hoon;Park, Young-Ho
    • Journal of Digital Contents Society
    • /
    • v.12 no.1
    • /
    • pp.57-67
    • /
    • 2011
  • Recently, an interest in a mobile search is increasing according to the growing propagation of smart phones. However, a keypad, which is not appropriate for mobile environment, is the only input media for the mobile search. As an alternative, voice emerged as a new media for the mobile search, but this also has weaknesses. Thus, in the paper, we propose a mobile content called Orthros for searching the Internet using images as an input. Orthros extracts texts from images, and then inserts the texts to public search engines as a keyword. Also, Orthros can repeat searching with the extracted texts by storing result URL to internal databases. As an experiment, we analyze properties of recognizable images and present the implementation method in details.

Development of Acquisition System for Biological Signals using Raspberry Pi (라즈베리 파이를 이용한 생체신호 수집시스템 개발)

  • Yoo, Seunghoon;Kim, Sitae;Kim, Dongsoo;Lee, Younggun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1935-1941
    • /
    • 2021
  • In order to develop an algorithm using deep learning, which has been recently applied to various fields, it is necessary to have rich, high-quality learning data. In this paper, we propose an acquisition system for biological signals that simultaneously collects bio-signal data such as optical videos, thermal videos, and voices, which are mainly used in developing deep learning algorithms and useful in derivation of information, and transmit them to the server. To increase the portability of the collector, it was made based on Raspberry Pi, and the collected data is transmitted to the server through the wireless Internet. To enable simultaneous data collection from multiple collectors, an ID for login was assigned to each subject, and this was reflected in the database to facilitate data management. By presenting an example of biological data collection for fatigue measurement, we prove the application of the proposed acquisition system.

Front-End Processing for Speech Recognition in the Telephone Network (전화망에서의 음성인식을 위한 전처리 연구)

  • Jun, Won-Suk;Shin, Won-Ho;Yang, Tae-Young;Kim, Weon-Goo;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.57-63
    • /
    • 1997
  • In this paper, we study the efficient feature vector extraction method and front-end processing to improve the performance of the speech recognition system using KT(Korea Telecommunication) database collected through various telephone channels. First of all, we compare the recognition performances of the feature vectors known to be robust to noise and environmental variation and verify the performance enhancement of the recognition system using weighted cepstral distance measure methods. The experiment result shows that the recognition rate is increasedby using both PLP(Perceptual Linear Prediction) and MFCC(Mel Frequency Cepstral Coefficient) in comparison with LPC cepstrum used in KT recognition system. In cepstral distance measure, the weighted cepstral distance measure functions such as RPS(Root Power Sums) and BPL(Band-Pass Lifter) help the recognition enhancement. The application of the spectral subtraction method decrease the recognition rate because of the effect of distortion. However, RASTA(RelAtive SpecTrAl) processing, CMS(Cepstral Mean Subtraction) and SBR(Signal Bias Removal) enhance the recognition performance. Especially, the CMS method is simple but shows high recognition enhancement. Finally, the performances of the modified methods for the real-time implementation of CMS are compared and the improved method is suggested to prevent the performance degradation.

  • PDF

A Phoneme-based Approximate String Searching System for Restricted Korean Character Input Environments (제한된 한글 입력환경을 위한 음소기반 근사 문자열 검색 시스템)

  • Yoon, Tai-Jin;Cho, Hwan-Gue;Chung, Woo-Keun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.10
    • /
    • pp.788-801
    • /
    • 2010
  • Advancing of mobile device is remarkable, so the research on mobile input device is getting more important issue. There are lots of input devices such as keypad, QWERTY keypad, touch and speech recognizer, but they are not as convenient as typical keyboard-based desktop input devices so input strings usually contain many typing errors. These input errors are not trouble with communication among person, but it has very critical problem with searching in database, such as dictionary and address book, we can not obtain correct results. Especially, Hangeul has more than 10,000 different characters because one Hangeul character is made by combination of consonants and vowels, frequency of error is higher than English. Generally, suffix tree is the most widely used data structure to deal with errors of query, but it is not enough for variety errors. In this paper, we propose fast approximate Korean word searching system, which allows variety typing errors. This system includes several algorithms for applying general approximate string searching to Hangeul. And we present profanity filters by using proposed system. This system filters over than 90% of coined profanities.