• Title/Summary/Keyword: Voice Feature

Search Result 232, Processing Time 0.032 seconds

Accelerometer-based Gesture Recognition for Robot Interface (로봇 인터페이스 활용을 위한 가속도 센서 기반 제스처 인식)

  • Jang, Min-Su;Cho, Yong-Suk;Kim, Jae-Hong;Sohn, Joo-Chan
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.53-69
    • /
    • 2011
  • Vision and voice-based technologies are commonly utilized for human-robot interaction. But it is widely recognized that the performance of vision and voice-based interaction systems is deteriorated by a large margin in the real-world situations due to environmental and user variances. Human users need to be very cooperative to get reasonable performance, which significantly limits the usability of the vision and voice-based human-robot interaction technologies. As a result, touch screens are still the major medium of human-robot interaction for the real-world applications. To empower the usability of robots for various services, alternative interaction technologies should be developed to complement the problems of vision and voice-based technologies. In this paper, we propose the use of accelerometer-based gesture interface as one of the alternative technologies, because accelerometers are effective in detecting the movements of human body, while their performance is not limited by environmental contexts such as lighting conditions or camera's field-of-view. Moreover, accelerometers are widely available nowadays in many mobile devices. We tackle the problem of classifying acceleration signal patterns of 26 English alphabets, which is one of the essential repertoires for the realization of education services based on robots. Recognizing 26 English handwriting patterns based on accelerometers is a very difficult task to take over because of its large scale of pattern classes and the complexity of each pattern. The most difficult problem that has been undertaken which is similar to our problem was recognizing acceleration signal patterns of 10 handwritten digits. Most previous studies dealt with pattern sets of 8~10 simple and easily distinguishable gestures that are useful for controlling home appliances, computer applications, robots etc. Good features are essential for the success of pattern recognition. To promote the discriminative power upon complex English alphabet patterns, we extracted 'motion trajectories' out of input acceleration signal and used them as the main feature. Investigative experiments showed that classifiers based on trajectory performed 3%~5% better than those with raw features e.g. acceleration signal itself or statistical figures. To minimize the distortion of trajectories, we applied a simple but effective set of smoothing filters and band-pass filters. It is well known that acceleration patterns for the same gesture is very different among different performers. To tackle the problem, online incremental learning is applied for our system to make it adaptive to the users' distinctive motion properties. Our system is based on instance-based learning (IBL) where each training sample is memorized as a reference pattern. Brute-force incremental learning in IBL continuously accumulates reference patterns, which is a problem because it not only slows down the classification but also downgrades the recall performance. Regarding the latter phenomenon, we observed a tendency that as the number of reference patterns grows, some reference patterns contribute more to the false positive classification. Thus, we devised an algorithm for optimizing the reference pattern set based on the positive and negative contribution of each reference pattern. The algorithm is performed periodically to remove reference patterns that have a very low positive contribution or a high negative contribution. Experiments were performed on 6500 gesture patterns collected from 50 adults of 30~50 years old. Each alphabet was performed 5 times per participant using $Nintendo{(R)}$ $Wii^{TM}$ remote. Acceleration signal was sampled in 100hz on 3 axes. Mean recall rate for all the alphabets was 95.48%. Some alphabets recorded very low recall rate and exhibited very high pairwise confusion rate. Major confusion pairs are D(88%) and P(74%), I(81%) and U(75%), N(88%) and W(100%). Though W was recalled perfectly, it contributed much to the false positive classification of N. By comparison with major previous results from VTT (96% for 8 control gestures), CMU (97% for 10 control gestures) and Samsung Electronics(97% for 10 digits and a control gesture), we could find that the performance of our system is superior regarding the number of pattern classes and the complexity of patterns. Using our gesture interaction system, we conducted 2 case studies of robot-based edutainment services. The services were implemented on various robot platforms and mobile devices including $iPhone^{TM}$. The participating children exhibited improved concentration and active reaction on the service with our gesture interface. To prove the effectiveness of our gesture interface, a test was taken by the children after experiencing an English teaching service. The test result showed that those who played with the gesture interface-based robot content marked 10% better score than those with conventional teaching. We conclude that the accelerometer-based gesture interface is a promising technology for flourishing real-world robot-based services and content by complementing the limits of today's conventional interfaces e.g. touch screen, vision and voice.

The effect of oral health behavior of the visually impaired on DMFT index (시각장애인의 구강보건행태가 DMFT지수에 미치는 영향)

  • Lee, Jong-Hwa;Lee, Seung-Hee;Yun, Hyun-Kyung
    • Journal of Korean society of Dental Hygiene
    • /
    • v.17 no.3
    • /
    • pp.331-342
    • /
    • 2017
  • Objectives: This study aimed at helping oral health prevention of the blind and related management plan, which is defined as the influence factors between missing and filled permanent teeth index and general feature and oral health behavior of the blind in Korea (estimates 229,678 persons) using data of the 6th Korea National Health and Nutrition Examination Survey from 2014 Korea Centers For Disease Control and Prevention. Methods: The blind over the age of 30 were selected as study subjects who have conducted health survey and dental inspections in KNHANES VI-2. Estimates of the subjects were 229,67 persons. For analyzing data, general linear models: GLM and covariance analysis were conducted to identify the relation between general feature and oral health behavior and missing and filled permanent teeth index. SPSS 21 statistical program was used, which is possible to conduct complex sampling design, and the significance level was 0.05. Results: The missing and filled permanent teeth index was 8.58 points. Regarding the results of the analysis, R-squared of the missing and filled permanent teeth index depending on general features of the blind was 0.839 points, which shows gender, age, residence, education level, individual income, disability rating, kinds of health insurance, marital status and recipient of basic living had an effect on the missing and filled permanent teeth index. R2 of the missing and filled permanent teeth index depending on oral health form of the blind was 0.728 points, which shows oral examination, dental treatment, smoking and toothbrushing after lunch had an effect on the missing and filled permanent teeth index. Conclusions: With the result of this study, we found the oral health actual condition of the blind in Korea. Therefore, it is considered that the government needs to introduce the personalized oral health education program to maintain oral health of the blind and to develop a program that uses braille and voice device which enables to access and utilize to improve oral health behavior that the government could use it as a reference to establish the policy plan.

Expansion and Transition of Tasan's Allegoric Poetry (다산(茶山) 우화시(寓話詩)의 확장(擴張)과 전이(轉移) -<오즉어행>과 <리노행>을 중심(中心)으로-)

  • Lee, Kyung-ah
    • Journal of Korean Classical Literature and Education
    • /
    • no.15
    • /
    • pp.329-353
    • /
    • 2008
  • Tasan Jeong Yak-yong is great scholar, who makes a synthesis of Sil-hak[實學, Practical Science of Korea], reformer of society, and a poet in the Joseon Dynasty. He expressed contradiction and conflict of those days by intellectual language, and reperceived basic ideology of the Joseon society. Also he theorized dissatisfaction of the people about those days and its system as form of religion. We can divide Tasan's life into two times. The first part is his ages 16~39 in the period of Jeong-jo(1777~1800). The second part is in the period of Sun-jo(1801~1834). In this period, he was exiled into Gang-jin for 17 years. After banishment, he lived a quiet life for the rest of his life in his hometown. His allegoric poetry were written in this second period. The special feature of allegoric poetry is strong satire. An allegory would be that is 'king's ear', which the barber has sight, or the barber's voice, which has divulged king's secret among the bamboos. Otherwise it would be that is the sound 'king's ear is donkey's ear' in the bamboos. This sound is divulging of the true donkey's ear. It doesn't travel to audiences, but travels trough wind in the bamboos. The narration exists just as story that barber can't stand to keep silence about king's secret. There are exposure of true and critical motive as allegoric expression. Tasan's allegoric poetry stand on the basis of his love for the people. Also there reveals his thought deeply with an enormous amount of reading and self-communion. Moreover there are his warm mind with his sharp insight in which captures alive lives as allegoric materials. Most of allegoric poetry satirize actuality of those days to make an excuse for external distinguishing marks of animals and plants. However Tasan's poetry are different from them. After he grasped serious problems from his contemporary actuality, and then choosed allegoric media to express correctly. Because he grasped the special features of lives after minute observation, he could exposure controversial point of the actual. His sharp insight was not limited to allegoric media. He noticed his period and the current of his society sensitively. It made his allegoric poetry as important materials to make us to know the condition of the people in the Joseon Dynasty. Tasan's allegoric poetry is inherited by Baek Seok[白石, 1912~1995] as regular juvenile literature. Baek Seok's juvenile stories are the results of expansion and transition for Tasan's allegoric poetry. Allegoric poetry was the shout of barber to prosecute about social irregularities and contradiction, and the sound of the bamboos to travel moaning of the people in the past. Now allegoric poetry create new emotion to make us to speculate ourselves with our surrounding. This changes are caused by special feature of allegoric poetry as a form to reflect our general lives.

$CO_2$ Laser Treatment of Adult-onset Laryngeal Papillomatosis ($CO_2$ 레이저를 이용한 성인 후두유두종의 치료)

  • Oh, Jang-Keun;Yoon, Jun-Sik;Lee, Sang-Joon;Chung, Phil-Sang
    • Korean Journal of Bronchoesophagology
    • /
    • v.13 no.2
    • /
    • pp.45-49
    • /
    • 2007
  • Background and Objective : Laryngeal Papillomatosis (LP) is the most common benign neoplasm of the larynx, but it tend to recur and it makes eradicating difficult. Meticulous $CO_2$ laser excision has been the most effective treatment to date. This article analyzes the clinical feature and therapeutic results of 42 LP patients who were undergone $CO_2$ laser excision. Methods : Forty two patients with recurrent LP were treated with $CO_2$ laser. And their medical records were reviewed retrospectively. Demographics, chief complaints at onset, initial distribution of papillomas, number of operations performed on each patient, and current results were evaluated. Results : Male in their twenties and forties are dominant in number in patient number. Most common site was anterior one thirds (69%) of glottis area (86%). LP recurred in 17 cases (40%), and in 4 cases, the lesion extended over the original margin. Patients were undergone surgery $1.62{\pm}0.87$ times, $2.53{\pm}0.72$ in recurred cases. Mean relapsing time was 6 momths (from 1momth to 8years). Ant. laryngeal web occurred in 2 cases (4.8%) and 1 case was combined with squamous cell carcinoma. Conclusion : Meticulously performed $CO_2$ laser excision can achieve significant voice and airway improvement and clinical cures. The $CO_2$ laser through microdirect laryngoscopy allows more precise and bloodless removal of papillomas.

  • PDF

Spectrum Feature Analysis of Crying Sounds of Infant Cold and Pneumonia (소아감기와 소아폐렴간의 울음소리 스펙트럼 특징 분석)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Cho, Dong-Uk
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.301-306
    • /
    • 2008
  • Recently, various health care methods for infants have been suggested in the impending era of low birth rate society. We propose, in this context, an early diagnosis method for common infant respiratory diseases. Particularly, the method is regarding infant cold and infant pneumonia. Firstly, sounds of infant crying, only expressing means of infants, among the infant cold group and the infant pneumonia group are compared and examined to find the differences from those among the healthy infant group. For this, the link between infected organs and articulatory organs is investigated. Also, resulting wave forms and frequency bandwidths among each group are compared and analyzed, by using the spectrum for a component voice, to diagnose the infant cold and pneumonia. Finally, the effectiveness of this method is verified through the experiments.

A Real-Time Embedded Speech Recognition System

  • Nam, Sang-Yep;Lee, Chun-Woo;Lee, Sang-Won;Park, In-Jung
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.690-693
    • /
    • 2002
  • According to the growth of communication biz, embedded market rapidly developing in domestic and overseas. Embedded system can be used in various way such as wire and wireless communication equipment or information products. There are lots of developing performance applying speech recognition to embedded system, for instance, PDA, PCS, CDMA-2000 or IMT-2000. This study implement minimum memory of speech recognition engine and DB for apply real time embedded system. The implement measure of speech recognition equipment to fit on embedded system is like following. At first, DC element is removed from Input voice and then a compensation of high frequency was achieved by pre-emphasis with coefficients value, 0.97 and constitute division data as same size as 256 sample by lapped shift method. Through by Levinson - Durbin Algorithm, these data can get linear predictive coefficient and again, using Cepstrum - Transformer attain feature vectors. During HMM training, We used Baum-Welch reestimation Algorithm for each words training and can get the recognition result from executed likelihood method on each words. The used speech data is using 40 speech command data and 10 digits extracted form each 15 of male and female speaker spoken menu control command of Embedded system. Since, in many times, ARM CPU is adopted in embedded system, it's peformed porting the speech recognition engine on ARM core evaluation board. And do the recognition test with select set 1 and set 3 parameter that has good recognition rate on commander and no digit after the several tests using by 5 proposal recognition parameter sets. The recognition engine of recognition rate shows 95%, speech commander recognizer shows 96% and digits recognizer shows 94%.

  • PDF

Analysis of Mitochondrial DNA in Patients with Essential Tremor (본태성 수전증 환자의 미토콘드리아 DNA 분석)

  • Lee, Uhn;Yoo, Young Mi;Yoo, Chan Jong
    • Journal of Korean Neurosurgical Society
    • /
    • v.29 no.2
    • /
    • pp.188-195
    • /
    • 2000
  • Objective : Essential tremor(ET) is the most common movement disorder, however, there has been little agreement in the neurologic literature regarding diagnostic criteria for ET. Familial ET is an autosomal dominant disorder presenting as an isolated postural tremor. The main feature of ET is postural tremor of the arms with later involvement of the head, voice, or legs. In previous studies, it was reported that ET susceptibility was inherited in an autosomal dominant inheritance. As previous results, it would suggest that ET might be associated with defect of mitochondrial or nuclear DNA. Recent studies are focusing on molecular genetic detection of movement disorders, such as essential tremor and restless legs syndrome. Moreover, authors have analysed mitochondrial DNA(mtDNA) from the blood cell of positive control(PC) and ET patients via long and accurate polymerase chain reaction(LA PCR). Materials & Methods : Blood samples were collected from PC and 9 ET patients. Total DNA was extracted twice with phenol followed by chloroform : isoamylalcohol. For the analysis of mtDNA, LA PCR was performed by mitochondrial specific primers. Results : With this technique, deletions of large quantities were detected within several regions of mtDNA in ET patients except for D-loop and CO I regions. Conclusion : The authors believe that ET is a genentic disorder with deficiency of mitochondrial DNA multicomplexes and mitochondiral dysfunction could be one of major causative factors of ET. Mitochondrial dysfunction may play an important role in the pathogenesis and possibility of disease progression among familial group with ET patients.

  • PDF

Hyperparameter Search for Facies Classification with Bayesian Optimization (베이지안 최적화를 이용한 암상 분류 모델의 하이퍼 파라미터 탐색)

  • Choi, Yonguk;Yoon, Daeung;Choi, Junhwan;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.3
    • /
    • pp.157-167
    • /
    • 2020
  • With the recent advancement of computer hardware and the contribution of open source libraries to facilitate access to artificial intelligence technology, the use of machine learning (ML) and deep learning (DL) technologies in various fields of exploration geophysics has increased. In addition, ML researchers have developed complex algorithms to improve the inference accuracy of various tasks such as image, video, voice, and natural language processing, and now they are expanding their interests into the field of automatic machine learning (AutoML). AutoML can be divided into three areas: feature engineering, architecture search, and hyperparameter search. Among them, this paper focuses on hyperparamter search with Bayesian optimization, and applies it to the problem of facies classification using seismic data and well logs. The effectiveness of the Bayesian optimization technique has been demonstrated using Vincent field data by comparing with the results of the random search technique.

EMG Pattern Classification using Soft Computing Techniques and Its Application to the Control of a Rehabilitation Robotic Arm (소프트 컴퓨팅 기법을 이용한 근전도 신호의 패턴 분류와 재활 로봇 팔 제어에의 응용)

  • Han, Jeong-Su;Kim, Jong-Seong;Song, Won-Gyeong;Bang, Won-Cheol;Lee, Hui-Yeong;Byeon, Jeung-Nam
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.37 no.6
    • /
    • pp.50-63
    • /
    • 2000
  • In this paper, a new EMG pattern classification method based on soft computing techniques is proposed to help the disabled and the elderly handle rehabilitation robotic arm systems. First, it is shown that EMG is more useful than existing input devices such as voice, a laser pointer and a keypad in view of naturality, extensibility, and applicability. Then, a new procedure is proposed to select the minimal feature set. As methods of classifying the pre-defined motions, a fuzzy pattern classification and fuzzy min-max neural networks (FMMNN) are designed using the selected features. As results, the motions are recognized with success rates of 83 percent and 90 Percent using fuzzy pattern classification and FMMNN, respectively.

  • PDF

Korean Digit Speech Recognition Dialing System using Filter Bank (필터뱅크를 이용한 한국어 숫자음 인식 다이얼링 시스템)

  • 박기영;최형기;김종교
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.37 no.5
    • /
    • pp.62-70
    • /
    • 2000
  • In this study, speech recognition for Korean digit is performed using filter bank which is programmed discrete HMM and DTW. Spectral analysis reveals speech signal features which are mainly due to the shape of the vocal tract. And spectral feature of speech are generally obtained as the exit of filter banks, which properly integrated a spectrum at defined frequency ranges. A set of 8 band pass filters is generally used since it simulates human ear processing. And defined frequency ranges are 320-330, 450-460, 640-650, 840-850, 900-1000, 1100-1200, 2000-2100, 3900-4000Hz and then sampled at 8kHz of sampling rate. Frame width is 20ms and period is 10ms. Accordingly, we found that the recognition rate of DTW is better than HMM for Korean digit speech in the experimental result. Recognition accuracy of Korean digit speech using filter bank is 93.3% for the 24th BPF, 89.1% for the 16th BPF and 88.9% for the 8th BPF of hardware realization of voice dialing system.

  • PDF