• Title/Summary/Keyword: speech database

Search Result 331, Processing Time 0.027 seconds

A Study on Spatio-temporal Features for Korean Vowel Lipreading (한국어 모음 입술독해를 위한 시공간적 특징에 관한 연구)

  • 오현화;김인철;김동수;진성일
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.19-26
    • /
    • 2002
  • This paper defines the visual basic speech units, visemes and investigates various visual features of a lip for the effective Korean lipreading. First, we analyzed the visual characteristics of the Korean vowels from the database of the lip image sequences obtained from the multi-speakers, thereby giving a definition of seven Korean vowel visemes. Various spatio-temporal features of a lip are extracted from the feature points located on both inner and outer lip contours of image sequences and their classification performances are evaluated by using a hidden Markov model based classifier for effective lipreading. The experimental results for recognizing the Korean visemes have demonstrated that the feature victor containing the information of inner and outer lip contours can be effectively applied to lipreading and also the direction and magnitude of the movement of a lip feature point over time is quite useful for Korean lipreading.

A Study on Satirical Expression of Animal Cartoon & Animated Cartoon (동물 만화영상의 풍자적 표현 연구)

  • Lee, Hwa-Ja
    • Cartoon and Animation Studies
    • /
    • s.9
    • /
    • pp.266-282
    • /
    • 2005
  • Cartoon & Animated cartoon is consists of imaginal attributes and linguistic attributes, and it is closely connected with humor and satirical contents. And then various expressions using animals as matter communicate satirical attributes of a satire strongly and easily. On this article, techniques of satirical expression using animals in Cartoon & Animated cartoon media are studied and analyzed. By the method, it looks around briefly beginning from primitive cave paintings of the prehistoric age to various contemporary Cartoon & Animated cartoon character industries as historical background of Cartoon & Animated cartoon, and also arranges various types that literary expression and representation for visual expression techniques - metaphorical expressions, emblematic expressions, figure of speech and so forth - on literature. This attempt aims for presenting a basic analysis method that connecting and combining Cartoon & Animated cartoon media with humanistic classification and making database of existing data. These accumulated data will indicate cartoon and the action of meaning.

  • PDF

Development of medical/electrical convergence software for classification between normal and pathological voices (장애 음성 판별을 위한 의료/전자 융복합 소프트웨어 개발)

  • Moon, Ji-Hye;Lee, JiYeoun
    • Journal of Digital Convergence
    • /
    • v.13 no.12
    • /
    • pp.187-192
    • /
    • 2015
  • If the software is developed to analyze the speech disorder, the application of various converged areas will be very high. This paper implements the user-friendly program based on CART(Classification and regression trees) analysis to distinguish between normal and pathological voices utilizing combination of the acoustical and HOS(Higher-order statistics) parameters. It means convergence between medical information and signal processing. Then the acoustical parameters are Jitter(%) and Shimmer(%). The proposed HOS parameters are means and variances of skewness(MOS and VOS) and kurtosis(MOK and VOK). Database consist of 53 normal and 173 pathological voices distributed by Kay Elemetrics. When the acoustical and proposed parameters together are used to generate the decision tree, the average accuracy is 83.11%. Finally, we developed a program with more user-friendly interface and frameworks.

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances (발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용)

  • Jang In-Chang;Lee Tae-Seung;Park Mikyoung;Kim Tae-Soo;Jang Dong-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.505-511
    • /
    • 2005
  • Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.

Decision Tree Learning Algorithms for Learning Model Classification in the Vocabulary Recognition System (어휘 인식 시스템에서 학습 모델 분류를 위한 결정 트리 학습 알고리즘)

  • Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.153-158
    • /
    • 2013
  • Target learning model is not recognized in this category or not classified clearly failed to determine if the vocabulary recognition is reduced. Form of classification learning model is changed or a new learning model is added to the recognition decision tree structure of the model should be changed to a structural problem. In order to solve these problems, a decision tree learning model for classification learning algorithm is proposed. Phonological phenomenon reflected sound enough to configure the database to ensure learning a decision tree learning model for classifying method was used. In this study, the indoor environment-dependent recognition and vocabulary words for the experimental results independent recognition vocabulary of the indoor environment-dependent recognition performance of 98.3% in the experiment showed, vocabulary independent recognition performance of 98.4% in the experiment shown.

What has Korea told in the WTO? : An analysis on the Ministerial Conference Statements (WTO에서 한국은 무슨 말을 해왔나?: 각료회의 대표발언문 분석을 중심으로)

  • Jeong-meen Suh
    • Korea Trade Review
    • /
    • v.48 no.1
    • /
    • pp.29-53
    • /
    • 2023
  • This study analyzes the statements made by representatives of member countries at the WTO Ministerial Conference (MC), the highest decision-making body of the WTO, to examine the position and attitude that Korea has shown at the WTO during the last 27 years. After constructing text dataset by extracting about 1,800 statement documents made by member countries from the WTO document database, the text mining technique is applied to figure out the characteristics of Korea's statements compared to other member countries. Through formal characteristics such as the number of remarks and length of speech, basic attitudes such as continuity of Korea's interest in the WTO and the level of interest in the WTO are measured. In terms of substantive characteristics, the topics in the statements of Korea are categorized through the LDA topic model, and the keywords of Korea for each session are analyzed through comparative analysis with statements by other member countries.

A Study on Spoken Digits Analysis and Recognition (숫자음 분석과 인식에 관한 연구)

  • 김득수;황철준
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.6 no.3
    • /
    • pp.107-114
    • /
    • 2001
  • This paper describes Connected Digit Recognition with Considering Acoustic Feature in Korea. The recognition rate of connected digit is usually lower than word recognition. Therefore, speech feature parameter and acoustic feature are employed to make robust model for digit, and we could confirm the effect of Considering. Acoustic Feature throughout the experience of recognition. We used KLE 4 connected digit as database and 19 continuous distributed HMM as PLUs(Phoneme Like Units) using phonetical rules. For recognition experience, we have tested two cases. The first case, we used usual method like using Mel-Cepstrum and Regressive Coefficient for constructing phoneme model. The second case, we used expanded feature parameter and acoustic feature for constructing phoneme model. In both case, we employed OPDP(One Pass Dynamic Programming) and FSA(Finite State Automata) for recognition tests. When appling FSN for recognition, we applied various acoustic features. As the result, we could get 55.4% recognition rate for Mel-Cepstrum, and 67.4% for Mel-Cepstrum and Regressive Coefficient. Also, we could get 74.3% recognition rate for expanded feature parameter, and 75.4% for applying acoustic feature. Since, the case of applying acoustic feature got better result than former method, we could make certain that suggested method is effective for connected digit recognition in korean.

  • PDF

A Phoneme-based Approximate String Searching System for Restricted Korean Character Input Environments (제한된 한글 입력환경을 위한 음소기반 근사 문자열 검색 시스템)

  • Yoon, Tai-Jin;Cho, Hwan-Gue;Chung, Woo-Keun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.10
    • /
    • pp.788-801
    • /
    • 2010
  • Advancing of mobile device is remarkable, so the research on mobile input device is getting more important issue. There are lots of input devices such as keypad, QWERTY keypad, touch and speech recognizer, but they are not as convenient as typical keyboard-based desktop input devices so input strings usually contain many typing errors. These input errors are not trouble with communication among person, but it has very critical problem with searching in database, such as dictionary and address book, we can not obtain correct results. Especially, Hangeul has more than 10,000 different characters because one Hangeul character is made by combination of consonants and vowels, frequency of error is higher than English. Generally, suffix tree is the most widely used data structure to deal with errors of query, but it is not enough for variety errors. In this paper, we propose fast approximate Korean word searching system, which allows variety typing errors. This system includes several algorithms for applying general approximate string searching to Hangeul. And we present profanity filters by using proposed system. This system filters over than 90% of coined profanities.

Design and Implementation of a news Archive System using Shot Types (샷의 타입을 이용한 뉴스 아카이브 시스템의 설계 및 구현)

  • Han, Keun-Ju;Nang, Jong-Ho;Ha, Myung-Hwan;Jung, Byung-Hee;Kim, Kyeong-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.416-428
    • /
    • 2001
  • In order to build a news archive system. the news video stream should be first segmented into several articles, ad their contents are abstracted effectively. This abstraction helps the users to understand the contents of the article without playing the whole video stream. This paper proposes a new article boundary detection scheme for the news video streams together with a new news article abstraction scheme using the shot types of the news video data. The shots in the news video are classified into anchor person shots, interview shots, speech shots, reporting shots, graphic shots, and others. Since the news article starts with an anchor shot whose duration is relatively longer than other shots with special screen structure, the article boundary in detected by the computing the length of the shot and checking the screen structure in the proposed scheme. For the effective abstraction of the article video, the graphic image located in the right-top of the anchor shot frames is primarily used in the proposed abstraction scheme since it is the abstraction of the article made by the producer of the news according to its contents so that it contains a lot of meaningful information. The key frames of the other shots except interview and report shots are also used to abstract the contents of the articles in the proposed scheme. Upon experimental results, the precision and recall values of the proposed article boundary detection scheme could be 92% and 96%, respectively. This paper also presents a design and implementation of a prototype news archive system on WWW that consists of an indexing tool, an authoring tool, a database for meta-data of the news, and a browsing tool.

  • PDF

A Research Review of High-technology AAC Intervention for Individuals with Disabilities (장애인을 위한 하이-테크놀로지 보완·대체의사소통체계 실험 연구 동향 분석)

  • Song, Jaeok;Jeon, Byung-un
    • 재활복지
    • /
    • v.20 no.4
    • /
    • pp.203-228
    • /
    • 2016
  • The purpose of this study was to find out the recent trend of high-tech AAC intervention studies for individuals with disabilities. Electronic database searches were completed to identify studies published between 2009 and 2016. 46 studies were identified for inclusion in this review. The studies were classified as participants, research design, intervention settings, independent variables, dependent variables, communication skills by High-tech device, type of high-tech AAC device. Across these studies, intervention was provided to total of 126 participants. Most participants are aged 6-11 and the most common diagnosis was autistic spectrum disorder. Most common study designs were multiple probe design and multiple treatment design. The majority of studies implemented interventions in a special education school(classroom) setting. The majority of studies implemented interventions to compare the effect of high-tech and low-tech AAC device interventions. The majority of targeted behavioral outcomes were communication skills. Tablet PC was the most frequently used for intervention in both domestic and foreign studies. The most common softwares were 'My talky' in domestic studies and 'Proloquo2Go' in foreign studies. The synthesis of evidence describing views of users and providers and the implementation of high-tech AAC device can provide valuable data to inform intervention studies and functional outcome measures. Suggestions for the future research are discussed.