• Title/Summary/Keyword: voice extract

Search Result 70, Processing Time 0.021 seconds

Indexing and Retrieval of Human Individuals on Video Data Using Face and Speaker Recognition

  • Y.Sugiyama;N.Ishikawa;M.Nishida;Y.Ariki
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.122-127
    • /
    • 1998
  • In this paper, we focus on the information retrieval of human individuals who are recorded on the video database. Our purpose is to index persons by their faces or voice and to retrieve their existing time sections on the video data. The database system can track as well as extract a face or voice of a certain person and construct a model of the individual person in self-organization mode. If he appears again at different time, the system can put the mark of the same person to the associated frames. In this way, the same person can be retrieved even if the system does not know his exact name. As the face and speaker modeling, a subspace method is employed to improve the indexing accuracy.

  • PDF

Korean isolated word recognizer using new time alignment method of speech signal (새로운 시간축 정규화 방법을 이용한 한국어 고립단어 인식기)

  • Nam, Myeong-U;Park, Gyu-Hong;No, Seung-Yong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.5
    • /
    • pp.567-575
    • /
    • 2001
  • This paper suggests new method to get fixed size parameter from different length of voice signals. The efficiency of speech recognizer is determined by how to compare the similarity(distance of each pattern) of the parameter from voice signal. But the variation of voice signal and the difference of speech speed make it difficult to extract the fixed size parameter from the voice signal. The method suggested in this paper is to normalize the parameter at fixed size by using the 2 dimension DCT(Discrete Cosine Transform) after representing the parameter by spectrogram. To prove validity of the suggested method, parameter extracted from 32 auditory filter-bank(it estimates auditory nerve firing probabilities) is used for the input of neural network after being processed by 2 dimension DCT. And to compare with conventional methods, we used one of conventional methods which solve time alignment problem. The result shows more efficient performance and faster recognition speed in the speaker dependent and independent isolated word recognition than conventional method.

  • PDF

Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution (음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상)

  • Hwang, Jae-Cheon
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.5
    • /
    • pp.13-17
    • /
    • 2017
  • Existing Speech feature extracting method in speech Signal, there are incorrect recognition rates due to incorrect speech which is not clear threshold value. In this article, the modeling method for improving speech recognition performance that combines the feature extraction for speech and silence characteristics normalized to the non-speech. The proposed method is minimized the noise affect, and speech recognition model are convergence of speech signal feature extraction to each speech frame and the silence feature normalization. Also, this method create the original speech signal with energy spectrum similar to entropy, therefore speech noise effects are to receive less of the noise. the performance values are improved in signal to noise ration by the silence feature normalization. We fixed speech and non speech classification standard value in cepstrum For th Performance analysis of the method presented in this paper is showed by comparing the results with CHMM HMM, the recognition rate was improved 2.7%p in the speech dependent and advanced 0.7%p in the speech independent.

Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm (오피니언 마이닝 알고리즘 기반 음성인식 인터뷰 모델의 설계 및 구현)

  • Kim, Kyu-Ho;Kim, Hee-Min;Lee, Ki-Young;Lim, Myung-Jae;Kim, Jeong-Lae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.225-230
    • /
    • 2012
  • The opinion mining is that to use the existing data mining technology also uploaded blog to web, to use product comment, the opinion mining can extract the author's opinion therefore it not judge text's subject, only judge subject's emotion. In this paper, published opinion mining algorithms and the text using speech recognition API for non-voice data to judge the emotions suggested. The system is open and the Subject associated with Google Voice Recognition API sunwihwa algorithm, the algorithm determines the polarity through improved design, based on this interview, speech recognition, which implements the model.

A Study on the Gender and Age Classification of Speech Data Using CNN (CNN을 이용한 음성 데이터 성별 및 연령 분류 기술 연구)

  • Park, Dae-Seo;Bang, Joon-Il;Kim, Hwa-Jong;Ko, Young-Jun
    • The Journal of Korean Institute of Information Technology
    • /
    • v.16 no.11
    • /
    • pp.11-21
    • /
    • 2018
  • Research is carried out to categorize voices using Deep Learning technology. The study examines neural network-based sound classification studies and suggests improved neural networks for voice classification. Related studies studied urban data classification. However, related studies showed poor performance in shallow neural network. Therefore, in this paper the first preprocess voice data and extract feature value. Next, Categorize the voice by entering the feature value into previous sound classification network and proposed neural network. Finally, compare and evaluate classification performance of the two neural networks. The neural network of this paper is organized deeper and wider so that learning is better done. Performance results showed that 84.8 percent of related studies neural networks and 91.4 percent of the proposed neural networks. The proposed neural network was about 6 percent high.

Research on Construction of the Korean Speech Corpus in Patient with Velopharyngeal Insufficiency (구개인두부전증 환자의 한국어 음성 코퍼스 구축 방안 연구)

  • Lee, Ji-Eun;Kim, Wook-Eun;Kim, Kwang Hyun;Sung, Myung-Whun;Kwon, Tack-Kyun
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.55 no.8
    • /
    • pp.498-507
    • /
    • 2012
  • Background and Objectives We aimed to develop a Korean version of the velopharyngeal insufficiency (VPI) speech corpus system. Subjects and Method After developing a 3-channel simultaneous speech recording device capable of recording nasal/oral and normal compound speech separately, voice data were collected from VPI patients aged more than 10 years with/without the history of operation or prior speech therapy. This was compared to a control group for which VPI was simulated by using a french-3 nelaton tube inserted via both nostril through nasopharynx and pulling the soft palate anteriorly in varying degrees. The study consisted of three transcriptors: a speech therapist transcribed the voice file into text, a second transcriptor graded speech intelligibility and severity and the third tagged the types and onset times of misarticulation. The database were composed of three main tables regarding (1) speaker's demographics, (2) condition of the recording system and (3) transcripts. All of these were interfaced with the Praat voice analysis program, which enables the user to extract exact transcribed phrases for analysis. Results In the simulated VPI group, the higher the severity of VPI, the higher the nasalance score was obtained. In addition, we could verify the vocal energy that characterizes hypernasality and compensation in nasal/oral and compound sounds spoken by VPI patients as opposed to that characgerizes the normal control group. Conclusion With the Korean version of VPI speech corpus system, patients' common difficulties and speech tendencies in articulation can be objectively evaluated. Comparing these data with those of the normal voice, mispronunciation and dysarticulation of patients with VPI can be corrected.

A Study on Infant Respiratory Diseases Diagnosis using Frequency Bandwidth Analysis of Crying Waveform (울음소리의 주파수 대역폭 분석을 이용한 소아호흡기 질환 진단에 관한 연구)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.12B
    • /
    • pp.1123-1130
    • /
    • 2008
  • Baby's diseases diagnosis has inconvenient for received direct coming to help that order expression ability was insufficiency which consciousness situation concern about the infant health because of birth rate and decrease the marriage rate and divorce rate. So in this paper through the infant crying sound about home a foundation which infant diseases develop the system comparison normal infant with take a infant that analysis the extract the voice analytics component. Especially this paper propose about the methodology for development system that infant cold, infant pneumonia, infant asthma among extract the crying sound feature part for infant respiratory diseases discussion the most easy has involved the infant. So infant respiratory put case stimulus diseases about all voice organs and experiment the analysis method through the bandwidth about phonetics analysis component that comparison normal infant with take a respiratory infant. Through these method, we were extracted to results that infant's frequency bandwidth suffering from respiratory diseases than a normal infant is short.

Twitter Crawling System

  • Ganiev, Saydiolim;Nasridinov, Aziz;Byun, Jeong-Yong
    • Journal of Multimedia Information System
    • /
    • v.2 no.3
    • /
    • pp.287-294
    • /
    • 2015
  • We are living in epoch of information when Internet touches all aspects of our lives. Therefore, it provides a plenty of services each of which benefits people in different ways. Electronic Mail (E-mail), File Transfer Protocol (FTP), Voice/Video Communication, Search Engines are bright examples of Internet services. Between them Social Network Services (SNS) continuously gain its popularity over the past years. Most popular SNSs like Facebook, Weibo and Twitter generate millions of data every minute. Twitter is one of SNS which allows its users post short instant messages. They, 100 million, posted 340 million tweets per day (2012)[1]. Often big amount of data contains lots of noisy data which can be defined as uninteresting and unclassifiable data. However, researchers can take advantage of such huge information in order to analyze and extract meaningful and interesting features. The way to collect SNS data as well as tweets is handled by crawlers. Twitter crawler has recently emerged as a great tool to crawl Twitter data as well as tweets. In this project, we develop Twitter Crawler system which enables us to extract Twitter data. We implemented our system in Java language along with MySQL. We use Twitter4J which is a java library for communicating with Twitter API. The application, first, connects to Twitter API, then retrieves tweets, and stores them into database. We also develop crawling strategies to efficiently extract tweets in terms of time and amount.

Effects of Kudzu Leaf Extracts on Stress Reduction in Rats with Damaged Larynxes (후두 손상 유발시킨 랫드에 칡잎추출물을 투여하여 스트레스 경감효과에 미치는 영향)

  • Lee, Tae-Jong;Yea, Chun-Jung
    • Journal of Environmental Health Sciences
    • /
    • v.38 no.5
    • /
    • pp.431-437
    • /
    • 2012
  • Objectives: This study aims to investigate the effects of voice disorders on changes in stress among people with damaged larynxes. To accomplish this, physiological changes and reductions in the stress of Sprague-Dawley rats whose larynx had been damaged were investigated after the laboratory animals were administered kudzu leaf extracts with sedative effects. Methods: In the experiment, a total of 24 rats were divided into four groups of six rats, including the normal group, control group, experimental group 1, and experimental group 2. After orally administering to the subjects a predetermined amount of the extract at a specific time (once per day over five weeks), changes in physiological functions, internal organ weight, cortisol, estrogen, and progesterone in the subjects were examined, and an immunological test was conducted on their brain tissues. Results: Statistical significance was seen in the experimental group as opposed to the control group and the results were similar to those of the normal group. Conclusions: In consideration of these results, it is deemed that there are severe effects on stress due to voice disorders and that the administration of kudzu leaf extracts results in improvement in stress.

Development of An Ergonomic Product Development Process Reflecting Quantified Customer Preference (정량화된 고객 선호도를 체계적으로 반영하기 위한 인간공학적 제품 개발 프로세스)

  • Im, YoungJae;Jung, Eui S.;Park, SungJoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.34 no.1
    • /
    • pp.66-78
    • /
    • 2008
  • In the past, Manufacturers used to determine the quality of products, but the trend of today's market becomesmore into customer-driven. As a result, demands from customers are becoming more diverse and complicated,and most companies are obligated to meet their needs. As one of the effort to achieve their satisfaction,companies are now emphasizing activities to find out what customers specifically want and extract voice ofcustomer(VOC). This study attempts to develop an ergonomic product development process as a method tomaximally reflect the VOC. In order to meet this goal, ergonomic design guidelines, which are possible to beclassified according that user's human characteristics, will be recommended. Even now, there are numerousdesign guidelines already existing in the ergonomics literature. However, it is not realistically feasible to reviewall of those guidelines, and some of them are even conflicting with each other. Therefore, in this paper, theproduct development process, which prioritizes the human characteristics that reflect customer needs and appliesthe design guidelines that meet the most important ones, will be suggested. Finally, the research was described toshow the validity of the product development process through an example of a mobile phone development case.