• 제목/요약/키워드: Art of speech

검색결과 86건 처리시간 0.024초

The Effect of the Number of Clusters on Speech Recognition with Clustering by ART2/LBG

  • Lee, Chang-Young
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.3-8
    • /
    • 2009
  • In an effort to improve speech recognition, we investigated the effect of the number of clusters. In usual LBG clustering, the number of codebook clusters is doubled on each bifurcation and hence cannot be chosen arbitrarily in a natural way. To have the number of clusters at our control, we combined adaptive resonance theory (ART2) with LBG and perform the clustering in two stages. The codebook thus formed was used in subsequent processing of fuzzy vector quantization (FVQ) and HMM for speech recognition tests. Compared to conventional LBG, our method was shown to reduce the best recognition error rate by 0${\sim$}0.9% depending on the vocabulary size. The result also showed that between 400 and 800 would be the optimal number of clusters in the limit of small and large vocabulary speech recognitions of isolated words, respectively.

  • PDF

A Study on Design and Implementation of Speech Recognition System Using ART2 Algorithm

  • Kim, Joeng Hoon;Kim, Dong Han;Jang, Won Il;Lee, Sang Bae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제4권2호
    • /
    • pp.149-154
    • /
    • 2004
  • In this research, we selected the speech recognition to implement the electric wheelchair system as a method to control it by only using the speech and used DTW (Dynamic Time Warping), which is speaker-dependent and has a relatively high recognition rate among the speech recognitions. However, it has to have small memory and fast process speed performance under consideration of real-time. Thus, we introduced VQ (Vector Quantization) which is widely used as a compression algorithm of speaker-independent recognition, to secure fast recognition and small memory. However, we found that the recognition rate decreased after using VQ. To improve the recognition rate, we applied ART2 (Adaptive Reason Theory 2) algorithm as a post-process algorithm to obtain about 5% recognition rate improvement. To utilize ART2, we have to apply an error range. In case that the subtraction of the first distance from the second distance for each distance obtained to apply DTW is 20 or more, the error range is applied. Likewise, ART2 was applied and we could obtain fast process and high recognition rate. Moreover, since this system is a moving object, the system should be implemented as an embedded one. Thus, we selected TMS320C32 chip, which can process significantly many calculations relatively fast, to implement the embedded system. Considering that the memory is speech, we used 128kbyte-RAM and 64kbyte ROM to save large amount of data. In case of speech input, we used 16-bit stereo audio codec, securing relatively accurate data through high resolution capacity.

The Effect of Comprehensive Art Therapy on Physical Performance and Activities of Daily Living in Children with Cerebral Palsy

  • Baek, Suejung;Lee, Myeungsu;Yang, Chungyong;Yang, Jisu;Kang, Eunyeong;Chong, Bokhee
    • 대한통합의학회지
    • /
    • 제7권3호
    • /
    • pp.51-59
    • /
    • 2019
  • Purpose : To evaluate the effect of comprehensive art therapy on physical function and activities of daily living in children with cerebral palsy (CP). Methods : Ten ambulant children with diplegic (n=8) or hemiplegic (n=2) CP participated in this study. All were randomly assigned to either the art therapy group (n=5) or the control group (n=5). Both groups received physical therapy based on neurodevelopmental techniques for 20 minutes a day, 1 day a week, for a period of 12 weeks. Children in the art therapy group received additional comprehensive art therapy for 70 minutes once a week for 3 months. Tests for various measurements-Motricity Index (MI) for strength, Trunk Control Test (TCT) for trunk ability, Gross Motor Function Measure (GMFM) and Gross Motor Function Classification System (GMFCS) for gross motor function, Denver Developmental Screening Test-II (DDST-II) for developmental milestones, Functional Independence Measure of Children (WeeFIM) for abilities to complete daily activities, Leg and Hand Ability Test (LHAT) for limb function-were performed before and after treatments. Results : The upper extremity and whole extremity strengths of MI, self-care and total scores of WeeFIM, and leg and arm functions of LHAT improved significantly only for individuals in the art therapy group after the art therapy (p<.05). The value of MI after treatment was at the upper extremity and whole extremity strengths the leg function of LHAT was also significantly improved compared to the control group (p<.05). Conclusion : This study revealed that comprehensive art therapy along with physiotherapy was effective in increasing upper extremity strength and leg ability in children with CP. This suggests that comprehensive art therapy may be a useful adjunctive therapy for children with CP.

잡음 환경하에서의 다 모델 기반인식기와 다 스타일 학습방법과의 성능비교 (Performance Comparison of Multiple-Model Speech Recognizer with Multi-Style Training Method Under Noisy Environments)

  • 윤장혁;정용주
    • The Journal of the Acoustical Society of Korea
    • /
    • 제29권2E호
    • /
    • pp.100-106
    • /
    • 2010
  • Multiple-model speech recognizer has been shown to be quite successful in noisy speech recognition. However, its performance has usually been tested using the general speech front-ends which do not incorporate any noise adaptive algorithms. For the accurate evaluation of the effectiveness of the multiple-model frame in noisy speech recognition, we used the state-of-the-art front-ends and compared its performance with the well-known multi-style training method. In addition, we improved the multiple-model speech recognizer by employing N-best reference HMMs for interpolation and using multiple SNR levels for training each of the reference HMM.

저 전송률 음성 부호화기를 위한 여기 신호 개선 알고리즘에 관한 연구 (Enhancement of Excitation in Low-bit-rate Speech Coders)

  • 이미숙;김홍국;최승호;김도영
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 신호처리소사이어티 추계학술대회 논문집
    • /
    • pp.57-60
    • /
    • 2003
  • In this paper, we propose a new excitation enhancement technique to improve the speech quality of low bit rate speech coders. The proposed technique is based on a harmonic model and it is employed only in the decoding process of speech coders without any additional bits. We develop the procedure of harmonic model parameters estimation and harmonic generation. and apply the technique to a current state of the art low bit rate speech coder, ITU-T G.729 Annex D. Also its performance is measured by using the ITU-T P.862 PESQ score and compared to those of the phase dispersion filter and the long-term postfilter applied to the decoded excitation. It is shown that the proposed excitation enhancement technique can improve the quality of decoded speech and provide better quality for male speech than other techniques.

  • PDF

저전송률 코드여기 선형 예측 부호화기를 위한 선택적 대역 하모닉 모델 기반 여기신호 개선 알고리즘 (Excitation Enhancement Based on a Selective-Band Harmonic Model for Low-Bit-Rate Code-Excited Linear Prediction Coders)

  • 이미숙;김홍국;최승호;김도영
    • 음성과학
    • /
    • 제11권2호
    • /
    • pp.259-269
    • /
    • 2004
  • In this paper, we propose a new excitation enhancement technique to improve the speech quality of low bit-rate code-excited linear prediction (CELP) coders. The proposed technique is based on a harmonic model and it is employed only in the decoding process of speech coders without any additional bits. We develop the procedure of harmonic model parameter estimation and harmonic generation, and apply this technique to a current state-of-the-art low bit rate speech coder, ITU-T G.729 Annex D. Also, its performance is measured by using the ITU-T P.862 PESQ score and compared to those of the phase dispersion filter and the long-term postfilter applied to the decoded excitation. It is shown that the proposed excitation enhancement technique can improve the quality of decoded speech and provide better quality for male speech than other techniques.

  • PDF

State of the Art for Refractory Cough: Multidisciplinary Approach

  • Anne E. Vertigan
    • Tuberculosis and Respiratory Diseases
    • /
    • 제86권4호
    • /
    • pp.264-271
    • /
    • 2023
  • Chronic cough is a common problem that can be refractory to medical treatment. Nonpharmaceutical management of chronic cough has an important role in well selected patients. This review article outlines the history of chronic cough management, current approaches to speech pathology management of the condition and new modalities of nonpharmaceutical treatment. There is a need for further research into nonpharmaceutical options with well described randomised control trials.

KMSAV: Korean multi-speaker spontaneous audiovisual dataset

  • Kiyoung Park;Changhan Oh;Sunghee Dong
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.71-81
    • /
    • 2024
  • Recent advances in deep learning for speech and visual recognition have accelerated the development of multimodal speech recognition, yielding many innovative results. We introduce a Korean audiovisual speech recognition corpus. This dataset comprises approximately 150 h of manually transcribed and annotated audiovisual data supplemented with additional 2000 h of untranscribed videos collected from YouTube under the Creative Commons License. The dataset is intended to be freely accessible for unrestricted research purposes. Along with the corpus, we propose an open-source framework for automatic speech recognition (ASR) and audiovisual speech recognition (AVSR). We validate the effectiveness of the corpus with evaluations using state-of-the-art ASR and AVSR techniques, capitalizing on both pretrained models and fine-tuning processes. After fine-tuning, ASR and AVSR achieve character error rates of 11.1% and 18.9%, respectively. This error difference highlights the need for improvement in AVSR techniques. We expect that our corpus will be an instrumental resource to support improvements in AVSR.

Subspace distribution clustering hidden Markov model을 위한 codebook design (Codebook design for subspace distribution clustering hidden Markov model)

  • 조영규;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 춘계 학술대회 발표논문집
    • /
    • pp.87-90
    • /
    • 2005
  • Today's state-of the-art speech recognition systems typically use continuous distribution hidden Markov models with the mixtures of Gaussian distributions. To obtain higher recognition accuracy, the hidden Markov models typically require huge number of Gaussian distributions. Such speech recognition systems have problems that they require too much memory to run, and are too slow for large applications. Many approaches are proposed for the design of compact acoustic models. One of those models is subspace distribution clustering hidden Markov model. Subspace distribution clustering hidden Markov model can represent original full-space distributions as some combinations of a small number of subspace distribution codebooks. Therefore, how to make the codebook is an important issue in this approach. In this paper, we report some experimental results on various quantization methods to make more accurate models.

  • PDF

미술의 공공성과 키스 해링(Keith Haring)의 사회적 개입에 관한 연구 (A Study on Art's Public Features and Social Intervention by Keith Haring)

  • 김지영
    • 미술이론과 현장
    • /
    • 제8호
    • /
    • pp.59-87
    • /
    • 2009
  • This thesis started from the attempt to make it clear that 80's American artist Keith Haring(1958-1990) had conducted social intervention of criticism, resistance, and participation through his works, and so pursued public value. Haring of graffiti fame left popular and familiar cartoon style pictures on the street wall, the billboards, the posters and so on. Popular and playful works was explained as his unique characteristics, but Haring's creative way at the field has more value than just being grasped as artist's personal characteristics. Haring's work pieces became everyday art by joining with people's life, and are working as a social speaking place. So I think that these Haring's art works possess characteristics of 'the public sphere'. 'The Public Sphere' means that is independent and free from the government or partisan economic forces, so that is not connected with the interested relations, and that is the sphere of rational argumentation without 'disguise' or 'fabrication', and that is the sphere where general public can participate in and is inspected by them. The public sphere between the sphere of public authority such a nation and a market and the private sphere of free individual, it is mutually connected with them and works as the space forming public opinion. Private individuals communicate with this public sphere and perform a role of direct and indirect check, balance, and social criticism way off from power. Openness that should include the voice of not only leading power but also the socially weak such as citizens, women, homosexuals, minority races, and so on, and alienated class, is an index of the public characteristics. The public sphere is not working just with speech and mass media. Many artists as well as Haring open their mouth and act through an art at the center of society, and create another public sphere by an art. I understood that the real participatory and practical characteristics on the Haring's work is a phenomenon and current of a part of the art world including Haring. Such current started from 1960s is the in-depth effort to be connected with the life more closely, to communicate with people, and to improve problems of life. And it has pursued public value on the different way from the nation or public power. Artists have intervened in the society with strategic and positive ways in order to raise pushed-out value and sinked rights as the public agenda, and labored to accept the value of variety and difference at the society. The aspect of such social intervention is the notable features, findable on the Haring's works and process. Haring's works include art historical meanings and are expressed with familiar and plastic language, so they were able to communicate with various classes. And he secured various customers at the field and the street. This communicative and public approach factor raised the possibility much for his works to work as the public sphere. Haring presented critical and resistant speech toward society with his works based on this factor. He asserted his position and justice of gender identity as a sexual minority. And his such work continued to movement for alienated class and social week over his own rights. His speech and message on the wall painting, poster, T-shirts, billboard of the subway, and so on worked as a spectacle and pressed concern with social issues and consciousness shift. And he's been trying to protect and care people who is injured by HIV and drug and to realize social justice through social week protection. Haring's works planned to meet many people as much as possible performed its role of intervening in society through criticism, resistance, speech, and participation, and controlling and checking social issues. These things considered, Haring's works show his consciousness about public attributes of art, and obviously include public value seeking. And also we can find the meaning of such his work as that an art is working as the public sphere and shows the possibility to discuss and practice public issues.

  • PDF