• 제목/요약/키워드: Source speaker

검색결과 104건 처리시간 0.025초

DSP를 이용한 능동소음 제어시스템의 개발 (Development of Active Noise Control System using DSP)

  • Kim, H.S.;Shin, J.;Oh, J.E.
    • 한국정밀공학회지
    • /
    • 제11권1호
    • /
    • pp.108-113
    • /
    • 1994
  • Active noise control technique has superior performance in low frequency ranges(50 .approx. 400Hz) to the conventional passive noise control technique. For the feasibility of active noise control, it is required to develop a controller which can implement control algorithm on real-time. In this study, therefore, real-time controller is developed using TMS320c25, high speed digital processor. Unlike conventional DSP board of complete ADD ON type, it is possible for the developed controller to interface with the other computer system easily by series communication for the convenience of program development. Furthermore it is designes to be separated readily as a control device. Active noise control of duct system is implemented ti evaluate a performance of developed device. Active noise control of duct system is implemented to evaluate a performance of developed controller using filtered-x LMS algorithm.

  • PDF

음원 파라미터 모델과 인공신경망을 이용한 음성장애 검출 (Screening of Voice Disorder using Source Parameter Model and Artificial Neural Network)

  • 파벨시틸;조철우;미샤파벨
    • 음성과학
    • /
    • 제15권2호
    • /
    • pp.89-97
    • /
    • 2008
  • There is a number of clinical conditions that affect directly or indirectly the physical properties of the vocal folds and thereby the pressure waveforms of elicited sounds. If the relationships between the clinical conditions and the voice quality are sufficiently reliable, it should be possible to detect these diseases or disorders. The focus of this paper is to determine the set of features and their values that would characterize the speaker's state of vocal folds. To the extent that these features can capture the anatomical, physiological, and neurological aspects of the speaker they can be potentially used to mediate an unobtrusive approach to diagnosis. We will show a new approach to this problem supported with results obtained from two disordered voice corpora.

  • PDF

KMSAV: Korean multi-speaker spontaneous audiovisual dataset

  • Kiyoung Park;Changhan Oh;Sunghee Dong
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.71-81
    • /
    • 2024
  • Recent advances in deep learning for speech and visual recognition have accelerated the development of multimodal speech recognition, yielding many innovative results. We introduce a Korean audiovisual speech recognition corpus. This dataset comprises approximately 150 h of manually transcribed and annotated audiovisual data supplemented with additional 2000 h of untranscribed videos collected from YouTube under the Creative Commons License. The dataset is intended to be freely accessible for unrestricted research purposes. Along with the corpus, we propose an open-source framework for automatic speech recognition (ASR) and audiovisual speech recognition (AVSR). We validate the effectiveness of the corpus with evaluations using state-of-the-art ASR and AVSR techniques, capitalizing on both pretrained models and fine-tuning processes. After fine-tuning, ASR and AVSR achieve character error rates of 11.1% and 18.9%, respectively. This error difference highlights the need for improvement in AVSR techniques. We expect that our corpus will be an instrumental resource to support improvements in AVSR.

Investigation of the Speech Intelligibility of Classrooms Depending on the Sound Source Location

  • Kim Jeong Tai;Haan Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • 제24권4E호
    • /
    • pp.139-143
    • /
    • 2005
  • The present study aims to investigate the effects of speaker location on the speech intelligibility in a classroom. In order to this, acoustic measurements were undertaken in a classroom with three different sound source locations such as center of front wall (FC), both sides of front wall (FS) and the center of ceiling (CC). SPL, RT, $D_{50}$, RASTI were measured in the 9 measurement points with same sound power level of sound source and MLS was used as the sound source signal. Also, subjective listening tests were carried out using Korean language listening materials which were recorded in an anechoic chamber. The recorded syllables were replayed and recorded again in the classroom with same sound source at three different locations and listening tests were undertaken to 20 respondents who were asked to write the correct syllables which were recorded in the classroom. The results show that higher sound intelligibility ($D_{50}$ of $47\%$, RASTI of 0.56) was obtained when sound source was located at the FS. The results also show that high sound intelligibility was obtained at the area nearby walls.

서라운드시스템을 위한 가상 음상정위 알고리즘 (Virtual Sound Localization algorithm for Surround Sound Systems)

  • 이신렬;한기영;이승래;성굉모
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2004년도 춘계학술발표대회 논문집 제23권 1호
    • /
    • pp.81-84
    • /
    • 2004
  • In this paper, we propose a virtual sound localization algorithm which improves the sound localization accuracy and sound color preservation for two channel and multi-channel surround speaker layouts. In conventional CPP laws, the sound direction is different from the panning angle and the sound color is different from real sound source especially when the speakers are spread out widely. To overcome this drawback, we design a virtual sound localization algorithm using directional psychoacoustic criteria (DPC) and sound color compensator (SCC). The analysis results show that in the case of the proposed system, the sound direction is the same as the panning angle in the audible frequency range and the sound color is less deviated from a real sound source than the conventional CPP law. In addition, its performance is verified by means of subjective tests using a real sound source.

  • PDF

A DSP Implementation of Subband Sound Localization System

  • Park, Kyusik
    • The Journal of the Acoustical Society of Korea
    • /
    • 제20권4E호
    • /
    • pp.52-60
    • /
    • 2001
  • This paper describes real time implementation of subband sound localization system on a floating-point DSP TI TMS320C31. The system determines two dimensional location of an active speaker in a closed room environment with real noise presents. The system consists of an two microphone array connected to TI DSP hosted by PC. The implemented sound localization algorithm is Subband CPSP which is an improved version of traditional CPSP (Cross-Power Spectrum Phase) method. The algorithm first split the input speech signal into arbitrary number of subband using subband filter banks and calculate the CPSP in each subband. It then averages out the CPSP results on each subband and compute a source location estimate. The proposed algorithm has an advantage over CPSP such that it minimize the overall estimation error in source location by limiting the specific band dominant noise to that subband. As a result, it makes possible to set up a robust real time sound localization system. For real time simulation, the input speech is captured using two microphone and digitized by the DSP at sampling rate 8192 hz, 16 bit/sample. The source location is then estimated at once per second to satisfy real-time computational constraints. The performance of the proposed system is confirmed by several real time simulation of the speech at a distance of 1m, 2m, 3m with various speech source locations and it shows over 5% accuracy improvement for the source location estimation.

  • PDF

휴머노이드 로봇을 위한 원거리 음성 인터페이스 기술 연구 (Distant-talking of Speech Interface for Humanoid Robots)

  • 이협우;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.39-40
    • /
    • 2007
  • For efficient interaction between human and robots, speech interface is a core problem especially in noisy and reverberant conditions. This paper analyzes main issues of spoken language interface for humanoid robots, such as sound source localization, voice activity detection, and speaker recognition.

  • PDF

KHST 차량 벽면의 투과손실값 예측 (Transmission Loss Prediction of KHST′s Wall)

  • Kim, Kwanju;Taejung Yoon
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2002년도 추계학술대회논문초록집
    • /
    • pp.317.2-317
    • /
    • 2002
  • Transmission loss of KHST passenger vehicle was calculated using measured acoustic data: In order to verify the transmission loss results for KHST case, similar experiment was carried out in laboratory condition, which result was compared those by geometric acoustic method. The computational results shows good agreement with the transmission loss magnitude from experiments. This paper also mentions items to obtain more accurate transmission loss values, i. e. how to assure reverberant field condition, the selection of source speaker' location.

  • PDF

Using Corpora for the Study of Word-Formation: A Case Study in English Negative Prefixation

  • Kwon, Heok-Seung
    • 한국영어학회지:영어학
    • /
    • 제1권3호
    • /
    • pp.369-386
    • /
    • 2001
  • This paper will show that traditional approaches to the derivation of different negative words have been of an essentially hypothetical nature, based on either linguists' intuitions or rather scant evidence, and that native-speaker dictionary entries show meaning potentials (rather than meanings) which are in fact linguistic and cognitive prototypes. The purpose of this paper is to demonstrate that using a large corpus of natural language can provide better answers to questions about word-formation (i.e., with particular reference to negative prefixation) than any other source of information.

  • PDF

모음에 따른 화자의 음원특성 비교 (Comparison of Speaker's Source Characteristics in Different Vowel Characteristics)

  • 이후동;강선미;장문수;박한상
    • 대한음성언어의학회:학술대회논문집
    • /
    • 대한음성언어의학회 2003년도 제19회 학술대회
    • /
    • pp.240-240
    • /
    • 2003
  • 본 논문에서는 기존의 매개변수들과 달리 화자의 고유한 특성을 보여주는 화자인식 매개변수를 발성유형에서 찾고자 한다. 일반적으로 화자의 음원 특성이 발성 유형을 결정한다. 발성유형의 특성을 나타내는 매개변수로는 개방지수(open quotient)와 스펙트럼의 기울기 (spectral tilt)가 있으며, 스펙트럼의 기울기는 음향학적으로 그 특성을 측정할 수 있다. 그러나 기존의 측정방식은 사람마다 다른 기본 주파수와 모음의 영향을 전부 혹은 일부 배제하지 못하였다. (중략)

  • PDF