• Title/Summary/Keyword: speech source

Search Result 281, Processing Time 0.025 seconds

Mobile Robot with Artificial Olfactory Function

  • Kim, Jeong-Do;Byun, Hyung-Gi;Hong, Chul-Ho
    • Transactions on Control, Automation and Systems Engineering
    • /
    • v.3 no.4
    • /
    • pp.223-228
    • /
    • 2001
  • We have been developed an intelligent mobile robot with an artificial olfactory function to recognize odours and to track odour source location. This mobile robot also has ben installed an engine for speech recognition and synthesis and is controlled by wireless communication. An artificial olfactory system based on array of 7 gas sensors has been installed in the mobile robot for odour recognition, and 11 gas sensors also are located in the obttom of robot to track odour sources. 3 optical sensors are also in cluded in the intelligent mobile robot, which is driven by 2 D. C. motors, for clash avoidance in a way of direction toward an odour source. Throughout the experimental trails, it is confirmed that the intelligent mobile robot is capable of not only the odour recognition using artificial neural network algorithm, but also the tracking odour source using the step-by-step approach method. The preliminary results are promising that intelligent mobile robot, which has been developed, is applicable to service robot system for environmental monitoring, localization of odour source, odour tracking of hazardous areas etc.

  • PDF

Blind Audio Source Separation Based On High Exploration Particle Swarm Optimization

  • KHALFA, Ali;AMARDJIA, Nourredine;KENANE, Elhadi;CHIKOUCHE, Djamel;ATTIA, Abdelouahab
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.5
    • /
    • pp.2574-2587
    • /
    • 2019
  • Blind Source Separation (BSS) is a technique used to separate supposed independent sources of signals from a given set of observations. In this paper, the High Exploration Particle Swarm Optimization (HEPSO) algorithm, which is an enhancement of the Particle Swarm Optimization (PSO) algorithm, has been used to separate a set of source signals. Compared to PSO algorithm, HEPSO algorithm depends on two additional operators. The first operator is based on the multi-crossover mechanism of the genetic algorithm while the second one relies on the bee colony mechanism. Both operators have been employed to update the velocity and the position of the particles respectively. Thus, they are used to find the optimal separating matrix. The proposed method enhances the overall efficiency of the standard PSO in terms of good exploration and performance. Based on many tests realized on speech and music signals supplied by the BSS demo, experimental results confirm the robustness and the accuracy of the introduced BSS technique.

Robust Multi-channel Wiener Filter for Suppressing Noise in Microphone Array Signal (마이크로폰 어레이 신호의 잡음 제거를 위한 강인한 다채널 위너 필터)

  • Jung, Junyoung;Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • v.23 no.4
    • /
    • pp.519-525
    • /
    • 2018
  • This paper deals with noise suppression of multi-channel data captured by microphone array using multi-channel Wiener filter. Multi-channel Wiener filter does not rely on information about the direction of the target speech and can be partitioned into an MVDR (Minimum Variance Distortionless Response) spatial filter and a single channel spectral filter. The acoustic transfer function between the single speech source and microphones can be estimated by subspace decomposition of multi-channel Wiener filter. The errors are incurred in the estimation of the acoustic transfer function due to the errors in the estimation of correlation matrices, which in turn results in speech distortion in the MVDR filter. To alleviate the speech distortion in the MVDR filter, diagonal loading is applied. In the experiments, database with seven microphones was used and MFCC distance was measured to demonstrate the effectiveness of the diagonal loading.

Design of Channel Coding Combined with 2.4kbps EHSX Coder (2.4kbps EHSX 음성부호화기와 결합된 채널코딩 방법)

  • Lee, Chang-Hwan;Kim, Young-Joon;Lee, In-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.9
    • /
    • pp.88-96
    • /
    • 2010
  • We propose the efficient channel coding method combined with a 2.4kbps speech coder. The code rate of a channel coder is given by 1/2 and 1/2 rate convolutional coder is obtained from the punctured convolutional coder with rate of 1/3. The punctured convolutional coder is used for a variable rate allocation. The puncturing method according to the importance of the output data of the source encoder is applied for the convolutional coder. The importance of output data is analyzed by evaluating the bit error sensitivity of speech parameter bits. The performance of proposed coder is analyzed and simulated in Rayleigh fading channel and AWGN channel. The experimental results with 2.4kbps EHSX coder show that the variable rate channel coding method is superior to non-variable channel coding method from the subjective speech quality.

An Adaptive Microphone Array with Linear Phase Response (선형 위상 특성을 갖는 적응 마이크로폰 어레이)

  • Kang, Hong-Gu;Youn, Dae-Hui;Cha, Il-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.53-60
    • /
    • 1992
  • Many adaptive beamforming methods have been studied for interference cancellation and speech signal enhancement in telephone conference and auditorium. Main aspect of adaptive beamforming methods for speech signal processing is different from radar, sonar and seismic signal processing because desire output signal should be apt to the human ear. Considering that phase of speech is quite insensible to the human ear, Sondhi proposed a nonlinear constrained optimization technique whose constraint was on the magnitude transfer function from the source to the output. In real environment the phase response of the speech signal affects the human auditorium system. So it is desirable to design linear phase system. In this paper, linear phase beamformer is proposed and sample processing algorithm is also proposed for real time consideration Simulation results show that the proposed algorithm yields more consistent beam patterns and deep nulls to the noise direction than Sondhi's.

  • PDF

Multi-Modal Biometries System for Ubiquitous Sensor Network Environment (유비쿼터스 센서 네트워크 환경을 위한 다중 생체인식 시스템)

  • Noh, Jin-Soo;Rhee, Kang-Hyeon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.4 s.316
    • /
    • pp.36-44
    • /
    • 2007
  • In this paper, we implement the speech & face recognition system to support various ubiquitous sensor network application services such as switch control, authentication, etc. using wireless audio and image interface. The proposed system is consist of the H/W with audio and image sensor and S/W such as speech recognition algorithm using psychoacoustic model, face recognition algorithm using PCA (Principal Components Analysis) and LDPC (Low Density Parity Check). The proposed speech and face recognition systems are inserted in a HOST PC to use the sensor energy effectively. And improve the accuracy of speech and face recognition, we implement a FEC (Forward Error Correction) system Also, we optimized the simulation coefficient and test environment to effectively remove the wireless channel noises and correcting wireless channel errors. As a result, when the distance that between audio sensor and the source of voice is less then 1.5m FAR and FRR are 0.126% and 7.5% respectively. The face recognition algorithm step is limited 2 times, GAR and FAR are 98.5% and 0.036%.

A Study on the Relationship between the Mencius and the Maeumron of Donguisusebowon - focusing on Mencius Chapter 3 - (『동의수세보원(東醫壽世保元)』의 마음론과 『맹자(孟子)』의 상관성 고찰 - 제3권 「공손축장구상(公孫丑章句上)」을 중심으로 -)

  • Lim, Byeong-Hak;Choi, Gu-Won;Yun, Su-Jeong
    • Journal of Sasang Constitutional Medicine
    • /
    • v.31 no.1
    • /
    • pp.13-25
    • /
    • 2019
  • Objectives Lee Je-Ma's Sasang-philosophy that was discussed in Donguisusebowon is Maeumron and Qi-philosophy. Sasang-philosophy has a direct origin to Mencius, which discusses the Maeumron of Confucianism. Therefore the relationship between Maeumron of Donguisusebowon and Mencius Chapter 3 will be examined. Methods Materials and references were collected about the literature survey. Lee Je-Ma's books such as Donguisusebowon, Gyeokchigo and a book of Confucianism including the Mencius. Results and Conclusions Hoyeonjiqi in 'Sadanron' of Donguisusebowon encompasses the Qi of metaphysical personality and physiological Qi, and it can be seen that it is the Qi that fuses body and mind together. Benevolence, righteousness, propriety and wisdom are directly linked to the personal mind of Sasangin. Unlike the mencius's four clues of virtue, the Sadan is discussed in terms of the large and small organs in four constitution such sa lungs, spleen, liver, and kidneys. Next, In the relationship between Taesim of 'Whoakchungron' and greed of Mencius, if discuss a relationships between 'I understand language' and Taesim, Deceptive speech connects with the dogmatism of Soeumin, Licentious speech connects with indulgence of Taeumin, Crooked speech connects with laziness of Soyangin and Evasive speech connects with selfishness of Taeyangin. Also in the relationship between taking delight gladness without real cause idleness arrogant and Taesim, taking Delight connects with the dogmatism of Soeumin, Gladness without real cause connects with selfishness of Taeyangin, Idleness connects with laziness of Soyangin, and Arrogant connects with indulgence of Taeumin. The next thing, people of Four types of 'Gwangjeseol' coincide with Mencius. Also Love benevolence and enjoy goodness, Envy benevolence and jealous talent in the 'Gwangjeseol' are able to find the source directly in Mencius.

Effect of Speech Degradation and Listening Effort in Reverberating and Noisy Environments Given N400 Responses

  • Kyong, Jeong-Sug;Kwak, Chanbeom;Han, Woojae;Suh, Myung-Whan;Kim, Jinsook
    • Korean Journal of Audiology
    • /
    • v.24 no.3
    • /
    • pp.119-126
    • /
    • 2020
  • Background and Objectives: In distracting listening conditions, individuals need to pay extra attention to selectively listen to the target sounds. To investigate the amount of listening effort required in reverberating and noisy backgrounds, a semantic mismatch was examined. Subjects and Methods: Electroencephalography was performed in 18 voluntary healthy participants using a 64-channel system to obtain N400 latencies. They were asked to listen to sounds and see letters in 2 reverberated×2 noisy paradigms (i.e., Q-0 ms, Q-2000 ms, 3 dB-0 ms, and 3 dB-2000 ms). With auditory-visual pairings, the participants were required to answer whether the auditory primes and letter targets did or did not match. Results: Q-0 ms revealed the shortest N400 latency, whereas the latency was significantly increased at 3 dB-2000 ms. Further, Q-2000 ms showed approximately a 47 ms delayed latency compared to 3 dB-0 ms. Interestingly, the presence of reverberation significantly increased N400 latencies. Under the distracting conditions, both noise and reverberation involved stronger frontal activation. Conclusions: The current distracting listening conditions could interrupt the semantic mismatch processing in the brain. The presence of reverberation, specifically a 2000 ms delay, necessitates additional mental effort, as evidenced in the delayed N400 latency and the involvement of the frontal sources in this study.

Effect of Speech Degradation and Listening Effort in Reverberating and Noisy Environments Given N400 Responses

  • Kyong, Jeong-Sug;Kwak, Chanbeom;Han, Woojae;Suh, Myung-Whan;Kim, Jinsook
    • Journal of Audiology & Otology
    • /
    • v.24 no.3
    • /
    • pp.119-126
    • /
    • 2020
  • Background and Objectives: In distracting listening conditions, individuals need to pay extra attention to selectively listen to the target sounds. To investigate the amount of listening effort required in reverberating and noisy backgrounds, a semantic mismatch was examined. Subjects and Methods: Electroencephalography was performed in 18 voluntary healthy participants using a 64-channel system to obtain N400 latencies. They were asked to listen to sounds and see letters in 2 reverberated×2 noisy paradigms (i.e., Q-0 ms, Q-2000 ms, 3 dB-0 ms, and 3 dB-2000 ms). With auditory-visual pairings, the participants were required to answer whether the auditory primes and letter targets did or did not match. Results: Q-0 ms revealed the shortest N400 latency, whereas the latency was significantly increased at 3 dB-2000 ms. Further, Q-2000 ms showed approximately a 47 ms delayed latency compared to 3 dB-0 ms. Interestingly, the presence of reverberation significantly increased N400 latencies. Under the distracting conditions, both noise and reverberation involved stronger frontal activation. Conclusions: The current distracting listening conditions could interrupt the semantic mismatch processing in the brain. The presence of reverberation, specifically a 2000 ms delay, necessitates additional mental effort, as evidenced in the delayed N400 latency and the involvement of the frontal sources in this study.

Speech source estimation using AMDF (AMDF를 이용한 화자위치 추정)

  • 송도훈
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.193-196
    • /
    • 1998
  • 본 연구에서는 원격 화상회의 시스템 등에서 Camera를 자동적으로 제어하기 위해 화자의 음성신호를 4개의 마이크로폰 배열(Microphone Array)로 수음하여 그 신호에 의해 화자의 위치를 추정한다. 마이크로폰으로 수음한 음성신호의 TDE(Time Delay Estimation)를 계산할 때 그 연산량을 감소시키기 위해 AMDF 알고리즘을 사용한다. 각 마이크로폰 출력신호에 대해 AMDF 알고리즘으로 시간지연을 구하고 DOA(Direction of Arrival)를 계산한다. 그리고 다시 공간 기하계산을 통해 공간내 화자의 위치를 추정한다. 시험 신호로써 음성신호 '아'음을 사용한 수치 시뮬레이션과 반사음이 존재하는 일반 강의실에서 아나운서가 발성하는 음을 사용하여 AMDF 알고리즘을 이용한 화자위치 추정의 정확도를 조사하였다.

  • PDF