• Title/Summary/Keyword: sound information

Search Result 1,716, Processing Time 0.031 seconds

The research on the MEMS device improvement which is necessary for the noise environment in the speech recognition rate improvement (잡음 환경에서 음성 인식률 향상에 필요한 MEMS 장치 개발에 관한 연구)

  • Yang, Ki-Woong;Lee, Hyung-keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.12
    • /
    • pp.1659-1666
    • /
    • 2018
  • When the input sound is mixed voice and sound, it can be seen that the voice recognition rate is lowered due to the noise, and the speech recognition rate is improved by improving the MEMS device which is the H / W device in order to overcome the S/W processing limit. The MEMS microphone device is a device for inputting voice and is implemented in various shapes and used. Conventional MEMS microphones generally exhibit excellent performance, but in a special environment such as noise, there is a problem that the processing performance is deteriorated due to a mixture of voice and sound. To overcome these problems, we developed a newly designed MEMS device that can detect the voice characteristics of the initial input device.

Context-Awareness Cat Behavior Captioning System (반려묘의 상황인지형 행동 캡셔닝 시스템)

  • Chae, Heechan;Choi, Yoona;Lee, Jonguk;Park, Daihee;Chung, Yongwha
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.1
    • /
    • pp.21-29
    • /
    • 2021
  • With the recent increase in the number of households raising pets, various engineering studies have been underway for pets. The final purpose of this study is to automatically generate situation-sensitive captions that can express implicit intentions based on the behavior and sound of cats by embedding the already mature behavioral detection technology of pets as basic element technology in the video capturing research. As a pilot project to this end, this paper proposes a high-level capturing system using optical-flow, RGB, and sound information of cat videos. That is, the proposed system uses video datasets collected in an actual breeding environment to extract feature vectors from the video and sound, then through hierarchical LSTM encoder and decoder, to identify the cat's behavior and its implicit intentions, and to perform learning to create context-sensitive captions. The performance of the proposed system was verified experimentally by utilizing video data collected in the environment where actual cats are raised.

Efficient Sound Processing and Synthesis in VR Environment Using Curl Vector of Obstacle Object (장애물 객체의 회전 벡터를 이용한 VR 환경에서의 효율적인 음향 처리 및 합성)

  • Park, Seong-A;Park, Soyeon;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.369-372
    • /
    • 2022
  • 본 논문에서는 장애물 객체의 회전 벡터를 이용하여 VR 환경에서의 효율적으로 음향 처리 및 합성하는 방법을 제안한다. 현실에서 소리와 장애물이 있을 때, 소리는 장애물의 형태에 따라 퍼지면서 전파되는 형태를 보여준다. 이 같은 특징을 가상현실 환경에 유사하게 음향 처리하고자 하며 이를 위해 장애물 객체의 위치와 소리의 근원지 위치를 입력으로 소리의 전파 형태를 근사한다. 이때 모서리 부근에서 표현되는 소리의 회전을 계산하기 위해 장애물의 회전벡터(Curl vector)를 기반으로 소리의 회전을 추출하였으며, 장애물 형태를 컨볼루션(Convolution)하여 소리가 바깥 방향으로 전파되는 형태를 모델링한다. 또한, 장애물과 소리 벡터 사이의 거리, 소리 근원지와 소리 벡터 사이의 거리를 계산하여 소리의 크기를 감쇠 시켜 주며, 최종적으로 장애물 주변으로 퍼지는 벡터 모양인 외부벡터를 합성하여 장애물로부터 외부로 퍼지는 벡터의 방향을 설정한다. 본 논문에서 제안하는 방법을 이용한 소리는 장애물과의 거리와 형태를 고려하여 퍼지는 사운드 벡터 형태를 보여주며, 소리 위치에 따라 소리 감소 패턴이 변경되고, 장애물 모양에 따라 흐름이 조절되는 결과를 보여준다. 이 같은 실험은 실제 현실에서 소리가 장애물의 모양에 따라 나타나는 소리의 변화 및 패턴을 거의 유사하게 표현할 수 있다.

  • PDF

Efficient and Secure Sound-Based Hybrid Authentication Factor with High Usability

  • Mohinder Singh B;Jaisankar N.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2844-2861
    • /
    • 2023
  • Internet is the most prevailing word being used nowadays. Over the years, people are becoming more dependent on the internet as it makes their job easier. This became a part of everyone's life as a means of communication in almost every area like financial transactions, education, and personal-health operations. A lot of data is being converted to digital and made online. Many researchers have proposed different authentication factors - biometric and/or non-biometric authentication factors - as the first line of defense to secure online data. Among all those factors, passwords and passphrases are being used by many users around the world. However, the usability of these factors is low. Also, the passwords are easily susceptible to brute force and dictionary attacks. This paper proposes the generation of a novel passcode from the hybrid authentication factor - sound. The proposed passcode is evaluated for its strength to resist brute-force and dictionary attacks using the Shannon entropy and Passcode (or password) entropy formulae. Also, the passcode is evaluated for its usability. The entropy value of the proposed is 658.2. This is higher than that of other authentication factors. Like, for a 6-digit pin - the entropy value was 13.2, 101.4 for Password with Passphrase combined with Keystroke dynamics and 193 for fingerprint, and 30 for voice biometrics. The proposed novel passcode is far much better than other authentication factors when compared with their corresponding strength and usability values.

Sound's Direction Detection and Speech Recognition System for Humanoid Active Audition

  • Kim, Hyun-Don;Choi, Jong-Suk;Lee, Chang-Hoon;Park, Gwi-Tea;Kim, Mun-Sang
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.633-638
    • /
    • 2003
  • In this paper, we propose a humanoid active audition system which detects the direction of sound and performs speech recognition using just three microphones. Compared with previous researches, this system which has simpler algorithm, fewer microphones and better amplifier shows better performance. In order to verify our system's performance, we install the proposed active audition system to the home service robot, called Hombot II, which has been developed at the KIST (Korea Institute of Science and Technology), thus we confirm excellent performance by experimental results

  • PDF

Improved methods for measuring early reflections from Five-channel room impulse response using newly introduced Peak-Detecting algorithm

  • Kim Lae-Hoon;Doo Sejin;Oh Yangki;Lee Heewon;Sung Koeng-Mo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.439-442
    • /
    • 2000
  • When we measure the acoustical properties of a room using multiple microphone system, it is important to grasp exact time delay of the early reflections from impulse response pair. But it is often very difficult to identify the early reflections in natural shape, because a waveform may be deformed due to the characteristics of a sound source loudspeaker, microphone and reflected wall and overlapping of plural waveform. In this paper to obtain more accurate and enough early reflections, we propose the brand-new five-channel sound receiving system and introduce peak-detecting algorithm. The system has microphones mounted at the origin and four points of a regular tetrahedron. The newly introduced peak-detecting algorithm can show exact peak position in each channel, in spite of deformation due to reflected walls, loudspeaker and microphone.

  • PDF

Realtime Stereo Sound Image Expansion System Using Hass Effect& Phase shifting (선착효과 및 위상처리를 이용한 실시간 스테레오 음상 확장 시스템 구현)

  • 이종철;이상훈
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1227-1230
    • /
    • 1998
  • Phase control methods are used to expand the sound image in general AV system. However, these methods are effective only to the signal under 1kHz, and the listener must be located in front center of the speaker system. In this paper, we realize the realtime processing system in which phase shifting method is dominant at low frequency and precedence effect is dominant at high frequency. Two sound cards are used to process the audio signal in realtime with 16 bits stereo channel of 44.1 kHz sampling frequency. And the analog circuit is designed to process the phase shifting. In experiments the usefulness of the proposed stereo system is confirmed.

  • PDF

Measurement of Reflection Coefficient of Sound Absorbent Material with Respect to Angle of Incidence and Its Associated Errors (입사각에 따른 흡음재의 반사 계수 측정 방법론 및 오차에 대한 고찰)

  • 이수열;김상렬;김양한
    • Journal of KSNVE
    • /
    • v.4 no.3
    • /
    • pp.295-305
    • /
    • 1994
  • The reflection coefficient of a material at oblique incidence is measured in a free field. The sound pressure distributions are measured at discrete points on two measurement lines and then decomposed into plane wave components by using spatial Fourier transform. The inciedent and reflected plane wave components are obtained from a set of "decomposition equations" of which uses the plane wave propagation theory. Numerical simulations and experiments have been performed to see the effect of finite size of measurement area. To reduce this effect, a window fuction has been performed to see the effects of finite size of mesurement area. To reduce this effect, a window function has been proposed and its effect on the measurement of sound absorbing material property has been studied as well. The reflection coefficient obtained by this method is compared with those obtained from other methods; 2-microphone method in a duct and an expirical equation of which determines the characteristic impedance .rho.c and propagation constant k of a material from flow resistance information.formation.

  • PDF

MDS-based Localization Reflecting Depth, Temperature, and Salinity of Ocean in Underwater Acoustic Sensor Networks(UWASNs) (수중 센서 네트워크에서 수심, 수온, 염도를 고려한 환경에서 MDS를 이용한 위치인식 연구)

  • Jung, Hui-Sok;Kim, Eun-Chan;Yang, Yeon-Mo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.4
    • /
    • pp.187-191
    • /
    • 2012
  • In these days, there are huge increases of concerning underwater acoustic sensor networks (UWASNs) to explore marine resources and to monitor climate change. To collect information from sensor nodes which are randomly deployed in underwater, Multi-Dimensional Scaling (MDS) based locating methods have been recently introduced, which consider sound speed to be constant in underwater. However, underwater sound speed tends to vary depending on underwater environment factors, such as depth, temperature, and salinity. In this paper, we propose a method considering environment factors, can influence upon sound speed in underwater, and introduce experimental setup which can follow up environmental factors.

Multi-Pulse Amplitude and Location Estimation by Maximum-Likelihood Estimation in MPE-LPC Speech Synthesis (MPE-LPC음성합성에서 Maximum- Likelihood Estimation에 의한 Multi-Pulse의 크기와 위치 추정)

  • 이기용;최홍섭;안수길
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.9
    • /
    • pp.1436-1443
    • /
    • 1989
  • In this paper, we propose a maximum-likelihood estimation(MLE) method to obtain the location and the amplitude of the pulses in MPE( multi-pulse excitation)-LPC speech synthesis using multi-pulses as excitation source. This MLE method computes the value maximizing the likelihood function with respect to unknown parameters(amplitude and position of the pulses) for the observed data sequence. Thus in the case of overlapped pulses, the method is equivalent to Ozawa's crosscorrelation method, resulting in equal amount of computation and sound quality with the cross-correlation method. We show by computer simulation: the multi-pulses obtained by MLE method are(1) pseudo-periodic in pitch in the case of voicde sound, (2) the pulses are random for unvoiced sound, (3) the pulses change from random to periodic in the interval where the original speech signal changes from unvoiced to voiced. Short time power specta of original speech and syunthesized speech obtained by using multi-pulses as excitation source are quite similar to each other at the formants.

  • PDF