Robust Speech Recognition Algorithm of Voice Activated Powered Wheelchair for Severely Disabled Person (중증 장애우용 음성구동 휠체어를 위한 강인한 음성인식 알고리즘)
-
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.6
- /
- pp.250-258
- /
- 2007
Current speech recognition technology s achieved high performance with the development of hardware devices, however it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. For the system which aims to operate powered wheelchairs safely by voice in real environment, we need to consider that non-voice commands such as user s coughing, breathing, and spark-like mechanical noise should be rejected and the wheelchair system need to recognize the speech commands affected by disability, which contains specific pronunciation speed and frequency. In this paper, we propose non-voice rejection method to perform voice/non-voice classification using both YIN based fundamental frequency(F0) extraction and reliability in preprocessing. We adopted a multi-template dictionary and acoustic modeling based speaker adaptation to cope with the pronunciation variation of inarticulately uttered speech. From the recognition tests conducted with the data collected in real environment, proposed YIN based fundamental extraction showed recall-precision rate of 95.1% better than that of 62% by cepstrum based method. Recognition test by a new system applied with multi-template dictionary and MAP adaptation also showed much higher accuracy of 99.5% than that of 78.6% by baseline system.
In this paper, we propose a 3-D geometric localization method for near-field broadband source in shallow water environments. According to the waveguide invariant theory, slope of the interference pattern which is seen in a sensor spectrogram directly proportional to a range of the source. The relative ratio of the range between source and sensors was estimated by matching of two interference patterns in spectrogram. Then this ratio is applied to the Apollonius's circle which shows the locus of a source whose range ratio from two sensors is constant. Two Apollonius's circles from three sensors make the intersection point that means the horizontal range and the azimuth angle of the source. And this intersection point is constant with source depth. Therefore the source depth can be estimated using 3-D hyperboloid equation whose range difference from two sensors is constant. To evaluate a performance of the proposed localization algorithm, simulation is performed using acoustic propagation program and analysis of localization error is demonstrated. From simulation results, error estimate for range and depth is described within 50 m and 15 m respectively.
Sonobuoys are disposable devices that utilize sound waves for information gathering, detecting engine noises, and capturing various acoustic characteristics. They play a crucial role in accurately detecting underwater targets, making them effective detection systems in anti-submarine warfare. Existing sonobuoy deployment methods in multistatic systems often rely on fixed patterns or heuristic-based rules, lacking efficiency in terms of the number of sonobuoys deployed and operational time due to the unpredictable mobility of the underwater targets. Thus, this paper proposes an optimal sonobuoy placement strategy for Unmanned Aerial Vehicles (UAVs) to overcome the limitations of conventional sonobuoy deployment methods. The proposed approach utilizes reinforcement learning in a simulation-based experimental environment that considers the movements of the underwater targets. The Unity ML-Agents framework is employed, and the Proximal Policy Optimization (PPO) algorithm is utilized for UAV learning in a virtual operational environment with real-time interactions. The reward function is designed to consider the number of sonobuoys deployed and the cost associated with sound sources and receivers, enabling effective learning. The proposed reinforcement learning-based deployment strategy compared to the conventional sonobuoy deployment methods in the same experimental environment demonstrates superior performance in terms of detection success rate, deployed sonobuoy count, and operational time.
Among the Foley sound generation models that have recently begun to be studied, a sound generation technique using the Vector Quantized-Variational AutoEncoder (VQ-VAE) structure and generation model such as Pixelsnail are one of the important research subjects. On the other hand, in the field of deep learning-based acoustic signal compression, residual vector quantization technology is reported to be more suitable than the conventional VQ-VAE structure. Therefore, in this paper, we aim to study whether residual vector quantization technology can be effectively applied to the Foley sound generation. In order to tackle the problem, this paper applies the residual vector quantization technique to the conventional VQ-VAE-based Foley sound generation model, and in particular, derives a model that is compatible with the existing models such as Pixelsnail and does not increase computational resource consumption. In order to evaluate the model, an experiment was conducted using DCASE2023 Task7 data. The results show that the proposed model enhances about 0.3 of the Fréchet audio distance. Unfortunately, the performance enhancement was limited, which is believed to be due to the decrease in the resolution of time-frequency domains in order to do not increase consumption of the computational resources.
The present study aims to investigate the level difference of floor impact noises of composite floor structure using EVA resilient materials. In order to this, four different types of resilient materials were designed combining PET, PP sheet and EVA mount including Flat type, Deck type, Cavity type and Mount type. Totally 9 different samples were made for acoustic measurements which were carried out twice with bang-machine and impact ball as the heavy-weight floor impact noise sources. All the floor impact noise measurements were undertaken at the authentication institution. As a result, concerning Flat and Cavity types, it was found that 2 dB ~ 5 dB of heavy-weight floor impact noise was reduced supplementally when PET was added, while floor impact noise larger than 50 dB was acquired when single resilient material was used. Especially, most high performance was obtained for Mount type with 1st grade of light-weight floor impact noise and 2nd grade of heavy-weight floor impact noise. This is because of material property with low dense PET sound absorption materials which fill all around EVA mounts. Also, it was considered that this results are due to the sound impact absorption by the both EVA mounts and the air cavity between EVA mount and PP sheet. Also, it was found that at least 36 EVA mounts per 1m2 area of resilient panel make more noise reduction of heavy-weight floor impact noises.
An understanding of the influence of emotional context on memory retrieval is crucial to our comprehensive understanding of human cognition. While previous research focused primarily on visual stimuli to address this relationship, this study ventures into the realm of speech-based emotional contexts. Building on previous findings, we examine the effects of arousal and the valence of verbal contexts on memory, with particular focus on mitigating the serial position effect. In Study 1, we investigated how the arousal level of verbal context in the middle of a word list affects memory retention. Our results demonstrated detriment to the memory of later parts of the word list when exposed to low-arousal contexts. In Study 2, we controlled for arousal levels and examined the impact of valence on memory. We found that negative verbal contexts impair the memory of the word when presented together. Our findings suggest that speech-based emotional contexts do not facilitate verbal memory processing. In particular, negative emotional contexts were found to reinforce the serial position effect. Negative emotional contexts tend to disrupt task performance and fail to elicit memory-enhancing effects, especially when both the context and memory stimulus are verbal. These insights offer a valuable contribution to our understanding of the nuances of auditorily delivered emotional context in verbal memory processes.
The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70