• Title/Summary/Keyword: speech enhancement

Search Result 340, Processing Time 0.035 seconds

Research on Chinese Microblog Sentiment Classification Based on TextCNN-BiLSTM Model

  • Haiqin Tang;Ruirui Zhang
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.842-857
    • /
    • 2023
  • Currently, most sentiment classification models on microblogging platforms analyze sentence parts of speech and emoticons without comprehending users' emotional inclinations and grasping moral nuances. This study proposes a hybrid sentiment analysis model. Given the distinct nature of microblog comments, the model employs a combined stop-word list and word2vec for word vectorization. To mitigate local information loss, the TextCNN model, devoid of pooling layers, is employed for local feature extraction, while BiLSTM is utilized for contextual feature extraction in deep learning. Subsequently, microblog comment sentiments are categorized using a classification layer. Given the binary classification task at the output layer and the numerous hidden layers within BiLSTM, the Tanh activation function is adopted in this model. Experimental findings demonstrate that the enhanced TextCNN-BiLSTM model attains a precision of 94.75%. This represents a 1.21%, 1.25%, and 1.25% enhancement in precision, recall, and F1 values, respectively, in comparison to the individual deep learning models TextCNN. Furthermore, it outperforms BiLSTM by 0.78%, 0.9%, and 0.9% in precision, recall, and F1 values.

Efficient Implementation of SVM-Based Speech/Music Classifier by Utilizing Temporal Locality (시간적 근접성 향상을 통한 효율적인 SVM 기반 음성/음악 분류기의 구현 방법)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.149-156
    • /
    • 2012
  • Support vector machines (SVMs) are well known for their pattern recognition capability, but proper care should be taken to alleviate their inherent implementation cost resulting from high computational intensity and memory requirement, especially in embedded systems where only limited resources are available. Since the memory requirement determined by the dimensionality and the number of support vectors is generally too high for a cache in embedded systems to accomodate, frequent accesses to the main memory occur inevitably whenever the cache is not able to provide requested data to the processor. These frequent accesses to the main memory result in overall performance degradation and increased energy consumption because a memory access typically takes longer and consumes more energy than a cache access or a register access. In this paper, we propose a technique that reduces the number of main memory accesses by optimizing the data access pattern of the SVM-based classifier in such a way that the temporal locality of the accesses increases, fully utilizing data loaded into the processor chip. With experiments, we confirm the enhancement made by the proposed technique in terms of the number of memory accesses, overall execution time, and energy consumption.

Implementation of Adaptive Feedback Cancellation Algorithm for Multichannel Digital Hearing Aid (다채널 디지털 보청기에 적용 가능한 Adaptive Feedback Cancellation 알고리즘 구현)

  • Jeon, Shin-Hyuk;Ji, You-Na;Park, Young-Cheol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.1
    • /
    • pp.102-110
    • /
    • 2017
  • In this paper, we have implemented an real-time adaptive feedback cancellation(AFC) algorithm that can be applied to multi-channel digital hearing aid. Multichannel digital hearing aid typically use the FFT filterbank based Wide Dynamic Range Compression(WDRC) algorithm to compensate for hearing loss. The implemented real-time acoustic feedback cancellation algorithm has one integrated structure using the same FFT filter bank with WDRC, which can be beneficial in terms of computation affecting the hearing aid battery life. In addition, when the AFC fails to operate due to nonlinear input and output, the reduction gain is applied to improve robustness in practical environment. The implemented algorithm can be further improved by adding various signal processing algorithm such as speech enhancement.

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

A Clinical Study of Treatment with Scalp Acupuncture for Learning Disorders (학습장애 아동의 두침 병행 치료 효과에 대한 임상적 연구)

  • Lee, Yu-Jin;Yoo, Song-Wun;Lee, Su-Bin;Ko, In-Sung;Park, Se-Jin
    • Journal of Oriental Neuropsychiatry
    • /
    • v.24 no.2
    • /
    • pp.145-154
    • /
    • 2013
  • Objectives : The purpose of this study is to examine the effects of treatment with scalp acupunctures for children with learning disorders. Methods : For this study, we evaluated Korea standard progressive matrices test (K-SPM) on 24 children with learning disorders who visited Korean medical center neuropsychiatry outpatient clinic from July 2012 to January 2013. Scalp acupuncture, cognitive enhancement therapy and speech-language therapy were applied. All children were treated 2 times a week for 4 months and we compared K-SPM test scores before treatment and 30 times after the treatment. Results : 1) After the treatment, K-SPM test scores have increased significantly (p<0.05) and the number of children in grade 5 (<5%) have decreased from 14 to 6. 2) Comparing K-SPM test scores between two groups: one with medical history and the other without medical history, the scores in both groups have increased significantly (p<0.05). 3) We also divided the children into two groups according to age: under the age of 13 and over the age of 13, and compared K-SPM test scores. Although the scores in both groups have increased respectively, it is the scores of the former group (under the age of 13) that have increased significantly (p<0.05). Conclusions : The treatments with scalp acupunctures were shown to be an effective intervention when improving K-SPM test scores of children with learning disorders.

An Enhancement of Learning Speed of the Error - Backpropagation Algorithm (오류 역전도 알고리즘의 학습속도 향상기법)

  • Shim, Bum-Sik;Jung, Eui-Yong;Yoon, Chung-Hwa;Kang, Kyung-Sik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.7
    • /
    • pp.1759-1769
    • /
    • 1997
  • The Error BackPropagation (EBP) algorithm for multi-layered neural networks is widely used in various areas such as associative memory, speech recognition, pattern recognition and robotics, etc. Nevertheless, many researchers have continuously published papers about improvements over the original EBP algorithm. The main reason for this research activity is that EBP is exceeding slow when the number of neurons and the size of training set is large. In this study, we developed new learning speed acceleration methods using variable learning rate, variable momentum rate and variable slope for the sigmoid function. During the learning process, these parameters should be adjusted continuously according to the total error of network, and it has been shown that these methods significantly reduced learning time over the original EBP. In order to show the efficiency of the proposed methods, first we have used binary data which are made by random number generator and showed the vast improvements in terms of epoch. Also, we have applied our methods to the binary-valued Monk's data, 4, 5, 6, 7-bit parity checker and real-valued Iris data which are famous benchmark training sets for machine learning.

  • PDF

Noise Statistics Estimation Using Target-to-Noise Contribution Ratio for Parameterized Multichannel Wiener Filter (변수내장형 다채널 위너필터를 위한 목적신호대잡음 기여비를 이용한 잡음추정기법)

  • Hong, Jungpyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1926-1933
    • /
    • 2022
  • Parameterized multichannel Wiener filter (PMWF) is a linear filter that can control the trade-off between residual noise and signal distortion using the embedded parameter. To apply the PMWF to noisy inputs, accurate noise estimation is important and multichannel minima-controlled recursive averaging (MMCRA) is widely used. However, in the case of the MMCRA, the accuracy of noise estimation decreases when a directional interference is involved into the array inputs. Consequently, the performance of the PMWF is degraded. Therefore, we propose a noise power spectral density (PSD) estimation method for the PMWF in this paper. The proposed method is based on a consecutive process of eigenvalue decomposition on noisy input PSD, estimation of the target component contribution using directional information, and exponential weighting for improved estimation of the target contribution. For evaluation, four objective measures were compared with the MMCRA and we verify that the PMWF with the proposed noise estimation method can improve performance in environments where directional interfereces exist.

A Study on the Development Plan to Increase Supplement of Voice over Internet Protocol (인터넷전화의 보급 확산을 위한 발전방안에 관한 연구)

  • Park, Jae-Yong
    • Management & Information Systems Review
    • /
    • v.28 no.3
    • /
    • pp.191-210
    • /
    • 2009
  • Internet was first designed only for sending data, but as the time passed, internet started to evolve into a broadband multi-media web that is capable of transmitting sound, video, high-capacity data and more due to the demands of internet users and the rapid changing internet-communication technology. Domestically, in January, 2000 Saerom C&T, launched a free VoIP, but due to limited ways of conversation(PC to PC) and absence of a revenue model, and bad speech quality, it had hit it's growth limit. This research studied VoIP based on technological enhancement in super-speed internet. According to IDC, domestic internet market's size was 80,800 million in 2008, and it formed a percentage of 12.5% out of the whole sound-communication market. in case of VoIP, it is able to maximize it's profit by connecting cable and wireless network, also it has a chance of becoming firm-concentrated monopoly market by fusing with IPTV. Considering the fact that our country is insignificant in MVNO revitalization, regulating organizations will play a significant roll on regulating profit between large and small businesses. Further research should be done to give VoIP a secure footing to prosper and become popularized.

  • PDF

A Correlational Study on Activities of Daily Living, Self-efficacy, Stroke Specific Qualify of Life and Need for Self-help Management Programs for Patients with Hemiplegia at Home (재가 뇌졸중환자의 일상생활활동, 자기효능감, 삶의 질, 자조관리프로그램요구도와의 관계에 관한 연구)

  • Kim Keum-Soon
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.8 no.1
    • /
    • pp.81-94
    • /
    • 2001
  • The purpose of this study was to identify levels of activity of daily living, self-efficacy. stroke specific quality of life and need for self-help management program for patients with hemiplegia in the home. Data were collected from June to November, 2000 and subjects were 88 poststroke patients who lived in Seoul and Kyunggi-do. The questionnaire consisted of 5 scales: activities of daily living, self-efficacy, stroke specific qulaity of life and need for a self-help management program. Data were analyzed using frequencies, percent, paired t-test, and Pearson's correlation coefficient with the SAS(version 6.12) program. The results are as follows ; 1) Most of subjects were Partially independent in ADL, but they needed assist once to do dressing, bathing meal preparation and house keeping work. 2) The mean self-efficacy score was 54.89(range : 1 to 80) and the individual differences were large. 3) Subjects responded that they were satisfied on the stroke specific quality of life scale totaled 65.8%. This value is comparatively low, especially for social role(51.4%), family functioning(58.3%) and mood (62.2%). 4) The highest needs for self-help management programs were for physical therapy, stress management, and range of motion exercise and the lowest needs were for elimination management and training, family counseling, and speech therapy. 5) On the demographic variables, sex showed significant differences for the dependent variables. Females had higher scores than males for IADL, self-efficacy, stroke-specific quality of life, and need for self-help management. 6) Age had high negative correlation with ADL, self-efficacy and stroke specific quality of life. Age was also correlated with need for self-help management. In conclusion, there was a high correlation for ADL, Self-efficacy and Quality of life in poststroke patients of home. The patient with a stroke also had a strong need for self-help management programs especially physical therapy and stress management. Therefore rehabilitation programs based on self-efficacy enhancement need to be developed in order to promote independent living for patients with hemiplegia.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.