Search | Korea Science

Recurrent Neural Network with Backpropagation Through Time Learning Algorithm for Arabic Phoneme Recognition

Ismail, Saliza;Ahmad, Abdul Manan
- 제어로봇시스템학회:학술대회논문집
- /
- 2004.08a
- /
- pp.1033-1036
- /
- 2004
The study on speech recognition and understanding has been done for many years. In this paper, we propose a new type of recurrent neural network architecture for speech recognition, in which each output unit is connected to itself and is also fully connected to other output units and all hidden units [1]. Besides that, we also proposed the new architecture and the learning algorithm of recurrent neural network such as Backpropagation Through Time (BPTT, which well-suited. The aim of the study was to observe the difference of Arabic's alphabet like "alif" until "ya". The purpose of this research is to upgrade the people's knowledge and understanding on Arabic's alphabet or word by using Recurrent Neural Network (RNN) and Backpropagation Through Time (BPTT) learning algorithm. 4 speakers (a mixture of male and female) are trained in quiet environment. Neural network is well-known as a technique that has the ability to classified nonlinear problem. Today, lots of researches have been done in applying Neural Network towards the solution of speech recognition [2] such as Arabic. The Arabic language offers a number of challenges for speech recognition [3]. Even through positive results have been obtained from the continuous study, research on minimizing the error rate is still gaining lots attention. This research utilizes Recurrent Neural Network, one of Neural Network technique to observe the difference of alphabet "alif" until "ya".
PDF

A New Vocoder based on AMR 7.4Kbit/s Mode for Speaker Dependent System (화자 의존 환경의 AMR 7.4Kbit/s모드에 기반한 보코더)

Min, Byung-Jae;Park, Dong-Chul
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.9C
- /
- pp.691-696
- /
- 2008
A new vocoder of Code Excited Linear Predictive (CELP) based on Adaptive Multi Rate (AMR) 7.4kbit/s mode is proposed in this paper. The proposed vocoder achieves a better compression rate in an environment of Speaker Dependent Coding System (SDSC) and is efficiently used for systems, such as OGM(Outgoing message) and TTS(Text To Speech), which needs only one person's speech. In order to enhance the compression rate of a coder, a new Line Spectral Pairs(LSP) code-book is employed by using Centroid Neural Network (CNN) algorithm. In comparison with original(traditional) AMR 7.4 Kbit/s coder, the new coder shows 27% higher compression rate while preserving synthesized speech quality in terms of Mean Opinion Score(MOS).
PDF KSCI

Neural-network-based Fault Detection and Diagnosis Method Using EIV(errors-in variables) (EIV를 이용한 신경회로망 기반 고장진단 방법)

Han, Hyung-Seob;Cho, Sang-Jin;Chong, Ui-Pil
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.21 no.11
- /
- pp.1020-1028
- /
- 2011
As rotating machines play an important role in industrial applications such as aeronautical, naval and automotive industries, many researchers have developed various condition monitoring system and fault diagnosis system by applying artificial neural network. Since using obtained signals without preprocessing as inputs of neural network can decrease performance of fault classification, it is very important to extract significant features of captured signals and to apply suitable features into diagnosis system according to the kinds of obtained signals. Therefore, this paper proposes a neural-network-based fault diagnosis system using AR coefficients as feature vectors by LPC(linear predictive coding) and EIV(errors-in variables) analysis. We extracted feature vectors from sound, vibration and current faulty signals and evaluated the suitability of feature vectors depending on the classification results and training error rates by changing AR order and adding noise. From experimental results, we conclude that classification results using feature vectors by EIV analysis indicate more than 90 % stably for less than 10 orders and noise effect comparing to LPC.
https://doi.org/10.5050/KSNVE.2011.21.11.1020 인용 PDF KSCI

A Linear Time Algorithm for Constructing a Sharable-Bandwidth Tree in Public-shared Network (공유 네트워크에서 공유대역폭 트리 구성을 위한 선형 시간 알고리즘)

Chong, Kyun-Rak
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.6
- /
- pp.93-100
- /
- 2012
In this paper we have proposed a linear time algorithm for solving the minimum sharable-bandwidth tree construction problem. The public-shared network is a user generated infrastructure on which a user can access the Internet and transfer data from any place via access points with sharable bandwidth. Recently, the idea of constructing the SVC video streaming delivery system on public-shared network has been proposed. To send video stream from the stream server to clients on public-shared network, a tree structure is constructed. The problem of constructing a tree structure to serve the video streaming requests by using minimum amount of sharable bandwidth has been shown to be NP-hard. The previously published algorithms for solving this problem are either unable to find solutions frequently or less efficient. The experimental results showed that our algorithm is excellent both in the success rate of finding solutions and in the quality of solutions.
https://doi.org/10.9708/jksci.2012.17.6.093 인용 PDF KSCI

Speaker-dependent Speech Recognition Algorithm for Male and Female Classification (남녀성별 분류를 위한 화자종속 음성인식 알고리즘)

Choi, Jae-Seung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.4
- /
- pp.775-780
- /
- 2013
This paper proposes a speaker-dependent speech recognition algorithm which can classify the gender for male and female speakers in white noise and car noise, using a neural network. The proposed speech recognition algorithm is trained by the neural network to recognize the gender for male and female speakers, using LPC (Linear Predictive Coding) cepstrum coefficients. In the experiment results, the maximal improvement of total speech recognition rate is 96% for white noise and 88% for car noise, respectively, after trained a total of six neural networks. Finally, the proposed speech recognition algorithm is compared with the results of a conventional speech recognition algorithm in the background noisy environment.
https://doi.org/10.6109/jkiice.2013.17.4.775 인용 PDF KSCI

Computer Vision Based Measurement, Error Analysis and Calibration (컴퓨터 시각(視覺)에 의거한 측정기술(測定技術) 및 측정오차(測定誤差)의 분석(分析)과 보정(補正))

Hwang, H.;Lee, C.H.
- Journal of Biosystems Engineering
- /
- v.17 no.1
- /
- pp.65-78
- /
- 1992
When using a computer vision system for a measurement, the geometrically distorted input image usually restricts the site and size of the measuring window. A geometrically distorted image caused by the image sensing and processing hardware degrades the accuracy of the visual measurement and prohibits the arbitrary selection of the measuring scope. Therefore, an image calibration is inevitable to improve the measuring accuracy. A calibration process is usually done via four steps such as measurement, modeling, parameter estimation, and compensation. In this paper, the efficient error calibration technique of a geometrically distorted input image was developed using a neural network. After calibrating a unit pixel, the distorted image was compensated by training CMLAN(Cerebellar Model Linear Associator Network) without modeling the behavior of any system element. The input/output training pairs for the network was obtained by processing the image of the devised sampled pattern. The generalization property of the network successfully compensates the distortion errors of the untrained arbitrary pixel points on the image space. The error convergence of the trained network with respect to the network control parameters were also presented. The compensated image through the network was then post processed using a simple DDA(Digital Differential Analyzer) to avoid the pixel disconnectivity. The compensation effect was verified using known sized geometric primitives. A way to extract directly a real scaled geometric quantity of the object from the 8-directional chain coding was also devised and coded. Since the developed calibration algorithm does not require any knowledge of modeling system elements and estimating parameters, it can be applied simply to any image processing system. Furthermore, it efficiently enhances the measurement accuracy and allows the arbitrary sizing and locating of the measuring window. The applied and developed algorithms were coded as a menu driven way using MS-C language Ver. 6.0, PC VISION PLUS library functions, and VGA graphic functions.
PDF

Interference Alignment in Two-way Relay Channel with Compute-and-Forward

Jiang, Xue;Zheng, Baoyu
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.10 no.2
- /
- pp.593-607
- /
- 2016
This paper analyzes interference alignment in the two-way relay network with compute-and-forward in both single relay and multiple relays networks. The advantages of compute-and-forward over other relaying strategies are that it can relay only linear combinations of the useful signals and remove the noise. The algorithm proposed in this paper adopts the criterion of maximum SINR to derive the pre-coding matrix. The experimental results show that the performance of interference alignment in two-way relay channel via compute-and-forward is better than that of amplify-and-forward, and the total sum rate in the two-way multiple relay networks is larger than that in the two-way single relay networks.
https://doi.org/10.3837/tiis.2016.02.009 인용 PDF KSCI KPUBS HTML

Real-time implementation and performance evaluation of speech classifiers in speech analysis-synthesis

Kumar, Sandeep
- ETRI Journal
- /
- v.43 no.1
- /
- pp.82-94
- /
- 2021
In this work, six voiced/unvoiced speech classifiers based on the autocorrelation function (ACF), average magnitude difference function (AMDF), cepstrum, weighted ACF (WACF), zero crossing rate and energy of the signal (ZCR-E), and neural networks (NNs) have been simulated and implemented in real time using the TMS320C6713 DSP starter kit. These speech classifiers have been integrated into a linear-predictive-coding-based speech analysis-synthesis system and their performance has been compared in terms of the percentage of the voiced/unvoiced classification accuracy, speech quality, and computation time. The results of the percentage of the voiced/unvoiced classification accuracy and speech quality show that the NN-based speech classifier performs better than the ACF-, AMDF-, cepstrum-, WACF- and ZCR-E-based speech classifiers for both clean and noisy environments. The computation time results show that the AMDF-based speech classifier is computationally simple, and thus its computation time is less than that of other speech classifiers, while that of the NN-based speech classifier is greater compared with other classifiers.
https://doi.org/10.4218/etrij.2019-0364 인용 PDF KSCI

Comparison of Characteristic Vector of Speech for Gender Recognition of Male and Female (남녀 성별인식을 위한 음성 특징벡터의 비교)

Jeong, Byeong-Goo;Choi, Jae-Seung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.16 no.7
- /
- pp.1370-1376
- /
- 2012
This paper proposes a gender recognition algorithm which classifies a male or female speaker. In this paper, characteristic vectors for the male and female speaker are analyzed, and recognition experiments for the proposed gender recognition by a neural network are performed using these characteristic vectors for the male and female. Input characteristic vectors of the proposed neural network are 10 LPC (Linear Predictive Coding) cepstrum coefficients, 12 LPC cepstrum coefficients, 12 FFT (Fast Fourier Transform) cepstrum coefficients and 1 RMS (Root Mean Square), and 12 LPC cepstrum coefficients and 8 FFT spectrum. The proposed neural network trained by 20-20-2 network are especially used in this experiment, using 12 LPC cepstrum coefficients and 8 FFT spectrum. From the experiment results, the average recognition rates obtained by the gender recognition algorithm is 99.8% for the male speaker and 96.5% for the female speaker.
https://doi.org/10.6109/jkiice.2012.16.7.1370 인용 PDF KSCI

Classification of Whale Sounds using LPC and Neural Networks (신경망과 LPC 계수를 이용한 고래 소리의 분류)

An, Woo-Jin;Lee, Eung-Jae;Kim, Nam-Gyu;Chong, Ui-Pil
- Journal of the Institute of Convergence Signal Processing
- /
- v.18 no.2
- /
- pp.43-48
- /
- 2017
The underwater transients signals contain the characteristics of complexity, time varying, nonlinear, and short duration. So it is very hard to model for these signals with reference patterns. In this paper we separate the whole length of signals into some short duration of constant length with overlapping frame by frame. The 20th LPC(Linear Predictive Coding) coefficients are extracted from the original signals using Durbin algorithm and applied to neural network. The 65% of whole signals were learned and 35% of the signals were tested in the neural network with two hidden layers. The types of the whales for sound classification are Blue whale, Dulsae whale, Gray whale, Humpback whale, Minke whale, and Northern Right whale. Finally, we could obtain more than 83% of classification rate from the test signals.
PDF

Search Result 55, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)