• Title/Summary/Keyword: multi layer perceptron

Search Result 436, Processing Time 0.031 seconds

A Study on the Spoken KOrean-Digit Recognition Using the Neural Netwok (神經網을 利用한 韓國語 數字音 認識에 관한 硏究)

  • Park, Hyun-Hwa;Gahang, Hae Dong;Bae, Keun Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.5-13
    • /
    • 1992
  • Taking devantage of the property that Korean digit is a mono-syllable word, we proposed a spoken Korean-digit recognition scheme using the multi-layer perceptron. The spoken Korean-digit is divided into three segments (initial sound, medial vowel, and final consonant) based on the voice starting / ending points and a peak point in the middle of vowel sound. The feature vectors such as cepstrum, reflection coefficients, ${\Delta}$cepstrum and ${\Delta}$energy are extracted from each segment. It has been shown that cepstrum, as an input vector to the neural network, gives higher recognition rate than reflection coefficients. Regression coefficients of cepstrum did not affect as much as we expected on the recognition rate. That is because, it is believed, we extracted features from the selected stationary segments of the input speech signal. With 150 ceptral coefficients obtained from each spoken digit, we achieved correct recognition rate of 97.8%.

  • PDF

A Fingerprint Classification Method Based on the Combination of Gray Level Co-Occurrence Matrix and Wavelet Features (명암도 동시발생 행렬과 웨이블릿 특징 조합에 기반한 지문 분류 방법)

  • Kang, Seung-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.7
    • /
    • pp.870-878
    • /
    • 2013
  • In this paper, we propose a novel fingerprint classification method to enhance the accuracy and efficiency of the fingerprint identification system, one of biometrics systems. According to the previous researches, fingerprints can be categorized into the several patterns based on their pattern of ridges and valleys. After construction of fingerprint database based on their patters, fingerprint classification approach can help to accelerate the fingerprint recognition. The reason is that classification methods reduce the size of the search space to the fingerprints of the same category before matching. First, we suggest a method to extract region of interest (ROI) which have real information about fingerprint from the image. And then we propose a feature extraction method which combines gray level co-occurrence matrix (GLCM) and wavelet features. Finally, we compare the performance of our proposed method with the existing method which use only GLCM as the feature of fingerprint by using the multi-layer perceptron and support vector machine.

Development of Interactive Content Services through an Intelligent IoT Mirror System (지능형 IoT 미러 시스템을 활용한 인터랙티브 콘텐츠 서비스 구현)

  • Jung, Wonseok;Seo, Jeongwook
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.5
    • /
    • pp.472-477
    • /
    • 2018
  • In this paper, we develop interactive content services for preventing depression of users through an intelligent Internet of Things(IoT) mirror system. For interactive content services, an IoT mirror device measures attention and meditation data from an EEG headset device and also measures facial expression data such as "sad", "angery", "disgust", "neutral", " happy", and "surprise" classified by a multi-layer perceptron algorithm through an webcam. Then, it sends the measured data to an oneM2M-compliant IoT server. Based on the collected data in the IoT server, a machine learning model is built to classify three levels of depression (RED, YELLOW, and GREEN) given by a proposed merge labeling method. It was verified that the k-nearest neighbor (k-NN) model could achieve about 93% of accuracy by experimental results. In addition, according to the classified level, a social network service agent sent a corresponding alert message to the family, friends and social workers. Thus, we were able to provide an interactive content service between users and caregivers.

Three Stage Neural Networks for Direction of Arrival Estimation (도래각 추정을 위한 3단계 인공신경망 알고리듬)

  • Park, Sun-bae;Yoo, Do-sik
    • Journal of Advanced Navigation Technology
    • /
    • v.24 no.1
    • /
    • pp.47-52
    • /
    • 2020
  • Direction of arrival (DoA) estimation is a scheme of estimating the directions of targets by analyzing signals generated or reflected from the targets and is used in various fields. Artificial neural networks (ANN) is a field of machine learning that mimics the neural network of living organisms. They show good performance in pattern recognition. Although researches has been using ANNs to estimate the DoAs, there are limitationsin dealing with variations of the signal-to-noise ratio (SNR) of the target signals. In this paper, we propose a three-stage ANN algorithm for DoA estimation. The proposed algorithm can minimize the performance reduction by applying the model trained in a single SNR environment to various environments through a 'noise reduction process'. Furthermore, the scheme reduces the difficulty in learning and maintains efficiency in estimation, by employing a process of DoA shift. We compare the performance of the proposed algorithm with Cramer-Rao bound (CRB) and the performances of existing subspace-based algorithms and show that the proposed scheme exhibits better performance than other schemes in some severe environments such as low SNR environments or situations in which targets are located very close to each other.

Human Touching Behavior Recognition based on Neural Network in the Touch Detector using Force Sensors (힘 센서를 이용한 접촉감지부에서 신경망기반 인간의 접촉행동 인식)

  • Ryu, Joung-Woo;Park, Cheon-Shu;Sohn, Joo-Chan
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.10
    • /
    • pp.910-917
    • /
    • 2007
  • Of the possible interactions between human and robot, touch is an important means of providing human beings with emotional relief. However, most previous studies have focused on interactions based on voice and images. In this paper. a method of recognizing human touching behaviors is proposed for developing a robot that can naturally interact with humans through touch. In this method, the recognition process is divided into pre-process and recognition Phases. In the Pre-Process Phase, recognizable characteristics are calculated from the data generated by the touch detector which was fabricated using force sensors. The force sensor used an FSR (force sensing register). The recognition phase classifies human touching behaviors using a multi-layer perceptron which is a neural network model. Experimental data was generated by six men employing three types of human touching behaviors including 'hitting', 'stroking' and 'tickling'. As the experimental result of a recognizer being generated for each user and being evaluated as cross-validation, the average recognition rate was 82.9% while the result of a single recognizer for all users showed a 74.5% average recognition rate.

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

Feature Vector Decision Method of Various Fault Signals for Neural-network-based Fault Diagnosis System (신경회로망 기반 고장 진단 시스템을 위한 고장 신호별 특징 벡터 결정 방법)

  • Han, Hyung-Seob;Cho, Sang-Jin;Chong, Ui-Pil
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.20 no.11
    • /
    • pp.1009-1017
    • /
    • 2010
  • As rotating machines play an important role in industrial applications such as aeronautical, naval and automotive industries, many researchers have developed various condition monitoring system and fault diagnosis system by applying various techniques such as signal processing and pattern recognition. Recently, fault diagnosis systems using artificial neural network have been proposed. For effective fault diagnosis, this paper used MLP(multi-layer perceptron) network which is widely used in pattern classification. Since using obtained signals without preprocessing as inputs of neural network can decrease performance of fault classification, it is very important to extract significant features of captured signals and to apply suitable features into diagnosis system according to the kinds of obtained signals. Therefore, this paper proposes the decision method of the proper feature vectors about each fault signal for neural-network-based fault diagnosis system. We applied LPC coefficients, maximum magnitudes of each spectral section in FFT and RMS(root mean square) and variance of wavelet coefficients as feature vectors and selected appropriate feature vectors as comparing error ratios of fault diagnosis for sound, vibration and current fault signals. From experiment results, LPC coefficients and maximum magnitudes of each spectral section showed 100 % diagnosis ratios for each fault and the method using wavelet coefficients had noise-robust characteristic.

Neural-network-based Driver Drowsiness Detection System Using Linear Predictive Coding Coefficients and Electroencephalographic Changes (선형예측계수와 뇌파의 변화를 이용한 신경회로망 기반 운전자의 졸음 감지 시스템)

  • Chong, Ui-Pil;Han, Hyung-Seob
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.13 no.3
    • /
    • pp.136-141
    • /
    • 2012
  • One of the main reasons for serious road accidents is driving while drowsy. For this reason, drowsiness detection and warning system for drivers has recently become a very important issue. Monitoring physiological signals provides the possibility of detecting features of drowsiness and fatigue of drivers. One of the effective signals is to measure electroencephalogram (EEG) signals and electrooculogram (EOG) signals. The aim of this study is to extract drowsiness-related features from a set of EEG signals and to classify the features into three states: alertness, drowsiness, sleepiness. This paper proposes a neural-network-based drowsiness detection system using Linear Predictive Coding (LPC) coefficients as feature vectors and Multi-Layer Perceptron (MLP) as a classifier. Samples of EEG data from each predefined state were used to train the MLP program by using the proposed feature extraction algorithms. The trained MLP program was tested on unclassified EEG data and subsequently reviewed according to manual classification. The classification rate of the proposed system is over 96.5% for only very small number of samples (250ms, 64 samples). Therefore, it can be applied to real driving incident situation that can occur for a split second.

Performance comparison of wake-up-word detection on mobile devices using various convolutional neural networks (다양한 합성곱 신경망 방식을 이용한 모바일 기기를 위한 시작 단어 검출의 성능 비교)

  • Kim, Sanghong;Lee, Bowon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.454-460
    • /
    • 2020
  • Artificial intelligence assistants that provide speech recognition operate through cloud-based voice recognition with high accuracy. In cloud-based speech recognition, Wake-Up-Word (WUW) detection plays an important role in activating devices on standby. In this paper, we compare the performance of Convolutional Neural Network (CNN)-based WUW detection models for mobile devices by using Google's speech commands dataset, using the spectrogram and mel-frequency cepstral coefficient features as inputs. The CNN models used in this paper are multi-layer perceptron, general convolutional neural network, VGG16, VGG19, ResNet50, ResNet101, ResNet152, MobileNet. We also propose network that reduces the model size to 1/25 while maintaining the performance of MobileNet is also proposed.

Automatic Change Detection of MODIS NDVI using Artificial Neural Networks (신경망을 이용한 MODIS NDVI의 자동화 변화탐지 기법)

  • Jung, Myung-Hee
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.2
    • /
    • pp.83-89
    • /
    • 2012
  • Natural Vegetation cover, which is very important earth resource, has been significantly altered by humans in some manner. Since this has currently resulted in a significant effect on global climate, various studies on vegetation environment including forest have been performed and the results are utilized in policy decision making. Remotely sensed data can detect, identify and map vegetation cover change based on the analysis of spectral characteristics and thus are vigorously utilized for monitoring vegetation resources. Among various vegetation indices extracted from spectral reponses of remotely sensed data, NDVI is the most popular index which provides a measure of how much photosynthetically active vegetation is present in the scene. In this study, for change detection in vegetation cover, a Multi-layer Perceptron Network (MLPN) as a nonparametric approach has been designed and applied to MODIS/Aqua vegetation indices 16-day L3 global 250m SIN Grid(v005) (MYD13Q1) data. The feature vector for change detection is constructed with the direct NDVI diffenrence at a pixel as well as the differences in some subset of NDVI series data. The research covered 5 years (2006-20110) over Korean peninsular.