• Title/Summary/Keyword: perceptron learning

Search Result 349, Processing Time 0.031 seconds

A study on the connected-digit recognition using MLP-VQ and Weighted DHMM (MLP-VQ와 가중 DHMM을 이용한 연결 숫자음 인식에 관한 연구)

  • Chung, Kwang-Woo;Hong, Kwang-Seok
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.8
    • /
    • pp.96-105
    • /
    • 1998
  • The aim of this paper is to propose the method of WDHMM(Weighted DHMM), using the MLP-VQ for the improvement of speaker-independent connect-digit recognition system. MLP neural-network output distribution shows a probability distribution that presents the degree of similarity between each pattern by the non-linear mapping among the input patterns and learning patterns. MLP-VQ is proposed in this paper. It generates codewords by using the output node index which can reach the highest level within MLP neural-network output distribution. Different from the old VQ, the true characteristics of this new MLP-VQ lie in that the degree of similarity between present input patterns and each learned class pattern could be reflected for the recognition model. WDHMM is also proposed. It can use the MLP neural-network output distribution as the way of weighing the symbol generation probability of DHMMs. This newly-suggested method could shorten the time of HMM parameter estimation and recognition. The reason is that it is not necessary to regard symbol generation probability as multi-dimensional normal distribution, as opposed to the old SCHMM. This could also improve the recognition ability by 14.7% higher than DHMM, owing to the increase of small caculation amount. Because it can reflect phone class relations to the recognition model. The result of my research shows that speaker-independent connected-digit recognition, using MLP-VQ and WDHMM, is 84.22%.

  • PDF

The Implementable Functions of the CoreNet of a Multi-Valued Single Neuron Network (단층 코어넷 다단입력 인공신경망회로의 함수에 관한 구현가능 연구)

  • Park, Jong Joon
    • Journal of IKEEE
    • /
    • v.18 no.4
    • /
    • pp.593-602
    • /
    • 2014
  • One of the purposes of an artificial neural netowrk(ANNet) is to implement the largest number of functions as possible with the smallest number of nodes and layers. This paper presents a CoreNet which has a multi-leveled input value and a multi-leveled output value with a 2-layered ANNet, which is the basic structure of an ANNet. I have suggested an equation for calculating the capacity of the CoreNet, which has a p-leveled input and a q-leveled output, as $a_{p,q}={\frac{1}{2}}p(p-1)q^2-{\frac{1}{2}}(p-2)(3p-1)q+(p-1)(p-2)$. I've applied this CoreNet into the simulation model 1(5)-1(6), which has 5 levels of an input and 6 levels of an output with no hidden layers. The simulation result of this model gives, the maximum 219 convergences for the number of implementable functions using the cot(${\sqrt{x}}$) input leveling method. I have also shown that, the 27 functions are implementable by the calculation of weight values(w, ${\theta}$) with the multi-threshold lines in the weight space, which are diverged in the simulation results. Therefore the 246 functions are implementable in the 1(5)-1(6) model, and this coincides with the value from the above eqution $a_{5,6}(=246)$. I also show the implementable function numbering method in the weight space.

The Capacity of Multi-Valued Single Layer CoreNet(Neural Network) and Precalculation of its Weight Values (단층 코어넷 다단입력 인공신경망회로의 처리용량과 사전 무게값 계산에 관한 연구)

  • Park, Jong-Joon
    • Journal of IKEEE
    • /
    • v.15 no.4
    • /
    • pp.354-362
    • /
    • 2011
  • One of the unsolved problems in Artificial Neural Networks is related to the capacity of a neural network. This paper presents a CoreNet which has a multi-leveled input and a multi-leveled output as a 2-layered artificial neural network. I have suggested an equation for calculating the capacity of the CoreNet, which has a p-leveled input and a q-leveled output, as $a_{p,q}=\frac{1}{2}p(p-1)q^2-\frac{1}{2}(p-2)(3p-1)q+(p-1)(p-2)$. With an odd value of p and an even value of q, (p-1)(p-2)(q-2)/2 needs to be subtracted further from the above equation. The simulation model 1(3)-1(6) has 3 levels of an input and 6 levels of an output with no hidden layer. The simulation result of this model gives, out of 216 possible functions, 80 convergences for the number of implementable function using the cot(x) input leveling method. I have also shown that, from the simulation result, the two diverged functions become implementable by precalculating the weight values. The simulation result and the precalculation of the weight values give the same result as the above equation in the total number of implementable functions.

White striping degree assessment using computer vision system and consumer acceptance test

  • Kato, Talita;Mastelini, Saulo Martiello;Campos, Gabriel Fillipe Centini;Barbon, Ana Paula Ayub da Costa;Prudencio, Sandra Helena;Shimokomaki, Massami;Soares, Adriana Lourenco;Barbon, Sylvio Jr.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.7
    • /
    • pp.1015-1026
    • /
    • 2019
  • Objective: The objective of this study was to evaluate three different degrees of white striping (WS) addressing their automatic assessment and customer acceptance. The WS classification was performed based on a computer vision system (CVS), exploring different machine learning (ML) algorithms and the most important image features. Moreover, it was verified by consumer acceptance and purchase intent. Methods: The samples for image analysis were classified by trained specialists, according to severity degrees regarding visual and firmness aspects. Samples were obtained with a digital camera, and 25 features were extracted from these images. ML algorithms were applied aiming to induce a model capable of classifying the samples into three severity degrees. In addition, two sensory analyses were performed: 75 samples properly grilled were used for the first sensory test, and 9 photos for the second. All tests were performed using a 10-cm hybrid hedonic scale (acceptance test) and a 5-point scale (purchase intention). Results: The information gain metric ranked 13 attributes. However, just one type of image feature was not enough to describe the phenomenon. The classification models support vector machine, fuzzy-W, and random forest showed the best results with similar general accuracy (86.4%). The worst performance was obtained by multilayer perceptron (70.9%) with the high error rate in normal (NORM) sample predictions. The sensory analysis of acceptance verified that WS myopathy negatively affects the texture of the broiler breast fillets when grilled and the appearance attribute of the raw samples, which influenced the purchase intention scores of raw samples. Conclusion: The proposed system has proved to be adequate (fast and accurate) for the classification of WS samples. The sensory analysis of acceptance showed that WS myopathy negatively affects the tenderness of the broiler breast fillets when grilled, while the appearance attribute of the raw samples eventually influenced purchase intentions.

Development of prediction model identifying high-risk older persons in need of long-term care (장기요양 필요 발생의 고위험 대상자 발굴을 위한 예측모형 개발)

  • Song, Mi Kyung;Park, Yeongwoo;Han, Eun-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.4
    • /
    • pp.457-468
    • /
    • 2022
  • In aged society, it is important to prevent older people from being disability needing long-term care. The purpose of this study is to develop a prediction model to discover high-risk groups who are likely to be beneficiaries of Long-Term Care Insurance. This study is a retrospective study using database of National Health Insurance Service (NHIS) collected in the past of the study subjects. The study subjects are 7,724,101, the population over 65 years of age registered for medical insurance. To develop the prediction model, we used logistic regression, decision tree, random forest, and multi-layer perceptron neural network. Finally, random forest was selected as the prediction model based on the performances of models obtained through internal and external validation. Random forest could predict about 90% of the older people in need of long-term care using DB without any information from the assessment of eligibility for long-term care. The findings might be useful in evidencebased health management for prevention services and can contribute to preemptively discovering those who need preventive services in older people.

A modified U-net for crack segmentation by Self-Attention-Self-Adaption neuron and random elastic deformation

  • Zhao, Jin;Hu, Fangqiao;Qiao, Weidong;Zhai, Weida;Xu, Yang;Bao, Yuequan;Li, Hui
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.1-16
    • /
    • 2022
  • Despite recent breakthroughs in deep learning and computer vision fields, the pixel-wise identification of tiny objects in high-resolution images with complex disturbances remains challenging. This study proposes a modified U-net for tiny crack segmentation in real-world steel-box-girder bridges. The modified U-net adopts the common U-net framework and a novel Self-Attention-Self-Adaption (SASA) neuron as the fundamental computing element. The Self-Attention module applies softmax and gate operations to obtain the attention vector. It enables the neuron to focus on the most significant receptive fields when processing large-scale feature maps. The Self-Adaption module consists of a multiplayer perceptron subnet and achieves deeper feature extraction inside a single neuron. For data augmentation, a grid-based crack random elastic deformation (CRED) algorithm is designed to enrich the diversities and irregular shapes of distributed cracks. Grid-based uniform control nodes are first set on both input images and binary labels, random offsets are then employed on these control nodes, and bilinear interpolation is performed for the rest pixels. The proposed SASA neuron and CRED algorithm are simultaneously deployed to train the modified U-net. 200 raw images with a high resolution of 4928 × 3264 are collected, 160 for training and the rest 40 for the test. 512 × 512 patches are generated from the original images by a sliding window with an overlap of 256 as inputs. Results show that the average IoU between the recognized and ground-truth cracks reaches 0.409, which is 29.8% higher than the regular U-net. A five-fold cross-validation study is performed to verify that the proposed method is robust to different training and test images. Ablation experiments further demonstrate the effectiveness of the proposed SASA neuron and CRED algorithm. Promotions of the average IoU individually utilizing the SASA and CRED module add up to the final promotion of the full model, indicating that the SASA and CRED modules contribute to the different stages of model and data in the training process.

Development of Graph based Deep Learning methods for Enhancing the Semantic Integrity of Spaces in BIM Models (BIM 모델 내 공간의 시멘틱 무결성 검증을 위한 그래프 기반 딥러닝 모델 구축에 관한 연구)

  • Lee, Wonbok;Kim, Sihyun;Yu, Youngsu;Koo, Bonsang
    • Korean Journal of Construction Engineering and Management
    • /
    • v.23 no.3
    • /
    • pp.45-55
    • /
    • 2022
  • BIM models allow building spaces to be instantiated and recognized as unique objects independently of model elements. These instantiated spaces provide the required semantics that can be leveraged for building code checking, energy analysis, and evacuation route analysis. However, theses spaces or rooms need to be designated manually, which in practice, lead to errors and omissions. Thus, most BIM models today does not guarantee the semantic integrity of space designations, limiting their potential applicability. Recent studies have explored ways to automate space allocation in BIM models using artificial intelligence algorithms, but they are limited in their scope and relatively low classification accuracy. This study explored the use of Graph Convolutional Networks, an algorithm exclusively tailored for graph data structures. The goal was to utilize not only geometry information but also the semantic relational data between spaces and elements in the BIM model. Results of the study confirmed that the accuracy was improved by about 8% compared to algorithms that only used geometric distinctions of the individual spaces.

Study on Image Use for Plant Disease Classification (작물의 병충해 분류를 위한 이미지 활용 방법 연구)

  • Jeong, Seong-Ho;Han, Jeong-Eun;Jeong, Seong-Kyun;Bong, Jae-Hwan
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.2
    • /
    • pp.343-350
    • /
    • 2022
  • It is worth verifying the effectiveness of data integration between data with different features. This study investigated whether the data integration affects the accuracy of deep neural network (DNN), and which integration method shows the best improvement. This study used two different public datasets. One public dataset was taken in an actual farm in India. And another was taken in a laboratory environment in Korea. Leaf images were selected from two different public datasets to have five classes which includes normal and four different types of plant diseases. DNN used pre-trained VGG16 as a feature extractor and multi-layer perceptron as a classifier. Data were integrated into three different ways to be used for the training process. DNN was trained in a supervised manner via the integrated data. The trained DNN was evaluated by using a test dataset taken in an actual farm. DNN shows the best accuracy for the test dataset when DNN was first trained by images taken in the laboratory environment and then trained by images taken in the actual farm. The results show that data integration between plant images taken in a different environment helps improve the performance of deep neural networks. And the results also confirmed that independent use of plant images taken in different environments during the training process is more effective in improving the performance of DNN.

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.