• Title/Summary/Keyword: LeNet-5

Search Result 39, Processing Time 0.027 seconds

Streamlined GoogLeNet Algorithm Based on CNN for Korean Character Recognition (한글 인식을 위한 CNN 기반의 간소화된 GoogLeNet 알고리즘 연구)

  • Kim, Yeon-gyu;Cha, Eui-young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1657-1665
    • /
    • 2016
  • Various fields are being researched through Deep Learning using CNN(Convolutional Neural Network) and these researches show excellent performance in the image recognition. In this paper, we provide streamlined GoogLeNet of CNN architecture that is capable of learning a large-scale Korean character database. The experimental data used in this paper is PHD08 that is the large-scale of Korean character database. PHD08 has 2,187 samples for each character and there are 2,350 Korean characters that make total 5,139,450 sample data. As a training result, streamlined GoogLeNet showed over 99% of test accuracy at PHD08. Also, we made additional Korean character data that have fonts that are not in the PHD08 in order to ensure objectivity and we compared the performance of classification between streamlined GoogLeNet and other OCR programs. While other OCR programs showed a classification success rate of 66.95% to 83.16%, streamlined GoogLeNet showed 89.14% of the classification success rate that is higher than other OCR program's rate.

Improving Test Accuracy on the MNIST Dataset using a Simple CNN with Batch Normalization

  • Seungbin Lee;Jungsoo Rhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.9
    • /
    • pp.1-7
    • /
    • 2024
  • In this paper, we proposes a Convolutional Neural Networks(CNN) equipped with Batch Normalization(BN) for handwritten digit recognition training the MNIST dataset. Aiming to surpass the performance of LeNet-5 by LeCun et al., a 6-layer neural network was designed. The proposed model processes 28×28 pixel images through convolution, Max Pooling, and Fully connected layers, with the batch normalization to improve learning stability and performance. The experiment utilized 60,000 training images and 10,000 test images, applying the Momentum optimization algorithm. The model configuration used 30 filters with a 5×5 filter size, padding 0, stride 1, and ReLU as activation function. The training process was set with a mini-batch size of 100, 20 epochs in total, and a learning rate of 0.1. As a result, the proposed model achieved a test accuracy of 99.22%, surpassing LeNet-5's 99.05%, and recorded an F1-score of 0.9919, demonstrating the model's performance. Moreover, the 6-layer model proposed in this paper emphasizes model efficiency with a simpler structure compared to LeCun et al.'s LeNet-5 (7-layer model) and the model proposed by Ji, Chun and Kim (10-layer model). The results of this study show potential for application in real industrial applications such as AI vision inspection systems. It is expected to be effectively applied in smart factories, particularly in determining the defective status of parts.

Variations of AlexNet and GoogLeNet to Improve Korean Character Recognition Performance

  • Lee, Sang-Geol;Sung, Yunsick;Kim, Yeon-Gyu;Cha, Eui-Young
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.205-217
    • /
    • 2018
  • Deep learning using convolutional neural networks (CNNs) is being studied in various fields of image recognition and these studies show excellent performance. In this paper, we compare the performance of CNN architectures, KCR-AlexNet and KCR-GoogLeNet. The experimental data used in this paper is obtained from PHD08, a large-scale Korean character database. It has 2,187 samples of each Korean character with 2,350 Korean character classes for a total of 5,139,450 data samples. In the training results, KCR-AlexNet showed an accuracy of over 98% for the top-1 test and KCR-GoogLeNet showed an accuracy of over 99% for the top-1 test after the final training iteration. We made an additional Korean character dataset with fonts that were not in PHD08 to compare the classification success rate with commercial optical character recognition (OCR) programs and ensure the objectivity of the experiment. While the commercial OCR programs showed 66.95% to 83.16% classification success rates, KCR-AlexNet and KCR-GoogLeNet showed average classification success rates of 90.12% and 89.14%, respectively, which are higher than the commercial OCR programs' rates. Considering the time factor, KCR-AlexNet was faster than KCR-GoogLeNet when they were trained using PHD08; otherwise, KCR-GoogLeNet had a faster classification speed.

Apply Locally Weight Parameter Elimination for CNN Model Compression (지역적 가중치 파라미터 제거를 적용한 CNN 모델 압축)

  • Lim, Su-chang;Kim, Do-yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.9
    • /
    • pp.1165-1171
    • /
    • 2018
  • CNN requires a large amount of computation and memory in the process of extracting the feature of the object. Also, It is trained from the network that the user has configured, and because the structure of the network is fixed, it can not be modified during training and it is also difficult to use it in a mobile device with low computing power. To solve these problems, we apply a pruning method to the pre-trained weight file to reduce computation and memory requirements. This method consists of three steps. First, all the weights of the pre-trained network file are retrieved for each layer. Second, take an absolute value for the weight of each layer and obtain the average. After setting the average to a threshold, remove the weight below the threshold. Finally, the network file applied the pruning method is re-trained. We experimented with LeNet-5 and AlexNet, achieved 31x on LeNet-5 and 12x on AlexNet.

Architectures of Convolutional Neural Networks for the Prediction of Protein Secondary Structures (단백질 이차 구조 예측을 위한 합성곱 신경망의 구조)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.5
    • /
    • pp.728-733
    • /
    • 2018
  • Deep learning has been actively studied for predicting protein secondary structure based only on the sequence information of the amino acids constituting the protein. In this paper, we compared the performances of the convolutional neural networks of various structures to predict the protein secondary structure. To investigate the optimal depth of the layer of neural network for the prediction of protein secondary structure, the performance according to the number of layers was investigated. We also applied the structure of GoogLeNet and ResNet which constitute building blocks of many image classification methods. These methods extract various features from input data, and smooth the gradient transmission in the learning process even using the deep layer. These architectures of convolutional neural networks were modified to suit the characteristics of protein data to improve performance.

A Deep Learning-based Hand Gesture Recognition Robust to External Environments (외부 환경에 강인한 딥러닝 기반 손 제스처 인식)

  • Oh, Dong-Han;Lee, Byeong-Hee;Kim, Tae-Young
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.5
    • /
    • pp.31-39
    • /
    • 2018
  • Recently, there has been active studies to provide a user-friendly interface in a virtual reality environment by recognizing user hand gestures based on deep learning. However, most studies use separate sensors to obtain hand information or go through pre-process for efficient learning. It also fails to take into account changes in the external environment, such as changes in lighting or some of its hands being obscured. This paper proposes a hand gesture recognition method based on deep learning that is strong in external environments without the need for pre-process of RGB images obtained from general webcam. In this paper we improve the VGGNet and the GoogLeNet structures and compared the performance of each structure. The VGGNet and the GoogLeNet structures presented in this paper showed a recognition rate of 93.88% and 93.75%, respectively, based on data containing dim, partially obscured, or partially out-of-sight hand images. In terms of memory and speed, the GoogLeNet used about 3 times less memory than the VGGNet, and its processing speed was 10 times better. The results of this paper can be processed in real-time and used as a hand gesture interface in various areas such as games, education, and medical services in a virtual reality environment.

Data Preprocessing Method for Lightweight Automotive Intrusion Detection System (차량용 경량화 침입 탐지 시스템을 위한 데이터 전처리 기법)

  • Sangmin Park;Hyungchul Im;Seongsoo Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.531-536
    • /
    • 2023
  • This paper proposes a sliding window method with frame feature insertion for immediate attack detection on in-vehicle networks. This method guarantees real-time attack detection by labeling based on the attack status of the current frame. Experiments show that the proposed method improves detection performance by giving more weight to the current frame in CNN computation. The proposed model was designed based on a lightweight LeNet-5 architecture and it achieves 100% detection for DoS attacks. Additionally, by comparing the complexity with conventional models, the proposed model has been proven to be more suitable for resource-constrained devices like ECUs.

Accurate Human Localization for Automatic Labelling of Human from Fisheye Images

  • Than, Van Pha;Nguyen, Thanh Binh;Chung, Sun-Tae
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.769-781
    • /
    • 2017
  • Deep learning networks like Convolutional Neural Networks (CNNs) show successful performances in many computer vision applications such as image classification, object detection, and so on. For implementation of deep learning networks in embedded system with limited processing power and memory, deep learning network may need to be simplified. However, simplified deep learning network cannot learn every possible scene. One realistic strategy for embedded deep learning network is to construct a simplified deep learning network model optimized for the scene images of the installation place. Then, automatic training will be necessitated for commercialization. In this paper, as an intermediate step toward automatic training under fisheye camera environments, we study more precise human localization in fisheye images, and propose an accurate human localization method, Automatic Ground-Truth Labelling Method (AGTLM). AGTLM first localizes candidate human object bounding boxes by utilizing GoogLeNet-LSTM approach, and after reassurance process by GoogLeNet-based CNN network, finally refines them more correctly and precisely(tightly) by applying saliency object detection technique. The performance improvement of the proposed human localization method, AGTLM with respect to accuracy and tightness is shown through several experiments.

Mushroom Image Recognition using Convolutional Neural Network and Transfer Learning (컨볼루션 신경망과 전이 학습을 이용한 버섯 영상 인식)

  • Kang, Euncheol;Han, Yeongtae;Oh, Il-Seok
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.53-57
    • /
    • 2018
  • A poisoning accident is often caused by a situation in which people eat poisonous mushrooms because they cannot distinguish between edible mushrooms and poisonous mushrooms. In this paper, we propose an automatic mushroom recognition system by using the convolutional neural network. We collected 1478 mushroom images of 38 species using image crawling, and used the dataset for learning the convolutional neural network. A comparison experiment using AlexNet, VGGNet, and GoogLeNet was performed using the collected datasets, and a comparison experiment using a class number expansion and a fine-tuning technique for transfer learning were performed. As a result of our experiment, we achieve 82.63% top-1 accuracy and 96.84% top-5 accuracy on test set of our dataset.

Performance Comparison of Commercial and Customized CNN for Detection in Nodular Lung Cancer (결절성 폐암 검출을 위한 상용 및 맞춤형 CNN의 성능 비교)

  • Park, Sung-Wook;Kim, Seunghyun;Lim, Su-Chang;Kim, Do-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.729-737
    • /
    • 2020
  • Screening with low-dose spiral computed tomography (LDCT) has been shown to reduce lung cancer mortality by about 20% when compared to standard chest radiography. One of the problems arising from screening programs is that large amounts of CT image data must be interpreted by radiologists. To solve this problem, automated detection of pulmonary nodules is necessary; however, this is a challenging task because of the high number of false positive results. Here we demonstrate detection of pulmonary nodules using six off-the-shelf convolutional neural network (CNN) models after modification of the input/output layers and end-to-end training based on publicly databases for comparative evaluation. We used the well-known CNN models, LeNet-5, VGG-16, GoogLeNet Inception V3, ResNet-152, DensNet-201, and NASNet. Most of the CNN models provided superior results to those of obtained using customized CNN models. It is more desirable to modify the proven off-the-shelf network model than to customize the network model to detect the pulmonary nodules.