• Title/Summary/Keyword: AlexNet

Search Result 71, Processing Time 0.036 seconds

Optimizing CNN Structure to Improve Accuracy of Artwork Artist Classification

  • Ji-Seon Park;So-Yeon Kim;Yeo-Chan Yoon;Soo Kyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.9-15
    • /
    • 2023
  • Metaverse is a modern new technology that is advancing quickly. The goal of this study is to investigate this technique from the perspective of computer vision as well as general perspective. A thorough analysis of computer vision related Metaverse topics has been done in this study. Its history, method, architecture, benefits, and drawbacks are all covered. The Metaverse's future and the steps that must be taken to adapt to this technology are described. The concepts of Mixed Reality (MR), Augmented Reality (AR), Extended Reality (XR) and Virtual Reality (VR) are briefly discussed. The role of computer vision and its application, advantages and disadvantages and the future research areas are discussed.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

Comparison of CNN Structures for Detection of Surface Defects (표면 결함 검출을 위한 CNN 구조의 비교)

  • Choi, Hakyoung;Seo, Kisung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1100-1104
    • /
    • 2017
  • A detector-based approach shows the limited performances for the defect inspections such as shallow fine cracks and indistinguishable defects from background. Deep learning technique is widely used for object recognition and it's applications to detect defects have been gradually attempted. Deep learning requires huge scale of learning data, but acquisition of data can be limited in some industrial application. The possibility of applying CNN which is one of the deep learning approaches for surface defect inspection is investigated for industrial parts whose detection difficulty is challenging and learning data is not sufficient. VOV is adopted for pre-processing and to obtain a resonable number of ROIs for a data augmentation. Then CNN method is applied for the classification. Three CNN networks, AlexNet, VGGNet, and mofified VGGNet are compared for experiments of defects detection.

Thermal transport study in actinide oxides with point defects

  • Resnick, Alex;Mitchell, Katherine;Park, Jungkyu;Farfan, Eduardo B.;Yee, Tien
    • Nuclear Engineering and Technology
    • /
    • v.51 no.5
    • /
    • pp.1398-1405
    • /
    • 2019
  • We use a molecular dynamics simulation to explore thermal transport in oxide nuclear fuels with point defects. The effect of vacancy and substitutional defects on the thermal conductivity of plutonium dioxide and uranium dioxide is investigated. It is found that the thermal conductivities of these fuels are reduced significantly by the presence of small amount of vacancy defects; 0.1% oxygen vacancy reduces the thermal conductivity of plutonium dioxide by more than 10%. The missing of larger atoms has a more detrimental impact on the thermal conductivity of actinide oxides. In uranium dioxide, for example, 0.1% uranium vacancies decrease the thermal conductivity by 24.6% while the same concentration of oxygen vacancies decreases the thermal conductivity by 19.4%. However, uranium substitution has a minimal effect on the thermal conductivity; 1.0% uranium substitution decreases the thermal conductivity of plutonium dioxide only by 1.5%.

Lane departure detection method using driving lane recognition based on deep learning (딥러닝 기반 주행 차로 인식 기법을 활용한 차선 변경 검출 기술)

  • Lee, Kyung-Min;Song, Hyok;Kim, Je Woo;Choi, Byeongho;Lin, Chi-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.06a
    • /
    • pp.332-333
    • /
    • 2018
  • 본 논문에서는 딥러닝 기반의 주행 차로 인식 기법을 활용한 차선 변경 검출 기술을 제안한다. 제안한 방법은 주행 차로, 좌우 차로, 차량 등 3 종의 이미지 데이터를 학습, 검증, 실험 데이터로 나눠 활용하였다. 주행 차로 및 차선 변경 인식을 위하여 변형된 AlexNet 모델을 개발하였다. 실험 결과 주행 차로 69.45%, 좌우 차로 66.9%, 차량 76.4%의 인식률 결과를 보여 기존 패턴인식 방법과 비교하여 우수한 결과를 보였다.

  • PDF

A Study on the Classification of Surface Defect Based on Deep Convolution Network and Transfer-learning (신경망과 전이학습 기반 표면 결함 분류에 관한 연구)

  • Kim, Sung Joo;Kim, Gyung Bum
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.1
    • /
    • pp.64-69
    • /
    • 2021
  • In this paper, a method for improving the defect classification performance in low contrast, ununiformity and featureless steel plate surfaces has been studied based on deep convolution neural network and transfer-learning neural network. The steel plate surface images have low contrast, ununiformity, and featureless, so that the contrast between defect and defect-free regions are not discriminated. These characteristics make it difficult to extract the feature of the surface defect image. A classifier based on a deep convolution neural network is constructed to extract features automatically for effective classification of images with these characteristics. As results of the experiment, AlexNet-based transfer-learning classifier showed excellent classification performance of 99.43% with less than 160 seconds of training time. The proposed classification system showed excellent classification performance for low contrast, ununiformity, and featureless surface images.

Research on Methods to Increase Recognition Rate of Korean Sign Language using Deep Learning

  • So-Young Kwon;Yong-Hwan Lee
    • Journal of Platform Technology
    • /
    • v.12 no.1
    • /
    • pp.3-11
    • /
    • 2024
  • Deaf people who use sign language as their first language sometimes have difficulty communicating because they do not know spoken Korean. Deaf people are also members of society, so we must support to create a society where everyone can live together. In this paper, we present a method to increase the recognition rate of Korean sign language using a CNN model. When the original image was used as input to the CNN model, the accuracy was 0.96, and when the image corresponding to the skin area in the YCbCr color space was used as input, the accuracy was 0.72. It was confirmed that inserting the original image itself would lead to better results. In other studies, the accuracy of the combined Conv1d and LSTM model was 0.92, and the accuracy of the AlexNet model was 0.92. The CNN model proposed in this paper is 0.96 and is proven to be helpful in recognizing Korean sign language.

  • PDF

Use of deep learning in nano image processing through the CNN model

  • Xing, Lumin;Liu, Wenjian;Liu, Xiaoliang;Li, Xin;Wang, Han
    • Advances in nano research
    • /
    • v.12 no.2
    • /
    • pp.185-195
    • /
    • 2022
  • Deep learning is another field of artificial intelligence (AI) utilized for computer aided diagnosis (CAD) and image processing in scientific research. Considering numerous mechanical repetitive tasks, reading image slices need time and improper with geographical limits, so the counting of image information is hard due to its strong subjectivity that raise the error ratio in misdiagnosis. Regarding the highest mortality rate of Lung cancer, there is a need for biopsy for determining its class for additional treatment. Deep learning has recently given strong tools in diagnose of lung cancer and making therapeutic regimen. However, identifying the pathological lung cancer's class by CT images in beginning phase because of the absence of powerful AI models and public training data set is difficult. Convolutional Neural Network (CNN) was proposed with its essential function in recognizing the pathological CT images. 472 patients subjected to staging FDG-PET/CT were selected in 2 months prior to surgery or biopsy. CNN was developed and showed the accuracy of 87%, 69%, and 69% in training, validation, and test sets, respectively, for T1-T2 and T3-T4 lung cancer classification. Subsequently, CNN (or deep learning) could improve the CT images' data set, indicating that the application of classifiers is adequate to accomplish better exactness in distinguishing pathological CT images that performs better than few deep learning models, such as ResNet-34, Alex Net, and Dense Net with or without Soft max weights.

Line-Segment Feature Analysis Algorithm for Handwritten-Digits Data Reduction (필기체 숫자 데이터 차원 감소를 위한 선분 특징 분석 알고리즘)

  • Kim, Chang-Min;Lee, Woo-Beom
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.4
    • /
    • pp.125-132
    • /
    • 2021
  • As the layers of artificial neural network deepens, and the dimension of data used as an input increases, there is a problem of high arithmetic operation requiring a lot of arithmetic operation at a high speed in the learning and recognition of the neural network (NN). Thus, this study proposes a data dimensionality reduction method to reduce the dimension of the input data in the NN. The proposed Line-segment Feature Analysis (LFA) algorithm applies a gradient-based edge detection algorithm using median filters to analyze the line-segment features of the objects existing in an image. Concerning the extracted edge image, the eigenvalues corresponding to eight kinds of line-segment are calculated, using 3×3 or 5×5-sized detection filters consisting of the coefficient values, including [0, 1, 2, 4, 8, 16, 32, 64, and 128]. Two one-dimensional 256-sized data are produced, accumulating the same response values from the eigenvalue calculated with each detection filter, and the two data elements are added up. Two LFA256 data are merged to produce 512-sized LAF512 data. For the performance evaluation of the proposed LFA algorithm to reduce the data dimension for the recognition of handwritten numbers, as a result of a comparative experiment, using the PCA technique and AlexNet model, LFA256 and LFA512 showed a recognition performance respectively of 98.7% and 99%.

A comparison of deep-learning models to the forecast of the daily solar flare occurrence using various solar images

  • Shin, Seulki;Moon, Yong-Jae;Chu, Hyoungseok
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.42 no.2
    • /
    • pp.61.1-61.1
    • /
    • 2017
  • As the application of deep-learning methods has been succeeded in various fields, they have a high potential to be applied to space weather forecasting. Convolutional neural network, one of deep learning methods, is specialized in image recognition. In this study, we apply the AlexNet architecture, which is a winner of Imagenet Large Scale Virtual Recognition Challenge (ILSVRC) 2012, to the forecast of daily solar flare occurrence using the MatConvNet software of MATLAB. Our input images are SOHO/MDI, EIT $195{\AA}$, and $304{\AA}$ from January 1996 to December 2010, and output ones are yes or no of flare occurrence. We consider other input images which consist of last two images and their difference image. We select training dataset from Jan 1996 to Dec 2000 and from Jan 2003 to Dec 2008. Testing dataset is chosen from Jan 2001 to Dec 2002 and from Jan 2009 to Dec 2010 in order to consider the solar cycle effect. In training dataset, we randomly select one fifth of training data for validation dataset to avoid the over-fitting problem. Our model successfully forecasts the flare occurrence with about 0.90 probability of detection (POD) for common flares (C-, M-, and X-class). While POD of major flares (M- and X-class) forecasting is 0.96, false alarm rate (FAR) also scores relatively high(0.60). We also present several statistical parameters such as critical success index (CSI) and true skill statistics (TSS). All statistical parameters do not strongly depend on the number of input data sets. Our model can immediately be applied to automatic forecasting service when image data are available.

  • PDF