Search | Korea Science

New Hybrid Approach of CNN and RNN based on Encoder and Decoder (인코더와 디코더에 기반한 합성곱 신경망과 순환 신경망의 새로운 하이브리드 접근법)

Jongwoo Woo;Gunwoo Kim;Keunho Choi
- Information Systems Review
- /
- v.25 no.1
- /
- pp.129-143
- /
- 2023
In the era of big data, the field of artificial intelligence is showing remarkable growth, and in particular, the image classification learning methods by deep learning are becoming an important area. Various studies have been actively conducted to further improve the performance of CNNs, which have been widely used in image classification, among which a representative method is the Convolutional Recurrent Neural Network (CRNN) algorithm. The CRNN algorithm consists of a combination of CNN for image classification and RNNs for recognizing time series elements. However, since the inputs used in the RNN area of CRNN are the flatten values extracted by applying the convolution and pooling technique to the image, pixel values in the same phase in the image appear in different order. And this makes it difficult to properly learn the sequence of arrangements in the image intended by the RNN. Therefore, this study aims to improve image classification performance by proposing a novel hybrid method of CNN and RNN applying the concepts of encoder and decoder. In this study, the effectiveness of the new hybrid method was verified through various experiments. This study has academic implications in that it broadens the applicability of encoder and decoder concepts, and the proposed method has advantages in terms of model learning time and infrastructure construction costs as it does not significantly increase complexity compared to conventional hybrid methods. In addition, this study has practical implications in that it presents the possibility of improving the quality of services provided in various fields that require accurate image classification.
https://doi.org/10.14329/isr.2023.25.1.129 인용 PDF

Deep learning based Person Re-identification with RGB-D sensors

Kim, Min;Park, Dong-Hyun
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.3
- /
- pp.35-42
- /
- 2021
In this paper, we propose a deep learning-based person re-identification method using a three-dimensional RGB-Depth Xtion2 camera considering joint coordinates and dynamic features(velocity, acceleration). The main idea of the proposed identification methodology is to easily extract gait data such as joint coordinates, dynamic features with an RGB-D camera and automatically identify gait patterns through a self-designed one-dimensional convolutional neural network classifier(1D-ConvNet). The accuracy was measured based on the F1 Score, and the influence was measured by comparing the accuracy with the classifier model (JC) that did not consider dynamic characteristics. As a result, our proposed classifier model in the case of considering the dynamic characteristics(JCSpeed) showed about 8% higher F1-Score than JC.
https://doi.org/10.9708/jksci.2021.26.03.035 인용 PDF KSCI

A Study on Improvement of Dynamic Object Detection using Dense Grid Model and Anchor Model (고밀도 그리드 모델과 앵커모델을 이용한 동적 객체검지 향상에 관한 연구)

Yun, Borin;Lee, Sun Woo;Choi, Ho Kyung;Lee, Sangmin;Kwon, Jang Woo
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.17 no.3
- /
- pp.98-110
- /
- 2018
In this paper, we propose both Dense grid model and Anchor model to improve the recognition rate of dynamic objects. Two experiments are conducted to study the performance of two proposed CNNs models (Dense grid model and Anchor model), which are to detect dynamic objects. In the first experiment, YOLO-v2 network is adjusted, and then fine-tuned on KITTI datasets. The Dense grid model and Anchor model are then compared with YOLO-v2. Regarding to the evaluation, the two models outperform YOLO-v2 from 6.26% to 10.99% on car detection at different difficulty levels. In the second experiment, this paper conducted further training of the models on a new dataset. The two models outperform YOLO-v2 up to 22.40% on car detection at different difficulty levels.
https://doi.org/10.12815/kits.2018.17.3.98 인용 PDF KSCI

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

Seo, Yian;Shin, Kyung-shik
- Journal of Intelligence and Information Systems
- /
- v.24 no.3
- /
- pp.1-19
- /
- 2018
Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.
https://doi.org/10.13088/jiis.2018.24.3.001 인용 PDF KSCI

Electric Power Demand Prediction Using Deep Learning Model with Temperature Data (기온 데이터를 반영한 전력수요 예측 딥러닝 모델)

Yoon, Hyoup-Sang;Jeong, Seok-Bong
- KIPS Transactions on Software and Data Engineering
- /
- v.11 no.7
- /
- pp.307-314
- /
- 2022
Recently, researches using deep learning-based models are being actively conducted to replace statistical-based time series forecast techniques to predict electric power demand. The result of analyzing the researches shows that the performance of the LSTM-based prediction model is acceptable, but it is not sufficient for long-term regional-wide power demand prediction. In this paper, we propose a WaveNet deep learning model to predict electric power demand 24-hour-ahead with temperature data in order to achieve the prediction accuracy better than MAPE value of 2% which statistical-based time series forecast techniques can present. First of all, we illustrate a delated causal one-dimensional convolutional neural network architecture of WaveNet and the preprocessing mechanism of the input data of electric power demand and temperature. Second, we present the training process and walk forward validation with the modified WaveNet. The performance comparison results show that the prediction model with temperature data achieves MAPE value of 1.33%, which is better than MAPE Value (2.33%) of the same model without temperature data.
https://doi.org/10.3745/KTSDE.2022.11.7.307 인용 PDF KSCI

Deep Learning-based Pes Planus Classification Model Using Transfer Learning

Kim, Yeonho;Kim, Namgyu
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.4
- /
- pp.21-28
- /
- 2021
This study proposes a deep learning-based flat foot classification methodology using transfer learning. We used a transfer learning with VGG16 pre-trained model and a data augmentation technique to generate a model with high predictive accuracy from a total of 176 image data consisting of 88 flat feet and 88 normal feet. To evaluate the performance of the proposed model, we performed an experiment comparing the prediction accuracy of the basic CNN-based model and the prediction model derived through the proposed methodology. In the case of the basic CNN model, the training accuracy was 77.27%, the validation accuracy was 61.36%, and the test accuracy was 59.09%. Meanwhile, in the case of our proposed model, the training accuracy was 94.32%, the validation accuracy was 86.36%, and the test accuracy was 84.09%, indicating that the accuracy of our model was significantly higher than that of the basic CNN model.
https://doi.org/10.9708/jksci.2021.26.04.021 인용 PDF KSCI HTML

An Embedding Similarity-based Deep Learning Model for Detecting Displacement in Cultural Asset Images (목조 문화재 영상에서의 크랙을 감지하기 위한 임베딩 유사도 기반 딥러닝 모델)

Kang, Jaeyong;Kim, Inki;Lim, Hyunseok;Gwak, Jeonghwan
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2021.07a
- /
- pp.133-135
- /
- 2021
본 논문에서는 목조 문화재 영상에서의 변위 현상 중 하나인 크랙이 발생하는 영역을 감지하기 위한 임베딩 유사도 기반 모델을 제안한다. 우선 변위가 존재하지 않는 정상으로만 구성된 학습 이미지는 사전 학습된 합성 곱 신경망을 통과하여 임베딩 벡터들을 추출한다. 그 이후 임베딩 벡터들을 가지고 정상 클래스에 대한 분포의 파라미터 값을 구한다. 실제 추론 과정에 사용되는 테스트 이미지에 대해서도 마찬가지로 임베딩 벡터를 구한다. 그런 다음 테스트 이미지의 임베딩 벡터와 이전에 구한 정상 클래스를 대표하는 가우시안 분포 정보와의 거리를 계산하여 이상치 맵을 생성하여 최종적으로 변위가 존재하는 영역을 감지한다. 데이터 셋으로는 충주시 근처의 문화재에 방문해서 수집한 목조 문화재 이미지를 가지고 정상 및 비정상으로 구분한 데이터 셋을 사용하였다. 실험 결과 우리가 제안한 임베딩 유사도 기반 모델이 목조 문화재에서 크랙이 발생하는 변위 영역을 잘 감지함을 확인하였다. 이러한 결과로부터 우리가 제안한 방법이 목재 문화재의 크랙 현상에 대한 변위 영역 검출에 있어서 매우 적합함을 보여준다.
PDF

Performance Prediction Model of Solid Oxide Fuel Cell Stack Using Deep Neural Network Technique (심층 신경망 기법을 이용한 고체 산화물 연료전지 스택의 성능 예측 모델)

LEE, JAEYOON;PINEDA, ISRAEL TORRES;GIAP, VAN-TIEN;LEE, DONGKEUN;KIM, YOUNG SANG;AHN, KOOK YOUNG;LEE, YOUNG DUK
- Transactions of the Korean hydrogen and new energy society
- /
- v.31 no.5
- /
- pp.436-443
- /
- 2020
The performance prediction model of a solid oxide fuel cell stack has been developed using deep neural network technique, one of the machine learning methods. The machine learning has been received much interest in various fields, including energy system mo- deling. Using machine learning technique can save time and cost requried in developing an energy system model being compared to the conventional method, that is a combination of a mathematical modeling and an experimental validation. Results reveal that the mean average percent error, root mean square error, and coefficient of determination (R2) range 1.7515, 0.1342, 0.8597, repectively, in maximum. To improve the predictability of the model, the pre-processing is effective and interpolative machine learning and application is more accurate than the extrapolative cases.
https://doi.org/10.7316/KHNES.2020.31.5.436 인용 PDF KSCI

A Deep Learning Model for Judging Presence or Absence of Lesions in the Chest X-ray Images (흉부 디지털 영상의 병변 유무 판단을 위한 딥러닝 모델)

Lee, Jong-Keun;Kim, Seon-Jin;Kwak, Nae-Joung;Kim, Dong-Woo;Ahn, Jae-Hyeong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.2
- /
- pp.212-218
- /
- 2020
There are dozens of different types of lesions that can be diagnosed through chest X-ray images, including Atelectasis, Cardiomegaly, Mass, Pneumothorax, and Effusion. Computed tomography(CT) test is generally necessary to determine the exact diagnosis and location and size of thoracic lesions, however computed tomography has disadvantages such as expensive cost and a lot of radiation exposure. Therefore, in this paper, we propose a deep learning algorithm for judging the presence or absence of lesions in chest X-ray images as the primary screening tool for the diagnosis of thoracic lesions. The proposed algorithm was designed by comparing various configuration methods to optimize the judgment of presence of lesions from chest X-ray. As a result, the evaluation rate of lesion presence of the proposed algorithm is about 1% better than the existing algorithm.
https://doi.org/10.6109/jkiice.2020.24.2.212 인용 PDF KSCI

Detection of Number and Character Area of License Plate Using Deep Learning and Semantic Image Segmentation (딥러닝과 의미론적 영상분할을 이용한 자동차 번호판의 숫자 및 문자영역 검출)

Lee, Jeong-Hwan
- Journal of the Korea Convergence Society
- /
- v.12 no.1
- /
- pp.29-35
- /
- 2021
License plate recognition plays a key role in intelligent transportation systems. Therefore, it is a very important process to efficiently detect the number and character areas. In this paper, we propose a method to effectively detect license plate number area by applying deep learning and semantic image segmentation algorithm. The proposed method is an algorithm that detects number and text areas directly from the license plate without preprocessing such as pixel projection. The license plate image was acquired from a fixed camera installed on the road, and was used in various real situations taking into account both weather and lighting changes. The input images was normalized to reduce the color change, and the deep learning neural networks used in the experiment were Vgg16, Vgg19, ResNet18, and ResNet50. To examine the performance of the proposed method, we experimented with 500 license plate images. 300 sheets were used for learning and 200 sheets were used for testing. As a result of computer simulation, it was the best when using ResNet50, and 95.77% accuracy was obtained.
https://doi.org/10.15207/JKCS.2021.12.1.029 인용 PDF KSCI

Search Result 307, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)