• Title/Summary/Keyword: VGGNet

Search Result 43, Processing Time 0.03 seconds

Performance Analysis of Detecting buried pipelines in GPR images using Faster R-CNN (Faster R-CNN을 활용한 GPR 영상에서의 지하배관 위치추적 성능분석)

  • Ko, Hyoung-Yong;Kim, Nam-gi
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.5
    • /
    • pp.21-26
    • /
    • 2019
  • Various pipes are buried in the city as needed, such as water pipes, gas pipes and hydrogen pipes. As the time passes, buried pipes becomes aged due to crack, etc. these pipes has the risk of accidents such as explosion and leakage. To prevent the risks, many pipes are repaired or replaced, but the location of the pipes can also be changed. Failure to identify the location of the altered pipe may cause an accident by touching the pipe. In this paper, we propose a method to detect buried pipes by gathering the GPR images by using GPR and Learning with Faster R-CNN. Then experiments was carried out by raw data sets and data sets augmentation applied to increase the amount of images.

Cycle-accurate NPU Simulator and Performance Evaluation According to Data Access Strategies (Cycle-accurate NPU 시뮬레이터 및 데이터 접근 방식에 따른 NPU 성능평가)

  • Kwon, Guyun;Park, Sangwoo;Suh, Taeweon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.4
    • /
    • pp.217-228
    • /
    • 2022
  • Currently, there are increasing demands for applying deep neural networks (DNNs) in the embedded domain such as classification and object detection. The DNN processing in embedded domain often requires custom hardware such as NPU for acceleration due to the constraints in power, performance, and area. Processing DNN models requires a large amount of data, and its seamless transfer to NPU is crucial for performance. In this paper, we developed a cycle-accurate NPU simulator to evaluate diverse NPU microarchitectures. In addition, we propose a novel technique for reducing the number of memory accesses when processing convolutional layers in convolutional neural networks (CNNs) on the NPU. The main idea is to reuse data with memory interleaving, which recycles the overlapping data between previous and current input windows. Data memory interleaving makes it possible to quickly read consecutive data in unaligned locations. We implemented the proposed technique to the cycle-accurate NPU simulator and measured the performance with LeNet-5, VGGNet-16, and ResNet-50. The experiment shows up to 2.08x speedup in processing one convolutional layer, compared to the baseline.

Lightweight Deep Learning Model of Optical Character Recognition for Laundry Management (세탁물 관리를 위한 문자인식 딥러닝 모델 경량화)

  • Im, Seung-Jin;Lee, Sang-Hyeop;Park, Jang-Sik
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.25 no.6_3
    • /
    • pp.1285-1291
    • /
    • 2022
  • In this paper, we propose a low-cost, low-power embedded environment-based deep learning lightweight model for input images to recognize laundry management codes. Laundry franchise companies mainly use barcode recognition-based systems to record laundry consignee information and laundry information for laundry collection management. Conventional laundry collection management systems using barcodes require barcode printing costs, and due to barcode damage and contamination, it is necessary to improve the cost of reprinting the barcode book in its entirety of 1 billion won annually. It is also difficult to do. Recognition performance is improved by applying the VGG model with 7 layers, which is a reduced-transformation of the VGGNet model for number recognition. As a result of the numerical recognition experiment of service parts drawings, the proposed method obtained a significantly improved result over the conventional method with an F1-Score of 0.95.

Functionality-based Processing-In-Memory Accelerator for Deep Neural Networks (딥뉴럴네트워크를 위한 기능성 기반의 핌 가속기)

  • Kim, Min-Jae;Kim, Shin-Dug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.8-11
    • /
    • 2020
  • 4 차 산업혁명 시대의 도래와 함께 AI, ICT 기술의 융합이 진행됨에 따라, 유저 레벨의 디바이스에서도 AI 서비스의 요청이 실현되었다. 이미지 처리와 관련된 AI 서비스는 피사체 판별, 불량품 검사, 자율주행 등에 이용되고 있으며, 특히 Deep Convolutional Neural Network (DCNN)은 이미지의 특색을 파악하는 데 뛰어난 성능을 보여준다. 하지만, 이미지의 크기가 커지고, 신경망이 깊어짐에 따라 연산 처리에 있어 낮은 데이터 지역성과 빈번한 메모리 참조를 야기했다. 이에 따라, 기존의 계층적 시스템 구조는 DCNN 을 scalable 하고 빠르게 처리하는 데 한계를 보인다. 본 연구에서는 DCNN 의 scalable 하고 빠른 처리를 위해 3 차원 메모리 구조의 Processing-In-Memory (PIM) 가속기를 제안한다. 이를 위해 기존 3 차원 메모리인 Hybrid Memory Cube (HMC)에 하드웨어 및 소프트웨어 모듈을 추가로 구성하였다. 구체적으로, Processing Element (PE)간 데이터를 공유할 수 있는 공유 캐시 및 소프트웨어 스택, 파이프라인화된 곱셈기 및 듀얼 프리페치 버퍼를 구성하였다. 이를 유명 DCNN 알고리즘 LeNet, AlexNet, ZFNet, VGGNet, GoogleNet, RestNet 에 대해 성능 평가를 진행한 결과 기존 HMC 대비 40.3%의 속도 향상을 29.4%의 대역폭 향상을 보였다.

Performance Enhancement and Evaluation of a Deep Learning Framework on Embedded Systems using Unified Memory (통합메모리를 이용한 임베디드 환경에서의 딥러닝 프레임워크 성능 개선과 평가)

  • Lee, Minhak;Kang, Woochul
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.7
    • /
    • pp.417-423
    • /
    • 2017
  • Recently, many embedded devices that have the computing capability required for deep learning have become available; hence, many new applications using these devices are emerging. However, these embedded devices have an architecture different from that of PCs and high-performance servers. In this paper, we propose a method that improves the performance of deep-learning framework by considering the architecture of an embedded device that shares memory between the CPU and the GPU. The proposed method is implemented in Caffe, an open-source deep-learning framework, and is evaluated on an NVIDIA Jetson TK1 embedded device. In the experiment, we investigate the image recognition performance of several state-of-the-art deep-learning networks, including AlexNet, VGGNet, and GoogLeNet. Our results show that the proposed method can achieve significant performance gain. For instance, in AlexNet, we could reduce image recognition latency by about 33% and energy consumption by about 50%.

A Study on Deep Learning Binary Classification of Prostate Pathological Images Using Multiple Image Enhancement Techniques (다양한 이미지 향상 기법을 사용한 전립선 병리영상 딥러닝 이진 분류 연구)

  • Park, Hyeon-Gyun;Bhattacharjee, Subrata;Deekshitha, Prakash;Kim, Cho-Hee;Choi, Heung-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.4
    • /
    • pp.539-548
    • /
    • 2020
  • Deep learning technology is currently being used and applied in many different fields. Convolution neural network (CNN) is a method of artificial neural networks in deep learning, which is commonly used for analyzing different types of images through classification. In the conventional classification of histopathology images of prostate carcinomas, the rating of cancer is classified by human subjective observation. However, this approach has produced to some misdiagnosing of cancer grading. To solve this problem, CNN based classification method is proposed in this paper, to train the histological images and classify the prostate cancer grading into two classes of the benign and malignant. The CNN architecture used in this paper is based on the VGG models, which is specialized for image classification. However, color normalization was performed based on the contrast enhancement technique, and the normalized images were used for CNN training, to compare the classification results of both original and normalized images. In all cases, accuracy was over 90%, accuracy of the original was 96%, accuracy of other cases was higher, and loss was the lowest with 9%.

A Fully Convolutional Network Model for Classifying Liver Fibrosis Stages from Ultrasound B-mode Images (초음파 B-모드 영상에서 FCN(fully convolutional network) 모델을 이용한 간 섬유화 단계 분류 알고리즘)

  • Kang, Sung Ho;You, Sun Kyoung;Lee, Jeong Eun;Ahn, Chi Young
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.1
    • /
    • pp.48-54
    • /
    • 2020
  • In this paper, we deal with a liver fibrosis classification problem using ultrasound B-mode images. Commonly representative methods for classifying the stages of liver fibrosis include liver biopsy and diagnosis based on ultrasound images. The overall liver shape and the smoothness and roughness of speckle pattern represented in ultrasound images are used for determining the fibrosis stages. Although the ultrasound image based classification is used frequently as an alternative or complementary method of the invasive biopsy, it also has the limitations that liver fibrosis stage decision depends on the image quality and the doctor's experience. With the rapid development of deep learning algorithms, several studies using deep learning methods have been carried out for automated liver fibrosis classification and showed superior performance of high accuracy. The performance of those deep learning methods depends closely on the amount of datasets. We propose an enhanced U-net architecture to maximize the classification accuracy with limited small amount of image datasets. U-net is well known as a neural network for fast and precise segmentation of medical images. We design it newly for the purpose of classifying liver fibrosis stages. In order to assess the performance of the proposed architecture, numerical experiments are conducted on a total of 118 ultrasound B-mode images acquired from 78 patients with liver fibrosis symptoms of F0~F4 stages. The experimental results support that the performance of the proposed architecture is much better compared to the transfer learning using the pre-trained model of VGGNet.

Towards Low Complexity Model for Audio Event Detection

  • Saleem, Muhammad;Shah, Syed Muhammad Shehram;Saba, Erum;Pirzada, Nasrullah;Ahmed, Masood
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.175-182
    • /
    • 2022
  • In our daily life, we come across different types of information, for example in the format of multimedia and text. We all need different types of information for our common routines as watching/reading the news, listening to the radio, and watching different types of videos. However, sometimes we could run into problems when a certain type of information is required. For example, someone is listening to the radio and wants to listen to jazz, and unfortunately, all the radio channels play pop music mixed with advertisements. The listener gets stuck with pop music and gives up searching for jazz. So, the above example can be solved with an automatic audio classification system. Deep Learning (DL) models could make human life easy by using audio classifications, but it is expensive and difficult to deploy such models at edge devices like nano BLE sense raspberry pi, because these models require huge computational power like graphics processing unit (G.P.U), to solve the problem, we proposed DL model. In our proposed work, we had gone for a low complexity model for Audio Event Detection (AED), we extracted Mel-spectrograms of dimension 128×431×1 from audio signals and applied normalization. A total of 3 data augmentation methods were applied as follows: frequency masking, time masking, and mixup. In addition, we designed Convolutional Neural Network (CNN) with spatial dropout, batch normalization, and separable 2D inspired by VGGnet [1]. In addition, we reduced the model size by using model quantization of float16 to the trained model. Experiments were conducted on the updated dataset provided by the Detection and Classification of Acoustic Events and Scenes (DCASE) 2020 challenge. We confirm that our model achieved a val_loss of 0.33 and an accuracy of 90.34% within the 132.50KB model size.

Estimation of Heading Date of Paddy Rice from Slanted View Images Using Deep Learning Classification Model

  • Hyeokjin Bak;Hoyoung Ban;SeongryulChang;Dongwon Gwon;Jae-Kyeong Baek;Jeong-Il Cho;Wan-Gyu Sang
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.80-80
    • /
    • 2022
  • Estimation of heading date of paddy rice is laborious and time consuming. Therefore, automatic estimation of heading date of paddy rice is highly essential. In this experiment, deep learning classification models were used to classify two difference categories of rice (vegetative and reproductive stage) based on the panicle initiation of paddy field. Specifically, the dataset includes 444 slanted view images belonging to two categories and was then expanded to include 1,497 images via IMGAUG data augmentation technique. We adopt two transfer learning strategies: (First, used transferring model weights already trained on ImageNet to six classification network models: VGGNet, ResNet, DenseNet, InceptionV3, Xception and MobileNet, Second, fine-tuned some layers of the network according to our dataset). After training the CNN model, we used several evaluation metrics commonly used for classification tasks, including Accuracy, Precision, Recall, and F1-score. In addition, GradCAM was used to generate visual explanations for each image patch. Experimental results showed that the InceptionV3 is the best performing model in terms of the accuracy, average recall, precision, and F1-score. The fine-tuned InceptionV3 model achieved an overall classification accuracy of 0.95 with a high F1-score of 0.95. Our CNN model also represented the change of rice heading date under different date of transplanting. This study demonstrated that image based deep learning model can reliably be used as an automatic monitoring system to detect the heading date of rice crops using CCTV camera.

  • PDF

Systems for Pill Recognition and Medication Management using Deep Learning (딥러닝을 활용한 알약인식 및 복용관리 시스템)

  • Kang-Hee Kim;So-Hyeon Kim;Da-Ham Jung;Bo-Kyung Lee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.9-16
    • /
    • 2024
  • It is difficult to know the efficacy of pills if the pill bag or wrapper is lost after purchasing the pill. Many people do not classify the use of commercial pills when storing them after purchasing and taking them, so the inaccessibility of information on the side effects of pills leads to misuse of pills. Even with existing applications that search and provide information about pills, users have to select the details of the pills themselves. In this paper, we develope a pill recognition application by building a model that learns the formulation and colour of 22,000 photos of pills provided by a Pharmaceutical Information Institution to solve the above situation. We also develope a pill medication management function.