• Title/Summary/Keyword: Deep Learning based System

Search Result 1,198, Processing Time 0.025 seconds

An Overloaded Vehicle Identifying System based on Object Detection Model (객체 인식 모델을 활용한 적재 불량 화물차 탐지 시스템)

  • Jung, Woojin;Park, Jinuk;Park, Yongju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1794-1799
    • /
    • 2022
  • Recently, the increasing number of overloaded vehicles on the road poses a risk to traffic safety, such as falling objects, road damage, and chain collisions due to the abnormal weight distribution, and can cause great damage once an accident occurs. therefore we propose to build an object detection-based AI model to identify overloaded vehicles that cause such social problems. In addition, we present a simple yet effective method to construct an object detection model for the large-scale vehicle images. In particular, we utilize the large-scale of vehicle image sets provided by open AI-Hub, which include the overloaded vehicles. We inspected the specific features of sizes of vehicles and types of image sources, and pre-processed these images to train a deep learning-based object detection model. Also, we propose an integrated system for tracking the detected vehicles. Finally, we demonstrated that the detection performance of the overloaded vehicle was improved by about 23% compared to the one using raw data.

Deep Learning-based Professional Image Interpretation Using Expertise Transplant (전문성 이식을 통한 딥러닝 기반 전문 이미지 해석 방법론)

  • Kim, Taejin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.79-104
    • /
    • 2020
  • Recently, as deep learning has attracted attention, the use of deep learning is being considered as a method for solving problems in various fields. In particular, deep learning is known to have excellent performance when applied to applying unstructured data such as text, sound and images, and many studies have proven its effectiveness. Owing to the remarkable development of text and image deep learning technology, interests in image captioning technology and its application is rapidly increasing. Image captioning is a technique that automatically generates relevant captions for a given image by handling both image comprehension and text generation simultaneously. In spite of the high entry barrier of image captioning that analysts should be able to process both image and text data, image captioning has established itself as one of the key fields in the A.I. research owing to its various applicability. In addition, many researches have been conducted to improve the performance of image captioning in various aspects. Recent researches attempt to create advanced captions that can not only describe an image accurately, but also convey the information contained in the image more sophisticatedly. Despite many recent efforts to improve the performance of image captioning, it is difficult to find any researches to interpret images from the perspective of domain experts in each field not from the perspective of the general public. Even for the same image, the part of interests may differ according to the professional field of the person who has encountered the image. Moreover, the way of interpreting and expressing the image also differs according to the level of expertise. The public tends to recognize the image from a holistic and general perspective, that is, from the perspective of identifying the image's constituent objects and their relationships. On the contrary, the domain experts tend to recognize the image by focusing on some specific elements necessary to interpret the given image based on their expertise. It implies that meaningful parts of an image are mutually different depending on viewers' perspective even for the same image. So, image captioning needs to implement this phenomenon. Therefore, in this study, we propose a method to generate captions specialized in each domain for the image by utilizing the expertise of experts in the corresponding domain. Specifically, after performing pre-training on a large amount of general data, the expertise in the field is transplanted through transfer-learning with a small amount of expertise data. However, simple adaption of transfer learning using expertise data may invoke another type of problems. Simultaneous learning with captions of various characteristics may invoke so-called 'inter-observation interference' problem, which make it difficult to perform pure learning of each characteristic point of view. For learning with vast amount of data, most of this interference is self-purified and has little impact on learning results. On the contrary, in the case of fine-tuning where learning is performed on a small amount of data, the impact of such interference on learning can be relatively large. To solve this problem, therefore, we propose a novel 'Character-Independent Transfer-learning' that performs transfer learning independently for each character. In order to confirm the feasibility of the proposed methodology, we performed experiments utilizing the results of pre-training on MSCOCO dataset which is comprised of 120,000 images and about 600,000 general captions. Additionally, according to the advice of an art therapist, about 300 pairs of 'image / expertise captions' were created, and the data was used for the experiments of expertise transplantation. As a result of the experiment, it was confirmed that the caption generated according to the proposed methodology generates captions from the perspective of implanted expertise whereas the caption generated through learning on general data contains a number of contents irrelevant to expertise interpretation. In this paper, we propose a novel approach of specialized image interpretation. To achieve this goal, we present a method to use transfer learning and generate captions specialized in the specific domain. In the future, by applying the proposed methodology to expertise transplant in various fields, we expected that many researches will be actively conducted to solve the problem of lack of expertise data and to improve performance of image captioning.

Object-aware Depth Estimation for Developing Collision Avoidance System (객체 영역에 특화된 뎁스 추정 기반의 충돌방지 기술개발)

  • Gyutae Hwang;Jimin Song;Sang Jun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.2
    • /
    • pp.91-99
    • /
    • 2024
  • Collision avoidance system is important to improve the robustness and functional safety of autonomous vehicles. This paper proposes an object-level distance estimation method to develop a collision avoidance system, and it is applied to golfcarts utilized in country club environments. To improve the detection accuracy, we continually trained an object detection model based on pseudo labels generated by a pre-trained detector. Moreover, we propose object-aware depth estimation (OADE) method which trains a depth model focusing on object regions. In the OADE algorithm, we generated dense depth information for object regions by utilizing detection results and sparse LiDAR points, and it is referred to as object-aware LiDAR projection (OALP). By using the OALP maps, a depth estimation model was trained by backpropagating more gradients of the loss on object regions. Experiments were conducted on our custom dataset, which was collected for the travel distance of 22 km on 54 holes in three country clubs under various weather conditions. The precision and recall rate were respectively improved from 70.5% and 49.1% to 95.3% and 92.1% after the continual learning with pseudo labels. Moreover, the OADE algorithm reduces the absolute relative error from 4.76% to 4.27% for estimating distances to obstacles.

Automatic Object Extraction from Electronic Documents Using Deep Neural Network (심층 신경망을 활용한 전자문서 내 객체의 자동 추출 방법 연구)

  • Jang, Heejin;Chae, Yeonghun;Lee, Sangwon;Jo, Jinyong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.411-418
    • /
    • 2018
  • With the proliferation of artificial intelligence technology, it is becoming important to obtain, store, and utilize scientific data in research and science sectors. A number of methods for extracting meaningful objects such as graphs and tables from research articles have been proposed to eventually obtain scientific data. Existing extraction methods using heuristic approaches are hardly applicable to electronic documents having heterogeneous manuscript formats because they are designed to work properly for some targeted manuscripts. This paper proposes a prototype of an object extraction system which exploits a recent deep-learning technology so as to overcome the inflexibility of the heuristic approaches. We implemented our trained model, based on the Faster R-CNN algorithm, using the Google TensorFlow Object Detection API and also composed an annotated data set from 100 research articles for training and evaluation. Finally, a performance evaluation shows that the proposed system outperforms a comparator adopting heuristic approaches by 5.2%.

Pavement Crack Detection and Segmentation Based on Deep Neural Network

  • Nguyen, Huy Toan;Yu, Gwang Hyun;Na, Seung You;Kim, Jin Young;Seo, Kyung Sik
    • The Journal of Korean Institute of Information Technology
    • /
    • v.17 no.9
    • /
    • pp.99-112
    • /
    • 2019
  • Cracks on pavement surfaces are critical signs and symptoms of the degradation of pavement structures. Image-based pavement crack detection is a challenging problem due to the intensity inhomogeneity, topology complexity, low contrast, and noisy texture background. In this paper, we address the problem of pavement crack detection and segmentation at pixel-level based on a Deep Neural Network (DNN) using gray-scale images. We propose a novel DNN architecture which contains a modified U-net network and a high-level features network. An important contribution of this work is the combination of these networks afforded through the fusion layer. To the best of our knowledge, this is the first paper introducing this combination for pavement crack segmentation and detection problem. The system performance of crack detection and segmentation is enhanced dramatically by using our novel architecture. We thoroughly implement and evaluate our proposed system on two open data sets: the Crack Forest Dataset (CFD) and the AigleRN dataset. Experimental results demonstrate that our system outperforms eight state-of-the-art methods on the same data sets.

No-Reference Image Quality Assessment based on Quality Awareness Feature and Multi-task Training

  • Lai, Lijing;Chu, Jun;Leng, Lu
    • Journal of Multimedia Information System
    • /
    • v.9 no.2
    • /
    • pp.75-86
    • /
    • 2022
  • The existing image quality assessment (IQA) datasets have a small number of samples. Some methods based on transfer learning or data augmentation cannot make good use of image quality-related features. A No Reference (NR)-IQA method based on multi-task training and quality awareness is proposed. First, single or multiple distortion types and levels are imposed on the original image, and different strategies are used to augment different types of distortion datasets. With the idea of weak supervision, we use the Full Reference (FR)-IQA methods to obtain the pseudo-score label of the generated image. Then, we combine the classification information of the distortion type, level, and the information of the image quality score. The ResNet50 network is trained in the pre-train stage on the augmented dataset to obtain more quality-aware pre-training weights. Finally, the fine-tuning stage training is performed on the target IQA dataset using the quality-aware weights to predicate the final prediction score. Various experiments designed on the synthetic distortions and authentic distortions datasets (LIVE, CSIQ, TID2013, LIVEC, KonIQ-10K) prove that the proposed method can utilize the image quality-related features better than the method using only single-task training. The extracted quality-aware features improve the accuracy of the model.

CNN-based System for Image Processing (이미지 처리를 위한 CNN 기반 시스템)

  • Song, Hyunok;Kim, Hankil;Shin, Hyunsuk;Lee, Seokwoo;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.311-312
    • /
    • 2018
  • This paper proposes an image processing system based on the Convolution Neural Network technique. The image classification was performed using the composite neural network model and the images were classified with accuracy of 84% or more. The proposed system is implemented to operate on various platforms. When the system is used in the classification of images, the efficiency is higher because it is higher than the accuracy of the existing model.

  • PDF

Development of Python-based Annotation Tool Program for Constructing Object Recognition Deep-Learning Model (물체인식 딥러닝 모델 구성을 위한 파이썬 기반의 Annotation 툴 개발)

  • Lim, Song-Won;Park, Goo-man
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.386-398
    • /
    • 2020
  • We developed an integrative annotation program that can perform data labeling process for deep learning models in object recognition. The program utilizes the basic GUI library of Python and configures crawler functions that allow data collection in real time. Retinanet was used to implement an automatic annotation function. In addition, different data labeling formats for Pascal-VOC, YOLO and Retinanet were generated. Through the experiment of the proposed method, a domestic vehicle image dataset was built, and it is applied to Retinanet and YOLO as the training and test set. The proposed system classified the vehicle model with the accuracy of about 94%.

Lane Detection System using CNN (CNN을 사용한 차선검출 시스템)

  • Kim, Jihun;Lee, Daesik;Lee, Minho
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.3
    • /
    • pp.163-171
    • /
    • 2016
  • Lane detection is a widely researched topic. Although simple road detection is easily achieved by previous methods, lane detection becomes very difficult in several complex cases involving noisy edges. To address this, we use a Convolution neural network (CNN) for image enhancement. CNN is a deep learning method that has been very successfully applied in object detection and recognition. In this paper, we introduce a robust lane detection method based on a CNN combined with random sample consensus (RANSAC) algorithm. Initially, we calculate edges in an image using a hat shaped kernel, then we detect lanes using the CNN combined with the RANSAC. In the training process of the CNN, input data consists of edge images and target data is images that have real white color lanes on an otherwise black background. The CNN structure consists of 8 layers with 3 convolutional layers, 2 subsampling layers and multi-layer perceptron (MLP) of 3 fully-connected layers. Convolutional and subsampling layers are hierarchically arranged to form a deep structure. Our proposed lane detection algorithm successfully eliminates noise lines and was found to perform better than other formal line detection algorithms such as RANSAC

Deep Learning Based Gray Image Generation from 3D LiDAR Reflection Intensity (딥러닝 기반 3차원 라이다의 반사율 세기 신호를 이용한 흑백 영상 생성 기법)

  • Kim, Hyun-Koo;Yoo, Kook-Yeol;Park, Ju H.;Jung, Ho-Youl
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.1
    • /
    • pp.1-9
    • /
    • 2019
  • In this paper, we propose a method of generating a 2D gray image from LiDAR 3D reflection intensity. The proposed method uses the Fully Convolutional Network (FCN) to generate the gray image from 2D reflection intensity which is projected from LiDAR 3D intensity. Both encoder and decoder of FCN are configured with several convolution blocks in the symmetric fashion. Each convolution block consists of a convolution layer with $3{\times}3$ filter, batch normalization layer and activation function. The performance of the proposed method architecture is empirically evaluated by varying depths of convolution blocks. The well-known KITTI data set for various scenarios is used for training and performance evaluation. The simulation results show that the proposed method produces the improvements of 8.56 dB in peak signal-to-noise ratio and 0.33 in structural similarity index measure compared with conventional interpolation methods such as inverse distance weighted and nearest neighbor. The proposed method can be possibly used as an assistance tool in the night-time driving system for autonomous vehicles.