• Title/Summary/Keyword: End-to-end learning

Search Result 1,139, Processing Time 0.142 seconds

Comparison of Fine-Tuned Convolutional Neural Networks for Clipart Style Classification

  • Lee, Seungbin;Kim, Hyungon;Seok, Hyekyoung;Nang, Jongho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.9 no.4
    • /
    • pp.1-7
    • /
    • 2017
  • Clipart is artificial visual contents that are created using various tools such as Illustrator to highlight some information. Here, the style of the clipart plays a critical role in determining how it looks. However, previous studies on clipart are focused only on the object recognition [16], segmentation, and retrieval of clipart images using hand-craft image features. Recently, some clipart classification researches based on the style similarity using CNN have been proposed, however, they have used different CNN-models and experimented with different benchmark dataset so that it is very hard to compare their performances. This paper presents an experimental analysis of the clipart classification based on the style similarity with two well-known CNN-models (Inception Resnet V2 [13] and VGG-16 [14] and transfers learning with the same benchmark dataset (Microsoft Style Dataset 3.6K). From this experiment, we find out that the accuracy of Inception Resnet V2 is better than VGG for clipart style classification because of its deep nature and convolution map with various sizes in parallel. We also find out that the end-to-end training can improve the accuracy more than 20% in both CNN models.

Weather Classification and Image Restoration Algorithm Attentive to Weather Conditions in Autonomous Vehicles (자율주행 상황에서의 날씨 조건에 집중한 날씨 분류 및 영상 화질 개선 알고리듬)

  • Kim, Jaihoon;Lee, Chunghwan;Kim, Sangmin;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.60-63
    • /
    • 2020
  • With the advent of deep learning, a lot of attempts have been made in computer vision to substitute deep learning models for conventional algorithms. Among them, image classification, object detection, and image restoration have received a lot of attention from researchers. However, most of the contributions were refined in one of the fields only. We propose a new paradigm of model structure. End-to-end model which we will introduce classifies noise of an image and restores accordingly. Through this, the model enhances universality and efficiency. Our proposed model is an 'One-For-All' model which classifies weather condition in an image and returns clean image accordingly. By separating weather conditions, restoration model became more compact as well as effective in reducing raindrops, snowflakes, or haze in an image which degrade the quality of the image.

  • PDF

A Review on Detection of COVID-19 Cases from Medical Images Using Machine Learning-Based Approach

  • Noof Al-dieef;Shabana Habib
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.3
    • /
    • pp.59-70
    • /
    • 2024
  • Background: The COVID-19 pandemic (the form of coronaviruses) developed at the end of 2019 and spread rapidly to almost every corner of the world. It has infected around 25,334,339 of the world population by the end of September 1, 2020 [1] . It has been spreading ever since, and the peak specific to every country has been rising and falling and does not seem to be over yet. Currently, the conventional RT-PCR testing is required to detect COVID-19, but the alternative method for data archiving purposes is certainly another choice for public departments to make. Researchers are trying to use medical images such as X-ray and Computed Tomography (CT) to easily diagnose the virus with the aid of Artificial Intelligence (AI)-based software. Method: This review paper provides an investigation of a newly emerging machine-learning method used to detect COVID-19 from X-ray images instead of using other methods of tests performed by medical experts. The facilities of computer vision enable us to develop an automated model that has clinical abilities of early detection of the disease. We have explored the researchers' focus on the modalities, images of datasets for use by the machine learning methods, and output metrics used to test the research in this field. Finally, the paper concludes by referring to the key problems posed by identifying COVID-19 using machine learning and future work studies. Result: This review's findings can be useful for public and private sectors to utilize the X-ray images and deployment of resources before the pandemic can reach its peaks, enabling the healthcare system with cushion time to bear the impact of the unfavorable circumstances of the pandemic is sure to cause

Single Document Extractive Summarization Based on Deep Neural Networks Using Linguistic Analysis Features (언어 분석 자질을 활용한 인공신경망 기반의 단일 문서 추출 요약)

  • Lee, Gyoung Ho;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.8
    • /
    • pp.343-348
    • /
    • 2019
  • In recent years, extractive summarization systems based on end-to-end deep learning models have become popular. These systems do not require human-crafted features and adopt data-driven approaches. However, previous related studies have shown that linguistic analysis features such as part-of-speeches, named entities and word's frequencies are useful for extracting important sentences from a document to generate a summary. In this paper, we propose an extractive summarization system based on deep neural networks using conventional linguistic analysis features. In order to prove the usefulness of the linguistic analysis features, we compare the models with and without those features. The experimental results show that the model with the linguistic analysis features improves the Rouge-2 F1 score by 0.5 points compared to the model without those features.

Fact and plan on specialist training for social security (사회안전관리에 대한 전문인력 양성실태와 발전방안)

  • Kong, Bae-Wan;Kim, Chang-Ho
    • Korean Security Journal
    • /
    • no.5
    • /
    • pp.5-18
    • /
    • 2002
  • The private security has been one of the fastest growing parts of the law enforcement industry, confronted with mutual coincidence or complementarity. Therefore, the primary factor in order to straighten it up should be bringing op a person, because he or she arranges the private security, based on the society in the end. In addition, it is suggested that further study of technical learning and its practice should be arranged. Because the education for agents undertaking the social security is comprehensive in space and limited in time, it may accompany hardship in arranging its content and curriculum Although this article leaves much to be desired, it has been analyzed end observed if a greater emphasis is placed on ample human resources supply for increased demand on social security in private law enforcement industry through institutional education system. A scientific advancement is expected to be attained in the majors related to the private security, with validity that the continuous studies should be implemented, and a social role of colleges as a specialized institute should be erected.

  • PDF

Object Feature Tracking Algorithm based on Siame-FPN (Siame-FPN기반 객체 특징 추적 알고리즘)

  • Kim, Jong-Chan;Lim, Su-Chang
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.2
    • /
    • pp.247-256
    • /
    • 2022
  • Visual tracking of selected target objects is fundamental challenging problems in computer vision. Object tracking localize the region of target object with bounding box in the video. We propose a Siam-FPN based custom fully CNN to solve visual tracking problems by regressing the target area in an end-to-end manner. A method of preserving the feature information flow using a feature map connection structure was applied. In this way, information is preserved and emphasized across the network. To regress object region and to classify object, the region proposal network was connected with the Siamese network. The performance of the tracking algorithm was evaluated using the OTB-100 dataset. Success Plot and Precision Plot were used as evaluation matrix. As a result of the experiment, 0.621 in Success Plot and 0.838 in Precision Plot were achieved.

High accuracy map matching method using monocular cameras and low-end GPS-IMU systems (단안 카메라와 저정밀 GPS-IMU 신호를 융합한 맵매칭 방법)

  • Kim, Yong-Gyun;Koo, Hyung-Il;Kang, Seok-Won;Kim, Joon-Won;Kim, Jae-Gwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.34-40
    • /
    • 2018
  • This paper presents a new method to estimate the pose of a moving object accurately using a monocular camera and a low-end GPS+IMU sensor system. For this goal, we adopted a deep neural network for the semantic segmentation of input images and compared the results with a semantic map of a neighborhood. In this map matching, we use weight tables to deal with label inconsistency effectively. Signals from a low-end GPS+IMU sensor system are used to limit search spaces and minimize the proposed function. For the evaluation, we added noise to the signals from a high-end GPS-IMU system. The results show that the pose can be recovered from the noisy signals. We also show that the proposed method is effective in handling non-open-sky situations.

Semantic Role Labeling using Biaffine Average Attention Model (Biaffine Average Attention 모델을 이용한 의미역 결정)

  • Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.662-667
    • /
    • 2022
  • Semantic role labeling task(SRL) is to extract predicate and arguments such as agent, patient, place, time. In the previously SRL task studies, a pipeline method extracting linguistic features of sentence has been proposed, but in this method, errors of each extraction work in the pipeline affect semantic role labeling performance. Therefore, methods using End-to-End neural network model have recently been proposed. In this paper, we propose a neural network model using the Biaffine Average Attention model for SRL task. The proposed model consists of a structure that can focus on the entire sentence information regardless of the distance between the predicate in the sentence and the arguments, instead of LSTM model that uses the surrounding information for prediction of a specific token proposed in the previous studies. For evaluation, we used F1 scores to compare two models based BERT model that proposed in existing studies using F1 scores, and found that 76.21% performance was higher than comparison models.

Urinary Stones Segmentation Model and AI Web Application Development in Abdominal CT Images Through Machine Learning (기계학습을 통한 복부 CT영상에서 요로결석 분할 모델 및 AI 웹 애플리케이션 개발)

  • Lee, Chung-Sub;Lim, Dong-Wook;Noh, Si-Hyeong;Kim, Tae-Hoon;Park, Sung-Bin;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.11
    • /
    • pp.305-310
    • /
    • 2021
  • Artificial intelligence technology in the medical field initially focused on analysis and algorithm development, but it is gradually changing to web application development for service as a product. This paper describes a Urinary Stone segmentation model in abdominal CT images and an artificial intelligence web application based on it. To implement this, a model was developed using U-Net, a fully-convolutional network-based model of the end-to-end method proposed for the purpose of image segmentation in the medical imaging field. And for web service development, it was developed based on AWS cloud using a Python-based micro web framework called Flask. Finally, the result predicted by the urolithiasis segmentation model by model serving is shown as the result of performing the AI web application service. We expect that our proposed AI web application service will be utilized for screening test.

Image classification and captioning model considering a CAM-based disagreement loss

  • Yoon, Yeo Chan;Park, So Young;Park, Soo Myoung;Lim, Heuiseok
    • ETRI Journal
    • /
    • v.42 no.1
    • /
    • pp.67-77
    • /
    • 2020
  • Image captioning has received significant interest in recent years, and notable results have been achieved. Most previous approaches have focused on generating visual descriptions from images, whereas a few approaches have exploited visual descriptions for image classification. This study demonstrates that a good performance can be achieved for both description generation and image classification through an end-to-end joint learning approach with a loss function, which encourages each task to reach a consensus. When given images and visual descriptions, the proposed model learns a multimodal intermediate embedding, which can represent both the textual and visual characteristics of an object. The performance can be improved for both tasks by sharing the multimodal embedding. Through a novel loss function based on class activation mapping, which localizes the discriminative image region of a model, we achieve a higher score when the captioning and classification model reaches a consensus on the key parts of the object. Using the proposed model, we established a substantially improved performance for each task on the UCSD Birds and Oxford Flowers datasets.