• Title/Summary/Keyword: Fully convolutional Network

Search Result 118, Processing Time 0.031 seconds

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

High-Capacity Robust Image Steganography via Adversarial Network

  • Chen, Beijing;Wang, Jiaxin;Chen, Yingyue;Jin, Zilong;Shim, Hiuk Jae;Shi, Yun-Qing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.1
    • /
    • pp.366-381
    • /
    • 2020
  • Steganography has been successfully employed in various applications, e.g., copyright control of materials, smart identity cards, video error correction during transmission, etc. Deep learning-based steganography models can hide information adaptively through network learning, and they draw much more attention. However, the capacity, security, and robustness of the existing deep learning-based steganography models are still not fully satisfactory. In this paper, three models for different cases, i.e., a basic model, a secure model, a secure and robust model, have been proposed for different cases. In the basic model, the functions of high-capacity secret information hiding and extraction have been realized through an encoding network and a decoding network respectively. The high-capacity steganography is implemented by hiding a secret image into a carrier image having the same resolution with the help of concat operations, InceptionBlock and convolutional layers. Moreover, the secret image is hidden into the channel B of carrier image only to resolve the problem of color distortion. In the secure model, to enhance the security of the basic model, a steganalysis network has been added into the basic model to form an adversarial network. In the secure and robust model, an attack network has been inserted into the secure model to improve its robustness further. The experimental results have demonstrated that the proposed secure model and the secure and robust model have an overall better performance than some existing high-capacity deep learning-based steganography models. The secure model performs best in invisibility and security. The secure and robust model is the most robust against some attacks.

A Deep Learning-based Streetscapes Safety Score Prediction Model using Environmental Context from Big Data (빅데이터로부터 추출된 주변 환경 컨텍스트를 반영한 딥러닝 기반 거리 안전도 점수 예측 모델)

  • Lee, Gi-In;Kang, Hang-Bong
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1282-1290
    • /
    • 2017
  • Since the mitigation of fear of crime significantly enhances the consumptions in a city, studies focusing on urban safety analysis have received much attention as means of revitalizing the local economy. In addition, with the development of computer vision and machine learning technologies, efficient and automated analysis methods have been developed. Previous studies have used global features to predict the safety of cities, yet this method has limited ability in accurately predicting abstract information such as safety assessments. Therefore we used a Convolutional Context Neural Network (CCNN) that considered "context" as a decision criterion to accurately predict safety of cities. CCNN model is constructed by combining a stacked auto encoder with a fully connected network to find the context and use it in the CNN model to predict the score. We analyzed the RMSE and correlation of SVR, Alexnet, and Sharing models to compare with the performance of CCNN model. Our results indicate that our model has much better RMSE and Pearson/Spearman correlation coefficient.

Multi-scale face detector using anchor free method

  • Lee, Dong-Ryeol;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.7
    • /
    • pp.47-55
    • /
    • 2020
  • In this paper, we propose one stage multi-scale face detector based Fully Convolution Network using anchor free method. Recently almost all state-of-the-art face detectors which predict location of faces using anchor-based methods rely on pre-defined anchor boxes. However this face detectors need to hyper-parameters and additional computation in training. The key idea of the proposed method is to eliminate hyper-parameters and additional computation using anchor free method. To do this, we apply two ideas. First, by eliminating the pre-defined set of anchor boxes, we avoid the additional computation and hyper-parameters related to anchor boxes. Second, our detector predicts location of faces using multi-feature maps to reduce foreground/background imbalance issue. Through Quantitative evaluation, the performance of the proposed method is evaluated and analyzed. Experimental results on the FDDB dataset demonstrate the effective of our proposed method.

Robust Coronary Artery Segmentation in 2D X-ray Images using Local Patch-based Re-connection Methods (지역적 패치기반 보정기법을 활용한 2D X-ray 영상에서의 강인한 관상동맥 재연결 기법)

  • Han, Kyunghoon;Jeon, Byunghwan;Kim, Sekeun;Jang, Yeonggul;Jung, Sunghee;Shim, Hackjoon;Chang, Hyukjae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.592-601
    • /
    • 2019
  • For coronary procedures, X-ray angiogram images are useful for diagnosing and assisting procedures. It is challenging to accurately segment a coronary artery using only a single segmentation model in 2D X-ray images due to a complex structure of three-dimensional coronary artery, especially from phenomenon of vessels being broken in the middle or end of coronary artery. In order to solve these problems, the initial segmentation is performed using an existing single model, and the candidate regions for the sophisticate correction is estimated based on the initial segment, and the local patch-based correction is performed in the candidate regions. Through this research, not only the broken coronary arteries are re-connected, but also the distal part of coronary artery that is very thin is additionally correctly found. Further, the performance can be much improved by combining the proposed correction method with any existing coronary artery segmentation method. In this paper, the U-net, a fully convolutional network was chosen as a segmentation method and the proposed correction method was combined with U-net to demonstrate a significant improvement in performance through X-ray images from several patients.

Development of Deep Learning Based Ensemble Land Cover Segmentation Algorithm Using Drone Aerial Images (드론 항공영상을 이용한 딥러닝 기반 앙상블 토지 피복 분할 알고리즘 개발)

  • Hae-Gwang Park;Seung-Ki Baek;Seung Hyun Jeong
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.71-80
    • /
    • 2024
  • In this study, a proposed ensemble learning technique aims to enhance the semantic segmentation performance of images captured by Unmanned Aerial Vehicles (UAVs). With the increasing use of UAVs in fields such as urban planning, there has been active development of techniques utilizing deep learning segmentation methods for land cover segmentation. The study suggests a method that utilizes prominent segmentation models, namely U-Net, DeepLabV3, and Fully Convolutional Network (FCN), to improve segmentation prediction performance. The proposed approach integrates training loss, validation accuracy, and class score of the three segmentation models to enhance overall prediction performance. The method was applied and evaluated on a land cover segmentation problem involving seven classes: buildings,roads, parking lots, fields, trees, empty spaces, and areas with unspecified labels, using images captured by UAVs. The performance of the ensemble model was evaluated by mean Intersection over Union (mIoU), and the results of comparing the proposed ensemble model with the three existing segmentation methods showed that mIoU performance was improved. Consequently, the study confirms that the proposed technique can enhance the performance of semantic segmentation models.

Deep Learning Similarity-based 1:1 Matching Method for Real Product Image and Drawing Image

  • Han, Gi-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.12
    • /
    • pp.59-68
    • /
    • 2022
  • This paper presents a method for 1:1 verification by comparing the similarity between the given real product image and the drawing image. The proposed method combines two existing CNN-based deep learning models to construct a Siamese Network. After extracting the feature vector of the image through the FC (Fully Connected) Layer of each network and comparing the similarity, if the real product image and the drawing image (front view, left and right side view, top view, etc) are the same product, the similarity is set to 1 for learning and, if it is a different product, the similarity is set to 0. The test (inference) model is a deep learning model that queries the real product image and the drawing image in pairs to determine whether the pair is the same product or not. In the proposed model, through a comparison of the similarity between the real product image and the drawing image, if the similarity is greater than or equal to a threshold value (Threshold: 0.5), it is determined that the product is the same, and if it is less than or equal to, it is determined that the product is a different product. The proposed model showed an accuracy of about 71.8% for a query to a product (positive: positive) with the same drawing as the real product, and an accuracy of about 83.1% for a query to a different product (positive: negative). In the future, we plan to conduct a study to improve the matching accuracy between the real product image and the drawing image by combining the parameter optimization study with the proposed model and adding processes such as data purification.

A Comparative Study on Deep Learning Topology for Event Extraction from Biomedical Literature (생의학 분야 학술 문헌에서의 이벤트 추출을 위한 심층 학습 모델 구조 비교 분석 연구)

  • Kim, Seon-Wu;Yu, Seok Jong;Lee, Min-Ho;Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.77-97
    • /
    • 2017
  • A recent sharp increase of the biomedical literature causes researchers to struggle to grasp the current research trends and conduct creative studies based on the previous results. In order to alleviate their difficulties in keeping up with the latest scholarly trends, numerous attempts have been made to develop specialized analytic services that can provide direct, intuitive and formalized scholarly information by using various text mining technologies such as information extraction and event detection. This paper introduces and evaluates total 8 Convolutional Neural Network (CNN) models for extracting biomedical events from academic abstracts by applying various feature utilization approaches. Also, this paper conducts performance comparison evaluation for the proposed models. As a result of the comparison, we confirmed that the Entity-Type-Fully-Connected model, one of the introduced models in the paper, showed the most promising performance (72.09% in F-score) in the event classification task while it achieved a relatively low but comparable result (21.81%) in the entire event extraction process due to the imbalance problem of the training collections and event identify model's low performance.

Malaria Cell Image Recognition Based On VGG19 Using Transfer Learning (전이 학습을 이용한 VGG19 기반 말라리아셀 이미지 인식)

  • Peng, Xiangshen;Kim, Kangchul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.483-490
    • /
    • 2022
  • Malaria is a disease caused by a parasite and it is prevalent in all over the world. The usual method used to recognize malaria cells is a thick and thin blood smears examination methods, but this method requires a lot of manual calculation, so the efficiency and accuracy are very low as well as the lack of pathologists in impoverished country has led to high malaria mortality rates. In this paper, a malaria cell image recognition model using transfer learning is proposed, which consists in the feature extractor, the residual structure and the fully connected layers. When the pre-training parameters of the VGG-19 model are imported to the proposed model, the parameters of some convolutional layers model are frozen and the fine-tuning method is used to fit the data for the model. Also we implement another malaria cell recognition model without residual structure to compare with the proposed model. The simulation results shows that the model using the residual structure gets better performance than the other model without residual structure and the proposed model has the best accuracy of 97.33% compared to other recent papers.

AANet: Adjacency auxiliary network for salient object detection

  • Li, Xialu;Cui, Ziguan;Gan, Zongliang;Tang, Guijin;Liu, Feng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3729-3749
    • /
    • 2021
  • At present, deep convolution network-based salient object detection (SOD) has achieved impressive performance. However, it is still a challenging problem to make full use of the multi-scale information of the extracted features and which appropriate feature fusion method is adopted to process feature mapping. In this paper, we propose a new adjacency auxiliary network (AANet) based on multi-scale feature fusion for SOD. Firstly, we design the parallel connection feature enhancement module (PFEM) for each layer of feature extraction, which improves the feature density by connecting different dilated convolution branches in parallel, and add channel attention flow to fully extract the context information of features. Then the adjacent layer features with close degree of abstraction but different characteristic properties are fused through the adjacent auxiliary module (AAM) to eliminate the ambiguity and noise of the features. Besides, in order to refine the features effectively to get more accurate object boundaries, we design adjacency decoder (AAM_D) based on adjacency auxiliary module (AAM), which concatenates the features of adjacent layers, extracts their spatial attention, and then combines them with the output of AAM. The outputs of AAM_D features with semantic information and spatial detail obtained from each feature are used as salient prediction maps for multi-level feature joint supervising. Experiment results on six benchmark SOD datasets demonstrate that the proposed method outperforms similar previous methods.