• Title/Summary/Keyword: Image Semantic Segmentation

Search Result 143, Processing Time 0.027 seconds

A Proposal of Deep Learning Based Semantic Segmentation to Improve Performance of Building Information Models Classification (Semantic Segmentation 기반 딥러닝을 활용한 건축 Building Information Modeling 부재 분류성능 개선 방안)

  • Lee, Ko-Eun;Yu, Young-Su;Ha, Dae-Mok;Koo, Bon-Sang;Lee, Kwan-Hoon
    • Journal of KIBIM
    • /
    • v.11 no.3
    • /
    • pp.22-33
    • /
    • 2021
  • In order to maximize the use of BIM, all data related to individual elements in the model must be correctly assigned, and it is essential to check whether it corresponds to the IFC entity classification. However, as the BIM modeling process is performed by a large number of participants, it is difficult to achieve complete integrity. To solve this problem, studies on semantic integrity verification are being conducted to examine whether elements are correctly classified or IFC mapped in the BIM model by applying an artificial intelligence algorithm to the 2D image of each element. Existing studies had a limitation in that they could not correctly classify some elements even though the geometrical differences in the images were clear. This was found to be due to the fact that the geometrical characteristics were not properly reflected in the learning process because the range of the region to be learned in the image was not clearly defined. In this study, the CRF-RNN-based semantic segmentation was applied to increase the clarity of element region within each image, and then applied to the MVCNN algorithm to improve the classification performance. As a result of applying semantic segmentation in the MVCNN learning process to 889 data composed of a total of 8 BIM element types, the classification accuracy was found to be 0.92, which is improved by 0.06 compared to the conventional MVCNN.

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation (딥러닝 기반의 Semantic Segmentation을 위한 DeepLabv3+에서 강조 기법에 관한 연구)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.55-61
    • /
    • 2021
  • In this paper, we proposed a DeepLabv3+ based encoder-decoder model utilizing an attention mechanism for precise semantic segmentation. The DeepLabv3+ is a semantic segmentation method based on deep learning and is mainly used in applications such as autonomous vehicles, and infrared image analysis. In the conventional DeepLabv3+, there is little use of the encoder's intermediate feature map in the decoder part, resulting in loss in restoration process. Such restoration loss causes a problem of reducing segmentation accuracy. Therefore, the proposed method firstly minimized the restoration loss by additionally using one intermediate feature map. Furthermore, we fused hierarchically from small feature map in order to effectively utilize this. Finally, we applied an attention mechanism to the decoder to maximize the decoder's ability to converge intermediate feature maps. We evaluated the proposed method on the Cityscapes dataset, which is commonly used for street scene image segmentation research. Experiment results showed that our proposed method improved segmentation results compared to the conventional DeepLabv3+. The proposed method can be used in applications that require high accuracy.

Semantic Segmentation of Drone Images Based on Combined Segmentation Network Using Multiple Open Datasets (개방형 다중 데이터셋을 활용한 Combined Segmentation Network 기반 드론 영상의 의미론적 분할)

  • Ahram Song
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.967-978
    • /
    • 2023
  • This study proposed and validated a combined segmentation network (CSN) designed to effectively train on multiple drone image datasets and enhance the accuracy of semantic segmentation. CSN shares the entire encoding domain to accommodate the diversity of three drone datasets, while the decoding domains are trained independently. During training, the segmentation accuracy of CSN was lower compared to U-Net and the pyramid scene parsing network (PSPNet) on single datasets because it considers loss values for all dataset simultaneously. However, when applied to domestic autonomous drone images, CSN demonstrated the ability to classify pixels into appropriate classes without requiring additional training, outperforming PSPNet. This research suggests that CSN can serve as a valuable tool for effectively training on diverse drone image datasets and improving object recognition accuracy in new regions.

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.

Semantic Segmentation using Convolutional Neural Network with Conditional Random Field (조건부 랜덤 필드와 컨볼루션 신경망을 이용한 의미론적인 객체 분할 방법)

  • Lim, Su-Chang;Kim, Do-Yeon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.3
    • /
    • pp.451-456
    • /
    • 2017
  • Semantic segmentation, which is the most basic and complicated problem in computer vision, classifies each pixel of an image into a specific object and performs a task of specifying a label. MRF and CRF, which have been studied in the past, have been studied as effective methods for improving the accuracy of pixel level labeling. In this paper, we propose a semantic partitioning method that combines CNN, a kind of deep running, which is in the spotlight recently, and CRF, a probabilistic model. For learning and performance verification, Pascal VOC 2012 image database was used and the test was performed using arbitrary images not used for learning. As a result of the study, we showed better partitioning performance than existing semantic partitioning algorithm.

Efficient Deep Neural Network Architecture based on Semantic Segmentation for Paved Road Detection (효율적인 비정형 도로영역 인식을 위한 Semantic segmentation 기반 심층 신경망 구조)

  • Park, Sejin;Han, Jeong Hoon;Moon, Young Shik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1437-1444
    • /
    • 2020
  • With the development of computer vision systems, many advances have been made in the fields of surveillance, biometrics, medical imaging, and autonomous driving. In the field of autonomous driving, in particular, the object detection technique using deep learning are widely used, and the paved road detection is a particularly crucial problem. Unlike the ROI detection algorithm used in general object detection, the structure of paved road in the image is heterogeneous, so the ROI-based object recognition architecture is not available. In this paper, we propose a deep neural network architecture for atypical paved road detection using Semantic segmentation network. In addition, we introduce the multi-scale semantic segmentation network, which is a network architecture specialized to the paved road detection. We demonstrate that the performance is significantly improved by the proposed method.

The Optimal GSD and Image Size for Deep Learning Semantic Segmentation Training of Drone Images of Winter Vegetables (드론 영상으로부터 월동 작물 분류를 위한 의미론적 분할 딥러닝 모델 학습 최적 공간 해상도와 영상 크기 선정)

  • Chung, Dongki;Lee, Impyeong
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_1
    • /
    • pp.1573-1587
    • /
    • 2021
  • A Drone image is an ultra-high-resolution image that is several or tens of times higher in spatial resolution than a satellite or aerial image. Therefore, drone image-based remote sensing is different from traditional remote sensing in terms of the level of object to be extracted from the image and the amount of data to be processed. In addition, the optimal scale and size of data used for model training is different depending on the characteristics of the applied deep learning model. However, moststudies do not consider the size of the object to be found in the image, the spatial resolution of the image that reflects the scale, and in many cases, the data specification used in the model is applied as it is before. In this study, the effect ofspatial resolution and image size of drone image on the accuracy and training time of the semantic segmentation deep learning model of six wintering vegetables was quantitatively analyzed through experiments. As a result of the experiment, it was found that the average accuracy of dividing six wintering vegetablesincreases asthe spatial resolution increases, but the increase rate and convergence section are different for each crop, and there is a big difference in accuracy and time depending on the size of the image at the same resolution. In particular, it wasfound that the optimal resolution and image size were different from each crop. The research results can be utilized as data for getting the efficiency of drone images acquisition and production of training data when developing a winter vegetable segmentation model using drone images.

Efficient Fast Motion Estimation algorithm and Image Segmentation For Low-bit-rate Video Coding (저 전송율 비디오 부호화를 위한 효율적인 고속 움직임추정 알고리즘과 영상 분할기법)

  • 이병석;한수영;이동규;이두수
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.211-214
    • /
    • 2001
  • This paper presents an efficient fast motion estimation algorithm and image segmentation method for low bit-rate coding. First, with region split information, the algorithm splits the image having homogeneous and semantic regions like face and semantic regions in image. Then, in these regions, We find the motion vector using adaptive search window adjustment. Additionally, with this new segment based fast motion estimation, we reduce blocking artifacts by intensively coding our interesting region(face or arm) in input image. The simulation results show the improvement in coding performance and image quality.

  • PDF

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning (딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

A hierarchical semantic segmentation framework for computer vision-based bridge damage detection

  • Jingxiao Liu;Yujie Wei ;Bingqing Chen;Hae Young Noh
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.325-334
    • /
    • 2023
  • Computer vision-based damage detection enables non-contact, efficient and low-cost bridge health monitoring, which reduces the need for labor-intensive manual inspection or that for a large number of on-site sensing instruments. By leveraging recent semantic segmentation approaches, we can detect regions of critical structural components and identify damages at pixel level on images. However, existing methods perform poorly when detecting small and thin damages (e.g., cracks); the problem is exacerbated by imbalanced samples. To this end, we incorporate domain knowledge to introduce a hierarchical semantic segmentation framework that imposes a hierarchical semantic relationship between component categories and damage types. For instance, certain types of concrete cracks are only present on bridge columns, and therefore the noncolumn region may be masked out when detecting such damages. In this way, the damage detection model focuses on extracting features from relevant structural components and avoid those from irrelevant regions. We also utilize multi-scale augmentation to preserve contextual information of each image, without losing the ability to handle small and/or thin damages. In addition, our framework employs an importance sampling, where images with rare components are sampled more often, to address sample imbalance. We evaluated our framework on a public synthetic dataset that consists of 2,000 railway bridges. Our framework achieves a 0.836 mean intersection over union (IoU) for structural component segmentation and a 0.483 mean IoU for damage segmentation. Our results have in total 5% and 18% improvements for the structural component segmentation and damage segmentation tasks, respectively, compared to the best-performing baseline model.