• Title/Summary/Keyword: DeepU-Net

Search Result 179, Processing Time 0.023 seconds

Layer Segmentation of Retinal OCT Images using Deep Convolutional Encoder-Decoder Network (딥 컨볼루셔널 인코더-디코더 네트워크를 이용한 망막 OCT 영상의 층 분할)

  • Kwon, Oh-Heum;Song, Min-Gyu;Song, Ha-Joo;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1269-1279
    • /
    • 2019
  • In medical image analysis, segmentation is considered as a vital process since it partitions an image into coherent parts and extracts interesting objects from the image. In this paper, we consider automatic segmentations of OCT retinal images to find six layer boundaries using convolutional neural networks. Segmenting retinal images by layer boundaries is very important in diagnosing and predicting progress of eye diseases including diabetic retinopathy, glaucoma, and AMD (age-related macular degeneration). We applied well-known CNN architecture for general image segmentation, called Segnet, U-net, and CNN-S into this problem. We also proposed a shortest path-based algorithm for finding the layer boundaries from the outputs of Segnet and U-net. We analysed their performance on public OCT image data set. The experimental results show that the Segnet combined with the proposed shortest path-based boundary finding algorithm outperforms other two networks.

Improving the Vehicle Damage Detection Model using YOLOv4 (YOLOv4를 이용한 차량파손 검출 모델 개선)

  • Jeon, Jong Won;Lee, Hyo Seop;Hahn, Hee Il
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.750-755
    • /
    • 2021
  • This paper proposes techniques for detecting the damage status of each part of a vehicle using YOLOv4. The proposed algorithm learns the parts and their damages of the vehicle through YOLOv4, extracts the coordinate information of the detected bounding boxes, and applies the algorithm to determine the relationship between the damage and the vehicle part to derive the damage status for each part. In addition, the technique using VGGNet, the technique using image segmentation and U-Net model, and Weproove.AI deep learning model, etc. are included for objectivity of performance comparison. Through this, the performance of the proposed algorithm is compared and evaluated, and a method to improve the detection model is proposed.

Lightweight high-precision pedestrian tracking algorithm in complex occlusion scenarios

  • Qiang Gao;Zhicheng He;Xu Jia;Yinghong Xie;Xiaowei Han
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.840-860
    • /
    • 2023
  • Aiming at the serious occlusion and slow tracking speed in pedestrian target tracking and recognition in complex scenes, a target tracking method based on improved YOLO v5 combined with Deep SORT is proposed. By merging the attention mechanism ECA-Net with the Neck part of the YOLO v5 network, using the CIoU loss function and the method of CIoU non-maximum value suppression, connecting the Deep SORT model using Shuffle Net V2 as the appearance feature extraction network to achieve lightweight and fast speed tracking and the purpose of improving tracking under occlusion. A large number of experiments show that the improved YOLO v5 increases the average precision by 1.3% compared with other algorithms. The improved tracking model, MOTA reaches 54.3% on the MOT17 pedestrian tracking data, and the tracking accuracy is 3.7% higher than the related algorithms and The model presented in this paper improves the FPS by nearly 5 on the fps indicator.

Land Cover Classifier Using Coordinate Hash Encoder (좌표 해시 인코더를 활용한 토지피복 분류 모델)

  • Yongsun Yoon;Dongjae Kwon
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_3
    • /
    • pp.1771-1777
    • /
    • 2023
  • With the advancements of deep learning, many semantic segmentation-based methods for land cover classification have been proposed. However, existing deep learning-based models only use image information and cannot guarantee spatiotemporal consistency. In this study, we propose a land cover classification model using geographical coordinates. First, the coordinate features are extracted through the Coordinate Hash Encoder, which is an extension of the Multi-resolution Hash Encoder, an implicit neural representation technique, to the longitude-latitude coordinate system. Next, we propose an architecture that combines the extracted coordinate features with different levels of U-net decoder. Experimental results show that the proposed method improves the mean intersection over union by about 32% and improves the spatiotemporal consistency.

Classification of Industrial Parks and Quarries Using U-Net from KOMPSAT-3/3A Imagery (KOMPSAT-3/3A 영상으로부터 U-Net을 이용한 산업단지와 채석장 분류)

  • Che-Won Park;Hyung-Sup Jung;Won-Jin Lee;Kwang-Jae Lee;Kwan-Young Oh;Jae-Young Chang;Moung-Jin Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_3
    • /
    • pp.1679-1692
    • /
    • 2023
  • South Korea is a country that emits a large amount of pollutants as a result of population growth and industrial development and is also severely affected by transboundary air pollution due to its geographical location. As pollutants from both domestic and foreign sources contribute to air pollution in Korea, the location of air pollutant emission sources is crucial for understanding the movement and distribution of pollutants in the atmosphere and establishing national-level air pollution management and response strategies. Based on this background, this study aims to effectively acquire spatial information on domestic and international air pollutant emission sources, which is essential for analyzing air pollution status, by utilizing high-resolution optical satellite images and deep learning-based image segmentation models. In particular, industrial parks and quarries, which have been evaluated as contributing significantly to transboundary air pollution, were selected as the main research subjects, and images of these areas from multi-purpose satellites 3 and 3A were collected, preprocessed, and converted into input and label data for model training. As a result of training the U-Net model using this data, the overall accuracy of 0.8484 and mean Intersection over Union (mIoU) of 0.6490 were achieved, and the predicted maps showed significant results in extracting object boundaries more accurately than the label data created by course annotations.

Detection of Wildfire Burned Areas in California Using Deep Learning and Landsat 8 Images (딥러닝과 Landsat 8 영상을 이용한 캘리포니아 산불 피해지 탐지)

  • Youngmin Seo;Youjeong Youn;Seoyeon Kim;Jonggu Kang;Yemin Jeong;Soyeon Choi;Yungyo Im;Yangwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1413-1425
    • /
    • 2023
  • The increasing frequency of wildfires due to climate change is causing extreme loss of life and property. They cause loss of vegetation and affect ecosystem changes depending on their intensity and occurrence. Ecosystem changes, in turn, affect wildfire occurrence, causing secondary damage. Thus, accurate estimation of the areas affected by wildfires is fundamental. Satellite remote sensing is used for forest fire detection because it can rapidly acquire topographic and meteorological information about the affected area after forest fires. In addition, deep learning algorithms such as convolutional neural networks (CNN) and transformer models show high performance for more accurate monitoring of fire-burnt regions. To date, the application of deep learning models has been limited, and there is a scarcity of reports providing quantitative performance evaluations for practical field utilization. Hence, this study emphasizes a comparative analysis, exploring performance enhancements achieved through both model selection and data design. This study examined deep learning models for detecting wildfire-damaged areas using Landsat 8 satellite images in California. Also, we conducted a comprehensive comparison and analysis of the detection performance of multiple models, such as U-Net and High-Resolution Network-Object Contextual Representation (HRNet-OCR). Wildfire-related spectral indices such as normalized difference vegetation index (NDVI) and normalized burn ratio (NBR) were used as input channels for the deep learning models to reflect the degree of vegetation cover and surface moisture content. As a result, the mean intersection over union (mIoU) was 0.831 for U-Net and 0.848 for HRNet-OCR, showing high segmentation performance. The inclusion of spectral indices alongside the base wavelength bands resulted in increased metric values for all combinations, affirming that the augmentation of input data with spectral indices contributes to the refinement of pixels. This study can be applied to other satellite images to build a recovery strategy for fire-burnt areas.

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

A Fully Convolutional Network Model for Classifying Liver Fibrosis Stages from Ultrasound B-mode Images (초음파 B-모드 영상에서 FCN(fully convolutional network) 모델을 이용한 간 섬유화 단계 분류 알고리즘)

  • Kang, Sung Ho;You, Sun Kyoung;Lee, Jeong Eun;Ahn, Chi Young
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.1
    • /
    • pp.48-54
    • /
    • 2020
  • In this paper, we deal with a liver fibrosis classification problem using ultrasound B-mode images. Commonly representative methods for classifying the stages of liver fibrosis include liver biopsy and diagnosis based on ultrasound images. The overall liver shape and the smoothness and roughness of speckle pattern represented in ultrasound images are used for determining the fibrosis stages. Although the ultrasound image based classification is used frequently as an alternative or complementary method of the invasive biopsy, it also has the limitations that liver fibrosis stage decision depends on the image quality and the doctor's experience. With the rapid development of deep learning algorithms, several studies using deep learning methods have been carried out for automated liver fibrosis classification and showed superior performance of high accuracy. The performance of those deep learning methods depends closely on the amount of datasets. We propose an enhanced U-net architecture to maximize the classification accuracy with limited small amount of image datasets. U-net is well known as a neural network for fast and precise segmentation of medical images. We design it newly for the purpose of classifying liver fibrosis stages. In order to assess the performance of the proposed architecture, numerical experiments are conducted on a total of 118 ultrasound B-mode images acquired from 78 patients with liver fibrosis symptoms of F0~F4 stages. The experimental results support that the performance of the proposed architecture is much better compared to the transfer learning using the pre-trained model of VGGNet.

Development of Deep Learning Based Ensemble Land Cover Segmentation Algorithm Using Drone Aerial Images (드론 항공영상을 이용한 딥러닝 기반 앙상블 토지 피복 분할 알고리즘 개발)

  • Hae-Gwang Park;Seung-Ki Baek;Seung Hyun Jeong
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.1
    • /
    • pp.71-80
    • /
    • 2024
  • In this study, a proposed ensemble learning technique aims to enhance the semantic segmentation performance of images captured by Unmanned Aerial Vehicles (UAVs). With the increasing use of UAVs in fields such as urban planning, there has been active development of techniques utilizing deep learning segmentation methods for land cover segmentation. The study suggests a method that utilizes prominent segmentation models, namely U-Net, DeepLabV3, and Fully Convolutional Network (FCN), to improve segmentation prediction performance. The proposed approach integrates training loss, validation accuracy, and class score of the three segmentation models to enhance overall prediction performance. The method was applied and evaluated on a land cover segmentation problem involving seven classes: buildings,roads, parking lots, fields, trees, empty spaces, and areas with unspecified labels, using images captured by UAVs. The performance of the ensemble model was evaluated by mean Intersection over Union (mIoU), and the results of comparing the proposed ensemble model with the three existing segmentation methods showed that mIoU performance was improved. Consequently, the study confirms that the proposed technique can enhance the performance of semantic segmentation models.

The Performance Improvement of U-Net Model for Landcover Semantic Segmentation through Data Augmentation (데이터 확장을 통한 토지피복분류 U-Net 모델의 성능 개선)

  • Baek, Won-Kyung;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1663-1676
    • /
    • 2022
  • Recently, a number of deep-learning based land cover segmentation studies have been introduced. Some studies denoted that the performance of land cover segmentation deteriorated due to insufficient training data. In this study, we verified the improvement of land cover segmentation performance through data augmentation. U-Net was implemented for the segmentation model. And 2020 satellite-derived landcover dataset was utilized for the study data. The pixel accuracies were 0.905 and 0.923 for U-Net trained by original and augmented data respectively. And the mean F1 scores of those models were 0.720 and 0.775 respectively, indicating the better performance of data augmentation. In addition, F1 scores for building, road, paddy field, upland field, forest, and unclassified area class were 0.770, 0.568, 0.433, 0.455, 0.964, and 0.830 for the U-Net trained by original data. It is verified that data augmentation is effective in that the F1 scores of every class were improved to 0.838, 0.660, 0.791, 0.530, 0.969, and 0.860 respectively. Although, we applied data augmentation without considering class balances, we find that data augmentation can mitigate biased segmentation performance caused by data imbalance problems from the comparisons between the performances of two models. It is expected that this study would help to prove the importance and effectiveness of data augmentation in various image processing fields.