• Title/Summary/Keyword: DeepU-Net

Search Result 178, Processing Time 0.022 seconds

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning (딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

Real-world noisy image denoising using deep residual U-Net structure (깊은 잔차 U-Net 구조를 이용한 실제 카메라 잡음 영상 디노이징)

  • Jang, Yeongil;Cho, Nam Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.119-121
    • /
    • 2019
  • 부가적 백색 잡음 모델(additive white Gaussian noise, AWGN에서 학습된 깊은 신경만 (deep neural networks)을 이용한 잡음 제거기는 제거하려는 잡음이 AWGN인 경우에는 뛰어난 성능을 보이지만 실제 카메라 잡음에 대해서 잡음 제거를 시도하였을 때는 성능이 크게 저하된다. 본 논문은 U-Net 구조의 깊은 인공신경망 모델에 residual block을 결합함으로서 실제 카메라 영상에서 기존 알고리즘보다 뛰어난 성능을 지니는 신경망을 제안하다. 제안한 방법을 통해 Darmstadt Noise Dataset에서 PSNR과 SSIM 모두 CBDNet 대비 향상됨을 확인하였다.

  • PDF

Multi-level Skip Connection for Nested U-Net-based Speech Enhancement (중첩 U-Net 기반 음성 향상을 위한 다중 레벨 Skip Connection)

  • Seorim, Hwang;Joon, Byun;Junyeong, Heo;Jaebin, Cha;Youngcheol, Park
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.840-847
    • /
    • 2022
  • In a deep neural network (DNN)-based speech enhancement, using global and local input speech information is closely related to model performance. Recently, a nested U-Net structure that utilizes global and local input data information using multi-scale has bee n proposed. This nested U-Net was also applied to speech enhancement and showed outstanding performance. However, a single skip connection used in nested U-Nets must be modified for the nested structure. In this paper, we propose a multi-level skip connection (MLS) to optimize the performance of the nested U-Net-based speech enhancement algorithm. As a result, the proposed MLS showed excellent performance improvement in various objective evaluation metrics compared to the standard skip connection, which means th at the MLS can optimize the performance of the nested U-Net-based speech enhancement algorithm. In addition, the final proposed m odel showed superior performance compared to other DNN-based speech enhancement models.

Evaluation of U-Net Based Learning Models according to Equalization Algorithm in Thyroid Ultrasound Imaging (갑상선 초음파 영상의 평활화 알고리즘에 따른 U-Net 기반 학습 모델 평가)

  • Moo-Jin Jeong;Joo-Young Oh;Hoon-Hee Park;Joo-Young Lee
    • Journal of radiological science and technology
    • /
    • v.47 no.1
    • /
    • pp.29-37
    • /
    • 2024
  • This study aims to evaluate the performance of the U-Net based learning model that may vary depending on the histogram equalization algorithm. The subject of the experiment were 17 radiology students of this college, and 1,727 data sets in which the region of interest was set in the thyroid after acquiring ultrasound image data were used. The training set consisted of 1,383 images, the validation set consisted of 172 and the test data set consisted of 172. The equalization algorithm was divided into Histogram Equalization(HE) and Contrast Limited Adaptive Histogram Equalization(CLAHE), and according to the clip limit, it was divided into CLAHE8-1, CLAHE8-2. CLAHE8-3. Deep Learning was learned through size control, histogram equalization, Z-score normalization, and data augmentation. As a result of the experiment, the Attention U-Net showed the highest performance from CLAHE8-2 to 0.8355, and the U-Net and BSU-Net showed the highest performance from CLAHE8-3 to 0.8303 and 0.8277. In the case of mIoU, the Attention U-Net was 0.7175 in CLAHE8-2, the U-Net was 0.7098 and the BSU-Net was 0.7060 in CLAHE8-3. This study attempted to confirm the effects of U-Net, Attention U-Net, and BSU-Net models when histogram equalization is performed on ultrasound images. The increase in Clip Limit can be expected to increase the ROI match with the prediction mask by clarifying the boundaries, which affects the improvement of the contrast of the thyroid area in deep learning model learning, and consequently affects the performance improvement.

A Comparative Performance Analysis of Segmentation Models for Lumbar Key-points Extraction (요추 특징점 추출을 위한 영역 분할 모델의 성능 비교 분석)

  • Seunghee Yoo;Minho Choi ;Jun-Su Jang
    • Journal of Biomedical Engineering Research
    • /
    • v.44 no.5
    • /
    • pp.354-361
    • /
    • 2023
  • Most of spinal diseases are diagnosed based on the subjective judgment of a specialist, so numerous studies have been conducted to find objectivity by automating the diagnosis process using deep learning. In this paper, we propose a method that combines segmentation and feature extraction, which are frequently used techniques for diagnosing spinal diseases. Four models, U-Net, U-Net++, DeepLabv3+, and M-Net were trained and compared using 1000 X-ray images, and key-points were derived using Douglas-Peucker algorithms. For evaluation, Dice Similarity Coefficient(DSC), Intersection over Union(IoU), precision, recall, and area under precision-recall curve evaluation metrics were used and U-Net++ showed the best performance in all metrics with an average DSC of 0.9724. For the average Euclidean distance between estimated key-points and ground truth, U-Net was the best, followed by U-Net++. However the difference in average distance was about 0.1 pixels, which is not significant. The results suggest that it is possible to extract key-points based on segmentation and that it can be used to accurately diagnose various spinal diseases, including spondylolisthesis, with consistent criteria.

Comparative Study of Deep Learning Model for Semantic Segmentation of Water System in SAR Images of KOMPSAT-5 (아리랑 5호 위성 영상에서 수계의 의미론적 분할을 위한 딥러닝 모델의 비교 연구)

  • Kim, Min-Ji;Kim, Seung Kyu;Lee, DoHoon;Gahm, Jin Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.2
    • /
    • pp.206-214
    • /
    • 2022
  • The way to measure the extent of damage from floods and droughts is to identify changes in the extent of water systems. In order to effectively grasp this at a glance, satellite images are used. KOMPSAT-5 uses Synthetic Aperture Radar (SAR) to capture images regardless of weather conditions such as clouds and rain. In this paper, various deep learning models are applied to perform semantic segmentation of the water system in this SAR image and the performance is compared. The models used are U-net, V-Net, U2-Net, UNet 3+, PSPNet, Deeplab-V3, Deeplab-V3+ and PAN. In addition, performance comparison was performed when the data was augmented by applying elastic deformation to the existing SAR image dataset. As a result, without data augmentation, U-Net was the best with IoU of 97.25% and pixel accuracy of 98.53%. In case of data augmentation, Deeplab-V3 showed IoU of 95.15% and V-Net showed the best pixel accuracy of 96.86%.

Tumor Segmentation in Multimodal Brain MRI Using Deep Learning Approaches

  • Al Shehri, Waleed;Jannah, Najlaa
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.343-351
    • /
    • 2022
  • A brain tumor forms when some tissue becomes old or damaged but does not die when it must, preventing new tissue from being born. Manually finding such masses in the brain by analyzing MRI images is challenging and time-consuming for experts. In this study, our main objective is to detect the brain's tumorous part, allowing rapid diagnosis to treat the primary disease instantly. With image processing techniques and deep learning prediction algorithms, our research makes a system capable of finding a tumor in MRI images of a brain automatically and accurately. Our tumor segmentation adopts the U-Net deep learning segmentation on the standard MICCAI BRATS 2018 dataset, which has MRI images with different modalities. The proposed approach was evaluated and achieved Dice Coefficients of 0.9795, 0.9855, 0.9793, and 0.9950 across several test datasets. These results show that the proposed system achieves excellent segmentation of tumors in MRIs using deep learning techniques such as the U-Net algorithm.

Automated Lung Segmentation on Chest Computed Tomography Images with Extensive Lung Parenchymal Abnormalities Using a Deep Neural Network

  • Seung-Jin Yoo;Soon Ho Yoon;Jong Hyuk Lee;Ki Hwan Kim;Hyoung In Choi;Sang Joon Park;Jin Mo Goo
    • Korean Journal of Radiology
    • /
    • v.22 no.3
    • /
    • pp.476-488
    • /
    • 2021
  • Objective: We aimed to develop a deep neural network for segmenting lung parenchyma with extensive pathological conditions on non-contrast chest computed tomography (CT) images. Materials and Methods: Thin-section non-contrast chest CT images from 203 patients (115 males, 88 females; age range, 31-89 years) between January 2017 and May 2017 were included in the study, of which 150 cases had extensive lung parenchymal disease involving more than 40% of the parenchymal area. Parenchymal diseases included interstitial lung disease (ILD), emphysema, nontuberculous mycobacterial lung disease, tuberculous destroyed lung, pneumonia, lung cancer, and other diseases. Five experienced radiologists manually drew the margin of the lungs, slice by slice, on CT images. The dataset used to develop the network consisted of 157 cases for training, 20 cases for development, and 26 cases for internal validation. Two-dimensional (2D) U-Net and three-dimensional (3D) U-Net models were used for the task. The network was trained to segment the lung parenchyma as a whole and segment the right and left lung separately. The University Hospitals of Geneva ILD dataset, which contained high-resolution CT images of ILD, was used for external validation. Results: The Dice similarity coefficients for internal validation were 99.6 ± 0.3% (2D U-Net whole lung model), 99.5 ± 0.3% (2D U-Net separate lung model), 99.4 ± 0.5% (3D U-Net whole lung model), and 99.4 ± 0.5% (3D U-Net separate lung model). The Dice similarity coefficients for the external validation dataset were 98.4 ± 1.0% (2D U-Net whole lung model) and 98.4 ± 1.0% (2D U-Net separate lung model). In 31 cases, where the extent of ILD was larger than 75% of the lung parenchymal area, the Dice similarity coefficients were 97.9 ± 1.3% (2D U-Net whole lung model) and 98.0 ± 1.2% (2D U-Net separate lung model). Conclusion: The deep neural network achieved excellent performance in automatically delineating the boundaries of lung parenchyma with extensive pathological conditions on non-contrast chest CT images.

Development of Marine Debris Monitoring Methods Using Satellite and Drone Images (위성 및 드론 영상을 이용한 해안쓰레기 모니터링 기법 개발)

  • Kim, Heung-Min;Bak, Suho;Han, Jeong-ik;Ye, Geon Hui;Jang, Seon Woong
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1109-1124
    • /
    • 2022
  • This study proposes a marine debris monitoring methods using satellite and drone multispectral images. A multi-layer perceptron (MLP) model was applied to detect marine debris using Sentinel-2 satellite image. And for the detection of marine debris using drone multispectral images, performance evaluation and comparison of U-Net, DeepLabv3+ (ResNet50) and DeepLabv3+ (Inceptionv3) among deep learning models were performed (mIoU 0.68). As a result of marine debris detection using satellite image, the F1-Score was 0.97. Marine debris detection using drone multispectral images was performed on vegetative debris and plastics. As a result of detection, when DeepLabv3+ (Inceptionv3) was used, the most model accuracy, mean intersection over union (mIoU), was 0.68. Vegetative debris showed an F1-Score of 0.93 and IoU of 0.86, while plastics showed low performance with an F1-Score of 0.5 and IoU of 0.33. However, the F1-Score of the spectral index applied to generate plastic mask images was 0.81, which was higher than the plastics detection performance of DeepLabv3+ (Inceptionv3), and it was confirmed that plastics monitoring using the spectral index was possible. The marine debris monitoring technique proposed in this study can be used to establish a plan for marine debris collection and treatment as well as to provide quantitative data on marine debris generation.

Design of Speech Enhancement U-Net for Embedded Computing (임베디드 연산을 위한 잡음에서 음성추출 U-Net 설계)

  • Kim, Hyun-Don
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.5
    • /
    • pp.227-234
    • /
    • 2020
  • In this paper, we propose wav-U-Net to improve speech enhancement in heavy noisy environments, and it has implemented three principal techniques. First, as input data, we use 128 modified Mel-scale filter banks which can reduce computational burden instead of 512 frequency bins. Mel-scale aims to mimic the non-linear human ear perception of sound by being more discriminative at lower frequencies and less discriminative at higher frequencies. Therefore, Mel-scale is the suitable feature considering both performance and computing power because our proposed network focuses on speech signals. Second, we add a simple ResNet as pre-processing that helps our proposed network make estimated speech signals clear and suppress high-frequency noises. Finally, the proposed U-Net model shows significant performance regardless of the kinds of noise. Especially, despite using a single channel, we confirmed that it can well deal with non-stationary noises whose frequency properties are dynamically changed, and it is possible to estimate speech signals from noisy speech signals even in extremely noisy environments where noises are much lauder than speech (less than SNR 0dB). The performance on our proposed wav-U-Net was improved by about 200% on SDR and 460% on NSDR compared to the conventional Jansson's wav-U-Net. Also, it was confirmed that the processing time of out wav-U-Net with 128 modified Mel-scale filter banks was about 2.7 times faster than the common wav-U-Net with 512 frequency bins as input values.