• 제목/요약/키워드: Over-Segmentation

Search Result 349, Processing Time 0.025 seconds

Impacts of label quality on performance of steel fatigue crack recognition using deep learning-based image segmentation

  • Hsu, Shun-Hsiang;Chang, Ting-Wei;Chang, Chia-Ming
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.207-220
    • /
    • 2022
  • Structural health monitoring (SHM) plays a vital role in the maintenance and operation of constructions. In recent years, autonomous inspection has received considerable attention because conventional monitoring methods are inefficient and expensive to some extent. To develop autonomous inspection, a potential approach of crack identification is needed to locate defects. Therefore, this study exploits two deep learning-based segmentation models, DeepLabv3+ and Mask R-CNN, for crack segmentation because these two segmentation models can outperform other similar models on public datasets. Additionally, impacts of label quality on model performance are explored to obtain an empirical guideline on the preparation of image datasets. The influence of image cropping and label refining are also investigated, and different strategies are applied to the dataset, resulting in six alternated datasets. By conducting experiments with these datasets, the highest mean Intersection-over-Union (mIoU), 75%, is achieved by Mask R-CNN. The rise in the percentage of annotations by image cropping improves model performance while the label refining has opposite effects on the two models. As the label refining results in fewer error annotations of cracks, this modification enhances the performance of DeepLabv3+. Instead, the performance of Mask R-CNN decreases because fragmented annotations may mistake an instance as multiple instances. To sum up, both DeepLabv3+ and Mask R-CNN are capable of crack identification, and an empirical guideline on the data preparation is presented to strengthen identification successfulness via image cropping and label refining.

A dual path encoder-decoder network for placental vessel segmentation in fetoscopic surgery

  • Yunbo Rao;Tian Tan;Shaoning Zeng;Zhanglin Chen;Jihong Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.15-29
    • /
    • 2024
  • A fetoscope is an optical endoscope, which is often applied in fetoscopic laser photocoagulation to treat twin-to-twin transfusion syndrome. In an operation, the clinician needs to observe the abnormal placental vessels through the endoscope, so as to guide the operation. However, low-quality imaging and narrow field of view of the fetoscope increase the difficulty of the operation. Introducing an accurate placental vessel segmentation of fetoscopic images can assist the fetoscopic laser photocoagulation and help identify the abnormal vessels. This study proposes a method to solve the above problems. A novel encoder-decoder network with a dual-path structure is proposed to segment the placental vessels in fetoscopic images. In particular, we introduce a channel attention mechanism and a continuous convolution structure to obtain multi-scale features with their weights. Moreover, a switching connection is inserted between the corresponding blocks of the two paths to strengthen their relationship. According to the results of a set of blood vessel segmentation experiments conducted on a public fetoscopic image dataset, our method has achieved higher scores than the current mainstream segmentation methods, raising the dice similarity coefficient, intersection over union, and pixel accuracy by 5.80%, 8.39% and 0.62%, respectively.

An improved fuzzy c-means method based on multivariate skew-normal distribution for brain MR image segmentation

  • Guiyuan Zhu;Shengyang Liao;Tianming Zhan;Yunjie Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2082-2102
    • /
    • 2024
  • Accurate segmentation of magnetic resonance (MR) images is crucial for providing doctors with effective quantitative information for diagnosis. However, the presence of weak boundaries, intensity inhomogeneity, and noise in the images poses challenges for segmentation models to achieve optimal results. While deep learning models can offer relatively accurate results, the scarcity of labeled medical imaging data increases the risk of overfitting. To tackle this issue, this paper proposes a novel fuzzy c-means (FCM) model that integrates a deep learning approach. To address the limited accuracy of traditional FCM models, which employ Euclidean distance as a distance measure, we introduce a measurement function based on the skewed normal distribution. This function enables us to capture more precise information about the distribution of the image. Additionally, we construct a regularization term based on the Kullback-Leibler (KL) divergence of high-confidence deep learning results. This regularization term helps enhance the final segmentation accuracy of the model. Moreover, we incorporate orthogonal basis functions to estimate the bias field and integrate it into the improved FCM method. This integration allows our method to simultaneously segment the image and estimate the bias field. The experimental results on both simulated and real brain MR images demonstrate the robustness of our method, highlighting its superiority over other advanced segmentation algorithms.

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF

An Intelligent Video Image Segmentation System using Watershed Algorithm (워터쉐드 알고리즘을 이용한 지능형 비디오 영상 분할 시스템)

  • Yang, Hwang-Kyu
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.3
    • /
    • pp.309-314
    • /
    • 2010
  • In this paper, an intelligent security camera over internet is proposed. Among ISC methods, watersheds based methods produce a good performance in segmentation accuracy. But traditional watershed transform has been suffered from over-segmentation due to small local minima included in gradient image that is input to the watershed transform. And a zone face candidates of detection using skin-color model. last step, face to check at face of candidate location using SVM method. It is extract of wavelet transform coefficient to the zone face candidated. Therefore, it is likely that it is applicable to read world problem, such as object tracking, surveillance, and human computer interface application etc.

Segmention-Based Residual Image Coding Using Classified Vectior Quantizer (분할기반 잉여신호의 CVQ 영상 부호화)

  • 김남철;김종우;홍원학;석민수
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.1
    • /
    • pp.63-71
    • /
    • 1993
  • An efficient RVQ image coding method is proposed using the segmentation-based coding and CVQ techniques. In the proposed method the residual image, the difference between an original image and the synthesized one obtained from the segmentation-based coding, is first dividel into $\times$4 subblocks. They are then individually coded in the spatial domain using a simple CVQ. Experimental results show that the proposed method yields better quality of the reconstructed images in both PSNR and subjective test over the basic VQ and SMVQ.

  • PDF

Efficient Image Segmentation Using Morphological Watershed Algorithm (형태학적 워터쉐드 알고리즘을 이용한 효율적인 영상분할)

  • Kim, Young-Woo;Lim, Jae-Young;Lee, Won-Yeol;Kim, Se-Yun;Lim, Dong-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.709-721
    • /
    • 2009
  • This paper discusses an efficient image segmentation using morphological watershed algorithm that is robust to noise. Morphological image segmentation consists of four steps: image simplification, computation of gradient image and watershed algorithm and region merging. Conventional watershed segmentation exhibits a serious weakness for over-segmentation of images. In this paper we present a morphological edge detection methods for detecting edges under noisy condition and apply our watershed algorithm to the resulting gradient images and merge regions using Kolmogorov-Smirnov test for eliminating irrelevant regions in the resulting segmented images. Experimental results are analyzed in both qualitative analysis through visual inspection and quantitative analysis with percentage error as well as computational time needed to segment images. The proposed algorithm can efficiently improve segmentation accuracy and significantly reduce the speed of computational time.

Real-time semantic segmentation of gastric intestinal metaplasia using a deep learning approach

  • Vitchaya Siripoppohn;Rapat Pittayanon;Kasenee Tiankanon;Natee Faknak;Anapat Sanpavat;Naruemon Klaikaew;Peerapon Vateekul;Rungsun Rerknimitr
    • Clinical Endoscopy
    • /
    • v.55 no.3
    • /
    • pp.390-400
    • /
    • 2022
  • Background/Aims: Previous artificial intelligence (AI) models attempting to segment gastric intestinal metaplasia (GIM) areas have failed to be deployed in real-time endoscopy due to their slow inference speeds. Here, we propose a new GIM segmentation AI model with inference speeds faster than 25 frames per second that maintains a high level of accuracy. Methods: Investigators from Chulalongkorn University obtained 802 histological-proven GIM images for AI model training. Four strategies were proposed to improve the model accuracy. First, transfer learning was employed to the public colon datasets. Second, an image preprocessing technique contrast-limited adaptive histogram equalization was employed to produce clearer GIM areas. Third, data augmentation was applied for a more robust model. Lastly, the bilateral segmentation network model was applied to segment GIM areas in real time. The results were analyzed using different validity values. Results: From the internal test, our AI model achieved an inference speed of 31.53 frames per second. GIM detection showed sensitivity, specificity, positive predictive, negative predictive, accuracy, and mean intersection over union in GIM segmentation values of 93%, 80%, 82%, 92%, 87%, and 57%, respectively. Conclusions: The bilateral segmentation network combined with transfer learning, contrast-limited adaptive histogram equalization, and data augmentation can provide high sensitivity and good accuracy for GIM detection and segmentation.

Effect of Learning Data on the Semantic Segmentation of Railroad Tunnel Using Deep Learning (딥러닝을 활용한 철도 터널 객체 분할에 학습 데이터가 미치는 영향)

  • Ryu, Young-Moo;Kim, Byung-Kyu;Park, Jeongjun
    • Journal of the Korean Geotechnical Society
    • /
    • v.37 no.11
    • /
    • pp.107-118
    • /
    • 2021
  • Scan-to-BIM can be precisely mod eled by measuring structures with Light Detection And Ranging (LiDAR) and build ing a 3D BIM (Building Information Modeling) model based on it, but has a limitation in that it consumes a lot of manpower, time, and cost. To overcome these limitations, studies are being conducted to perform semantic segmentation of 3D point cloud data applying deep learning algorithms, but studies on how segmentation result changes depending on learning data are insufficient. In this study, a parametric study was conducted to determine how the size and track type of railroad tunnels constituting learning data affect the semantic segmentation of railroad tunnels through deep learning. As a result of the parametric study, the similar size of the tunnels used for learning and testing, the higher segmentation accuracy, and the better results when learning through a double-track tunnel than a single-line tunnel. In addition, when the training data is composed of two or more tunnels, overall accuracy (OA) and mean intersection over union (MIoU) increased by 10% to 50%, it has been confirmed that various configurations of learning data can contribute to efficient learning.

A New Method for Segmenting Speech Signal by Frame Averaging Algorithm

  • Byambajav D.;Kang Chul-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.128-131
    • /
    • 2005
  • A new algorithm for speech signal segmentation is proposed. This algorithm is based on finding successive similar frames belonging to a segment and represents it by an average spectrum. The speech signal is a slowly time varying signal in the sense that, when examined over a sufficiently short period of time (between 10 and 100 ms), its characteristics are fairly stationary. Generally this approach is based on finding these fairly stationary periods. Advantages of the. algorithm are accurate border decision of segments and simple computation. The automatic segmentations using frame averaging show as much as $82.20\%$ coincided with manually verified segmentation of CMU ARCTIC corpus within time range 16 ms. More than $90\%$ segment boundaries are coincided within a range of 32 ms. Also it can be combined with many types of automatic segmentations (HMM based, acoustic cues or feature based etc.).