• Title/Summary/Keyword: Multi-class Segmentation

Search Result 20, Processing Time 0.032 seconds

Adaptive Multi-class Segmentation Model of Aggregate Image Based on Improved Sparrow Search Algorithm

  • Mengfei Wang;Weixing Wang;Sheng Feng;Limin Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.391-411
    • /
    • 2023
  • Aggregates play the skeleton and supporting role in the construction field, high-precision measurement and high-efficiency analysis of aggregates are frequently employed to evaluate the project quality. Aiming at the unbalanced operation time and segmentation accuracy for multi-class segmentation algorithms of aggregate images, a Chaotic Sparrow Search Algorithm (CSSA) is put forward to optimize it. In this algorithm, the chaotic map is combined with the sinusoidal dynamic weight and the elite mutation strategies; and it is firstly proposed to promote the SSA's optimization accuracy and stability without reducing the SSA's speed. The CSSA is utilized to optimize the popular multi-class segmentation algorithm-Multiple Entropy Thresholding (MET). By taking three METs as objective functions, i.e., Kapur Entropy, Minimum-cross Entropy and Renyi Entropy, the CSSA is implemented to quickly and automatically calculate the extreme value of the function and get the corresponding correct thresholds. The image adaptive multi-class segmentation model is called CSSA-MET. In order to comprehensively evaluate it, a new parameter I based on the segmentation accuracy and processing speed is constructed. The results reveal that the CSSA outperforms the other seven methods of optimization performance, as well as the quality evaluation of aggregate images segmented by the CSSA-MET, and the speed and accuracy are balanced. In particular, the highest I value can be obtained when the CSSA is applied to optimize the Renyi Entropy, which indicates that this combination is more suitable for segmenting the aggregate images.

Character Segmentation and Recognition Algorithm for Steel Manufacturing Process Automation (슬라브 제품 정보 인식을 위한 문자 분리 및 문자 인식 알고리즘 개발)

  • Choi, Sung-Hoo;Yun, Jong-Pil;Park, Young-Su;Park, Jee-Hoon;Koo, Keun-Hwi;Kim, Sang-Woo
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.389-391
    • /
    • 2007
  • This paper describes about the printed character segmentation and recognition system for slabs in steel manufacturing process. To increase the recognition rate, it is important to improve success rate of character segmentation. Since Slabs front area surface are not uniform and surface temperature is very high, marked characters not only undergo damages but also have much noise. On the other hand, since almost marked characters are very thick and the space between characters is only about 10 $^{\sim}$ 15 mm, there are many touching characters. Therefore appropriate character image preprocessing and segmentation algorithm is needed. In this paper we propose a multi-local thresholding method for damaged character restoration, a modified touching character segmentation, algorithm for marked characters. Finally a effective Multi-Class SVM is used to recognize segmented characters.

  • PDF

DA-Res2Net: a novel Densely connected residual Attention network for image semantic segmentation

  • Zhao, Xiaopin;Liu, Weibin;Xing, Weiwei;Wei, Xiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.11
    • /
    • pp.4426-4442
    • /
    • 2020
  • Since scene segmentation is becoming a hot topic in the field of autonomous driving and medical image analysis, researchers are actively trying new methods to improve segmentation accuracy. At present, the main issues in image semantic segmentation are intra-class inconsistency and inter-class indistinction. From our analysis, the lack of global information as well as macroscopic discrimination on the object are the two main reasons. In this paper, we propose a Densely connected residual Attention network (DA-Res2Net) which consists of a dense residual network and channel attention guidance module to deal with these problems and improve the accuracy of image segmentation. Specifically, in order to make the extracted features equipped with stronger multi-scale characteristics, a densely connected residual network is proposed as a feature extractor. Furthermore, to improve the representativeness of each channel feature, we design a Channel-Attention-Guide module to make the model focusing on the high-level semantic features and low-level location features simultaneously. Experimental results show that the method achieves significant performance on various datasets. Compared to other state-of-the-art methods, the proposed method reaches the mean IOU accuracy of 83.2% on PASCAL VOC 2012 and 79.7% on Cityscapes dataset, respectively.

Fully Automatic Liver Segmentation Based on the Morphological Property of a CT Image (CT 영상의 모포러지컬 특성에 기반한 완전 자동 간 분할)

  • 서경식;박종안;박승진
    • Progress in Medical Physics
    • /
    • v.15 no.2
    • /
    • pp.70-76
    • /
    • 2004
  • The most important work for early detection of liver cancer and decision of its characteristic and location is good segmentation of a liver region from other abdominal organs. This paper proposes a fully automatic liver segmentation algorithm based on the abdominal morphology characteristic as an easy and efficient method. Multi-modal threshold as pre-processing is peformed and a spine is segmented for finding morphological coordinates of an abdomen. Then the liver region is extracted using C-class maximum a posteriori (MAP) decision and morphological filtering. In order to estimate results of the automatic segmented liver region, area error rate (AER) and correlation coefficients of rotational binary region projection matching (RBRPM) are utilized. Experimental results showed automatic liver segmentation obtained by the proposed algorithm provided strong similarity to manual liver segmentation.

  • PDF

Survey on Deep Learning-based Panoptic Segmentation Methods (딥 러닝 기반의 팬옵틱 분할 기법 분석)

  • Kwon, Jung Eun;Cho, Sung In
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.209-214
    • /
    • 2021
  • Panoptic segmentation, which is now widely used in computer vision such as medical image analysis, and autonomous driving, helps understanding an image with holistic view. It identifies each pixel by assigning a unique class ID, and an instance ID. Specifically, it can classify 'thing' from 'stuff', and provide pixel-wise results of semantic prediction and object detection. As a result, it can solve both semantic segmentation and instance segmentation tasks through a unified single model, producing two different contexts for two segmentation tasks. Semantic segmentation task focuses on how to obtain multi-scale features from large receptive field, without losing low-level features. On the other hand, instance segmentation task focuses on how to separate 'thing' from 'stuff' and how to produce the representation of detected objects. With the advances of both segmentation techniques, several panoptic segmentation models have been proposed. Many researchers try to solve discrepancy problems between results of two segmentation branches that can be caused on the boundary of the object. In this survey paper, we will introduce the concept of panoptic segmentation, categorize the existing method into two representative methods and explain how it is operated on two methods: top-down method and bottom-up method. Then, we will analyze the performance of various methods with experimental results.

A Study on Market Segmentation through Clothes Image Preferences and Benefit (PartII) (선호 의복이미지와 편익에 의한 시장세분화에 관한 연구 (제2보))

  • 이숙희;임숙자
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.27 no.3_4
    • /
    • pp.322-332
    • /
    • 2003
  • The purpose of this study was to segment the consumer market for women's street clothes based on benefit sought. The sample was taken from 1106 middle class women who were in their 30's-40's living in Gwangju city. Consumers were classified into three groups by honest sought. The groups were practical benefit seeking group(36.7%), multi-benefit seeking group(32.6%) and symbolic/aesthetic benefit seeking group(30.7%). ANOVA, $\chi$$^2$-test revealed differences among groups according to benefit sought, use of information sources, purchasing behavior variables and demographic variables As a result of comparison for two market segmentations, benefit segmentation was proven to be more useful than segmentations using clothes image preference. But there were differences in psychological variables and demographic variables among the same benefit segments. Therefore hybrid approach on segmentation using clothes images preferences and benefit sought is neccesary.

Image Segmentation for Fire Prediction using Deep Learning (딥러닝을 이용한 화재 발생 예측 이미지 분할)

  • TaeHoon, Kim;JongJin, Park
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.1
    • /
    • pp.65-70
    • /
    • 2023
  • In this paper, we used a deep learning model to detect and segment flame and smoke in real time from fires. To this end, well known U-NET was used to separate and divide the flame and smoke of the fire using multi-class. As a result of learning using the proposed technique, the values of loss error and accuracy are very good at 0.0486 and 0.97996, respectively. The IOU value used in object detection is also very good at 0.849. As a result of predicting fire images that were not used for learning using the learned model, the flame and smoke of fire are well detected and segmented, and smoke color were well distinguished. Proposed method can be used to build fire prediction and detection system.

Face and Iris Detection Algorithm based on SURF and circular Hough Transform (서프 및 하프변환 기반 운전자 동공 검출기법)

  • Artem, Lenskiy;Lee, Jong-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.175-182
    • /
    • 2010
  • The paper presents a novel algorithm for face and iris detection with the application for driver iris monitoring. The proposed algorithm consists of the following major steps: Skin-color segmentation, facial features segmentation, and iris positioning. For the skin-segmentation we applied a multi-layer perceptron to approximate the statistical probability of certain skin-colors, and filter out those with low probabilities. The next step segments the face region into the following categories: eye, mouth, eye brow, and remaining facial regions. For this purpose we propose a novel segmentation technique based on estimation of facial class probability density functions (PDF). Each facial class PDF is estimated on the basis of salient features extracted from a corresponding facial image region. Then pixels are classified according to the highest probability selected from four estimated PDFs. The final step applies the circular Hough transform to the detected eye regions to extract the position and radius of the iris. We tested our system on two data sets. The first one is obtained from the Web and contains faces under different illuminations. The second dataset was collected by us. It contains images obtained from video sequences recorded by a CCD camera while a driver was driving a car. The experimental results are presented, showing high detection rates.

Fully Automatic Heart Segmentation Model Analysis Using Residual Multi-Dilated Recurrent Convolutional U-Net (Residual Multi-Dilated Recurrent Convolutional U-Net을 이용한 전자동 심장 분할 모델 분석)

  • Lim, Sang Heon;Lee, Myung Suk
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.2
    • /
    • pp.37-44
    • /
    • 2020
  • In this paper, we proposed that a fully automatic multi-class whole heart segmentation algorithm using deep learning. The proposed method is based on U-Net architecture which consist of recurrent convolutional block, residual multi-dilated convolutional block. The evaluation was accomplished by comparing automated analysis results of the test dataset to the manual assessment. We obtained the average DSC of 96.88%, precision of 95.60%, and recall of 97.00% with CT images. We were able to observe and analyze after visualizing segmented images using three-dimensional volume rendering method. Our experiment results show that proposed method effectively performed to segment in various heart structures. We expected that our method can help doctors and radiologist to make image reading and clinical decision.

Effective Multi-Modal Feature Fusion for 3D Semantic Segmentation with Multi-View Images (멀티-뷰 영상들을 활용하는 3차원 의미적 분할을 위한 효과적인 멀티-모달 특징 융합)

  • Hye-Lim Bae;Incheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.505-518
    • /
    • 2023
  • 3D point cloud semantic segmentation is a computer vision task that involves dividing the point cloud into different objects and regions by predicting the class label of each point. Existing 3D semantic segmentation models have some limitations in performing sufficient fusion of multi-modal features while ensuring both characteristics of 2D visual features extracted from RGB images and 3D geometric features extracted from point cloud. Therefore, in this paper, we propose MMCA-Net, a novel 3D semantic segmentation model using 2D-3D multi-modal features. The proposed model effectively fuses two heterogeneous 2D visual features and 3D geometric features by using an intermediate fusion strategy and a multi-modal cross attention-based fusion operation. Also, the proposed model extracts context-rich 3D geometric features from input point cloud consisting of irregularly distributed points by adopting PTv2 as 3D geometric encoder. In this paper, we conducted both quantitative and qualitative experiments with the benchmark dataset, ScanNetv2 in order to analyze the performance of the proposed model. In terms of the metric mIoU, the proposed model showed a 9.2% performance improvement over the PTv2 model using only 3D geometric features, and a 12.12% performance improvement over the MVPNet model using 2D-3D multi-modal features. As a result, we proved the effectiveness and usefulness of the proposed model.