• Title/Summary/Keyword: Combined segmentation network

Search Result 17, Processing Time 0.02 seconds

Semantic Segmentation of Drone Images Based on Combined Segmentation Network Using Multiple Open Datasets (개방형 다중 데이터셋을 활용한 Combined Segmentation Network 기반 드론 영상의 의미론적 분할)

  • Ahram Song
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.967-978
    • /
    • 2023
  • This study proposed and validated a combined segmentation network (CSN) designed to effectively train on multiple drone image datasets and enhance the accuracy of semantic segmentation. CSN shares the entire encoding domain to accommodate the diversity of three drone datasets, while the decoding domains are trained independently. During training, the segmentation accuracy of CSN was lower compared to U-Net and the pyramid scene parsing network (PSPNet) on single datasets because it considers loss values for all dataset simultaneously. However, when applied to domestic autonomous drone images, CSN demonstrated the ability to classify pixels into appropriate classes without requiring additional training, outperforming PSPNet. This research suggests that CSN can serve as a valuable tool for effectively training on diverse drone image datasets and improving object recognition accuracy in new regions.

Semantic Segmentation of Heterogeneous Unmanned Aerial Vehicle Datasets Using Combined Segmentation Network

  • Ahram, Song
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.1
    • /
    • pp.87-97
    • /
    • 2023
  • Unmanned aerial vehicles (UAVs) can capture high-resolution imagery from a variety of viewing angles and altitudes; they are generally limited to collecting images of small scenes from larger regions. To improve the utility of UAV-appropriated datasetsfor use with deep learning applications, multiple datasets created from variousregions under different conditions are needed. To demonstrate a powerful new method for integrating heterogeneous UAV datasets, this paper applies a combined segmentation network (CSN) to share UAVid and semantic drone dataset encoding blocks to learn their general features, whereas its decoding blocks are trained separately on each dataset. Experimental results show that our CSN improves the accuracy of specific classes (e.g., cars), which currently comprise a low ratio in both datasets. From this result, it is expected that the range of UAV dataset utilization will increase.

Semantic Image Segmentation Combining Image-level and Pixel-level Classification (영상수준과 픽셀수준 분류를 결합한 영상 의미분할)

  • Kim, Seon Kuk;Lee, Chil Woo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.12
    • /
    • pp.1425-1430
    • /
    • 2018
  • In this paper, we propose a CNN based deep learning algorithm for semantic segmentation of images. In order to improve the accuracy of semantic segmentation, we combined pixel level object classification and image level object classification. The image level object classification is used to accurately detect the characteristics of an image, and the pixel level object classification is used to indicate which object area is included in each pixel. The proposed network structure consists of three parts in total. A part for extracting the features of the image, a part for outputting the final result in the resolution size of the original image, and a part for performing the image level object classification. Loss functions exist for image level and pixel level classification, respectively. Image-level object classification uses KL-Divergence and pixel level object classification uses cross-entropy. In addition, it combines the layer of the resolution of the network extracting the features and the network of the resolution to secure the position information of the lost feature and the information of the boundary of the object due to the pooling operation.

Layer Segmentation of Retinal OCT Images using Deep Convolutional Encoder-Decoder Network (딥 컨볼루셔널 인코더-디코더 네트워크를 이용한 망막 OCT 영상의 층 분할)

  • Kwon, Oh-Heum;Song, Min-Gyu;Song, Ha-Joo;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1269-1279
    • /
    • 2019
  • In medical image analysis, segmentation is considered as a vital process since it partitions an image into coherent parts and extracts interesting objects from the image. In this paper, we consider automatic segmentations of OCT retinal images to find six layer boundaries using convolutional neural networks. Segmenting retinal images by layer boundaries is very important in diagnosing and predicting progress of eye diseases including diabetic retinopathy, glaucoma, and AMD (age-related macular degeneration). We applied well-known CNN architecture for general image segmentation, called Segnet, U-net, and CNN-S into this problem. We also proposed a shortest path-based algorithm for finding the layer boundaries from the outputs of Segnet and U-net. We analysed their performance on public OCT image data set. The experimental results show that the Segnet combined with the proposed shortest path-based boundary finding algorithm outperforms other two networks.

Recognition of Passport MRZ Information Using Combined Neural Networks (결합 신경망을 이용한 여권 MRZ 정보 인식)

  • Kim, Jinho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.149-157
    • /
    • 2019
  • In case of reading passport using a smart phone in contrast with a dedicated passport reading system, MRZ(Machine Readable Zone) character recognition can be hard when the character strokes were broken, touched or blurred according to the lighting condition, and the position and size of MRZ character lines were varied due to the camera distance and angle. In this paper, the effective recognition algorithm of the passport MRZ information using a combined neural network recognizer of CNN(Convolutional Neural Network) and ANN( Artificial Neural Network), is proposed under the various sized and skewed passport images. The MRZ line detection using connected component analysis algorithm and the skew correction using perspective transform algorithm are also designed in order to achieve effective character segmentation results. Each of the MRZ field recognition results is verified by using five check digits for deciding whether retrying the recognition process of passport MRZ information or not. After we implement the proposed recognition algorithm of passport MRZ information, the excellent recognition performance of the passport MRZ information was obtained in the experimental results for PC off-line mode and smart phone on-line mode.

Robust Coronary Artery Segmentation in 2D X-ray Images using Local Patch-based Re-connection Methods (지역적 패치기반 보정기법을 활용한 2D X-ray 영상에서의 강인한 관상동맥 재연결 기법)

  • Han, Kyunghoon;Jeon, Byunghwan;Kim, Sekeun;Jang, Yeonggul;Jung, Sunghee;Shim, Hackjoon;Chang, Hyukjae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.592-601
    • /
    • 2019
  • For coronary procedures, X-ray angiogram images are useful for diagnosing and assisting procedures. It is challenging to accurately segment a coronary artery using only a single segmentation model in 2D X-ray images due to a complex structure of three-dimensional coronary artery, especially from phenomenon of vessels being broken in the middle or end of coronary artery. In order to solve these problems, the initial segmentation is performed using an existing single model, and the candidate regions for the sophisticate correction is estimated based on the initial segment, and the local patch-based correction is performed in the candidate regions. Through this research, not only the broken coronary arteries are re-connected, but also the distal part of coronary artery that is very thin is additionally correctly found. Further, the performance can be much improved by combining the proposed correction method with any existing coronary artery segmentation method. In this paper, the U-net, a fully convolutional network was chosen as a segmentation method and the proposed correction method was combined with U-net to demonstrate a significant improvement in performance through X-ray images from several patients.

Masked Face Recognition via a Combined SIFT and DLBP Features Trained in CNN Model

  • Aljarallah, Nahla Fahad;Uliyan, Diaa Mohammed
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.6
    • /
    • pp.319-331
    • /
    • 2022
  • The latest global COVID-19 pandemic has made the use of facial masks an important aspect of our lives. People are advised to cover their faces in public spaces to discourage illness from spreading. Using these face masks posed a significant concern about the exactness of the face identification method used to search and unlock telephones at the school/office. Many companies have already built the requisite data in-house to incorporate such a scheme, using face recognition as an authentication. Unfortunately, veiled faces hinder the detection and acknowledgment of these facial identity schemes and seek to invalidate the internal data collection. Biometric systems that use the face as authentication cause problems with detection or recognition (face or persons). In this research, a novel model has been developed to detect and recognize faces and persons for authentication using scale invariant features (SIFT) for the whole segmented face with an efficient local binary texture features (DLBP) in region of eyes in the masked face. The Fuzzy C means is utilized to segment the image. These mixed features are trained significantly in a convolution neural network (CNN) model. The main advantage of this model is that can detect and recognizing faces by assigning weights to the selected features aimed to grant or provoke permissions with high accuracy.

Improvement of Mask-RCNN Performance Using Deep-Learning-Based Arbitrary-Scale Super-Resolution Module (딥러닝 기반 임의적 스케일 초해상도 모듈을 이용한 Mask-RCNN 성능 향상)

  • Ahn, Young-Pill;Park, Hyun-Jun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.381-388
    • /
    • 2022
  • In instance segmentation, Mask-RCNN is mostly used as a base model. Increasing the performance of Mask-RCNN is meaningful because it affects the performance of the derived model. Mask-RCNN has a transform module for unifying size of input images. In this paper, to improve the Mask-RCNN, we apply deep-learning-based ASSR to the resizing part in the transform module and inject calculated scale information into the model using IM(Integration Module). The proposed IM improves instance segmentation performance by 2.5 AP higher than Mask-RCNN in the COCO dataset, and in the periment for optimizing the IM location, the best performance was shown when it was located in the 'Top' before FPN and backbone were combined. Therefore, the proposed method can improve the performance of models using Mask-RCNN as a base model.

Consumer Associative Network Analysis on Device and Service Convergence

  • Han, Sangman;Lee, Janghyuk;Park, Sun-Young;Jo, Woonghyeon
    • Asia Marketing Journal
    • /
    • v.15 no.3
    • /
    • pp.1-14
    • /
    • 2013
  • Our research brings managerial insights for developing new digital convergence of devices and services. To explain the phenomenon of device and service convergence, we combine two different approaches from separate research fields: a perceptual mapping technique generally used for segmentation in marketing and associative network analysis mobilized to understanding network structure of core and peripheral as well as the information mediating role of nodes in network science. By combining these two approaches, we provide an in-depth analysis of the associations among devices and services by assessing the centrality of device and service nodes in an associative network. This is done by examining the connections between these services and devices as well as investigating the role of mediation in the combined device-service associative network. Our results based on bi-partite network analysis of survey responses from 250 Internet Protocol (IP) television viewers show which device and which service will play the major role in future device and service convergence as well as which characteristics and functionalities have to be incorporated into future convergence. Among the devices, the mobile handset with the betweenness centrality of 0.26 appears to be the device that would lead future device convergence. Among the services, wireless broadband with the betweenness centrality of 0.276 appears to be the service on which future service convergence needs to be developed. This result is quite unexpected, since wireless broadband has a lower penetration rate than other services, such as fixed broadband and cable TV. In addition, we indicate the possibility of converging devices, such as personal digital assistant (PDA) and mobile handset, and services, such as IPTV and mobile Internet, into wireless broadband services in the future.

  • PDF

MPEG Video Segmentation using Two-stage Neural Networks and Hierarchical Frame Search (2단계 신경망과 계층적 프레임 탐색 방법을 이용한 MPEG 비디오 분할)

  • Kim, Joo-Min;Choi, Yeong-Woo;Chung, Ku-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.114-125
    • /
    • 2002
  • In this paper, we are proposing a hierarchical segmentation method that first segments the video data into units of shots by detecting cut and dissolve, and then decides types of camera operations or object movements in each shot. In our previous work[1], each picture group is divided into one of the three detailed categories, Shot(in case of scene change), Move(in case of camera operation or object movement) and Static(in case of almost no change between images), by analysing DC(Direct Current) component of I(Intra) frame. In this process, we have designed two-stage hierarchical neural network with inputs of various multiple features combined. Then, the system detects the accurate shot position, types of camera operations or object movements by searching P(Predicted), B(Bi-directional) frames of the current picture group selectively and hierarchically. Also, the statistical distributions of macro block types in P or B frames are used for the accurate detection of cut position, and another neural network with inputs of macro block types and motion vectors method can reduce the processing time by using only DC coefficients of I frames without decoding and by searching P, B frames selectively and hierarchically. The proposed method classified the picture groups in the accuracy of 93.9-100.0% and the cuts in the accuracy of 96.1-100.0% with three different together is used to detect dissolve, types of camera operations and object movements. The proposed types of video data. Also, it classified the types of camera movements or object movements in the accuracy of 90.13% and 89.28% with two different types of video data.