• 제목/요약/키워드: Multi-level CNN

검색결과 21건 처리시간 0.019초

Road Damage Detection and Classification based on Multi-level Feature Pyramids

  • Yin, Junru;Qu, Jiantao;Huang, Wei;Chen, Qiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권2호
    • /
    • pp.786-799
    • /
    • 2021
  • Road damage detection is important for road maintenance. With the development of deep learning, more and more road damage detection methods have been proposed, such as Fast R-CNN, Faster R-CNN, Mask R-CNN and RetinaNet. However, because shallow and deep layers cannot be extracted at the same time, the existing methods do not perform well in detecting objects with fewer samples. In addition, these methods cannot obtain a highly accurate detecting bounding box. This paper presents a Multi-level Feature Pyramids method based on M2det. Because the feature layer has multi-scale and multi-level architecture, the feature layer containing more information and obvious features can be extracted. Moreover, an attention mechanism is used to improve the accuracy of local boundary boxes in the dataset. Experimental results show that the proposed method is better than the current state-of-the-art methods.

국방분야 비인가 이미지 파일 탐지를 위한 다중 레벨 컨볼루션 신경망 알고리즘의 구현 및 검증 (Implementation and Verification of Multi-level Convolutional Neural Network Algorithm for Identifying Unauthorized Image Files in the Military)

  • 김영수
    • 한국멀티미디어학회논문지
    • /
    • 제21권8호
    • /
    • pp.858-863
    • /
    • 2018
  • In this paper, we propose and implement a multi-level convolutional neural network (CNN) algorithm to identify the sexually explicit and lewdness of various image files, and verify its effectiveness by using unauthorized image files generated in the actual military. The proposed algorithm increases the accuracy by applying the convolutional artificial neural network step by step to minimize classification error between similar categories. Experimental data have categorized 20,005 images in the real field into 6 authorization categories and 11 non-authorization categories. Experimental results show that the overall detection rate is 99.51% for the image files. In particular, the excellence of the proposed algorithm is verified through reducing the identification error rate between similar categories by 64.87% compared with the general CNN algorithm.

얼굴 표정 인식을 위한 Densely Backward Attention 기반 컨볼루션 네트워크 (Convolutional Network with Densely Backward Attention for Facial Expression Recognition)

  • 서현석;;이승룡
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2019년도 추계학술발표대회
    • /
    • pp.958-961
    • /
    • 2019
  • Convolutional neural network(CNN)의 등장으로 얼굴 표현 인식 연구는 많은 발전을 이루었다. 그러나, 기존의 CNN 접근법은 미리 학습된 훈련모델에서 Multiple-level 의 의미적 맥락을 포함하지 않는 Attention-embedded 문제가 발생한다. 사람의 얼굴 감정은 다양한 근육의 움직임과 결합에 기초하여 관찰되며, CNN 에서 딥 레이어의 산출물로 나온 특징들의 결합은 많은 서브샘플링 단계를 통해서 class 구별와 같은 의미 정보의 손실이 일어나기 때문에 전이 학습을 통한 올바른 훈련 모델 생성이 어렵다는 단점이 있다. 따라서, 본 논문은 Backbone 네트워크의 Multi-level 특성에서 Channel-wise Attention 통합 및 의미 정보를 포함하여 높은 인식 성능을 달성하는 Densely Backwarnd Attention(DBA) CNN 방법을 제안한다. 제안하는 기법은 High-level 기능에서 채널 간 시멘틱 정보를 활용하여 세분화된 시멘틱 정보를 Low-level 버전에서 다시 재조정한다. 그런 다음, 중요한 얼굴 표정의 묘사를 분명하게 포함시키기 위해서 multi-level 데이터를 통합하는 단계를 추가로 실행한다. 실험을 통해, 제안된 접근방법이 정확도 79.37%를 달성 하여 제안 기술이 효율성이 있음을 증명하였다.

Integration of Multi-scale CAM and Attention for Weakly Supervised Defects Localization on Surface Defective Apple

  • Nguyen Bui Ngoc Han;Ju Hwan Lee;Jin Young Kim
    • 스마트미디어저널
    • /
    • 제12권9호
    • /
    • pp.45-59
    • /
    • 2023
  • Weakly supervised object localization (WSOL) is a task of localizing an object in an image using only image-level labels. Previous studies have followed the conventional class activation mapping (CAM) pipeline. However, we reveal the current CAM approach suffers from problems which cause original CAM could not capture the complete defects features. This work utilizes a convolutional neural network (CNN) pretrained on image-level labels to generate class activation maps in a multi-scale manner to highlight discriminative regions. Additionally, a vision transformer (ViT) pretrained was treated to produce multi-head attention maps as an auxiliary detector. By integrating the CNN-based CAMs and attention maps, our approach localizes defective regions without requiring bounding box or pixel-level supervision during training. We evaluate our approach on a dataset of apple images with only image-level labels of defect categories. Experiments demonstrate our proposed method aligns with several Object Detection models performance, hold a promise for improving localization.

계층적 CNN 구조를 이용한 스테가노그래피 식별 (Identification of Steganographic Methods Using a Hierarchical CNN Structure)

  • 강상훈;박한훈;박종일;김산해
    • 융합신호처리학회논문지
    • /
    • 제20권4호
    • /
    • pp.205-211
    • /
    • 2019
  • 스테그아날리시스(steganalysis)는 스테가노그래피(steganography)에 의해 숨겨진 데이터를 감지하고 복구하기 위한 기법이다. 스테그아날리시스 방법은 데이터 삽입 시 발생하는 시각적, 통계적 변화를 분석하여 숨겨진 데이터를 찾는다. 숨겨진 데이터를 복원하기 위해서는 어떤 스테가노그래피 방법에 의해 데이터가 숨겨졌는지를 알아야 한다. 그러므로 본 논문은 다층 분류를 통해 입력 영상에 적용된 스테가노그래피 방법을 식별하는 계층적 CNN 구조를 제안한다. 이를 위해 4개의 기본 CNN을 각각 입력 영상에 스테가노그래피 방법이 적용되었는지 여부나 서로 다른 두 스테가노그래피 방법 중에 어떤 방법이 적용되었는지를 이진 판별하도록 학습시켰으며, 학습된 CNN을 계층적으로 연결하였다. 실험 결과를 통해 제안된 계층적 CNN 구조는 4개의 서로 다른 스테가노그래피 방법인 LSB(Least Significant Bit Substitution), PVD(Pixel Value Difference), WOW(Wavelet Obtained Weights), UNIWARD(Universal Wavelet Relative Distortion)을 79%의 정확도로 식별할 수 있음을 확인하였다.

CNN-based Fast Split Mode Decision Algorithm for Versatile Video Coding (VVC) Inter Prediction

  • Yeo, Woon-Ha;Kim, Byung-Gyu
    • Journal of Multimedia Information System
    • /
    • 제8권3호
    • /
    • pp.147-158
    • /
    • 2021
  • Versatile Video Coding (VVC) is the latest video coding standard developed by Joint Video Exploration Team (JVET). In VVC, the quadtree plus multi-type tree (QT+MTT) structure of coding unit (CU) partition is adopted, and its computational complexity is considerably high due to the brute-force search for recursive rate-distortion (RD) optimization. In this paper, we aim to reduce the time complexity of inter-picture prediction mode since the inter prediction accounts for a large portion of the total encoding time. The problem can be defined as classifying the split mode of each CU. To classify the split mode effectively, a novel convolutional neural network (CNN) called multi-level tree (MLT-CNN) architecture is introduced. For boosting classification performance, we utilize additional information including inter-picture information while training the CNN. The overall algorithm including the MLT-CNN inference process is implemented on VVC Test Model (VTM) 11.0. The CUs of size 128×128 can be the inputs of the CNN. The sequences are encoded at the random access (RA) configuration with five QP values {22, 27, 32, 37, 42}. The experimental results show that the proposed algorithm can reduce the computational complexity by 11.53% on average, and 26.14% for the maximum with an average 1.01% of the increase in Bjøntegaard delta bit rate (BDBR). Especially, the proposed method shows higher performance on the sequences of the A and B classes, reducing 9.81%~26.14% of encoding time with 0.95%~3.28% of the BDBR increase.

Potential Anomaly Separation and Archeological Site Localization Using Genetically Trained Multi-level Cellular Neural Networks

  • Bilgili, Erdem;Goknar, I. Cem;Albora, Ali Muhittin;Ucan, Osman Nuri
    • ETRI Journal
    • /
    • 제27권3호
    • /
    • pp.294-303
    • /
    • 2005
  • In this paper, a supervised algorithm for the evaluation of geophysical sites using a multi-level cellular neural network (ML-CNN) is introduced, developed, and applied to real data. ML-CNN is a stochastic image processing technique based on template optimization using neighborhood relationships of the pixels. The separation/enhancement and border detection performance of the proposed method is evaluated by various interesting real applications. A genetic algorithm is used in the optimization of CNN templates. The first application is concerned with the separation of potential field data of the Dumluca chromite region, which is one of the rich reserves of Turkey; in this context, the classical approach to the gravity anomaly separation method is one of the main problems in geophysics. The other application is the border detection of archeological ruins of the Hittite Empire in Turkey. The Hittite civilization sites located at the Sivas-Altinyayla region of Turkey are among the most important archeological sites in history, one reason among others being that written documentation was first produced by this civilization.

  • PDF

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

Activity Object Detection Based on Improved Faster R-CNN

  • Zhang, Ning;Feng, Yiran;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제24권3호
    • /
    • pp.416-422
    • /
    • 2021
  • Due to the large differences in human activity within classes, the large similarity between classes, and the problems of visual angle and occlusion, it is difficult to extract features manually, and the detection rate of human behavior is low. In order to better solve these problems, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multi-object recognition and localization through a second-order detection network, and replaces the original feature extraction module with Dense-Net, which can fuse multi-level feature information, increase network depth and avoid disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects, and enhancing the network detection accuracy under multiple objects. During the experiment, the improved Faster R-CNN method in this article has 84.7% target detection result, which is improved compared to other methods, which proves that the target recognition method has significant advantages and potential.

Weak-lensing Mass Reconstruction of Galaxy Clusters with Convolutional Neural Network

  • Hong, Sungwook E.;Park, Sangnam;Jee, M. James;Bak, Dongsu;Cha, Sangjun
    • 천문학회보
    • /
    • 제45권1호
    • /
    • pp.49.4-50
    • /
    • 2020
  • We introduce a novel method for reconstructing the projected matter distributions of galaxy clusters with weak-lensing (WL) data based on convolutional neural network (CNN). We control the noise level of the galaxy shear catalog such that it mimics the typical properties of the existing Subaru/Suprime-Cam WL observations of galaxy clusters. We find that our mass reconstruction based on multi-layered CNN with architectures of alternating convolution and trans-convolution filters significantly outperforms the traditional mass reconstruction methods.

  • PDF