• 제목/요약/키워드: Spatial attention mechanism

검색결과 40건 처리시간 0.019초

CT 영상에서 폐 결절 분할을 위한 경계 및 역 어텐션 기법 (Boundary and Reverse Attention Module for Lung Nodule Segmentation in CT Images)

  • 황경연;지예원;윤학영;이상준
    • 대한임베디드공학회논문지
    • /
    • 제17권5호
    • /
    • pp.265-272
    • /
    • 2022
  • As the risk of lung cancer has increased, early-stage detection and treatment of cancers have received a lot of attention. Among various medical imaging approaches, computer tomography (CT) has been widely utilized to examine the size and growth rate of lung nodules. However, the process of manual examination is a time-consuming task, and it causes physical and mental fatigue for medical professionals. Recently, many computer-aided diagnostic methods have been proposed to reduce the workload of medical professionals. In recent studies, encoder-decoder architectures have shown reliable performances in medical image segmentation, and it is adopted to predict lesion candidates. However, localizing nodules in lung CT images is a challenging problem due to the extremely small sizes and unstructured shapes of nodules. To solve these problems, we utilize atrous spatial pyramid pooling (ASPP) to minimize the loss of information for a general U-Net baseline model to extract rich representations from various receptive fields. Moreover, we propose mixed-up attention mechanism of reverse, boundary and convolutional block attention module (CBAM) to improve the accuracy of segmentation small scale of various shapes. The performance of the proposed model is compared with several previous attention mechanisms on the LIDC-IDRI dataset, and experimental results demonstrate that reverse, boundary, and CBAM (RB-CBAM) are effective in the segmentation of small nodules.

시각 탐색과 공간적 작업기억간 상호 간섭의 원인 (Main Cause of the Interference between Visual Search and Spatial Working Memory Task)

  • 안지원;김민식
    • 인지과학
    • /
    • 제16권3호
    • /
    • pp.155-174
    • /
    • 2005
  • 최근 연구들은 공간적 작업기억 과제를 수행하면서 시각 탐색 과제를 수행했을 때 시각 탐색의 효율성과 작업기억 과제의 정확률이 동시에 낮아지는 결과를 보고하였다(Oh & Kim, 2004; Woodman & Luck, 2004). 이러한 결과는 두 과제의 처리 과정이 동일한 인지적 자원을 요구하기 때문인 것으로 해석할 수 있는데, 동일한 인지적 자원은 공간적 주의(공간적 주의 부하 가설)나, 공간적 작업기억(공간적 작업기억 부하 가설), 혹은 이 둘과 모두 관련될 가능성이 있다. 시각 탐색과 공간적 작업기억 간 상호 간섭의 기제를 밝히기 위해 작업기억에 유지해야 하는 위치와 공간적 주의를 사용해야 하는 시각 탐색의 자극 위치를 변화시켜 2개의 실험을 수행하였다. 실험 1에서는 공간적 작업기억 과제의 자극을 탐색 자극이 제시될 수 있는 주변 영역에 제시하는 경우에도 두 과제간의 간섭이 나타남을 보임으로써 이전 연구 결과들을 재확인하였다. 실험 2에서는 기억 자극과 탐색 자극을 모두 동일한 사분면에 제시하는 경우와 그렇지 않은 경우에서 시각 탐색과 작업기억 과제 수행을 비교하였다. 실험 결과 시각 탐색의 효율은 시각 탐색 과제만을 수행한 조건과 동일 위치 조건에 비해 비동일 위치 조건에서 유의미하게 저하되었다. 공간적 작업기억 과제의 정확률역시 다른 조건보다 비동일 위치 조건에서 더 낮게 나타났다. 이러한 결과들은 선행 연구들에서 밝혀진 공간 기억과 시각 탐색간의 상호 간섭이 작업기억의 과부하보다는 공간적 주의의 과부하로 인한 것임을 시사한다.

  • PDF

Dual Attention Based Image Pyramid Network for Object Detection

  • Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권12호
    • /
    • pp.4439-4455
    • /
    • 2021
  • Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.

Region of Interest Detection Based on Visual Attention and Threshold Segmentation in High Spatial Resolution Remote Sensing Images

  • Zhang, Libao;Li, Hao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권8호
    • /
    • pp.1843-1859
    • /
    • 2013
  • The continuous increase of the spatial resolution of remote sensing images brings great challenge to image analysis and processing. Traditional prior knowledge-based region detection and target recognition algorithms for processing high resolution remote sensing images generally employ a global searching solution, which results in prohibitive computational complexity. In this paper, a more efficient region of interest (ROI) detection algorithm based on visual attention and threshold segmentation (VA-TS) is proposed, wherein a visual attention mechanism is used to eliminate image segmentation and feature detection to the entire image. The input image is subsampled to decrease the amount of data and the discrete moment transform (DMT) feature is extracted to provide a finer description of the edges. The feature maps are combined with weights according to the amount of the "strong points" and the "salient points". A threshold segmentation strategy is employed to obtain more accurate region of interest shape information with the very low computational complexity. Experimental statistics have shown that the proposed algorithm is computational efficient and provide more visually accurate detection results. The calculation time is only about 0.7% of the traditional Itti's model.

A New Residual Attention Network based on Attention Models for Human Action Recognition in Video

  • Kim, Jee-Hyun;Cho, Young-Im
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권1호
    • /
    • pp.55-61
    • /
    • 2020
  • 딥 러닝 기술의 발전과 컴퓨팅 파워 등의 개선으로 인해 비디오 기반 연구는 최근 많은 관심을 얻고 있다. 비디오 데이터가 이미지 데이터와 비교하여 가장 큰 차이는 비디오 데이터에는 많은 양의 시간적, 공간적 정보가 포함되어 있다는 점이다. 이처럼 비디오에 포함된 많은 양의 데이터로 인해 컴퓨터 비전 연구에 있어서 행동 인식은 중요한 연구 과제 중 하나이지만, 비디오와 같이 움직임이 있는 환경에서 인간의 행동 인식은 매우 복잡하고 도전적인 과제이다. 인간에 대한 여러 연구를 바탕으로 인공지능에서는 인간과 유사한 주의(attention)메커니즘이 효율적인 인식 모델이라는 것을 알게 되었다. 이 효율적인 모델은 이미지 정보와 복잡한 연속 비디오 정보를 처리하는 데 이상적이다. 본 논문에서는 이러한 연구배경을 기반으로, 비디오에서 인간의 행동을 효율적으로 인식하기 위해 먼저 인간의 행동에 주목한 후 비디오 행동 인식에 주의메커니즘을 도입하고자 한다. 논문의 주요내용은 두 가지 주의 메카니즘을 기반으로 컨볼루션 신경망을 이용한 새로운 3D 잔류 주의 네트워크를 제안함으로써 비디오에서 인간의 행동을 식별하고자 한다. 제안 모델의 평가 결과 최대 90.7%정도의 정확도를 보였다.

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.

Boundary-Aware Dual Attention Guided Liver Segment Segmentation Model

  • Jia, Xibin;Qian, Chen;Yang, Zhenghan;Xu, Hui;Han, Xianjun;Ren, Hao;Wu, Xinru;Ma, Boyang;Yang, Dawei;Min, Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.16-37
    • /
    • 2022
  • Accurate liver segment segmentation based on radiological images is indispensable for the preoperative analysis of liver tumor resection surgery. However, most of the existing segmentation methods are not feasible to be used directly for this task due to the challenge of exact edge prediction with some tiny and slender vessels as its clinical segmentation criterion. To address this problem, we propose a novel deep learning based segmentation model, called Boundary-Aware Dual Attention Liver Segment Segmentation Model (BADA). This model can improve the segmentation accuracy of liver segments with enhancing the edges including the vessels serving as segment boundaries. In our model, the dual gated attention is proposed, which composes of a spatial attention module and a semantic attention module. The spatial attention module enhances the weights of key edge regions by concerning about the salient intensity changes, while the semantic attention amplifies the contribution of filters that can extract more discriminative feature information by weighting the significant convolution channels. Simultaneously, we build a dataset of liver segments including 59 clinic cases with dynamically contrast enhanced MRI(Magnetic Resonance Imaging) of portal vein stage, which annotated by several professional radiologists. Comparing with several state-of-the-art methods and baseline segmentation methods, we achieve the best results on this clinic liver segment segmentation dataset, where Mean Dice, Mean Sensitivity and Mean Positive Predicted Value reach 89.01%, 87.71% and 90.67%, respectively.

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

  • Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
    • Journal of Information Processing Systems
    • /
    • 제19권4호
    • /
    • pp.427-438
    • /
    • 2023
  • Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.

DATCN: Deep Attention fused Temporal Convolution Network for the prediction of monitoring indicators in the tunnel

  • Bowen, Du;Zhixin, Zhang;Junchen, Ye;Xuyan, Tan;Wentao, Li;Weizhong, Chen
    • Smart Structures and Systems
    • /
    • 제30권6호
    • /
    • pp.601-612
    • /
    • 2022
  • The prediction of structural mechanical behaviors is vital important to early perceive the abnormal conditions and avoid the occurrence of disasters. Especially for underground engineering, complex geological conditions make the structure more prone to disasters. Aiming at solving the problems existing in previous studies, such as incomplete consideration factors and can only predict the continuous performance, the deep attention fused temporal convolution network (DATCN) is proposed in this paper to predict the spatial mechanical behaviors of structure, which integrates both the temporal effect and spatial effect and realize the cross-time prediction. The temporal convolution network (TCN) and self-attention mechanism are employed to learn the temporal correlation of each monitoring point and the spatial correlation among different points, respectively. Then, the predicted result obtained from DATCN is compared with that obtained from some classical baselines, including SVR, LR, MLP, and RNNs. Also, the parameters involved in DATCN are discussed to optimize the prediction ability. The prediction result demonstrates that the proposed DATCN model outperforms the state-of-the-art baselines. The prediction accuracy of DATCN model after 24 hours reaches 90 percent. Also, the performance in last 14 hours plays a domain role to predict the short-term behaviors of the structure. As a study case, the proposed model is applied in an underwater shield tunnel to predict the stress variation of concrete segments in space.

Local Climate Mediates Spatial and Temporal Variation in Carabid Beetle Communities on Hyangnobong, Korea

  • Park, Yong Hwan;Jang, Tae Woong;Jeong, Jong Cheol;Chae, Hee Mun;Kim, Jong Kuk
    • Journal of Forest and Environmental Science
    • /
    • 제33권3호
    • /
    • pp.161-171
    • /
    • 2017
  • Global environmental changes have the capacity to make dramatic alterations to floral and faunal composition, and elucidation of the mechanism is important for predicting its outcomes. Studies on global climate change have traditionally focused on statistical summaries within relatively wide scales of spatial and temporal changes, and less attention has been paid to variability in microclimates across spatial and temporal scales. Microclimate is a suite of climatic conditions measured in local areas near the earth's surface. Environmental variables in microclimatic scale can be critical for the ecology of organisms inhabiting there. Here we examine the effect of spatial and temporal changes in microclimates on those of carabid beetle communities in Hyangnobong, Korea. We found that climatic variables and the patterns of annual changes in carabid beetle communities differed among sites even within the single mountain system. Our results indicate the importance of temporal survey of communities at local scales, which is expected to reveal an additional fraction of variation in communities and underlying processes that has been overlooked in studies of global community patterns and changes.