• Title/Summary/Keyword: Dual attention mechanism

Search Result 15, Processing Time 0.028 seconds

Dual Attention Based Image Pyramid Network for Object Detection

  • Dong, Xiang;Li, Feng;Bai, Huihui;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4439-4455
    • /
    • 2021
  • Compared with two-stage object detection algorithms, one-stage algorithms provide a better trade-off between real-time performance and accuracy. However, these methods treat the intermediate features equally, which lacks the flexibility to emphasize meaningful information for classification and location. Besides, they ignore the interaction of contextual information from different scales, which is important for medium and small objects detection. To tackle these problems, we propose an image pyramid network based on dual attention mechanism (DAIPNet), which builds an image pyramid to enrich the spatial information while emphasizing multi-scale informative features based on dual attention mechanisms for one-stage object detection. Our framework utilizes a pre-trained backbone as standard detection network, where the designed image pyramid network (IPN) is used as auxiliary network to provide complementary information. Here, the dual attention mechanism is composed of the adaptive feature fusion module (AFFM) and the progressive attention fusion module (PAFM). AFFM is designed to automatically pay attention to the feature maps with different importance from the backbone and auxiliary network, while PAFM is utilized to adaptively learn the channel attentive information in the context transfer process. Furthermore, in the IPN, we build an image pyramid to extract scale-wise features from downsampled images of different scales, where the features are further fused at different states to enrich scale-wise information and learn more comprehensive feature representations. Experimental results are shown on MS COCO dataset. Our proposed detector with a 300 × 300 input achieves superior performance of 32.6% mAP on the MS COCO test-dev compared with state-of-the-art methods.

Tobacco Retail License Recognition Based on Dual Attention Mechanism

  • Shan, Yuxiang;Ren, Qin;Wang, Cheng;Wang, Xiuhui
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.480-488
    • /
    • 2022
  • Images of tobacco retail licenses have complex unstructured characteristics, which is an urgent technical problem in the robot process automation of tobacco marketing. In this paper, a novel recognition approach using a double attention mechanism is presented to realize the automatic recognition and information extraction from such images. First, we utilized a DenseNet network to extract the license information from the input tobacco retail license data. Second, bi-directional long short-term memory was used for coding and decoding using a continuous decoder integrating dual attention to realize the recognition and information extraction of tobacco retail license images without segmentation. Finally, several performance experiments were conducted using a largescale dataset of tobacco retail licenses. The experimental results show that the proposed approach achieves a correction accuracy of 98.36% on the ZY-LQ dataset, outperforming most existing methods.

Forecasting Crop Yield Using Encoder-Decoder Model with Attention (Attention 기반 Encoder-Decoder 모델을 활용한작물의 생산량 예측)

  • Kang, Sooram;Cho, Kyungchul;Na, MyungHwan
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.4
    • /
    • pp.569-579
    • /
    • 2021
  • Purpose: The purpose of this study is the time series analysis for predicting the yield of crops applicable to each farm using environmental variables measured by smart farms cultivating tomato. In addition, it is intended to confirm the influence of environmental variables using a deep learning model that can be explained to some extent. Methods: A time series analysis was performed to predict production using environmental variables measured at 75 smart farms cultivating tomato in two periods. An LSTM-based encoder-decoder model was used for cases of several farms with similar length. In particular, Dual Attention Mechanism was applied to use environmental variables as exogenous variables and to confirm their influence. Results: As a result of the analysis, Dual Attention LSTM with a window size of 12 weeks showed the best predictive power. It was verified that the environmental variables has a similar effect on prediction through wieghtss extracted from the prediction model, and it was also verified that the previous time point has a greater effect than the time point close to the prediction point. Conclusion: It is expected that it will be possible to attempt various crops as a model that can be explained by supplementing the shortcomings of general deep learning model.

Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

  • Liu, Min;Tang, Jun
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.754-771
    • /
    • 2021
  • In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.

Boundary-Aware Dual Attention Guided Liver Segment Segmentation Model

  • Jia, Xibin;Qian, Chen;Yang, Zhenghan;Xu, Hui;Han, Xianjun;Ren, Hao;Wu, Xinru;Ma, Boyang;Yang, Dawei;Min, Hong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.16-37
    • /
    • 2022
  • Accurate liver segment segmentation based on radiological images is indispensable for the preoperative analysis of liver tumor resection surgery. However, most of the existing segmentation methods are not feasible to be used directly for this task due to the challenge of exact edge prediction with some tiny and slender vessels as its clinical segmentation criterion. To address this problem, we propose a novel deep learning based segmentation model, called Boundary-Aware Dual Attention Liver Segment Segmentation Model (BADA). This model can improve the segmentation accuracy of liver segments with enhancing the edges including the vessels serving as segment boundaries. In our model, the dual gated attention is proposed, which composes of a spatial attention module and a semantic attention module. The spatial attention module enhances the weights of key edge regions by concerning about the salient intensity changes, while the semantic attention amplifies the contribution of filters that can extract more discriminative feature information by weighting the significant convolution channels. Simultaneously, we build a dataset of liver segments including 59 clinic cases with dynamically contrast enhanced MRI(Magnetic Resonance Imaging) of portal vein stage, which annotated by several professional radiologists. Comparing with several state-of-the-art methods and baseline segmentation methods, we achieve the best results on this clinic liver segment segmentation dataset, where Mean Dice, Mean Sensitivity and Mean Positive Predicted Value reach 89.01%, 87.71% and 90.67%, respectively.

Two-Dimensional Attention-Based LSTM Model for Stock Index Prediction

  • Yu, Yeonguk;Kim, Yoon-Joong
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1231-1242
    • /
    • 2019
  • This paper presents a two-dimensional attention-based long short-memory (2D-ALSTM) model for stock index prediction, incorporating input attention and temporal attention mechanisms for weighting of important stocks and important time steps, respectively. The proposed model is designed to overcome the long-term dependency, stock selection, and stock volatility delay problems that negatively affect existing models. The 2D-ALSTM model is validated in a comparative experiment involving the two attention-based models multi-input LSTM (MI-LSTM) and dual-stage attention-based recurrent neural network (DARNN), with real stock data being used for training and evaluation. The model achieves superior performance compared to MI-LSTM and DARNN for stock index prediction on a KOSPI100 dataset.

A Domain-independent Dual-image based Robust Reversible Watermarking

  • Guo, Xuejing;Fang, Yixiang;Wang, Junxiang;Zeng, Wenchao;Zhao, Yi;Zhang, Tianzhu;Shi, Yun-Qing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.4024-4041
    • /
    • 2022
  • Robust reversible watermarking has attracted widespread attention in the field of information hiding in recent years. It should not only have robustness against attacks in transmission but also meet the reversibility of distortion-free transmission. According to our best knowledge, the most recent robust reversible watermarking methods adopt a single image as the carrier, which might lead to low efficiency in terms of carrier utilization. To address the issue, a novel dual-image robust reversible watermarking framework is proposed in this paper to effectively utilize the correlation between both carriers (namely dual images) and thus improve the efficiency of carrier utilization. In the dual-image robust reversible watermarking framework, a two-layer robust watermarking mechanism is designed to further improve the algorithm performances, i.e., embedding capacity and robustness. In addition, an optimization model is built to determine the parameters. Finally, the proposed framework is applied in different domains (namely domain-independent), i.e., Slantlet Transform and Singular Value Decomposition domain, and Zernike moments, respectively to demonstrate its effectiveness and generality. Experimental results demonstrate the superiority of the proposed dual-image robust reversible watermarking framework.

Comparison and Analysis of the Attention Mechanism for Stock Prediction (주가 예측을 위한 어텐션 메커니즘의 비교분석)

  • Yu, Yeonguk;Cheon, Yongsang;Cho, Min-Hee;Kim, Yoon-Joong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.844-847
    • /
    • 2019
  • 주가 예측은 상업적인 매력 때문에 많은 이목이 끌리는 분야이지만, 주가의 불확실성과 변동성 때문에 주가 예측은 어려운 작업이다. 최근에는 주가 예측 모델에 어텐션 메커니즘을 사용하여 주가 예측에 많은 인자들이 사용되어 생기는 성능 하락 문제를 해결하여 좋은 성능을 보여주는 연구가 존재한다. 본 연구에서는 그 모델 중 하나인 Dual-Stage Attention-Based Recurrent Neural Network(DARNN)의 어텐션 메커니즘을 변경해가며 어떤 어텐션 메커니즘이 주가 예측에 적합한지를 알아본다. KOSPI100 지수의 예측실험을 통해 location 스코어함수를 사용한 어텐션 메커니즘이 가장 뛰어난 성능을 보여주는 것을 확인하였고, 이는 기존의 스코어함수를 사용한 DARNN에 비해 약 10% 향상된 성능으로 스코어 함수가 모델의 중요한 영향을 끼치는 것을 확인하였다.

The Effect of Personalized Product Recommendation Service of Online Fashion Shopping Mall on Service Use Behaviors through Cognitive Attitude and Emotional Attachment (온라인 패션쇼핑몰의 개인 상품 추천서비스가 인지적 태도와 감정적 애착을 통해 서비스 사용행동에 미치는 영향)

  • Choi, Mi Young
    • Fashion & Textile Research Journal
    • /
    • v.23 no.5
    • /
    • pp.586-597
    • /
    • 2021
  • Personalized product recommendation service is receiving attention as a new marketing strategy while supporting consumer information search and purchasing decisions. This study attempted to verify the effect of self-reference on service use behavior through the dual path of cognitive attitude and emotional attachment. Using convenience sampling, an online survey was conducted with 324 women who were in their 20s and 30s. After collecting and compiling the survey data, the reliability and validity of variables constituting the conceptual research model were verified through confirmatory factor analysis using AMOS 22.0. Next, the significance of sequentially mediated pathways was verified using Process 3.5 Model 80. The results showed that self-referencing not only significantly affects service use intention by simply mediating cognitive attitudes but also sequentially mediates cognitive attitudes and additional information search. Furthermore, self-referencing was significant as an indirect path to service use intention by mediating additional information search. However, in the path mediated by emotional attachment, self-referencing was considered as a simple mediated path leading to service usage intention. These results indicate a dual path in the psychological mechanism, through cognitive and emotional evaluation, that prompts consumer behavioral responses to the personalized product information provided in the shopping process.

A dual path encoder-decoder network for placental vessel segmentation in fetoscopic surgery

  • Yunbo Rao;Tian Tan;Shaoning Zeng;Zhanglin Chen;Jihong Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.1
    • /
    • pp.15-29
    • /
    • 2024
  • A fetoscope is an optical endoscope, which is often applied in fetoscopic laser photocoagulation to treat twin-to-twin transfusion syndrome. In an operation, the clinician needs to observe the abnormal placental vessels through the endoscope, so as to guide the operation. However, low-quality imaging and narrow field of view of the fetoscope increase the difficulty of the operation. Introducing an accurate placental vessel segmentation of fetoscopic images can assist the fetoscopic laser photocoagulation and help identify the abnormal vessels. This study proposes a method to solve the above problems. A novel encoder-decoder network with a dual-path structure is proposed to segment the placental vessels in fetoscopic images. In particular, we introduce a channel attention mechanism and a continuous convolution structure to obtain multi-scale features with their weights. Moreover, a switching connection is inserted between the corresponding blocks of the two paths to strengthen their relationship. According to the results of a set of blood vessel segmentation experiments conducted on a public fetoscopic image dataset, our method has achieved higher scores than the current mainstream segmentation methods, raising the dice similarity coefficient, intersection over union, and pixel accuracy by 5.80%, 8.39% and 0.62%, respectively.