Search | Korea Science

Further Optimize MobileNetV2 with Channel-wise Squeeze and Excitation (채널간 압축과 해제를 통한 MobileNetV2 최적화)

Park, Jinho;Kim, Wonjun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- fall
- /
- pp.154-156
- /
- 2021
Depth-wise separable convolution 은 컴퓨터 자원이 제한된 환경에서 기존의 standard convolution을 대체하는데 강력하고, 효과적인 대안으로 잘 알려져 있다.[1] MobileNetV2 에서는 Inverted residual block을 소개한다. 이는 depth-wise separable convolution으로 인해 생기는 손실, 즉 channel 간의 데이터를 조합해 새로운 feature를 만들어낼 기회를 잃어버릴 때, 이를 depth-wise separable convolution 양단에 point-wise convolution(1×1 convolution)을 사용함으로써 극복해낸 block이다.[1] 하지만 1×1 convolution은 채널 수에 의존적(dependent)인 특징을 갖고 있고, 따라서 결국 네트워크가 깊어지면 깊어질수록 효율적이고(efficient) 가벼운(light weight) 네트워크를 만드는데 병목 현상(bottleneck)을 일으키고 만다. 이 논문에서는 channel-wise squeeze and excitation block(CSE)을 통해 1×1 convolution을 부분적으로 대체하는 방법을 통해 이 병목 현상을 해결한다.
PDF

FGW-FER: Lightweight Facial Expression Recognition with Attention

Huy-Hoang Dinh;Hong-Quan Do;Trung-Tung Doan;Cuong Le;Ngo Xuan Bach;Tu Minh Phuong;Viet-Vu Vu
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.9
- /
- pp.2505-2528
- /
- 2023
The field of facial expression recognition (FER) has been actively researched to improve human-computer interaction. In recent years, deep learning techniques have gained popularity for addressing FER, with numerous studies proposing end-to-end frameworks that stack or widen significant convolutional neural network layers. While this has led to improved performance, it has also resulted in larger model sizes and longer inference times. To overcome this challenge, our work introduces a novel lightweight model architecture. The architecture incorporates three key factors: Depth-wise Separable Convolution, Residual Block, and Attention Modules. By doing so, we aim to strike a balance between model size, inference speed, and accuracy in FER tasks. Through extensive experimentation on popular benchmark FER datasets, our proposed method has demonstrated promising results. Notably, it stands out due to its substantial reduction in parameter count and faster inference time, while maintaining accuracy levels comparable to other lightweight models discussed in the existing literature.
https://doi.org/10.3837/tiis.2023.09.011 인용 PDF HTML

High-Speed Transformer for Panoptic Segmentation

Baek, Jong-Hyeon;Kim, Dae-Hyun;Lee, Hee-Kyung;Choo, Hyon-Gon;Koh, Yeong Jun
- Journal of Broadcast Engineering
- /
- v.27 no.7
- /
- pp.1011-1020
- /
- 2022
Recent high-performance panoptic segmentation models are based on transformer architectures. However, transformer-based panoptic segmentation methods are basically slower than convolution-based methods, since the attention mechanism in the transformer requires quadratic complexity w.r.t. image resolution. Also, sine and cosine computation for positional embedding in the transformer also yields a bottleneck for computation time. To address these problems, we adopt three modules to speed up the inference runtime of the transformer-based panoptic segmentation. First, we perform channel-level reduction using depth-wise separable convolution for inputs of the transformer decoder. Second, we replace sine and cosine-based positional encoding with convolution operations, called conv-embedding. We also apply a separable self-attention to the transformer encoder to lower quadratic complexity to linear one for numbers of image pixels. As result, the proposed model achieves 44% faster frame per second than baseline on ADE20K panoptic validation dataset, when we use all three modules.
https://doi.org/10.5909/JBE.2022.27.7.1011 인용 PDF KSCI KPUBS

Modulation Recognition of MIMO Systems Based on Dimensional Interactive Lightweight Network

Aer, Sileng;Zhang, Xiaolin;Wang, Zhenduo;Wang, Kailin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.10
- /
- pp.3458-3478
- /
- 2022
Automatic modulation recognition is the core algorithm in the field of modulation classification in communication systems. Our investigations show that deep learning (DL) based modulation recognition techniques have achieved effective progress for multiple-input multiple-output (MIMO) systems. However, network complexity is always an additional burden for high-accuracy classifications, which makes it impractical. Therefore, in this paper, we propose a low-complexity dimensional interactive lightweight network (DilNet) for MIMO systems. Specifically, the signals received by different antennas are cooperatively input into the network, and the network calculation amount is reduced through the depth-wise separable convolution. A two-dimensional interactive attention (TDIA) module is designed to extract interactive information of different dimensions, and improve the effectiveness of the cooperation features. In addition, the TDIA module ensures low complexity through compressing the convolution dimension, and the computational burden after inserting TDIA is also acceptable. Finally, the network is trained with a penalized statistical entropy loss function. Simulation results show that compared to existing modulation recognition methods, the proposed DilNet dramatically reduces the model complexity. The dimensional interactive lightweight network trained by penalized statistical entropy also performs better for recognition accuracy in MIMO systems.
https://doi.org/10.3837/tiis.2022.10.014 인용 PDF KSCI HTML

Search Result 4, Processing Time 0.018 seconds

Further Optimize MobileNetV2 with Channel-wise Squeeze and Excitation (채널간 압축과 해제를 통한 MobileNetV2 최적화)

FGW-FER: Lightweight Facial Expression Recognition with Attention

High-Speed Transformer for Panoptic Segmentation

Modulation Recognition of MIMO Systems Based on Dimensional Interactive Lightweight Network

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)