• 제목/요약/키워드: Lightweight convolutional neural network

검색결과 32건 처리시간 0.022초

FGW-FER: Lightweight Facial Expression Recognition with Attention

  • Huy-Hoang Dinh;Hong-Quan Do;Trung-Tung Doan;Cuong Le;Ngo Xuan Bach;Tu Minh Phuong;Viet-Vu Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권9호
    • /
    • pp.2505-2528
    • /
    • 2023
  • The field of facial expression recognition (FER) has been actively researched to improve human-computer interaction. In recent years, deep learning techniques have gained popularity for addressing FER, with numerous studies proposing end-to-end frameworks that stack or widen significant convolutional neural network layers. While this has led to improved performance, it has also resulted in larger model sizes and longer inference times. To overcome this challenge, our work introduces a novel lightweight model architecture. The architecture incorporates three key factors: Depth-wise Separable Convolution, Residual Block, and Attention Modules. By doing so, we aim to strike a balance between model size, inference speed, and accuracy in FER tasks. Through extensive experimentation on popular benchmark FER datasets, our proposed method has demonstrated promising results. Notably, it stands out due to its substantial reduction in parameter count and faster inference time, while maintaining accuracy levels comparable to other lightweight models discussed in the existing literature.

Lightweight CNN-based Expression Recognition on Humanoid Robot

  • Zhao, Guangzhe;Yang, Hanting;Tao, Yong;Zhang, Lei;Zhao, Chunxiao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권3호
    • /
    • pp.1188-1203
    • /
    • 2020
  • The human expression contains a lot of information that can be used to detect complex conditions such as pain and fatigue. After deep learning became the mainstream method, the traditional feature extraction method no longer has advantages. However, in order to achieve higher accuracy, researchers continue to stack the number of layers of the neural network, which makes the real-time performance of the model weak. Therefore, this paper proposed an expression recognition framework based on densely concatenated convolutional neural networks to balance accuracy and latency and apply it to humanoid robots. The techniques of feature reuse and parameter compression in the framework improved the learning ability of the model and greatly reduced the parameters. Experiments showed that the proposed model can reduce tens of times the parameters at the expense of little accuracy.

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

Multi-Task FaceBoxes: A Lightweight Face Detector Based on Channel Attention and Context Information

  • Qi, Shuaihui;Yang, Jungang;Song, Xiaofeng;Jiang, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.4080-4097
    • /
    • 2020
  • In recent years, convolutional neural network (CNN) has become the primary method for face detection. But its shortcomings are obvious, such as expensive calculation, heavy model, etc. This makes CNN difficult to use on the mobile devices which have limited computing and storage capabilities. Therefore, the design of lightweight CNN for face detection is becoming more and more important with the popularity of smartphones and mobile Internet. Based on the CPU real-time face detector FaceBoxes, we propose a multi-task lightweight face detector, which has low computing cost and higher detection precision. First, to improve the detection capability, the squeeze and excitation modules are used to extract attention between channels. Then, the textual and semantic information are extracted by shallow networks and deep networks respectively to get rich features. Finally, the landmark detection module is used to improve the detection performance for small faces and provide landmark data for face alignment. Experiments on AFW, FDDB, PASCAL, and WIDER FACE datasets show that our algorithm has achieved significant improvement in the mean average precision. Especially, on the WIDER FACE hard validation set, our algorithm outperforms the mean average precision of FaceBoxes by 7.2%. For VGA-resolution images, the running speed of our algorithm can reach 23FPS on a CPU device.

차량용 경량화 침입 탐지 시스템을 위한 데이터 전처리 기법 (Data Preprocessing Method for Lightweight Automotive Intrusion Detection System)

  • 박상민;임형철;이성수
    • 전기전자학회논문지
    • /
    • 제27권4호
    • /
    • pp.531-536
    • /
    • 2023
  • 본 논문에서는 차량 내 네트워크에서 즉각적인 공격 탐지를 위해 프레임 피처 삽입이 적용된 슬라이딩 윈도우 기법을 제안한다. 이 방법은 현재 프레임의 공격 여부에 따라 라벨링을 진행하기 때문에 공격 탐지의 실시간성을 보장할 수 있다. 또한 이 방법이 CNN 연산에서 현재 프레임에 대한 가중치를 주어 성능을 향상시킬 수 있음을 실험을 통해 확인하였다. 제안하는 모델은 경량화된 LeNet-5 구조 기반으로 설계되었으며 DoS 공격 탐지 성능에서 100%를 달성하였다. 또한 기존 연구의 모델들과 복잡성을 비교했을 때 제안하는 모델이 ECU와 같이 리소스가 제한된 장치에 더 적합함을 확인하였다.

Lightweight multiple scale-patch dehazing network for real-world hazy image

  • Wang, Juan;Ding, Chang;Wu, Minghu;Liu, Yuanyuan;Chen, Guanhai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권12호
    • /
    • pp.4420-4438
    • /
    • 2021
  • Image dehazing is an ill-posed problem which is far from being solved. Traditional image dehazing methods often yield mediocre effects and possess substandard processing speed, while modern deep learning methods perform best only in certain datasets. The haze removal effect when processed by said methods is unsatisfactory, meaning the generalization performance fails to meet the requirements. Concurrently, due to the limited processing speed, most dehazing algorithms cannot be employed in the industry. To alleviate said problems, a lightweight fast dehazing network based on a multiple scale-patch framework (MSP) is proposed in the present paper. Firstly, the multi-scale structure is employed as the backbone network and the multi-patch structure as the supplementary network. Dehazing through a single network causes problems, such as loss of object details and color in some image areas, the multi-patch structure was employed for MSP as an information supplement. In the algorithm image processing module, the image is segmented up and down for processed separately. Secondly, MSP generates a clear dehazing effect and significant robustness when targeting real-world homogeneous and nonhomogeneous hazy maps and different datasets. Compared with existing dehazing methods, MSP demonstrated a fast inference speed and the feasibility of real-time processing. The overall size and model parameters of the entire dehazing model are 20.75M and 6.8M, and the processing time for the single image is 0.026s. Experiments on NTIRE 2018 and NTIRE 2020 demonstrate that MSP can achieve superior performance among the state-of-the-art methods, such as PSNR, SSIM, LPIPS, and individual subjective evaluation.

A Lightweight Pedestrian Intrusion Detection and Warning Method for Intelligent Traffic Security

  • Yan, Xinyun;He, Zhengran;Huang, Youxiang;Xu, Xiaohu;Wang, Jie;Zhou, Xiaofeng;Wang, Chishe;Lu, Zhiyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3904-3922
    • /
    • 2022
  • As a research hotspot, pedestrian detection has a wide range of applications in the field of computer vision in recent years. However, current pedestrian detection methods have problems such as insufficient detection accuracy and large models that are not suitable for large-scale deployment. In view of these problems mentioned above, a lightweight pedestrian detection and early warning method using a new model called you only look once (Yolov5) is proposed in this paper, which utilizing advantages of Yolov5s model to achieve accurate and fast pedestrian recognition. In addition, this paper also optimizes the loss function of the batch normalization (BN) layer. After sparsification, pruning and fine-tuning, got a lot of optimization, the size of the model on the edge of the computing power is lower equipment can be deployed. Finally, from the experimental data presented in this paper, under the training of the road pedestrian dataset that we collected and processed independently, the Yolov5s model has certain advantages in terms of precision and other indicators compared with traditional single shot multiBox detector (SSD) model and fast region-convolutional neural network (Fast R-CNN) model. After pruning and lightweight, the size of training model is greatly reduced without a significant reduction in accuracy, and the final precision reaches 87%, while the model size is reduced to 7,723 KB.

연속학습을 활용한 경량 온-디바이스 AI 기반 실시간 기계 결함 진단 시스템 설계 및 구현 (Design and Implementation of a Lightweight On-Device AI-Based Real-time Fault Diagnosis System using Continual Learning)

  • 김영준;김태완;김수현;이성재;김태현
    • 대한임베디드공학회논문지
    • /
    • 제19권3호
    • /
    • pp.151-158
    • /
    • 2024
  • Although on-device artificial intelligence (AI) has gained attention to diagnosing machine faults in real time, most previous studies did not consider the model retraining and redeployment processes that must be performed in real-world industrial environments. Our study addresses this challenge by proposing an on-device AI-based real-time machine fault diagnosis system that utilizes continual learning. Our proposed system includes a lightweight convolutional neural network (CNN) model, a continual learning algorithm, and a real-time monitoring service. First, we developed a lightweight 1D CNN model to reduce the cost of model deployment and enable real-time inference on the target edge device with limited computing resources. We then compared the performance of five continual learning algorithms with three public bearing fault datasets and selected the most effective algorithm for our system. Finally, we implemented a real-time monitoring service using an open-source data visualization framework. In the performance comparison results between continual learning algorithms, we found that the replay-based algorithms outperformed the regularization-based algorithms, and the experience replay (ER) algorithm had the best diagnostic accuracy. We further tuned the number and length of data samples used for a memory buffer of the ER algorithm to maximize its performance. We confirmed that the performance of the ER algorithm becomes higher when a longer data length is used. Consequently, the proposed system showed an accuracy of 98.7%, while only 16.5% of the previous data was stored in memory buffer. Our lightweight CNN model was also able to diagnose a fault type of one data sample within 3.76 ms on the Raspberry Pi 4B device.

Lightening of Human Pose Estimation Algorithm Using MobileViT and Transfer Learning

  • Kunwoo Kim;Jonghyun Hong;Jonghyuk Park
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권9호
    • /
    • pp.17-25
    • /
    • 2023
  • 본 논문에서는 매개변수가 더 적고, 빠르게 추정 가능한 MobileViT 기반 모델을 통해 사람 자세 추정 과업을 수행할 수 있는 모델을 제안한다. 기반 모델은 합성곱 신경망의 특징과 Vision Transformer의 특징이 결합한 구조를 통해 경량화된 성능을 입증한다. 본 연구에서 주요 매커니즘이 되는 Transformer는 그 기반의 모델들이 컴퓨터 비전 분야에서도 합성곱 신경망 기반의 모델들 대비 더 나은 성능을 보이며, 영향력이 커지게 되었다. 이는 사람 자세 추정 과업에서도 동일한 상황이며, Vision Transformer기반의 ViTPose가 COCO, OCHuman, MPII 등 사람 자세 추정 벤치마크에서 모두 최고 성능을 지키고 있는 것이 그 적절한 예시이다. 하지만 Vision Transformer는 매개변수의 수가 많고 상대적으로 많은 연산량을 요구하는 무거운 모델 구조를 가지고 있기 때문에, 학습에 있어 사용자에게 많은 비용을 야기시킨다. 이에 기반 모델은 Vision Transformer가 많은 계산량을 요구하는 부족한 Inductive Bias 계산 문제를 합성곱 신경망 구조를 통한 Local Representation으로 극복하였다. 최종적으로, 제안 모델은 MS COCO 사람 자세 추정 벤치마크에서 제공하는 Validation Set으로 ViTPose 대비 각각 5분의 1과 9분의 1만큼의 3.28GFLOPs, 972만 매개변수를 나타내었고, 69.4 Mean Average Precision을 달성하여 상대적으로 우수한 성능을 보였다.

흉부 X-선 영상을 이용한 14 가지 흉부 질환 분류를 위한 Ensemble Knowledge Distillation (Ensemble Knowledge Distillation for Classification of 14 Thorax Diseases using Chest X-ray Images)

  • 호티키우칸;전영훈;곽정환
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2021년도 제64차 하계학술대회논문집 29권2호
    • /
    • pp.313-315
    • /
    • 2021
  • Timely and accurate diagnosis of lung diseases using Chest X-ray images has been gained much attention from the computer vision and medical imaging communities. Although previous studies have presented the capability of deep convolutional neural networks by achieving competitive binary classification results, their models were seemingly unreliable to effectively distinguish multiple disease groups using a large number of x-ray images. In this paper, we aim to build an advanced approach, so-called Ensemble Knowledge Distillation (EKD), to significantly boost the classification accuracies, compared to traditional KD methods by distilling knowledge from a cumbersome teacher model into an ensemble of lightweight student models with parallel branches trained with ground truth labels. Therefore, learning features at different branches of the student models could enable the network to learn diverse patterns and improve the qualify of final predictions through an ensemble learning solution. Although we observed that experiments on the well-established ChestX-ray14 dataset showed the classification improvements of traditional KD compared to the base transfer learning approach, the EKD performance would be expected to potentially enhance classification accuracy and model generalization, especially in situations of the imbalanced dataset and the interdependency of 14 weakly annotated thorax diseases.

  • PDF