• Title/Summary/Keyword: Deep Fusion Model

Search Result 83, Processing Time 0.022 seconds

Speech emotion recognition based on genetic algorithm-decision tree fusion of deep and acoustic features

  • Sun, Linhui;Li, Qiu;Fu, Sheng;Li, Pingan
    • ETRI Journal
    • /
    • v.44 no.3
    • /
    • pp.462-475
    • /
    • 2022
  • Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)-decision tree (DT) fusion of deep and acoustic features. To more comprehensively express speech emotional information, first, frame-level deep and acoustic features are extracted from a speech signal. Next, five kinds of statistic variables of these features are calculated to obtain utterance-level features. The Fisher feature selection criterion is employed to select high-performance features, removing redundant information. In the feature fusion stage, the GA is is used to adaptively search for the best feature fusion weight. Finally, using the fused feature, the proposed speech emotion recognition model based on a DT support vector machine model is realized. Experimental results on the Berlin speech emotion database and the Chinese emotion speech database indicate that the proposed model outperforms an average weight fusion method.

Traffic Flow Prediction with Spatio-Temporal Information Fusion using Graph Neural Networks

  • Huijuan Ding;Giseop Noh
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.88-97
    • /
    • 2023
  • Traffic flow prediction is of great significance in urban planning and traffic management. As the complexity of urban traffic increases, existing prediction methods still face challenges, especially for the fusion of spatiotemporal information and the capture of long-term dependencies. This study aims to use the fusion model of graph neural network to solve the spatio-temporal information fusion problem in traffic flow prediction. We propose a new deep learning model Spatio-Temporal Information Fusion using Graph Neural Networks (STFGNN). We use GCN module, TCN module and LSTM module alternately to carry out spatiotemporal information fusion. GCN and multi-core TCN capture the temporal and spatial dependencies of traffic flow respectively, and LSTM connects multiple fusion modules to carry out spatiotemporal information fusion. In the experimental evaluation of real traffic flow data, STFGNN showed better performance than other models.

Text Classification Method Using Deep Learning Model Fusion and Its Application

  • Shin, Seong-Yoon;Cho, Gwang-Hyun;Cho, Seung-Pyo;Lee, Hyun-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.409-410
    • /
    • 2022
  • This paper proposes a fusion model based on Long-Short Term Memory networks (LSTM) and CNN deep learning methods, and applied to multi-category news datasets, and achieved good results. Experiments show that the fusion model based on deep learning has greatly improved the precision and accuracy of text sentiment classification. This method will become an important way to optimize the model and improve the performance of the model.

  • PDF

High-performance of Deep learning Colorization With Wavelet fusion (웨이블릿 퓨전에 의한 딥러닝 색상화의 성능 향상)

  • Kim, Young-Back;Choi, Hyun;Cho, Joong-Hwee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.6
    • /
    • pp.313-319
    • /
    • 2018
  • We propose a post-processing algorithm to improve the quality of the RGB image generated by deep learning based colorization from the gray-scale image of an infrared camera. Wavelet fusion is used to generate a new luminance component of the RGB image luminance component from the deep learning model and the luminance component of the infrared camera. PSNR is increased for all experimental images by applying the proposed algorithm to RGB images generated by two deep learning models of SegNet and DCGAN. For the SegNet model, the average PSNR is improved by 1.3906dB at level 1 of the Haar wavelet method. For the DCGAN model, PSNR is improved 0.0759dB on the average at level 5 of the Daubechies wavelet method. It is also confirmed that the edge components are emphasized by the post-processing and the visibility is improved.

Predicting Session Conversion on E-commerce: A Deep Learning-based Multimodal Fusion Approach

  • Minsu Kim;Woosik Shin;SeongBeom Kim;Hee-Woong Kim
    • Asia pacific journal of information systems
    • /
    • v.33 no.3
    • /
    • pp.737-767
    • /
    • 2023
  • With the availability of big customer data and advances in machine learning techniques, the prediction of customer behavior at the session-level has attracted considerable attention from marketing practitioners and scholars. This study aims to predict customer purchase conversion at the session-level by employing customer profile, transaction, and clickstream data. For this purpose, we develop a multimodal deep learning fusion model with dynamic and static features (i.e., DS-fusion). Specifically, we base page views within focal visist and recency, frequency, monetary value, and clumpiness (RFMC) for dynamic and static features, respectively, to comprehensively capture customer characteristics for buying behaviors. Our model with deep learning architectures combines these features for conversion prediction. We validate the proposed model using real-world e-commerce data. The experimental results reveal that our model outperforms unimodal classifiers with each feature and the classical machine learning models with dynamic and static features, including random forest and logistic regression. In this regard, this study sheds light on the promise of the machine learning approach with the complementary method for different modalities in predicting customer behaviors.

Infrared and Visible Image Fusion Based on NSCT and Deep Learning

  • Feng, Xin
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1405-1419
    • /
    • 2018
  • An image fusion method is proposed on the basis of depth model segmentation to overcome the shortcomings of noise interference and artifacts caused by infrared and visible image fusion. Firstly, the deep Boltzmann machine is used to perform the priori learning of infrared and visible target and background contour, and the depth segmentation model of the contour is constructed. The Split Bregman iterative algorithm is employed to gain the optimal energy segmentation of infrared and visible image contours. Then, the nonsubsampled contourlet transform (NSCT) transform is taken to decompose the source image, and the corresponding rules are used to integrate the coefficients in the light of the segmented background contour. Finally, the NSCT inverse transform is used to reconstruct the fused image. The simulation results of MATLAB indicates that the proposed algorithm can obtain the fusion result of both target and background contours effectively, with a high contrast and noise suppression in subjective evaluation as well as great merits in objective quantitative indicators.

Text Classification by Deep Learning Fusion (딥러닝 융합에 의한 텍스트 분류)

  • Shin, Kwang-Seong;Ham, Seo-Hyun;Shin, Seong-Yoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.385-386
    • /
    • 2019
  • This paper proposes a fusion model based on Long-Short Term Memory networks (LSTM) and CNN deep learning methods, and applied to multi-category news datasets, and achieved good results. Experiments show that the fusion model based on deep learning has greatly improved the precision and accuracy of text sentiment classification.

  • PDF

Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors

  • Xu, Kaiping;Qin, Zheng;Wang, Guolong;Zhang, Huidi;Huang, Kai;Ye, Shuxiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2253-2272
    • /
    • 2018
  • We propose a deep learning method for multi-focus image fusion. Unlike most existing pixel-level fusion methods, either in spatial domain or in transform domain, our method directly learns an end-to-end fully convolutional two-stream network. The framework maps a pair of different focus images to a clean version, with a chain of convolutional layers, fusion layer and deconvolutional layers. Our deep fusion model has advantages of efficiency and robustness, yet demonstrates state-of-art fusion quality. We explore different parameter settings to achieve trade-offs between performance and speed. Moreover, the experiment results on our training dataset show that our network can achieve good performance with subjective visual perception and objective assessment metrics.

DL-ML Fusion Hybrid Model for Malicious Web Site URL Detection Based on URL Lexical Features (악성 URL 탐지를 위한 URL Lexical Feature 기반의 DL-ML Fusion Hybrid 모델)

  • Dae-yeob Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.881-891
    • /
    • 2023
  • Recently, various studies on malicious URL detection using artificial intelligence have been conducted, and most of the research have shown great detection performance. However, not only does classical machine learning require a process of analyzing features, but the detection performance of a trained model also depends on the data analyst's ability. In this paper, we propose a DL-ML Fusion Hybrid Model for malicious web site URL detection based on URL lexical features. the propose model combines the automatic feature extraction layer of deep learning and classical machine learning to improve the feature engineering issue. 60,000 malicious and normal URLs were collected for the experiment and the results showed 23.98%p performance improvement in maximum. In addition, it was possible to train a model in an efficient way with the automation of feature engineering.

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.