• 제목/요약/키워드: Deep Fusion Model

검색결과 83건 처리시간 0.019초

Speech emotion recognition based on genetic algorithm-decision tree fusion of deep and acoustic features

  • Sun, Linhui;Li, Qiu;Fu, Sheng;Li, Pingan
    • ETRI Journal
    • /
    • 제44권3호
    • /
    • pp.462-475
    • /
    • 2022
  • Although researchers have proposed numerous techniques for speech emotion recognition, its performance remains unsatisfactory in many application scenarios. In this study, we propose a speech emotion recognition model based on a genetic algorithm (GA)-decision tree (DT) fusion of deep and acoustic features. To more comprehensively express speech emotional information, first, frame-level deep and acoustic features are extracted from a speech signal. Next, five kinds of statistic variables of these features are calculated to obtain utterance-level features. The Fisher feature selection criterion is employed to select high-performance features, removing redundant information. In the feature fusion stage, the GA is is used to adaptively search for the best feature fusion weight. Finally, using the fused feature, the proposed speech emotion recognition model based on a DT support vector machine model is realized. Experimental results on the Berlin speech emotion database and the Chinese emotion speech database indicate that the proposed model outperforms an average weight fusion method.

Traffic Flow Prediction with Spatio-Temporal Information Fusion using Graph Neural Networks

  • Huijuan Ding;Giseop Noh
    • International journal of advanced smart convergence
    • /
    • 제12권4호
    • /
    • pp.88-97
    • /
    • 2023
  • Traffic flow prediction is of great significance in urban planning and traffic management. As the complexity of urban traffic increases, existing prediction methods still face challenges, especially for the fusion of spatiotemporal information and the capture of long-term dependencies. This study aims to use the fusion model of graph neural network to solve the spatio-temporal information fusion problem in traffic flow prediction. We propose a new deep learning model Spatio-Temporal Information Fusion using Graph Neural Networks (STFGNN). We use GCN module, TCN module and LSTM module alternately to carry out spatiotemporal information fusion. GCN and multi-core TCN capture the temporal and spatial dependencies of traffic flow respectively, and LSTM connects multiple fusion modules to carry out spatiotemporal information fusion. In the experimental evaluation of real traffic flow data, STFGNN showed better performance than other models.

Text Classification Method Using Deep Learning Model Fusion and Its Application

  • 신성윤;조광현;조승표;이현창
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 추계학술대회
    • /
    • pp.409-410
    • /
    • 2022
  • 본 논문은 LSTM(Long-Short Term Memory) 네트워크와 CNN 딥러닝 기법을 기반으로 하는 융합 모델을 제안하고 다중 카테고리 뉴스 데이터 세트에 적용하여 좋은 결과를 얻었다. 실험에 따르면 딥 러닝 기반의 융합 모델이 텍스트 감정 분류의 정밀도와 정확도를 크게 향상시켰다. 이 방법은 모델을 최적화하고 모델의 성능을 향상시키는 중요한 방법이 될 것이다.

  • PDF

웨이블릿 퓨전에 의한 딥러닝 색상화의 성능 향상 (High-performance of Deep learning Colorization With Wavelet fusion)

  • 김영백;최현;조중휘
    • 대한임베디드공학회논문지
    • /
    • 제13권6호
    • /
    • pp.313-319
    • /
    • 2018
  • We propose a post-processing algorithm to improve the quality of the RGB image generated by deep learning based colorization from the gray-scale image of an infrared camera. Wavelet fusion is used to generate a new luminance component of the RGB image luminance component from the deep learning model and the luminance component of the infrared camera. PSNR is increased for all experimental images by applying the proposed algorithm to RGB images generated by two deep learning models of SegNet and DCGAN. For the SegNet model, the average PSNR is improved by 1.3906dB at level 1 of the Haar wavelet method. For the DCGAN model, PSNR is improved 0.0759dB on the average at level 5 of the Daubechies wavelet method. It is also confirmed that the edge components are emphasized by the post-processing and the visibility is improved.

Predicting Session Conversion on E-commerce: A Deep Learning-based Multimodal Fusion Approach

  • Minsu Kim;Woosik Shin;SeongBeom Kim;Hee-Woong Kim
    • Asia pacific journal of information systems
    • /
    • 제33권3호
    • /
    • pp.737-767
    • /
    • 2023
  • With the availability of big customer data and advances in machine learning techniques, the prediction of customer behavior at the session-level has attracted considerable attention from marketing practitioners and scholars. This study aims to predict customer purchase conversion at the session-level by employing customer profile, transaction, and clickstream data. For this purpose, we develop a multimodal deep learning fusion model with dynamic and static features (i.e., DS-fusion). Specifically, we base page views within focal visist and recency, frequency, monetary value, and clumpiness (RFMC) for dynamic and static features, respectively, to comprehensively capture customer characteristics for buying behaviors. Our model with deep learning architectures combines these features for conversion prediction. We validate the proposed model using real-world e-commerce data. The experimental results reveal that our model outperforms unimodal classifiers with each feature and the classical machine learning models with dynamic and static features, including random forest and logistic regression. In this regard, this study sheds light on the promise of the machine learning approach with the complementary method for different modalities in predicting customer behaviors.

Infrared and Visible Image Fusion Based on NSCT and Deep Learning

  • Feng, Xin
    • Journal of Information Processing Systems
    • /
    • 제14권6호
    • /
    • pp.1405-1419
    • /
    • 2018
  • An image fusion method is proposed on the basis of depth model segmentation to overcome the shortcomings of noise interference and artifacts caused by infrared and visible image fusion. Firstly, the deep Boltzmann machine is used to perform the priori learning of infrared and visible target and background contour, and the depth segmentation model of the contour is constructed. The Split Bregman iterative algorithm is employed to gain the optimal energy segmentation of infrared and visible image contours. Then, the nonsubsampled contourlet transform (NSCT) transform is taken to decompose the source image, and the corresponding rules are used to integrate the coefficients in the light of the segmented background contour. Finally, the NSCT inverse transform is used to reconstruct the fused image. The simulation results of MATLAB indicates that the proposed algorithm can obtain the fusion result of both target and background contours effectively, with a high contrast and noise suppression in subjective evaluation as well as great merits in objective quantitative indicators.

딥러닝 융합에 의한 텍스트 분류 (Text Classification by Deep Learning Fusion)

  • 신광성;함서현;신성윤
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2019년도 제60차 하계학술대회논문집 27권2호
    • /
    • pp.385-386
    • /
    • 2019
  • This paper proposes a fusion model based on Long-Short Term Memory networks (LSTM) and CNN deep learning methods, and applied to multi-category news datasets, and achieved good results. Experiments show that the fusion model based on deep learning has greatly improved the precision and accuracy of text sentiment classification.

  • PDF

Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors

  • Xu, Kaiping;Qin, Zheng;Wang, Guolong;Zhang, Huidi;Huang, Kai;Ye, Shuxiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권5호
    • /
    • pp.2253-2272
    • /
    • 2018
  • We propose a deep learning method for multi-focus image fusion. Unlike most existing pixel-level fusion methods, either in spatial domain or in transform domain, our method directly learns an end-to-end fully convolutional two-stream network. The framework maps a pair of different focus images to a clean version, with a chain of convolutional layers, fusion layer and deconvolutional layers. Our deep fusion model has advantages of efficiency and robustness, yet demonstrates state-of-art fusion quality. We explore different parameter settings to achieve trade-offs between performance and speed. Moreover, the experiment results on our training dataset show that our network can achieve good performance with subjective visual perception and objective assessment metrics.

악성 URL 탐지를 위한 URL Lexical Feature 기반의 DL-ML Fusion Hybrid 모델 (DL-ML Fusion Hybrid Model for Malicious Web Site URL Detection Based on URL Lexical Features)

  • 김대엽
    • 정보보호학회논문지
    • /
    • 제33권6호
    • /
    • pp.881-891
    • /
    • 2023
  • 최근에는 인공지능을 활용하여 악성 URL을 탐지하는 다양한 연구가 진행되고 있으며, 대부분의 연구 결과에서 높은 탐지 성능을 보였다. 그러나 고전 머신러닝을 활용하는 경우 feature를 분석하고 선별해야 하는 추가 비용이 발생하며, 데이터 분석가의 역량에 따라 탐지 성능이 결정되는 이슈가 있다. 본 논문에서는 이러한 이슈를 해결하기 위해 URL lexical feature를 자동으로 추출하는 딥러닝 모델의 일부가 고전 머신러닝 모델에 결합된 형태인 DL-ML Fusion Hybrid 모델을 제안한다. 제안한 모델로 직접 수집한 총 6만 개의 악성과 정상 URL을 학습한 결과 탐지 성능이 최대 23.98%p 향상되었을 뿐만 아니라, 자동화된 feature engineering을 통해 효율적인 기계학습이 가능하였다.

Skin Lesion Segmentation with Codec Structure Based Upper and Lower Layer Feature Fusion Mechanism

  • Yang, Cheng;Lu, GuanMing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.60-79
    • /
    • 2022
  • The U-Net architecture-based segmentation models attained remarkable performance in numerous medical image segmentation missions like skin lesion segmentation. Nevertheless, the resolution gradually decreases and the loss of spatial information increases with deeper network. The fusion of adjacent layers is not enough to make up for the lost spatial information, thus resulting in errors of segmentation boundary so as to decline the accuracy of segmentation. To tackle the issue, we propose a new deep learning-based segmentation model. In the decoding stage, the feature channels of each decoding unit are concatenated with all the feature channels of the upper coding unit. Which is done in order to ensure the segmentation effect by integrating spatial and semantic information, and promotes the robustness and generalization of our model by combining the atrous spatial pyramid pooling (ASPP) module and channel attention module (CAM). Extensive experiments on ISIC2016 and ISIC2017 common datasets proved that our model implements well and outperforms compared segmentation models for skin lesion segmentation.