• Title/Summary/Keyword: Multi-Feature Fusion

Search Result 86, Processing Time 0.021 seconds

Generating Radiology Reports via Multi-feature Optimization Transformer

  • Rui Wang;Rong Hua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2768-2787
    • /
    • 2023
  • As an important research direction of the application of computer science in the medical field, the automatic generation technology of radiology report has attracted wide attention in the academic community. Because the proportion of normal regions in radiology images is much larger than that of abnormal regions, words describing diseases are often masked by other words, resulting in significant feature loss during the calculation process, which affects the quality of generated reports. In addition, the huge difference between visual features and semantic features causes traditional multi-modal fusion method to fail to generate long narrative structures consisting of multiple sentences, which are required for medical reports. To address these challenges, we propose a multi-feature optimization Transformer (MFOT) for generating radiology reports. In detail, a multi-dimensional mapping attention (MDMA) module is designed to encode the visual grid features from different dimensions to reduce the loss of primary features in the encoding process; a feature pre-fusion (FP) module is constructed to enhance the interaction ability between multi-modal features, so as to generate a reasonably structured radiology report; a detail enhanced attention (DEA) module is proposed to enhance the extraction and utilization of key features and reduce the loss of key features. In conclusion, we evaluate the performance of our proposed model against prevailing mainstream models by utilizing widely-recognized radiology report datasets, namely IU X-Ray and MIMIC-CXR. The experimental outcomes demonstrate that our model achieves SOTA performance on both datasets, compared with the base model, the average improvement of six key indicators is 19.9% and 18.0% respectively. These findings substantiate the efficacy of our model in the domain of automated radiology report generation.

Emotion Recognition and Expression System of User using Multi-Modal Sensor Fusion Algorithm (다중 센서 융합 알고리즘을 이용한 사용자의 감정 인식 및 표현 시스템)

  • Yeom, Hong-Gi;Joo, Jong-Tae;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.1
    • /
    • pp.20-26
    • /
    • 2008
  • As they have more and more intelligence robots or computers these days, so the interaction between intelligence robot(computer) - human is getting more and more important also the emotion recognition and expression are indispensable for interaction between intelligence robot(computer) - human. In this paper, firstly we extract emotional features at speech signal and facial image. Secondly we apply both BL(Bayesian Learning) and PCA(Principal Component Analysis), lastly we classify five emotions patterns(normal, happy, anger, surprise and sad) also, we experiment with decision fusion and feature fusion to enhance emotion recognition rate. The decision fusion method experiment on emotion recognition that result values of each recognition system apply Fuzzy membership function and the feature fusion method selects superior features through SFS(Sequential Forward Selection) method and superior features are applied to Neural Networks based on MLP(Multi Layer Perceptron) for classifying five emotions patterns. and recognized result apply to 2D facial shape for express emotion.

Automatic Image Registration Based on Extraction of Corresponding-Points for Multi-Sensor Image Fusion (다중센서 영상융합을 위한 대응점 추출에 기반한 자동 영상정합 기법)

  • Choi, Won-Chul;Jung, Jik-Han;Park, Dong-Jo;Choi, Byung-In;Choi, Sung-Nam
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.12 no.4
    • /
    • pp.524-531
    • /
    • 2009
  • In this paper, we propose an automatic image registration method for multi-sensor image fusion such as visible and infrared images. The registration is achieved by finding corresponding feature points in both input images. In general, the global statistical correlation is not guaranteed between multi-sensor images, which bring out difficulties on the image registration for multi-sensor images. To cope with this problem, mutual information is adopted to measure correspondence of features and to select faithful points. An update algorithm for projective transform is also proposed. Experimental results show that the proposed method provides robust and accurate registration results.

Multi-Scale Dilation Convolution Feature Fusion (MsDC-FF) Technique for CNN-Based Black Ice Detection

  • Sun-Kyoung KANG
    • Korean Journal of Artificial Intelligence
    • /
    • v.11 no.3
    • /
    • pp.17-22
    • /
    • 2023
  • In this paper, we propose a black ice detection system using Convolutional Neural Networks (CNNs). Black ice poses a serious threat to road safety, particularly during winter conditions. To overcome this problem, we introduce a CNN-based architecture for real-time black ice detection with an encoder-decoder network, specifically designed for real-time black ice detection using thermal images. To train the network, we establish a specialized experimental platform to capture thermal images of various black ice formations on diverse road surfaces, including cement and asphalt. This enables us to curate a comprehensive dataset of thermal road black ice images for a training and evaluation purpose. Additionally, in order to enhance the accuracy of black ice detection, we propose a multi-scale dilation convolution feature fusion (MsDC-FF) technique. This proposed technique dynamically adjusts the dilation ratios based on the input image's resolution, improving the network's ability to capture fine-grained details. Experimental results demonstrate the superior performance of our proposed network model compared to conventional image segmentation models. Our model achieved an mIoU of 95.93%, while LinkNet achieved an mIoU of 95.39%. Therefore, it is concluded that the proposed model in this paper could offer a promising solution for real-time black ice detection, thereby enhancing road safety during winter conditions.

Feature Extraction and Fusion for land-Cover Discrimination with Multi-Temporal SAR Data (다중 시기 SAR 자료를 이용한 토지 피복 구분을 위한 특징 추출과 융합)

  • Park No-Wook;Lee Hoonyol;Chi Kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.21 no.2
    • /
    • pp.145-162
    • /
    • 2005
  • To improve the accuracy of land-cover discrimination in SAB data classification, this paper presents a methodology that includes feature extraction and fusion steps with multi-temporal SAR data. Three features including average backscattering coefficient, temporal variability and coherence are extracted from multi-temporal SAR data by considering the temporal behaviors of backscattering characteristics of SAR sensors. Dempster-Shafer theory of evidence(D-S theory) and fuzzy logic are applied to effectively integrate those features. Especially, a feature-driven heuristic approach to mass function assignment in D-S theory is applied and various fuzzy combination operators are tested in fuzzy logic fusion. As experimental results on a multi-temporal Radarsat-1 data set, the features considered in this paper could provide complementary information and thus effectively discriminated water, paddy and urban areas. However, it was difficult to discriminate forest and dry fields. From an information fusion methodological point of view, the D-S theory and fuzzy combination operators except the fuzzy Max and Algebraic Sum operators showed similar land-cover accuracy statistics.

Multi-parametric MRIs based assessment of Hepatocellular Carcinoma Differentiation with Multi-scale ResNet

  • Jia, Xibin;Xiao, Yujie;Yang, Dawei;Yang, Zhenghan;Lu, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5179-5196
    • /
    • 2019
  • To explore an effective non-invasion medical imaging diagnostics approach for hepatocellular carcinoma (HCC), we propose a method based on adopting the multiple technologies with the multi-parametric data fusion, transfer learning, and multi-scale deep feature extraction. Firstly, to make full use of complementary and enhancing the contribution of different modalities viz. multi-parametric MRI images in the lesion diagnosis, we propose a data-level fusion strategy. Secondly, based on the fusion data as the input, the multi-scale residual neural network with SPP (Spatial Pyramid Pooling) is utilized for the discriminative feature representation learning. Thirdly, to mitigate the impact of the lack of training samples, we do the pre-training of the proposed multi-scale residual neural network model on the natural image dataset and the fine-tuning with the chosen multi-parametric MRI images as complementary data. The comparative experiment results on the dataset from the clinical cases show that our proposed approach by employing the multiple strategies achieves the highest accuracy of 0.847±0.023 in the classification problem on the HCC differentiation. In the problem of discriminating the HCC lesion from the non-tumor area, we achieve a good performance with accuracy, sensitivity, specificity and AUC (area under the ROC curve) being 0.981±0.002, 0.981±0.002, 0.991±0.007 and 0.999±0.0008, respectively.

Ensemble convolutional neural networks for automatic fusion recognition of multi-platform radar emitters

  • Zhou, Zhiwen;Huang, Gaoming;Wang, Xuebao
    • ETRI Journal
    • /
    • v.41 no.6
    • /
    • pp.750-759
    • /
    • 2019
  • Presently, the extraction of hand-crafted features is still the dominant method in radar emitter recognition. To solve the complicated problems of selection and updation of empirical features, we present a novel automatic feature extraction structure based on deep learning. In particular, a convolutional neural network (CNN) is adopted to extract high-level abstract representations from the time-frequency images of emitter signals. Thus, the redundant process of designing discriminative features can be avoided. Furthermore, to address the performance degradation of a single platform, we propose the construction of an ensemble learning-based architecture for multi-platform fusion recognition. Experimental results indicate that the proposed algorithms are feasible and effective, and they outperform other typical feature extraction and fusion recognition methods in terms of accuracy. Moreover, the proposed structure could be extended to other prevalent ensemble learning alternatives.

Learning-Based Multiple Pooling Fusion in Multi-View Convolutional Neural Network for 3D Model Classification and Retrieval

  • Zeng, Hui;Wang, Qi;Li, Chen;Song, Wei
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1179-1191
    • /
    • 2019
  • We design an ingenious view-pooling method named learning-based multiple pooling fusion (LMPF), and apply it to multi-view convolutional neural network (MVCNN) for 3D model classification or retrieval. By this means, multi-view feature maps projected from a 3D model can be compiled as a simple and effective feature descriptor. The LMPF method fuses the max pooling method and the mean pooling method by learning a set of optimal weights. Compared with the hand-crafted approaches such as max pooling and mean pooling, the LMPF method can decrease the information loss effectively because of its "learning" ability. Experiments on ModelNet40 dataset and McGill dataset are presented and the results verify that LMPF can outperform those previous methods to a great extent.

End-to-End Learning-based Spatial Scalable Image Compression with Multi-scale Feature Fusion Module (다중 스케일 특징 융합 모듈을 통한 종단 간 학습기반 공간적 스케일러블 영상 압축)

  • Shin Juyeon;Kang Jewon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.1-3
    • /
    • 2022
  • 최근 기존의 영상 압축 파이프라인 대신 신경망의 종단 간 학습을 통해 압축을 수행하는 알고리즘의 연구가 활발히 진행되고 있다. 본 논문은 종단 간 학습 기반 공간적 스케일러블 압축 기술을 제안한다. 보다 구체적으로 본 논문은 신경망의 각 계층에서 하위 계층의 학습된 특징 (feature)을 융합하여 상위 계층으로 전달하는 다중 스케일 특징 융합 (multi-scale feature fusion) 모듈을 도입해 상위 계층이 더욱 풍부한 특징 정보를 학습하고 계층 사이의 특징 중복성을 더욱 잘 제거할 수 있도록 한다. 기존 방법 대비 향상 계층(enhancement layer)에서 1.37%의 BD-rate가 향상된 결과를 볼 수 있다.

  • PDF

Multi-Image Stereo Method Using DEM Fusion Technique (DEM 융합 기법을 이용한 다중영상스테레오 방법)

  • Lim Sung-Min;Woo Dong-Min
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.4
    • /
    • pp.212-222
    • /
    • 2003
  • The ability to efficiently and robustly recover accurate 3D terrain models from sets of stereoscopic images is important to many civilian and military applications. A stereo matching has been an important tool for reconstructing three dimensional terrain. However, there exist many factors causing stereo matching error, such as occlusion, no feature or repetitive pattern in the correlation window, intensity variation, etc. Among them, occlusion can be only resolved by true multi-image stereo. In this paper, we present multi-image stereo method using DEM fusion as one of efficient and reliable true multi-image methods. Elevations generated by all pairs of images are combined by the fusion process which accepts an accurate elevation and rejects an outlier. We propose three fusion schemes: THD(Thresholding), BPS(Best Pair Selection) and MS(Median Selection). THD averages elevations after rejecting outliers by thresholding, while BPS selects the most reliable elevation. To determine the reliability of a elevation or detect the outlier, we employ the measure of self-consistency. The last scheme, MS, selects the median value of elevations. We test the effectiveness of the proposed methods with a quantitative analysis using simulated images. Experimental results indicate that all three fusion schemes showed much better improvement over the conventional binocular stereo in natural terrain of 29 Palms and urban site of Avenches.