• Title/Summary/Keyword: Multi-modal Data

Search Result 134, Processing Time 0.026 seconds

Development of Multi-Sensor Station for u-Surveillance to Collaboration-Based Context Awareness (협업기반 상황인지를 위한 u-Surveillance 다중센서 스테이션 개발)

  • Yoo, Joon-Hyuk;Kim, Hie-Cheol
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.18 no.8
    • /
    • pp.780-786
    • /
    • 2012
  • Surveillance has become one of promising application areas of wireless sensor networks which allow for pervasive monitoring of concerned environmental phenomena by facilitating context awareness through sensor fusion. Existing systems that depend on a postmortem context analysis of sensor data on a centralized server expose several shortcomings, including a single point of failure, wasteful energy consumption due to unnecessary data transfer as well as deficiency of scalability. As an opposite direction, this paper proposes an energy-efficient distributed context-aware surveillance in which sensor nodes in the wireless sensor network collaborate with neighbors in a distributed manner to analyze and aware surrounding context. We design and implement multi-modal sensor stations for use as sensor nodes in our wireless sensor network implementing our distributed context awareness. This paper presents an initial experimental performance result of our proposed system. Results show that multi-modal sensor performance of our sensor station, a key enabling factor for distributed context awareness, is comparable to each independent sensor setting. They also show that its initial performance of context-awareness is satisfactory for a set of introductory surveillance scenarios in the current interim stage of our ongoing research.

Jointly Image Topic and Emotion Detection using Multi-Modal Hierarchical Latent Dirichlet Allocation

  • Ding, Wanying;Zhu, Junhuan;Guo, Lifan;Hu, Xiaohua;Luo, Jiebo;Wang, Haohong
    • Journal of Multimedia Information System
    • /
    • v.1 no.1
    • /
    • pp.55-67
    • /
    • 2014
  • Image topic and emotion analysis is an important component of online image retrieval, which nowadays has become very popular in the widely growing social media community. However, due to the gaps between images and texts, there is very limited work in literature to detect one image's Topics and Emotions in a unified framework, although topics and emotions are two levels of semantics that often work together to comprehensively describe one image. In this work, a unified model, Joint Topic/Emotion Multi-Modal Hierarchical Latent Dirichlet Allocation (JTE-MMHLDA) model, which extends previous LDA, mmLDA, and JST model to capture topic and emotion information at the same time from heterogeneous data, is proposed. Specifically, a two level graphical structured model is built to realize sharing topics and emotions among the whole document collection. The experimental results on a Flickr dataset indicate that the proposed model efficiently discovers images' topics and emotions, and significantly outperform the text-only system by 4.4%, vision-only system by 18.1% in topic detection, and outperforms the text-only system by 7.1%, vision-only system by 39.7% in emotion detection.

  • PDF

Modeling sharply peaked asymmetric multi-modal circular data using wrapped Laplace mixture (겹친라플라스 혼합분포를 통한 첨 다봉형 비대칭 원형자료의 모형화)

  • Na, Jong-Hwa;Jang, Young-Mi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.863-871
    • /
    • 2010
  • Until now, many studies related circular data are carried out, but the focuses are mainly on mildly peaked symmetric or asymmetric cases. In this paper we studied a modeling process for sharply peaked asymmetric circular data. By using wrapped Laplace, which was firstly introduced by Jammalamadaka and Kozbowski (2003), and its mixture distributions, we considered the model fitting problem of multi-modal circular data as well as unimodal one. In particular we suggested EM algorithm to find ML estimates of the mixture of wrapped Laplace distributions. Simulation results showed that the suggested EM algorithm is very accurate and useful.

A novel PSO-based algorithm for structural damage detection using Bayesian multi-sample objective function

  • Chen, Ze-peng;Yu, Ling
    • Structural Engineering and Mechanics
    • /
    • v.63 no.6
    • /
    • pp.825-835
    • /
    • 2017
  • Significant improvements to methodologies on structural damage detection (SDD) have emerged in recent years. However, many methods are related to inversion computation which is prone to be ill-posed or ill-conditioning, leading to low-computing efficiency or inaccurate results. To explore a more accurate solution with satisfactory efficiency, a PSO-INM algorithm, combining particle swarm optimization (PSO) algorithm and an improved Nelder-Mead method (INM), is proposed to solve multi-sample objective function defined based on Bayesian inference in this study. The PSO-based algorithm, as a heuristic algorithm, is reliable to explore solution to SDD problem converted into a constrained optimization problem in mathematics. And the multi-sample objective function provides a stable pattern under different level of noise. Advantages of multi-sample objective function and its superior over traditional objective function are studied. Numerical simulation results of a two-storey frame structure show that the proposed method is sensitive to multi-damage cases. For further confirming accuracy of the proposed method, the ASCE 4-storey benchmark frame structure subjected to single and multiple damage cases is employed. Different kinds of modal identification methods are utilized to extract structural modal data from noise-contaminating acceleration responses. The illustrated results show that the proposed method is efficient to exact locations and extents of induced damages in structures.

Dynamic Analysis of Carbon-fiber-reinforced Plastic for Different Multi-layered Fabric Structure (적층 직물 구조에 따른 탄소강화플라스틱 소재 동적 특성 분석)

  • Kim, Chan-Jung
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.26 no.4
    • /
    • pp.375-382
    • /
    • 2016
  • The mechanical property of a carbon-fiber-reinforced plastic (CFRP) is subjected to two elements, carbon fiber and polymer resin, in a first step and the selection of multi-layered structure is second one. Many combination of fabric layers, i.e. plainweave, twillweave, can be derived for candidates of test specimen used for a basic mechanical components so that a reliable identification of dynamic nature of possible multi-layered structures are essential during the development of CFRP based component system. In this paper, three kinds of multi-layered structure specimens were prepared and the dynamic characteristics of service specimens were conducted through classical modal test process with impact hammer. In addition, the design sensitivity analysis based on transmissibility function was applied for the measured response data so that the response sensitivity for each resonance frequency were compared for three CFRP test specimens. Finally, the evaluation of CFRP specimen over different multi-layered fabric structures are commented from the experimental consequences.

Effective Multi-Modal Feature Fusion for 3D Semantic Segmentation with Multi-View Images (멀티-뷰 영상들을 활용하는 3차원 의미적 분할을 위한 효과적인 멀티-모달 특징 융합)

  • Hye-Lim Bae;Incheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.505-518
    • /
    • 2023
  • 3D point cloud semantic segmentation is a computer vision task that involves dividing the point cloud into different objects and regions by predicting the class label of each point. Existing 3D semantic segmentation models have some limitations in performing sufficient fusion of multi-modal features while ensuring both characteristics of 2D visual features extracted from RGB images and 3D geometric features extracted from point cloud. Therefore, in this paper, we propose MMCA-Net, a novel 3D semantic segmentation model using 2D-3D multi-modal features. The proposed model effectively fuses two heterogeneous 2D visual features and 3D geometric features by using an intermediate fusion strategy and a multi-modal cross attention-based fusion operation. Also, the proposed model extracts context-rich 3D geometric features from input point cloud consisting of irregularly distributed points by adopting PTv2 as 3D geometric encoder. In this paper, we conducted both quantitative and qualitative experiments with the benchmark dataset, ScanNetv2 in order to analyze the performance of the proposed model. In terms of the metric mIoU, the proposed model showed a 9.2% performance improvement over the PTv2 model using only 3D geometric features, and a 12.12% performance improvement over the MVPNet model using 2D-3D multi-modal features. As a result, we proved the effectiveness and usefulness of the proposed model.

Enhancing Recommender Systems by Fusing Diverse Information Sources through Data Transformation and Feature Selection

  • Thi-Linh Ho;Anh-Cuong Le;Dinh-Hong Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.5
    • /
    • pp.1413-1432
    • /
    • 2023
  • Recommender systems aim to recommend items to users by taking into account their probable interests. This study focuses on creating a model that utilizes multiple sources of information about users and items by employing a multimodality approach. The study addresses the task of how to gather information from different sources (modalities) and transform them into a uniform format, resulting in a multi-modal feature description for users and items. This work also aims to transform and represent the features extracted from different modalities so that the information is in a compatible format for integration and contains important, useful information for the prediction model. To achieve this goal, we propose a novel multi-modal recommendation model, which involves extracting latent features of users and items from a utility matrix using matrix factorization techniques. Various transformation techniques are utilized to extract features from other sources of information such as user reviews, item descriptions, and item categories. We also proposed the use of Principal Component Analysis (PCA) and Feature Selection techniques to reduce the data dimension and extract important features as well as remove noisy features to increase the accuracy of the model. We conducted several different experimental models based on different subsets of modalities on the MovieLens and Amazon sub-category datasets. According to the experimental results, the proposed model significantly enhances the accuracy of recommendations when compared to SVD, which is acknowledged as one of the most effective models for recommender systems. Specifically, the proposed model reduces the RMSE by a range of 4.8% to 21.43% and increases the Precision by a range of 2.07% to 26.49% for the Amazon datasets. Similarly, for the MovieLens dataset, the proposed model reduces the RMSE by 45.61% and increases the Precision by 14.06%. Additionally, the experimental results on both datasets demonstrate that combining information from multiple modalities in the proposed model leads to superior outcomes compared to relying on a single type of information.

Janus - Multi Source Event Detection and Collection System for Effective Surveillance of Criminal Activity

  • Shahabi, Cyrus;Kim, Seon Ho;Nocera, Luciano;Constantinou, Giorgos;Lu, Ying;Cai, Yinghao;Medioni, Gerard;Nevatia, Ramakant;Banaei-Kashani, Farnoush
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.1-22
    • /
    • 2014
  • Recent technological advances provide the opportunity to use large amounts of multimedia data from a multitude of sensors with different modalities (e.g., video, text) for the detection and characterization of criminal activity. Their integration can compensate for sensor and modality deficiencies by using data from other available sensors and modalities. However, building such an integrated system at the scale of neighborhood and cities is challenging due to the large amount of data to be considered and the need to ensure a short response time to potential criminal activity. In this paper, we present a system that enables multi-modal data collection at scale and automates the detection of events of interest for the surveillance and reconnaissance of criminal activity. The proposed system showcases novel analytical tools that fuse multimedia data streams to automatically detect and identify specific criminal events and activities. More specifically, the system detects and analyzes series of incidents (an incident is an occurrence or artifact relevant to a criminal activity extracted from a single media stream) in the spatiotemporal domain to extract events (actual instances of criminal events) while cross-referencing multimodal media streams and incidents in time and space to provide a comprehensive view to a human operator while avoiding information overload. We present several case studies that demonstrate how the proposed system can provide law enforcement personnel with forensic and real time tools to identify and track potential criminal activity.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

Feasibility study on an acceleration signal-based translational and rotational mode shape estimation approach utilizing the linear transformation matrix

  • Seung-Hun Sung;Gil-Yong Lee;In-Ho Kim
    • Smart Structures and Systems
    • /
    • v.32 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • In modal analysis, the mode shape reflects the vibration characteristics of the structure, and thus it is widely performed for finite element model updating and structural health monitoring. Generally, the acceleration-based mode shape is suitable to express the characteristics of structures for the translational vibration; however, it is difficult to represent the rotational mode at boundary conditions. A tilt sensor and gyroscope capable of measuring rotational mode are used to analyze the overall behavior of the structure, but extracting its mode shape is the major challenge under the small vibration always. Herein, we conducted a feasibility study on a multi-mode shape estimating approach utilizing a single physical quantity signal. The basic concept of the proposed method is to receive multi-metric dynamic responses from two sensors and obtain mode shapes through bridge loading test with relatively large deformation. In addition, the linear transformation matrix for estimating two mode shapes is derived, and the mode shape based on the gyro sensor data is obtained by acceleration response using ambient vibration. Because the structure's behavior with respect to translational and rotational mode can be confirmed, the proposed method can obtain the total response of the structure considering boundary conditions. To verify the feasibility of the proposed method, we pre-measured dynamic data acquired from five accelerometers and five gyro sensors in a lab-scale test considering bridge structures, and obtained a linear transformation matrix for estimating the multi-mode shapes. In addition, the mode shapes for two physical quantities could be extracted by using only the acceleration data. Finally, the mode shapes estimated by the proposed method were compared with the mode shapes obtained from the two sensors. This study confirmed the applicability of the multi-mode shape estimation approach for accurate damage assessment using multi-dimensional mode shapes of bridge structures, and can be used to evaluate the behavior of structures under ambient vibration.