• Title/Summary/Keyword: multimodal

Search Result 646, Processing Time 0.025 seconds

Efficient Multimodal Background Modeling and Motion Defection (효과적인 다봉 배경 모델링 및 물체 검출)

  • Park, Dae-Yong;Byun, Hae-Ran
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.6
    • /
    • pp.459-463
    • /
    • 2009
  • Background modeling and motion detection is the one of the most significant real time video processing technique. Until now, many researches are conducted into the topic but it still needs much time for robustness. It is more important when other algorithms are used together such as object tracking, classification or behavior understanding. In this paper, we propose efficient multi-modal background modeling methods which can be understood as simplified learning method of Gaussian mixture model. We present its validity using numerical methods and experimentally show detecting performance.

An Watermarking Algorithm for Multimodal Biometric Systems (다중 생체인식 시스템에 적합한 워터마킹 알고리즘)

  • Moon, Dae-Sung;Jung, Seung-Hwan;Kim, Tae-Hae;Chung, Yong-Wha;Moon, Ki-Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.15 no.4
    • /
    • pp.93-100
    • /
    • 2005
  • In this paper, we describe biometric watermarking techniques for secure user verification on the remote, multimodal biometric system employing both fingerprint and face information, and compare their effects on verification accuracy quantitatively. To hide biometric data with watermarking techniques, we first consider possible two scenarios. In the scenario 1, we use a fingerprint image as a cover work and hide facial features into it. On the contrary, we hide fingerprint features into a facial image in the Scenario 2. Based on the experimental results, we confirm that the Scenario 2 is superior to the Scenario 1 in terms of the verification accuracy of the watermarking image.

A Multimodal Fusion Method Based on a Rotation Invariant Hierarchical Model for Finger-based Recognition

  • Zhong, Zhen;Gao, Wanlin;Wang, Minjuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.1
    • /
    • pp.131-146
    • /
    • 2021
  • Multimodal biometric-based recognition has been an active topic because of its higher convenience in recent years. Due to high user convenience of finger, finger-based personal identification has been widely used in practice. Hence, taking Finger-Print (FP), Finger-Vein (FV) and Finger-Knuckle-Print (FKP) as the ingredients of characteristic, their feature representation were helpful for improving the universality and reliability in identification. To usefully fuse the multimodal finger-features together, a new robust representation algorithm was proposed based on hierarchical model. Firstly, to obtain more robust features, the feature maps were obtained by Gabor magnitude feature coding and then described by Local Binary Pattern (LBP). Secondly, the LGBP-based feature maps were processed hierarchically in bottom-up mode by variable rectangle and circle granules, respectively. Finally, the intension of each granule was represented by Local-invariant Gray Features (LGFs) and called Hierarchical Local-Gabor-based Gray Invariant Features (HLGGIFs). Experiment results revealed that the proposed algorithm is capable of improving rotation variation of finger-pose, and achieving lower Equal Error Rate (EER) in our homemade database.

A multisource image fusion method for multimodal pig-body feature detection

  • Zhong, Zhen;Wang, Minjuan;Gao, Wanlin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.11
    • /
    • pp.4395-4412
    • /
    • 2020
  • The multisource image fusion has become an active topic in the last few years owing to its higher segmentation rate. To enhance the accuracy of multimodal pig-body feature segmentation, a multisource image fusion method was employed. Nevertheless, the conventional multisource image fusion methods can not extract superior contrast and abundant details of fused image. To superior segment shape feature and detect temperature feature, a new multisource image fusion method was presented and entitled as NSST-GF-IPCNN. Firstly, the multisource images were resolved into a range of multiscale and multidirectional subbands by Nonsubsampled Shearlet Transform (NSST). Then, to superior describe fine-scale texture and edge information, even-symmetrical Gabor filter and Improved Pulse Coupled Neural Network (IPCNN) were used to fuse low and high-frequency subbands, respectively. Next, the fused coefficients were reconstructed into a fusion image using inverse NSST. Finally, the shape feature was extracted using automatic threshold algorithm and optimized using morphological operation. Nevertheless, the highest temperature of pig-body was gained in view of segmentation results. Experiments revealed that the presented fusion algorithm was able to realize 2.102-4.066% higher average accuracy rate than the traditional algorithms and also enhanced efficiency.

Real-world multimodal lifelog dataset for human behavior study

  • Chung, Seungeun;Jeong, Chi Yoon;Lim, Jeong Mook;Lim, Jiyoun;Noh, Kyoung Ju;Kim, Gague;Jeong, Hyuntae
    • ETRI Journal
    • /
    • v.44 no.3
    • /
    • pp.426-437
    • /
    • 2022
  • To understand the multilateral characteristics of human behavior and physiological markers related to physical, emotional, and environmental states, extensive lifelog data collection in a real-world environment is essential. Here, we propose a data collection method using multimodal mobile sensing and present a long-term dataset from 22 subjects and 616 days of experimental sessions. The dataset contains over 10 000 hours of data, including physiological, data such as photoplethysmography, electrodermal activity, and skin temperature in addition to the multivariate behavioral data. Furthermore, it consists of 10 372 user labels with emotional states and 590 days of sleep quality data. To demonstrate feasibility, human activity recognition was applied on the sensor data using a convolutional neural network-based deep learning model with 92.78% recognition accuracy. From the activity recognition result, we extracted the daily behavior pattern and discovered five representative models by applying spectral clustering. This demonstrates that the dataset contributed toward understanding human behavior using multimodal data accumulated throughout daily lives under natural conditions.

Rapid Functional Enhancement of Ankylosing Spondylitis with Severe Hip Joint Arthritis and Muscle Strain (고관절염과 근 긴장을 동반한 강직성 척추염의 빠른 기능 회복)

  • Hwang, Sangwon;Im, Sang Hee;Shin, Ji Cheol;Park, Jinyoung
    • Clinical Pain
    • /
    • v.18 no.2
    • /
    • pp.121-125
    • /
    • 2019
  • Arthritis of hip joints deteriorates the quality of life in ankylosing spondylitis (AS) patients. Secondary to the articular inflammatory process, the shortened hip-girdle muscles contribute to the decreased joint mobility which may lead to the functional impairment. As the limitation of range of motion (ROM) usually progress slowly, clinicians regard it as a chronic condition and prescribe long-term therapy. However, by short-term intensive multimodal treatment, a 20-year-old man diagnosed as AS with severely limited hip joint ROM who relied on crutches doubled the joint angle and could walk independently only within 2 weeks. The combination included intra-articular steroid injection, electrical twitch obtaining intramuscular stimulation, extracorporeal shock wave therapy, heat, manual therapy, and stretching exercises. The management focused on the relaxation of hip-girdle muscles as well as the direct control of intra-articular inflammation. Hereby, we emphasize the effectiveness of intensive multimodal treatment in improving the function even within a short period.

Multimodal Medical Image Fusion Based on Two-Scale Decomposer and Detail Preservation Model (이중스케일분해기와 미세정보 보존모델에 기반한 다중 모드 의료영상 융합연구)

  • Zhang, Yingmei;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.655-658
    • /
    • 2021
  • The purpose of multimodal medical image fusion (MMIF) is to integrate images of different modes with different details into a result image with rich information, which is convenient for doctors to accurately diagnose and treat the diseased tissues of patients. Encouraged by this purpose, this paper proposes a novel method based on a two-scale decomposer and detail preservation model. The first step is to use the two-scale decomposer to decompose the source image into the energy layers and structure layers, which have the characteristic of detail preservation. And then, structure tensor operator and max-abs are combined to fuse the structure layers. The detail preservation model is proposed for the fusion of the energy layers, which greatly improves the image performance. The fused image is achieved by summing up the two fused sub-images obtained by the above fusion rules. Experiments demonstrate that the proposed method has superior performance compared with the state-of-the-art fusion methods.

The Use of Graphic Novels for Developing Multiliteracies (그래픽노블을 통한 다중문식성의 발달)

  • Yun, Eunja
    • Journal of English Language & Literature
    • /
    • v.56 no.4
    • /
    • pp.575-596
    • /
    • 2010
  • The modes of narratives and communication have expanded due to social and cultural changes and technological development. Thus texts have become multimodal and media hybridities and media crossover have been increasing as well. Multimodality requires new literacy to understand and interpret those multimodal texts other than existing traditional literacy approaches. The New London Group (2000) argues that multiliteracies are needed to serve today's changing multimodal texts. Kress (2003) also argues, visual texts have been prevailing, being mingled with other modes of texts such as linguistic, audio, gestural, and spatial modes. Literary texts are not exception in this trend of multimodality. The recent renaissance of comics, in particular, the new light on graphic novels can be interpreted in this historical vein. In comparison to comics, no consensus has been made in defining graphic novels, however, many studies have been recently conducted in order to look into the potential of graphic novels in building multiliteracies. In this paper, the graphic novel as a literary genre are explored from a histocial perspective and the definition of graphic novels was attempted to be made. In the light of multiliteracies, this paper presented cases that show how graphic novels can be utilized to build multiliteracies. Lastly, the use of graphic novels for English as a foreign language was introduced as well. The author hopes that at the age of multimodality, the potential graphic novels have in language and literacy education can be taken into account by language teachers and students in expanding their territory of literacy.

Predicting Session Conversion on E-commerce: A Deep Learning-based Multimodal Fusion Approach

  • Minsu Kim;Woosik Shin;SeongBeom Kim;Hee-Woong Kim
    • Asia pacific journal of information systems
    • /
    • v.33 no.3
    • /
    • pp.737-767
    • /
    • 2023
  • With the availability of big customer data and advances in machine learning techniques, the prediction of customer behavior at the session-level has attracted considerable attention from marketing practitioners and scholars. This study aims to predict customer purchase conversion at the session-level by employing customer profile, transaction, and clickstream data. For this purpose, we develop a multimodal deep learning fusion model with dynamic and static features (i.e., DS-fusion). Specifically, we base page views within focal visist and recency, frequency, monetary value, and clumpiness (RFMC) for dynamic and static features, respectively, to comprehensively capture customer characteristics for buying behaviors. Our model with deep learning architectures combines these features for conversion prediction. We validate the proposed model using real-world e-commerce data. The experimental results reveal that our model outperforms unimodal classifiers with each feature and the classical machine learning models with dynamic and static features, including random forest and logistic regression. In this regard, this study sheds light on the promise of the machine learning approach with the complementary method for different modalities in predicting customer behaviors.

Enhancing Multimodal Emotion Recognition in Speech and Text with Integrated CNN, LSTM, and BERT Models (통합 CNN, LSTM, 및 BERT 모델 기반의 음성 및 텍스트 다중 모달 감정 인식 연구)

  • Edward Dwijayanto Cahyadi;Hans Nathaniel Hadi Soesilo;Mi-Hwa Song
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.617-623
    • /
    • 2024
  • Identifying emotions through speech poses a significant challenge due to the complex relationship between language and emotions. Our paper aims to take on this challenge by employing feature engineering to identify emotions in speech through a multimodal classification task involving both speech and text data. We evaluated two classifiers-Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM)-both integrated with a BERT-based pre-trained model. Our assessment covers various performance metrics (accuracy, F-score, precision, and recall) across different experimental setups). The findings highlight the impressive proficiency of two models in accurately discerning emotions from both text and speech data.