• Title/Summary/Keyword: Information Modalities

Search Result 176, Processing Time 0.03 seconds

A Study on Multiple Modalities for Face Anti-Spoofing (얼굴 스푸핑 방지를 위한 다중 양식에 관한 연구)

  • Wu, Chenmou;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.651-654
    • /
    • 2021
  • Face anti-spoofing (FAS) techniques play a significant role in the defense of facial recognition systems against spoofing attacks. Existing FAS methods achieve the great performance depending on annotated additional modalities. However, labeling these high-cost modalities need a lot of manpower, device resources and time. In this work, we proposed to use self-transforming modalities instead the annotated modalities. Three different modalities based on frequency domain and temporal domain are applied and analyzed. Intuitive visualization analysis shows the advantages of each modality. Comprehensive experiments in both the CNN-based and transformer-based architecture with various modalities combination demonstrate that self-transforming modalities improve the vanilla network a lot. The codes are available at https://github.com/chenmou0410/FAS-Challenge2021.

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

An Analysis of Collaborative Visualization Processing of Text Information for Developing e-Learning Contents

  • SUNG, Eunmo
    • Educational Technology International
    • /
    • v.10 no.1
    • /
    • pp.25-40
    • /
    • 2009
  • The purpose of this study was to explore procedures and modalities on collaborative visualization processing of text information for developing e-Learning contents. In order to investigate, two research questions were explored: 1) what are procedures on collaborative visualization processing of text information, 2) what kinds of patterns and modalities can be found in each procedure of collaborative visualization of text information. This research method was employed a qualitative research approaches by means of grounded theory. As a result of this research, collaborative visualization processing of text information were emerged six steps: identifying text, analyzing text, exploring visual clues, creating visuals, discussing visuals, elaborating visuals, and creating visuals. Collaborative visualization processing of text information came out the characteristic of systemic and systematic system like spiral sequencing. Also, another result of this study, modalities in collaborative visualization processing of text information was divided two dimensions: individual processing by internal representation, social processing by external representation. This case study suggested that collaborative visualization strategy has full possibility of providing ideal methods for sharing cognitive system or thinking system as using human visual intelligence.

Enhancing Recommender Systems by Fusing Diverse Information Sources through Data Transformation and Feature Selection

  • Thi-Linh Ho;Anh-Cuong Le;Dinh-Hong Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.5
    • /
    • pp.1413-1432
    • /
    • 2023
  • Recommender systems aim to recommend items to users by taking into account their probable interests. This study focuses on creating a model that utilizes multiple sources of information about users and items by employing a multimodality approach. The study addresses the task of how to gather information from different sources (modalities) and transform them into a uniform format, resulting in a multi-modal feature description for users and items. This work also aims to transform and represent the features extracted from different modalities so that the information is in a compatible format for integration and contains important, useful information for the prediction model. To achieve this goal, we propose a novel multi-modal recommendation model, which involves extracting latent features of users and items from a utility matrix using matrix factorization techniques. Various transformation techniques are utilized to extract features from other sources of information such as user reviews, item descriptions, and item categories. We also proposed the use of Principal Component Analysis (PCA) and Feature Selection techniques to reduce the data dimension and extract important features as well as remove noisy features to increase the accuracy of the model. We conducted several different experimental models based on different subsets of modalities on the MovieLens and Amazon sub-category datasets. According to the experimental results, the proposed model significantly enhances the accuracy of recommendations when compared to SVD, which is acknowledged as one of the most effective models for recommender systems. Specifically, the proposed model reduces the RMSE by a range of 4.8% to 21.43% and increases the Precision by a range of 2.07% to 26.49% for the Amazon datasets. Similarly, for the MovieLens dataset, the proposed model reduces the RMSE by 45.61% and increases the Precision by 14.06%. Additionally, the experimental results on both datasets demonstrate that combining information from multiple modalities in the proposed model leads to superior outcomes compared to relying on a single type of information.

Uncooperative Person Recognition Based on Stochastic Information Updates and Environment Estimators

  • Kim, Hye-Jin;Kim, Dohyung;Lee, Jaeyeon;Jeong, Il-Kwon
    • ETRI Journal
    • /
    • v.37 no.2
    • /
    • pp.395-405
    • /
    • 2015
  • We address the problem of uncooperative person recognition through continuous monitoring. Multiple modalities, such as face, height, clothes color, and voice, can be used when attempting to recognize a person. In general, not all modalities are available for a given frame; furthermore, only some modalities will be useful as some frames in a video sequence are of a quality that is too low to be able to recognize a person. We propose a method that makes use of stochastic information updates of temporal modalities and environment estimators to improve person recognition performance. The environment estimators provide information on whether a given modality is reliable enough to be used in a particular instance; such indicators mean that we can easily identify and eliminate meaningless data, thus increasing the overall efficiency of the method. Our proposed method was tested using movie clips acquired under an unconstrained environment that included a wide variation of scale and rotation; illumination changes; uncontrolled distances from a camera to users (varying from 0.5 m to 5 m); and natural views of the human body with various types of noise. In this real and challenging scenario, our proposed method resulted in an outstanding performance.

Evaluation of In-vehicle Warning Information Modalities by Kansei Engineering (감성공학을 이용한 차내 경고정보 제공방식 평가)

  • Park, Jun-Yeong;O, Cheol;Kim, Myeong-Ju;Jang, Myeong-Sun
    • Journal of Korean Society of Transportation
    • /
    • v.28 no.3
    • /
    • pp.39-49
    • /
    • 2010
  • Provision of in-vehicle warning information is of keen interest since it can be effectively used to prevent traffic accident on the road. This study evaluates the effectiveness of information provision modalities based on kansei engineering. Various warning information scenarios using different modalities are devised for the evaluation. Statistical data analysis techniques including factor analysis, correlation analysis, and the general linear model are used to assess the user's affect for information modalities. The evaluation result shows that the provision of visual information consisted of 'text and pictogram' leads to higher understandability. The combination of beep sound and voice message' was identified as a more effective modality for auditory warning. In addition, the red color for the blinking warning signal was preferred by users.

Emotion Recognition based on Multiple Modalities

  • Kim, Dong-Ju;Lee, Hyeon-Gu;Hong, Kwang-Seok
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.4
    • /
    • pp.228-236
    • /
    • 2011
  • Emotion recognition plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between humans and computer. Most of previous work on emotion recognition focused on extracting emotions from face, speech or EEG information separately. Therefore, a novel approach is presented in this paper, including face, speech and EEG, to recognize the human emotion. The individual matching scores obtained from face, speech, and EEG are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. In the experiment results, the proposed approach gives an improvement of more than 18.64% when compared to the most successful unimodal approach, and also provides better performance compared to approaches integrating two modalities each other. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

Hybrid feature extraction of multimodal images for face recognition

  • Cheema, Usman;Moon, Seungbin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.880-881
    • /
    • 2018
  • Recently technological advancements have allowed visible, infrared and thermal imaging systems to be readily available for security and access control. Increasing applications of facial recognition for security and access control leads to emerging spoofing methodologies. To overcome these challenges of occlusion, replay attack and disguise, researches have proposed using multiple imaging modalities. Using infrared and thermal modalities alongside visible imaging helps to overcome the shortcomings of visible imaging. In this paper we review and propose hybrid feature extraction methods to combine data from multiple imaging systems simultaneously.

Effects of Presentation Modalities of Television Moving Image and Print Text on Children's and Adult's Recall (TV동영상과 신문텍스트의 정보제시특성이 어린이와 성인의 정보기억에 미치는 영향)

  • Choi, E-Jung
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.149-158
    • /
    • 2009
  • Major purpose of this study is to explore effect of presentation modalities of Television and print on children's and adult's recall. So An experiment was conducted by comparing children's and adults' recall of information stories presented in three different modalities: "television moving Image1(auditory-visual redundancy)", "television moving Image2(auditory-visual redundancy)" and "print text". Results indicated that children remembered more infornation from the television moving Image than from print versions regardless of auditory-visual redundancy. But for the adults advantage of television was only found for information that had been accompanied by redundant pictures in television moving Image, providing support for the dual-coding hypothesis.

Hybrid Imaging in Oncology

  • Fatima, Nosheen;uz Zaman, Maseeh;Gnanasegaran, Gopinath;Zaman, Unaiza;Shahid, Wajeeha;Zaman, Areeba;Tahseen, Rabia
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.14
    • /
    • pp.5599-5605
    • /
    • 2015
  • In oncology various imaging modalities play a crucial role in diagnosis, staging, restaging, treatment monitoring and follow up of various cancers. Stand-alone morphological imaging like computerized tomography (CT) and magnetic resonance imaging (MRI) provide a high magnitude of anatomical details about the tumor but are relatively dumb about tumor physiology. Stand-alone functional imaging like positron emission tomography (PET) and single photon emission tomography (SPECT) are rich in functional information but provide little insight into tumor morphology. Introduction of first hybrid modality PET/CT is the one of the most successful stories of current century which has revolutionized patient care in oncology due to its high diagnostic accuracy. Spurred on by this success, more hybrid imaging modalities like SPECT/CT and PET/MR were introduced. It is the time to explore the potential applications of the existing hybrid modalities, developing and implementing standardized imaging protocols and train users in nuclear medicine and radiology. In this review we discuss three existing hybrid modalities with emphasis on their technical aspects and clinical applications in oncology.