• 제목/요약/키워드: Information Modalities

검색결과 177건 처리시간 0.027초

얼굴 스푸핑 방지를 위한 다중 양식에 관한 연구 (A Study on Multiple Modalities for Face Anti-Spoofing)

  • 오신모;이효종
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2021년도 추계학술발표대회
    • /
    • pp.651-654
    • /
    • 2021
  • Face anti-spoofing (FAS) techniques play a significant role in the defense of facial recognition systems against spoofing attacks. Existing FAS methods achieve the great performance depending on annotated additional modalities. However, labeling these high-cost modalities need a lot of manpower, device resources and time. In this work, we proposed to use self-transforming modalities instead the annotated modalities. Three different modalities based on frequency domain and temporal domain are applied and analyzed. Intuitive visualization analysis shows the advantages of each modality. Comprehensive experiments in both the CNN-based and transformer-based architecture with various modalities combination demonstrate that self-transforming modalities improve the vanilla network a lot. The codes are available at https://github.com/chenmou0410/FAS-Challenge2021.

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • 제16권1호
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

An Analysis of Collaborative Visualization Processing of Text Information for Developing e-Learning Contents

  • SUNG, Eunmo
    • Educational Technology International
    • /
    • 제10권1호
    • /
    • pp.25-40
    • /
    • 2009
  • The purpose of this study was to explore procedures and modalities on collaborative visualization processing of text information for developing e-Learning contents. In order to investigate, two research questions were explored: 1) what are procedures on collaborative visualization processing of text information, 2) what kinds of patterns and modalities can be found in each procedure of collaborative visualization of text information. This research method was employed a qualitative research approaches by means of grounded theory. As a result of this research, collaborative visualization processing of text information were emerged six steps: identifying text, analyzing text, exploring visual clues, creating visuals, discussing visuals, elaborating visuals, and creating visuals. Collaborative visualization processing of text information came out the characteristic of systemic and systematic system like spiral sequencing. Also, another result of this study, modalities in collaborative visualization processing of text information was divided two dimensions: individual processing by internal representation, social processing by external representation. This case study suggested that collaborative visualization strategy has full possibility of providing ideal methods for sharing cognitive system or thinking system as using human visual intelligence.

Enhancing Recommender Systems by Fusing Diverse Information Sources through Data Transformation and Feature Selection

  • Thi-Linh Ho;Anh-Cuong Le;Dinh-Hong Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권5호
    • /
    • pp.1413-1432
    • /
    • 2023
  • Recommender systems aim to recommend items to users by taking into account their probable interests. This study focuses on creating a model that utilizes multiple sources of information about users and items by employing a multimodality approach. The study addresses the task of how to gather information from different sources (modalities) and transform them into a uniform format, resulting in a multi-modal feature description for users and items. This work also aims to transform and represent the features extracted from different modalities so that the information is in a compatible format for integration and contains important, useful information for the prediction model. To achieve this goal, we propose a novel multi-modal recommendation model, which involves extracting latent features of users and items from a utility matrix using matrix factorization techniques. Various transformation techniques are utilized to extract features from other sources of information such as user reviews, item descriptions, and item categories. We also proposed the use of Principal Component Analysis (PCA) and Feature Selection techniques to reduce the data dimension and extract important features as well as remove noisy features to increase the accuracy of the model. We conducted several different experimental models based on different subsets of modalities on the MovieLens and Amazon sub-category datasets. According to the experimental results, the proposed model significantly enhances the accuracy of recommendations when compared to SVD, which is acknowledged as one of the most effective models for recommender systems. Specifically, the proposed model reduces the RMSE by a range of 4.8% to 21.43% and increases the Precision by a range of 2.07% to 26.49% for the Amazon datasets. Similarly, for the MovieLens dataset, the proposed model reduces the RMSE by 45.61% and increases the Precision by 14.06%. Additionally, the experimental results on both datasets demonstrate that combining information from multiple modalities in the proposed model leads to superior outcomes compared to relying on a single type of information.

Uncooperative Person Recognition Based on Stochastic Information Updates and Environment Estimators

  • Kim, Hye-Jin;Kim, Dohyung;Lee, Jaeyeon;Jeong, Il-Kwon
    • ETRI Journal
    • /
    • 제37권2호
    • /
    • pp.395-405
    • /
    • 2015
  • We address the problem of uncooperative person recognition through continuous monitoring. Multiple modalities, such as face, height, clothes color, and voice, can be used when attempting to recognize a person. In general, not all modalities are available for a given frame; furthermore, only some modalities will be useful as some frames in a video sequence are of a quality that is too low to be able to recognize a person. We propose a method that makes use of stochastic information updates of temporal modalities and environment estimators to improve person recognition performance. The environment estimators provide information on whether a given modality is reliable enough to be used in a particular instance; such indicators mean that we can easily identify and eliminate meaningless data, thus increasing the overall efficiency of the method. Our proposed method was tested using movie clips acquired under an unconstrained environment that included a wide variation of scale and rotation; illumination changes; uncontrolled distances from a camera to users (varying from 0.5 m to 5 m); and natural views of the human body with various types of noise. In this real and challenging scenario, our proposed method resulted in an outstanding performance.

감성공학을 이용한 차내 경고정보 제공방식 평가 (Evaluation of In-vehicle Warning Information Modalities by Kansei Engineering)

  • 박준영;오철;김명주;장명순
    • 대한교통학회지
    • /
    • 제28권3호
    • /
    • pp.39-49
    • /
    • 2010
  • 본 연구에서는 감성공학적 분석방법론을 이용하여 운전자가 감성적인 측면에서 효과적으로 반응할 수 있는 교통안전 경고정보 제공방식 도출을 위한 연구를 수행하였다. 교통안전 경고정보는 운전자에게 전방의 위험요소를 미리 알려주어 사고회피를 위한 적절한 반응을 유도하는 역할을 하며 네비게이션과 같은 차내단말기를 통해 제공될 수 있다. 경고정보는 정보 제공방식들의 조합으로 구성되며 9개 시나리오를 설정하고 두 번의 설문조사를 시행하였다. 의미미분법, 상관분석, 수량화I류 이론을 이용한 감성공학I류 분석방법을 통해 연구를 진행하였으며, 성별차이에 따른 운전자 감성특성을 분석하였다. 분석결과 성별차이에 따라 각 정보의 제공방식에 대해 운전자가 느끼는 감성정도가 전체적으로 차이가 나는 것으로 분석되었다. 제공방식의 조합은 '청각적 요소: Beep음+음성안내', '메시지창: Text+픽토그램', '배경점멸: 빨간색 점멸'이 운전자의 감성정도와 선호도가 높게 나타났다. 본 연구의 결과는 운전자의 감성특성을 고려한 효과적인 교통안전 경고정보 설계 및 제공을 위한 유용한 자료로 활용될 것으로 기대된다.

Emotion Recognition based on Multiple Modalities

  • Kim, Dong-Ju;Lee, Hyeon-Gu;Hong, Kwang-Seok
    • 융합신호처리학회논문지
    • /
    • 제12권4호
    • /
    • pp.228-236
    • /
    • 2011
  • Emotion recognition plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between humans and computer. Most of previous work on emotion recognition focused on extracting emotions from face, speech or EEG information separately. Therefore, a novel approach is presented in this paper, including face, speech and EEG, to recognize the human emotion. The individual matching scores obtained from face, speech, and EEG are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. In the experiment results, the proposed approach gives an improvement of more than 18.64% when compared to the most successful unimodal approach, and also provides better performance compared to approaches integrating two modalities each other. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

Hybrid feature extraction of multimodal images for face recognition

  • Cheema, Usman;Moon, Seungbin
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2018년도 추계학술발표대회
    • /
    • pp.880-881
    • /
    • 2018
  • Recently technological advancements have allowed visible, infrared and thermal imaging systems to be readily available for security and access control. Increasing applications of facial recognition for security and access control leads to emerging spoofing methodologies. To overcome these challenges of occlusion, replay attack and disguise, researches have proposed using multiple imaging modalities. Using infrared and thermal modalities alongside visible imaging helps to overcome the shortcomings of visible imaging. In this paper we review and propose hybrid feature extraction methods to combine data from multiple imaging systems simultaneously.

TV동영상과 신문텍스트의 정보제시특성이 어린이와 성인의 정보기억에 미치는 영향 (Effects of Presentation Modalities of Television Moving Image and Print Text on Children's and Adult's Recall)

  • 최이정
    • 한국콘텐츠학회논문지
    • /
    • 제9권7호
    • /
    • pp.149-158
    • /
    • 2009
  • 본 연구는 TV동영상과 신문텍스트의 정보제시특성에 따라 어린이와 성인의 정보기억이 각각 어떻게 달라지는가를 고찰한 것이다. 이를 위해 "TV 동영상1(화면과 음성정보 중복)", "TV 동영상2(화면과 음성 정보 분리)", "신문 텍스트" 의 세 가지 서로 다른 제시유형의 정보스토리를 어린이와 성인에게 제시하고 그들의 정보회상정도를 비교하는 실험연구를 수행했다. 검증결과 어린이는 화면과 음성의 중복성 여부와 상관없이 신문텍스트보다 TV동영상 정보를 더 잘 기억하는 것으로 나타났다. 그러나 성인은 화면과 음성의 중복성이 전제되는 경우에만 이중부호화가설을 지지하며 신문텍스트보다 TV 동영상의 장점이 더 부각되어 나타나는 것을 확인할 수 있었다.

Hybrid Imaging in Oncology

  • Fatima, Nosheen;uz Zaman, Maseeh;Gnanasegaran, Gopinath;Zaman, Unaiza;Shahid, Wajeeha;Zaman, Areeba;Tahseen, Rabia
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권14호
    • /
    • pp.5599-5605
    • /
    • 2015
  • In oncology various imaging modalities play a crucial role in diagnosis, staging, restaging, treatment monitoring and follow up of various cancers. Stand-alone morphological imaging like computerized tomography (CT) and magnetic resonance imaging (MRI) provide a high magnitude of anatomical details about the tumor but are relatively dumb about tumor physiology. Stand-alone functional imaging like positron emission tomography (PET) and single photon emission tomography (SPECT) are rich in functional information but provide little insight into tumor morphology. Introduction of first hybrid modality PET/CT is the one of the most successful stories of current century which has revolutionized patient care in oncology due to its high diagnostic accuracy. Spurred on by this success, more hybrid imaging modalities like SPECT/CT and PET/MR were introduced. It is the time to explore the potential applications of the existing hybrid modalities, developing and implementing standardized imaging protocols and train users in nuclear medicine and radiology. In this review we discuss three existing hybrid modalities with emphasis on their technical aspects and clinical applications in oncology.