• Title/Summary/Keyword: Multimodal approach

Search Result 76, Processing Time 0.025 seconds

Incomplete Cholesky Decomposition based Kernel Cross Modal Factor Analysis for Audiovisual Continuous Dimensional Emotion Recognition

  • Li, Xia;Lu, Guanming;Yan, Jingjie;Li, Haibo;Zhang, Zhengyan;Sun, Ning;Xie, Shipeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.810-831
    • /
    • 2019
  • Recently, continuous dimensional emotion recognition from audiovisual clues has attracted increasing attention in both theory and in practice. The large amount of data involved in the recognition processing decreases the efficiency of most bimodal information fusion algorithms. A novel algorithm, namely the incomplete Cholesky decomposition based kernel cross factor analysis (ICDKCFA), is presented and employed for continuous dimensional audiovisual emotion recognition, in this paper. After the ICDKCFA feature transformation, two basic fusion strategies, namely feature-level fusion and decision-level fusion, are explored to combine the transformed visual and audio features for emotion recognition. Finally, extensive experiments are conducted to evaluate the ICDKCFA approach on the AVEC 2016 Multimodal Affect Recognition Sub-Challenge dataset. The experimental results show that the ICDKCFA method has a higher speed than the original kernel cross factor analysis with the comparable performance. Moreover, the ICDKCFA method achieves a better performance than other common information fusion methods, such as the Canonical correlation analysis, kernel canonical correlation analysis and cross-modal factor analysis based fusion methods.

Future Challenges and Perspectives of Digital Dance Interventions for Depression in Older Adults

  • Zhiting Zhang;Qingfeng Zhang
    • International Journal of Advanced Culture Technology
    • /
    • v.12 no.2
    • /
    • pp.72-89
    • /
    • 2024
  • Depression is a common disorder among the elderly, significantly affecting their quality of life. Traditional dance interventions, although beneficial, have limitations in convenience, personalization, and retention. With the advent of digital technology, digital dance interventions have emerged as a potential solution to these limitations. This paper involves an extensive review of literature on digital dance interventions. Research databases were searched for studies that focus on the use of digital dance in treating depression among older adults. The review also includes analyses of the advancements in digital dance technology, its application in therapeutic settings, and the evaluation of its efficacy. The paper identifies three main challenges in the current digital dance intervention research: real-time dynamic assessment, multimodal dance generation, and improving compliance. Despite these challenges, digital dance interventions show promise in addressing the limitations of traditional dance therapy. The research suggests that the integration of human-computer interaction and personalized approaches in digital dance interventions could significantly improve outcomes in elderly patients with depression. Digital dance interventions represent a novel and promising approach to treating depression in older adults. Future research should focus on overcoming the identified challenges and enhancing the effectiveness of these interventions.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

Environmental IoT-Enabled Multimodal Mashup Service for Smart Forest Fires Monitoring

  • Elmisery, Ahmed M.;Sertovic, Mirela
    • Journal of Multimedia Information System
    • /
    • v.4 no.4
    • /
    • pp.163-170
    • /
    • 2017
  • Internet of things (IoT) is a new paradigm for collecting, processing and analyzing various contents in order to detect anomalies and to monitor particular patterns in a specific environment. The collected data can be used to discover new patterns and to offer new insights. IoT-enabled data mashup is a new technology to combine various types of information from multiple sources into a single web service. Mashup services create a new horizon for different applications. Environmental monitoring is a serious tool for the state and private organizations, which are located in regions with environmental hazards and seek to gain insights to detect hazards and locate them clearly. These organizations may utilize IoT - enabled data mashup service to merge different types of datasets from different IoT sensor networks in order to leverage their data analytics performance and the accuracy of the predictions. This paper presents an IoT - enabled data mashup service, where the multimedia data is collected from the various IoT platforms, then fed into an environmental cognition service which executes different image processing techniques such as noise removal, segmentation, and feature extraction, in order to detect interesting patterns in hazardous areas. The noise present in the captured images is eliminated with the help of a noise removal and background subtraction processes. Markov based approach was utilized to segment the possible regions of interest. The viable features within each region were extracted using a multiresolution wavelet transform, then fed into a discriminative classifier to extract various patterns. Experimental results have shown an accurate detection performance and adequate processing time for the proposed approach. We also provide a data mashup scenario for an IoT-enabled environmental hazard detection service and experimentation results.

A Study on the Software-oriented approach for Intermodal transportation (인터모달 수송에 대한 소프트웨어적 접근방법 연구)

  • 김영훈;홍순홈;김동희
    • Proceedings of the KSR Conference
    • /
    • 2002.10a
    • /
    • pp.314-319
    • /
    • 2002
  • The present transportation system is avoiding the competitive relationship between each transportation system and it is urgently needed the cooperative relationship between transportation system. The passengers using the various mass transportation is requiring a multimodal environment. Particularly now is the period when railroad system is considering intermodal environment. In this paper, concerning the above matters, we are dealing with intermodal at chapter2, and study cases of japan and europe at chapter 3, and at chapter 4 we are dealing with intermodal environment preparing considerations of railway system. Concerning about the approach of intermodal transportation, we are adopting the software-oriented method of using information technique and communication technique.

  • PDF

Problems on the Door to Door Application of International Air Law Conventions (국제항공운송협약의 Door to Door 운송에의 적용에 관한 문제점)

  • CHOI, Myung-Kook
    • THE INTERNATIONAL COMMERCE & LAW REVIEW
    • /
    • v.78
    • /
    • pp.1-29
    • /
    • 2018
  • This article demonstrates that both the Warsaw Convention Systemand the Montreal Convention are not designed for multimodal transport, let alone for "Door to Door" transport. The polemic directed against the "Door to Door" application of the Warsaw Convention systemand the Montreal Convention is predominantly driven by the text and the drafting philosophy of the said Contentions that since 1929 support unimodalism-with the rule that "the period of the carriage by air does not expend to any carriage by land, by sea or by inland waterway performed outside an airport" playing a profound role in restricting their multimodal aspirations. The drafters of the Montreal Convention were more adventurous than their predecessors with respect to the boundaries of the Montreal Convention. They amended Art. 18(3) by removing the phrase "whether in an aerodrome or on board an aircraft, or, in the case of landing outside an aerodrome, in any place whatsoever", however, they retained the first sentence of Art. 18(4). The deletion of the airport limitation fromArt. 18(3) creates its own paradox. The carrier can be held liable under the Montreal Convention for the loss or damage to cargo while it is in its charge in a warehouse outside an airport. Yet, damage or loss of the same cargo that occurs during its surface transportation to the aforementioned warehouse and vice versa is not covered by the Montreal Convention fromthe moment the cargo crosses the airport's perimeter. Surely, this result could not have been the intention of its drafters: it certainly does not make any commercial sense. I think that a better solution to the paradox is to apply the "functional interpretation" of the term"airport". This would retain the integrity of the text of the Montreal Convention, make sense of the change in the wording of Art. 18(3), and nevertheless retain the Convention's unimodal philosophy. English courts so far remain loyal to the judgment of the Court of Appeal in Quantum, which constitutes bad news for the supporters of the multimodal scope of the Montreal Convention. According the US cases, any losses occurring during Door to Door transportation under an air waybill which involves a dominant air segment are subject to the international air law conventions. Any domestic rules that might be applicable to the road segment are blatantly overlooked. Undoubtedly, the approach of the US makes commercial. But this policy decision by arguing that the intention of the drafters of the Warsaw Convention was to cover Door to Door transportation is mistaken. Any expansion to multimodal transport would require an amendment to the Montreal Convention, Arts 18 and 38, one that is not in the plans for the foreseeable future. Yet there is no doubt that air carriers and freight forwarders will continue to push hard for such expansion, especially in the USA, where courts are more accommodating.

  • PDF

A study of using quality for Radial Basis Function based score-level fusion in multimodal biometrics (RBF 기반 유사도 단계 융합 다중 생체 인식에서의 품질 활용 방안 연구)

  • Choi, Hyun-Soek;Shin, Mi-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.192-200
    • /
    • 2008
  • Multimodal biometrics is a method for personal authentication and verification using more than two types of biometrics data. RBF based score-level fusion uses pattern recognition algorithm for multimodal biometrics, seeking the optimal decision boundary to classify score feature vectors each of which consists of matching scores obtained from several unimodal biometrics system for each sample. In this case, all matching scores are assumed to have the same reliability. However, in recent research it is reported that the quality of input sample affects the result of biometrics. Currently the matching scores having low reliability caused by low quality of samples are not currently considered for pattern recognition modelling in multimodal biometrics. To solve this problem, in this paper, we proposed the RBF based score-level fusion approach which employs quality information of input biometrics data to adjust decision boundary. As a result the proposed method with Qualify information showed better recognition performance than both the unimodal biometrics and the usual RBF based score-level fusion without using quality information.

Multimodal Emotional State Estimation Model for Implementation of Intelligent Exhibition Services (지능형 전시 서비스 구현을 위한 멀티모달 감정 상태 추정 모형)

  • Lee, Kichun;Choi, So Yun;Kim, Jae Kyeong;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.1-14
    • /
    • 2014
  • Both researchers and practitioners are showing an increased interested in interactive exhibition services. Interactive exhibition services are designed to directly respond to visitor responses in real time, so as to fully engage visitors' interest and enhance their satisfaction. In order to install an effective interactive exhibition service, it is essential to adopt intelligent technologies that enable accurate estimation of a visitor's emotional state from responses to exhibited stimulus. Studies undertaken so far have attempted to estimate the human emotional state, most of them doing so by gauging either facial expressions or audio responses. However, the most recent research suggests that, a multimodal approach that uses people's multiple responses simultaneously may lead to better estimation. Given this context, we propose a new multimodal emotional state estimation model that uses various responses including facial expressions, gestures, and movements measured by the Microsoft Kinect Sensor. In order to effectively handle a large amount of sensory data, we propose to use stratified sampling-based MRA (multiple regression analysis) as our estimation method. To validate the usefulness of the proposed model, we collected 602,599 responses and emotional state data with 274 variables from 15 people. When we applied our model to the data set, we found that our model estimated the levels of valence and arousal in the 10~15% error range. Since our proposed model is simple and stable, we expect that it will be applied not only in intelligent exhibition services, but also in other areas such as e-learning and personalized advertising.

A new human-robot interaction method using semantic symbols

  • Park, Sang-Hyun;Hwang, Jung-Hoon;Kwon, Dong-Soo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.2005-2010
    • /
    • 2004
  • As robots become more prevalent in human daily life, situations requiring interaction between humans and robots will occur more frequently. Therefore, human-robot interaction (HRI) is becoming increasingly important. Although robotics researchers have made many technical developments in their field, intuitive and easy ways for most common users to interact with robots are still lacking. This paper introduces a new approach to enhance human-robot interaction using a semantic symbol language and proposes a method to acquire the intentions of robot users. In the proposed approach, each semantic symbol represents knowledge about either the environment or an action that a robot can perform. Users'intentions are expressed by symbolized multimodal information. To interpret a users'command, a probabilistic approach is used, which is appropriate for interpreting a freestyle user expression or insufficient input information. Therefore, a first-order Markov model is constructed as a probabilistic model, and a questionnaire is conducted to obtain state transition probabilities for this Markov model. Finally, we evaluated our model to show how well it interprets users'commands.

  • PDF

Combining Multi-Criteria Analysis with CBR for Medical Decision Support

  • Abdelhak, Mansoul;Baghdad, Atmani
    • Journal of Information Processing Systems
    • /
    • v.13 no.6
    • /
    • pp.1496-1515
    • /
    • 2017
  • One of the most visible developments in Decision Support Systems (DSS) was the emergence of rule-based expert systems. Hence, despite their success in many sectors, developers of Medical Rule-Based Systems have met several critical problems. Firstly, the rules are related to a clearly stated subject. Secondly, a rule-based system can only learn by updating of its rule-base, since it requires explicit knowledge of the used domain. Solutions to these problems have been sought through improved techniques and tools, improved development paradigms, knowledge modeling languages and ontology, as well as advanced reasoning techniques such as case-based reasoning (CBR) which is well suited to provide decision support in the healthcare setting. However, using CBR reveals some drawbacks, mainly in its interrelated tasks: the retrieval and the adaptation. For the retrieval task, a major drawback raises when several similar cases are found and consequently several solutions. Hence, a choice for the best solution must be done. To overcome these limitations, numerous useful works related to the retrieval task were conducted with simple and convenient procedures or by combining CBR with other techniques. Through this paper, we provide a combining approach using the multi-criteria analysis (MCA) to help, the traditional retrieval task of CBR, in choosing the best solution. Afterwards, we integrate this approach in a decision model to support medical decision. We present, also, some preliminary results and suggestions to extend our approach.