• 제목/요약/키워드: multimodal

검색결과 646건 처리시간 0.026초

Multimodal Context Embedding for Scene Graph Generation

  • Jung, Gayoung;Kim, Incheol
    • Journal of Information Processing Systems
    • /
    • 제16권6호
    • /
    • pp.1250-1260
    • /
    • 2020
  • This study proposes a novel deep neural network model that can accurately detect objects and their relationships in an image and represent them as a scene graph. The proposed model utilizes several multimodal features, including linguistic features and visual context features, to accurately detect objects and relationships. In addition, in the proposed model, context features are embedded using graph neural networks to depict the dependencies between two related objects in the context feature vector. This study demonstrates the effectiveness of the proposed model through comparative experiments using the Visual Genome benchmark dataset.

AI 멀티모달 센서 기반 보행자 영상인식 알고리즘 (AI Multimodal Sensor-based Pedestrian Image Recognition Algorithm)

  • 신성윤;조승표;조광현
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2023년도 제67차 동계학술대회논문집 31권1호
    • /
    • pp.407-408
    • /
    • 2023
  • In this paper, we intend to develop a multimodal algorithm that secures recognition performance of over 95% in daytime illumination environments and secures recognition performance of over 90% in bad weather (rainfall and snow) and night illumination environments.

  • PDF

수도권 복합 대중교통망의 복수 대안 경로 탐색 알고리즘 고찰 (A Study on Finding the K Shortest Paths for the Multimodal Public Transportation Network in the Seoul Metropolitan)

  • 박종훈;손무성;오석문;민재홍
    • 한국철도학회:학술대회논문집
    • /
    • 한국철도학회 2011년도 정기총회 및 추계학술대회 논문집
    • /
    • pp.607-613
    • /
    • 2011
  • This paper reviews search methods of multiple reasonable paths to implement multimodal public transportation network of Seoul. Such a large scale multimodal public transportation network as Seoul, the computation time of path finding algorithm is a key and the result of path should reflect route choice behavior of public transportation passengers. Search method of alternative path is divided by removing path method and deviation path method. It analyzes previous researches based on the complexity of algorithm for large-scale network. Applying path finding algorithm in public transportation network, transfer and loop constraints must be included to be able to reflect real behavior. It constructs the generalized cost function based on the smart card data to reflect travel behavior of public transportation. To validate the availability of algorithm, experiments conducted with Seoul metropolitan public multimodal transportation network consisted with 22,109 nodes and 215,859 links by using the deviation path method, suitable for large-scale network.

  • PDF

휴대폰용 멀티모달 인터페이스 개발 - 키패드, 모션, 음성인식을 결합한 멀티모달 인터페이스 (Development of a multimodal interface for mobile phones)

  • 김원우
    • 한국HCI학회:학술대회논문집
    • /
    • 한국HCI학회 2008년도 학술대회 1부
    • /
    • pp.559-563
    • /
    • 2008
  • 휴대폰은 현대 생활에 없어서는 안 될 개인화 단말기가 되었으며, 그 위에서 다양한 디바이스, 컨텐츠 및 서비스의 컨버전스가 이루어지고 있다. 그러한 다양하고 복잡한 기능과 대용량 컨텐츠 및 정보를 효과적으로 검색하고 사용할 수 있는 수단에 대한 연구도 활발히 진행되고 있다. 본 연구는 휴대폰 상에서 음성, 키패드, 모션을 이용하여 한글 단어를 입력하는 새로운 인터페이스를 개발하고, 이를 응용한 전화걸기 애플리케이션을 통하여 그 그사용성과 효과를 검증하는 것을 목적으로 한다. 개발된 멀티모달 인터페이스는 복잡한 메뉴 트리와 깊이를 한 번에 접근할 수 있는 음성 인터페이스의 장점을 수용하면서 인식률 및 인식시간을 개선하였다.

  • PDF

Using Spatial Ontology in the Semantic Integration of Multimodal Object Manipulation in Virtual Reality

  • Irawati, Sylvia;Calderon, Daniela;Ko, Hee-Dong
    • 한국HCI학회:학술대회논문집
    • /
    • 한국HCI학회 2006년도 학술대회 1부
    • /
    • pp.884-892
    • /
    • 2006
  • This paper describes a framework for multimodal object manipulation in virtual environments. The gist of the proposed framework is the semantic integration of multimodal input using spatial ontology and user context to integrate the interpretation results from the inputs into a single one. The spatial ontology, describing the spatial relationships between objects, is used together with the current user context to solve ambiguities coming from the user's commands. These commands are used to reposition the objects in the virtual environments. We discuss how the spatial ontology is defined and used to assist the user to perform object placements in the virtual environment as it will be in the real world.

  • PDF

멀티모달 정보를 이용한 응급상황 인식 시스템 (Emergency situations Recognition System Using Multimodal Information)

  • 김영운;강선경;소인미;한대경;김윤진;정성태
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.757-758
    • /
    • 2008
  • This paper aims to propose an emergency recognition system using multimodal information extracted by an image processing module, a voice processing module, and a gravity sensor processing module. Each processing module detects predefined events such as moving, stopping, fainting, and transfer them to the multimodal integration module. Multimodal integration module recognizes emergency situation by using the transferred events and rechecks it by asking the user some question and recognizing the answer. The experiment was conducted for a faint motion in the living room and bathroom. The results of the experiment show that the proposed system is robust than previous methods and effectively recognizes emergency situations at various situations.

  • PDF

Coupling Particles Swarm Optimization for Multimodal Electromagnetic Problems

  • Pham, Minh-Trien;Song, Min-Ho;Koh, Chang-Seop
    • Journal of Electrical Engineering and Technology
    • /
    • 제5권3호
    • /
    • pp.423-430
    • /
    • 2010
  • Particle swarm optimization (PSO) algorithm is designed to find a single global optimal point. However, the PSO needs to be modified in order to find multiple optimal points of a multimodal function. These modifications usually divide a swarm of particles into multiple subswarms; in turn, these subswarms try to find their own optimal point, resulting in multiple optimal points. In this work, we present a new PSO algorithm, called coupling PSO to find multiple optimal points of a multimodal function based on coupling particles. In the coupling PSO, each main particle may generate a new particle to form a couple, after which the couple searches its own optimal point using non-stop-moving PSO algorithm. We tested the suggested algorithm and other ones, such as clustering PSO and niche PSO, over three analytic functions. The coupling PSO algorithm was also applied to solve a significant benchmark problem, the TEAM workshop problem 22.

Multimodal System by Data Fusion and Synergetic Neural Network

  • Son, Byung-Jun;Lee, Yill-Byung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제5권2호
    • /
    • pp.157-163
    • /
    • 2005
  • In this paper, we present the multimodal system based on the fusion of two user-friendly biometric modalities: Iris and Face. In order to reach robust identification and verification we are going to combine two different biometric features. we specifically apply 2-D discrete wavelet transform to extract the feature sets of low dimensionality from iris and face. And then to obtain Reduced Joint Feature Vector(RJFV) from these feature sets, Direct Linear Discriminant Analysis (DLDA) is used in our multimodal system. In addition, the Synergetic Neural Network(SNN) is used to obtain matching score of the preprocessed data. This system can operate in two modes: to identify a particular person or to verify a person's claimed identity. Our results for both cases show that the proposed method leads to a reliable person authentication system.

멀티모달 상황인지 미들웨어 기반의 홈앤(HomeN) 매니저 시스템 (HomeN manager system based on multimodal context-aware middleware)

  • 안세열;박성찬;박성수;구명완;정영준;김명숙
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 추계학술대회 발표논문집
    • /
    • pp.120-123
    • /
    • 2006
  • The provision of personalized user interfaces for mobile devices is expected to be used for different devices with a wide variety of capabilities and interaction modalities. In this paper, we implemented a multimodal context-aware middleware incorporating XML-based languages such as XHTML, VoiceXML. SCXML uses parallel states to invoke both XHTML and VoiceXML contents as well as to gather composite multimodal inputs or synchronize inter-modalities through man-machine I/Os. We developed home networking service named "HomeN" based on our middleware framework. It demonstrates that users could maintain multimodal scenarios in a clear, concise and consistent manner under various user's interactions.

  • PDF

Multimodal Interaction on Automultiscopic Content with Mobile Surface Haptics

  • Kim, Jin Ryong;Shin, Seunghyup;Choi, Seungho;Yoo, Yeonwoo
    • ETRI Journal
    • /
    • 제38권6호
    • /
    • pp.1085-1094
    • /
    • 2016
  • In this work, we present interactive automultiscopic content with mobile surface haptics for multimodal interaction. Our system consists of a 40-view automultiscopic display and a tablet supporting surface haptics in an immersive room. Animated graphics are projected onto the walls of the room. The 40-view automultiscopic display is placed at the center of the front wall. The haptic tablet is installed at the mobile station to enable the user to interact with the tablet. The 40-view real-time rendering and multiplexing technology is applied by establishing virtual cameras in the convergence layout. Surface haptics rendering is synchronized with three-dimensional (3D) objects on the display for real-time haptic interaction. We conduct an experiment to evaluate user experiences of the proposed system. The results demonstrate that the system's multimodal interaction provides positive user experiences of immersion, control, user interface intuitiveness, and 3D effects.