• Title/Summary/Keyword: 멀티 뷰 학습

Search Result 6, Processing Time 0.023 seconds

Multi-view learning review: understanding methods and their application (멀티 뷰 기법 리뷰: 이해와 응용)

  • Bae, Kang Il;Lee, Yung Seop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.41-68
    • /
    • 2019
  • Multi-view learning considers data from various viewpoints as well as attempts to integrate various information from data. Multi-view learning has been studied recently and has showed superior performance to a model learned from only a single view. With the introduction of deep learning techniques to a multi-view learning approach, it has showed good results in various fields such as image, text, voice, and video. In this study, we introduce how multi-view learning methods solve various problems faced in human behavior recognition, medical areas, information retrieval and facial expression recognition. In addition, we review data integration principles of multi-view learning methods by classifying traditional multi-view learning methods into data integration, classifiers integration, and representation integration. Finally, we examine how CNN, RNN, RBM, Autoencoder, and GAN, which are commonly used among various deep learning methods, are applied to multi-view learning algorithms. We categorize CNN and RNN-based learning methods as supervised learning, and RBM, Autoencoder, and GAN-based learning methods as unsupervised learning.

Multimedia Recommender System Based on Contrastive Learning with Modality-Reflective View (모달리티 반영 뷰를 활용하는 대조 학습 기반의 멀티미디어 추천 시스템)

  • SoHee Ban;Taeri Kim;Sang-Wook Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2024.05a
    • /
    • pp.635-638
    • /
    • 2024
  • 최근, 대조 학습 기반의 멀티미디어 추천 시스템들이 활발하게 연구되고 있다. 이들은 아이템의 다양한 모달리티 피처들을 활용하여 사용자와 아이템에 대한 임베딩들(뷰들)을 생성하고, 이들을 통해 대조 학습을 진행한다. 학습한 뷰들을 추천에 활용함으로써, 이들은 기존 멀티미디어 추천 시스템들보다 상당히 향상된 추천 정확도를 획득했다. 그럼에도 불구하고, 우리는 기존 대조 학습 기반의 멀티미디어 추천 시스템들이 아이템의 뷰들을 생성하는 데에 아이템의 모달리티 피처들을 올바르게 반영하는 것의 중요성을 간과하며, 그 결과 추천 정확도 향상에 제약을 갖는다고 주장한다. 이는 아이템 임베딩에 아이템 자신의 모달리티 피처를 올바르게 반영하는 것이 추천 정확도에 향상에 도움이 된다는 기존 멀티미디어 추천 시스템의 발견에 기반한다. 따라서 본 논문에서 우리는 아이템의 모달리티 피처들을 올바르게 반영할 수 있는 뷰(구체적으로, 모달리티 반영 뷰)를 통해 대조 학습을 진행하는 새로운 멀티미디어 추천 시스템을 제안한다. 제안 방안은 두 가지 실세계 공개 데이터 집합들에 대해 최신 멀티미디어 추천 시스템보다 6.78%까지 향상된 추천 정확도를 보였다.

Design and Implementation of PC-Mechanic Education Application System Using Image Processing (영상처리를 이용한 PC 내부구조 학습 어플리케이션 설계 및 구현)

  • Kim, Won-Jin;Kim, Hyung-Ook;Jo, Sung-Eun;Jang, Soo-Jeong;Moon, Il-Young
    • The Journal of Korean Institute for Practical Engineering Education
    • /
    • v.3 no.2
    • /
    • pp.93-99
    • /
    • 2011
  • We introduce the application what using the MultiTouch-Table of the PC-mechanic Certification. Thesedays, People does't use the Mouse and Keyboard and use people gesture. We introduce Graphic and Image by addition. Theseday, MultiTouch-Table is so famous. We use it the multitouch-table to on 3D Maxs and C#. We help them to get the certification using the component Scale and Drags through the camera view and then include the PC-Mechanic question of domestic.

  • PDF

Design of ePub-based Digital Textbooks Integrated Solution for Smart Learning (스마트러닝을 위한 ePub 기반 디지털교과서 통합 솔루션 설계)

  • Heo, Sung-Uk;Kang, Sung-In;Kim, Gwan-Hyung;Choi, Sung-Wook;Oh, Am-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.873-875
    • /
    • 2013
  • 정보기술 발전에 따라 정보 활용 및 처리 역량이 상승하면서 교육 환경의 지능화, 네트워크화로 기술간, 서비스 간 융 복합을 통한 다양한 학습 내용 및 방법이 출현하였으며, 최근 e-러닝 산업에서 스마트기기 보급 확산과 상황 적응적이고 자기 주도적 학습에 대한 소비자의 니즈가 증가하면서 새로운 형태의 교육시스템인 스마트러닝이 부각되고 있다. 이러한 교육 패러다임의 변화에 따라 기존의 교육 콘텐츠를 스마트기기에 적용하기 위해서는 콘텐츠 및 솔루션 구조의 개선이 요구되며, 또한 서비스 제공의 측면에서 다양한 교육 콘텐츠 연동과 교육 서비스 융합을 위한 표준 플랫폼 적용이 필요하다. 이에 본 논문에서는 JVM 환경의 PC 인터페이스를 통해 ePub 표준의 교육용 멀티미디어 콘텐츠 제작기능과 기존 서책형 파일 포맷의 자료 정보를 응용하기 위한 정보변환 모듈, 스마트 기기용 ePub 전자책 뷰어를 포함하는 통합 솔루션 소프트웨어인 ePub Solution을 설계하였다.

  • PDF

Using Skeleton Vector Information and RNN Learning Behavior Recognition Algorithm (스켈레톤 벡터 정보와 RNN 학습을 이용한 행동인식 알고리즘)

  • Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.598-605
    • /
    • 2018
  • Behavior awareness is a technology that recognizes human behavior through data and can be used in applications such as risk behavior through video surveillance systems. Conventional behavior recognition algorithms have been performed using the 2D camera image device or multi-mode sensor or multi-view or 3D equipment. When two-dimensional data was used, the recognition rate was low in the behavior recognition of the three-dimensional space, and other methods were difficult due to the complicated equipment configuration and the expensive additional equipment. In this paper, we propose a method of recognizing human behavior using only CCTV images without additional equipment using only RGB and depth information. First, the skeleton extraction algorithm is applied to extract points of joints and body parts. We apply the equations to transform the vector including the displacement vector and the relational vector, and study the continuous vector data through the RNN model. As a result of applying the learned model to various data sets and confirming the accuracy of the behavior recognition, the performance similar to that of the existing algorithm using the 3D information can be verified only by the 2D information.

Class-Agnostic 3D Mask Proposal and 2D-3D Visual Feature Ensemble for Efficient Open-Vocabulary 3D Instance Segmentation (효율적인 개방형 어휘 3차원 개체 분할을 위한 클래스-독립적인 3차원 마스크 제안과 2차원-3차원 시각적 특징 앙상블)

  • Sungho Song;Kyungmin Park;Incheol Kim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.335-347
    • /
    • 2024
  • Open-vocabulary 3D point cloud instance segmentation (OV-3DIS) is a challenging visual task to segment a 3D scene point cloud into object instances of both base and novel classes. In this paper, we propose a novel model Open3DME for OV-3DIS to address important design issues and overcome limitations of the existing approaches. First, in order to improve the quality of class-agnostic 3D masks, our model makes use of T3DIS, an advanced Transformer-based 3D point cloud instance segmentation model, as mask proposal module. Second, in order to obtain semantically text-aligned visual features of each point cloud segment, our model extracts both 2D and 3D features from the point cloud and the corresponding multi-view RGB images by using pretrained CLIP and OpenSeg encoders respectively. Last, to effectively make use of both 2D and 3D visual features of each point cloud segment during label assignment, our model adopts a unique feature ensemble method. To validate our model, we conducted both quantitative and qualitative experiments on ScanNet-V2 benchmark dataset, demonstrating significant performance gains.