• Title/Summary/Keyword: Multimodal Contents

Search Result 35, Processing Time 0.033 seconds

Improved Transformer Model for Multimodal Fashion Recommendation Conversation System (멀티모달 패션 추천 대화 시스템을 위한 개선된 트랜스포머 모델)

  • Park, Yeong Joon;Jo, Byeong Cheol;Lee, Kyoung Uk;Kim, Kyung Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.1
    • /
    • pp.138-147
    • /
    • 2022
  • Recently, chatbots have been applied in various fields and have shown good results, and many attempts to use chatbots in shopping mall product recommendation services are being conducted on e-commerce platforms. In this paper, for a conversation system that recommends a fashion that a user wants based on conversation between the user and the system and fashion image information, a transformer model that is currently performing well in various AI fields such as natural language processing, voice recognition, and image recognition. We propose a multimodal-based improved transformer model that is improved to increase the accuracy of recommendation by using dialogue (text) and fashion (image) information together for data preprocessing and data representation. We also propose a method to improve accuracy through data improvement by analyzing the data. The proposed system has a recommendation accuracy score of 0.6563 WKT (Weighted Kendall's tau), which significantly improved the existing system's 0.3372 WKT by 0.3191 WKT or more.

Environmental IoT-Enabled Multimodal Mashup Service for Smart Forest Fires Monitoring

  • Elmisery, Ahmed M.;Sertovic, Mirela
    • Journal of Multimedia Information System
    • /
    • v.4 no.4
    • /
    • pp.163-170
    • /
    • 2017
  • Internet of things (IoT) is a new paradigm for collecting, processing and analyzing various contents in order to detect anomalies and to monitor particular patterns in a specific environment. The collected data can be used to discover new patterns and to offer new insights. IoT-enabled data mashup is a new technology to combine various types of information from multiple sources into a single web service. Mashup services create a new horizon for different applications. Environmental monitoring is a serious tool for the state and private organizations, which are located in regions with environmental hazards and seek to gain insights to detect hazards and locate them clearly. These organizations may utilize IoT - enabled data mashup service to merge different types of datasets from different IoT sensor networks in order to leverage their data analytics performance and the accuracy of the predictions. This paper presents an IoT - enabled data mashup service, where the multimedia data is collected from the various IoT platforms, then fed into an environmental cognition service which executes different image processing techniques such as noise removal, segmentation, and feature extraction, in order to detect interesting patterns in hazardous areas. The noise present in the captured images is eliminated with the help of a noise removal and background subtraction processes. Markov based approach was utilized to segment the possible regions of interest. The viable features within each region were extracted using a multiresolution wavelet transform, then fed into a discriminative classifier to extract various patterns. Experimental results have shown an accurate detection performance and adequate processing time for the proposed approach. We also provide a data mashup scenario for an IoT-enabled environmental hazard detection service and experimentation results.

Emotion Generation Model for Tutoring Agents (교육용 에이전트를 위한 감성 생성 모델)

  • Choo, Moon Won;Choi, Young Mie
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2002.05d
    • /
    • pp.812-822
    • /
    • 2002
  • The interface metaphor has been evolved gradually from desktop to agent-oriented paradigm. Multimedia contents could be simply recognized as the multimodal communicational interface. In this respect, the emotional agents are actively focused as the research topics to test the possibility for realizing anthropomorphized and sympathetic interfaces. In this paper, the emotion generation model for tutoring agents is suggested.

  • PDF

Deformable Registration for MRI Medical Image

  • Li, Binglu;Kim, YoungSeop;Lee, Yong-Hwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.2
    • /
    • pp.63-66
    • /
    • 2019
  • Due to the development of medical imaging technology, different imaging technologies provide a large amount of effective information. However, different imaging method caused the limitations of information integrity by using single type of image. Combining different image together so that doctor can obtain the information from medical image comprehensively. Image registration algorithm based on mutual information has become one of the hotspots in the field of image registration with its high registration accuracy and wide applicability. Because the information theory-based registration technology is not dependent on the gray value difference of the image, and it is very suitable for multimodal medical image registration. However, the method based on mutual information has a robustness problem. The essential reason is that the mutual information itself is not have enough information between the pixel pairs, so that the mutual information is unstable during the registration process. A large number of local extreme values are generated, which finally cause mismatch. In order to overcome the shortages of mutual information registration method, this paper proposes a registration method combined with image spatial structure information and mutual information.

Exploratory Study on the Possibilities of Convergence with Music in Writing Classes (글쓰기 수업에서 음악과의 융합 가능성에 대한 탐색적 연구)

  • Lee, Ran
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.8
    • /
    • pp.88-100
    • /
    • 2020
  • This is an exploratory study based on the literature reviews which examined the possibilities and necessities of multimodal writing curriculum for liberal education. The purpose of this study is to analyze the existing research results which utilized the teaching methods associating music and writing, and to find the educational implications, and finally in terms of writing education, to suggest the possibilities of writing classes' convergent forms with music extracted from the results of the existing studies. Those studies were categorized to four patterns: WAC, effects of therapy, materials for writing, and new literacy. Based on Meyrowitz's perspective, firstly music can be utilized as a circumstance, which means a teacher can indirectly take the emotional, reminding, and healing effects of background musics. Secondly, music can play an important role of materials in thinking and writing, which is the most generally utilized pattern today. The effects are found in all of affective, cognitive, and strategic domains by utilizing music as a sort of reading materials. Thirdly, the convergent writing of music and narrative is suggested. Music is an independent language that can interact with narrative and construct text meanings in this kind of writing classes. These three dimensions of convergence have different perspectives, but sometimes occur at a same time or as a connected pattern. This study proposes that writing teachers need to improve their competence in music as well and to have professional concerns and efforts to develop their convergent writing teaching skills with music for these classes. Finally, this study stresses that team teaching can be an alternative for them.

A Review of Haptic Perception: Focused on Sensation and Application

  • Song, Joobong;Lim, Ji Hyoun;Yun, Myung Hwan
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.6
    • /
    • pp.715-723
    • /
    • 2012
  • Objective: The aim of this study is to investigate haptic perception related researches into three perspectives: cutaneous & proprioceptive sensations, active & passive touch, and cognition & emotion, then to identify issues for implementing haptic interactions. Background: Although haptic technologies had improved and become practical, more research on the method of application is still needed to actualize the multimodal interaction technology. Systematical approached to explore haptic perception is required to understand emotional experience and social message, as well as tactile feedback. Method: Content analysis were conducted to analyze trend in haptic related research. Changes in issues and topics were investigated using sensational dimensions and the different contents delivered via tactile perception. Result: The found research opportunities were haptic perception in various body segments and emotion related proprioceptive sensation. Conclusion: Once the mechanism of how users perceives haptic stimuli would help to develop effective haptic interactrion and this study provide insights of what to focus for the future of haptic interaction. Application: This research is expected to provide presence, and emotional response applied by haptic perception to fields such as human-robot, human-device, and telecommunication interaction.

Improvement of Environment Recognition using Multimodal Signal (멀티 신호를 이용한 환경 인식 성능 개선)

  • Park, Jun-Qyu;Baek, Seong-Joon
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.12
    • /
    • pp.27-33
    • /
    • 2010
  • In this study, we conducted the classification experiments with GMM (Gaussian Mixture Model) from combining the extracted features by using microphone, Gyro sensor and Acceleration sensor in 9 different environment types. Existing studies of Context Aware wanted to recognize the Environment situation mainly using the Environment sound data with microphone, but there was limitation of reflecting recognition owing to structural characteristics of Environment sound which are composed of various noises combination. Hence we proposed the additional application methods which added Gyro sensor and Acceleration sensor data in order to reflect recognition agent's movement feature. According to the experimental results, the method combining Acceleration sensor data with the data of existing Environment sound feature improves the recognition performance by more than 5%, when compared with existing methods of getting only Environment sound feature data from the Microphone.

Leveraging Social Media for Enriching Disaster related Location Trustiness (재난 관련 위치 신뢰도 향상을 위한 소셜 미디어 활용)

  • Nguyen, Van-Quyet;Nguyen, Giang-Truong;Nguyen, Sinh-Ngoc;Kim, Kyungbaek
    • Journal of Digital Contents Society
    • /
    • v.18 no.3
    • /
    • pp.567-575
    • /
    • 2017
  • Location-based services play an important role in many applications such as disaster warning systems and recommendation systems. These applications often require not only location information (e.g., name, latitude, longitude, etc.) but also the impact of events (e.g., earthquake, typhoon, etc.) on locations. Recently, to provide the impact of an event on a location, how to calculate location trustiness by using multimodal information such as earthquake information and disaster sensor data is researched. In the previous approach, the linear decrement of impact value of an event is applied to obtain the location trustiness of a specific location. In this paper, we propose a new approach to enrich location trustiness, that is, the impact of an event on a location, by using social media information additionally. Firstly, we design a collecting system for earthquake information and social media data. Secondly, we present an approach of location trustiness calculation based on earthquake information. Finally, we propose a new approach to enrich location trustiness by augmenting the trustiness in spatially distributed manner based on social media.

A Study on the Design of Digital Twin System and Required Function for Underground Lifelines (지하공동구 디지털 트윈 체계 및 요구기능 설계에 관한 연구)

  • Jeong, Min-Woo;Lee, Hee-Seok;Shin, Dong-Bin
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.7
    • /
    • pp.248-258
    • /
    • 2021
  • 24-hour monitoring is required to maintain the city's lifeline function in the underground facility for public utilities. And it is necessary to develop technology to exchange the shortage of human resources. It is difficult to reflect the specificity of underground space management in general management methods. This study proposes underground facility for public utilities digital twin system requirements. The concept of space is divided into physical space and virtual space, and the physical space constitutes the type and layout of the sensor that is the basis for the construction of the multimodal image sensor system, and the virtual space constitutes the system architecture. It also suggested system functions according to the task. It will be effective in preventing disasters and maintaining the lifeline function of the city through the digital twins.

Video Highlight Prediction Using GAN and Multiple Time-Interval Information of Audio and Image (오디오와 이미지의 다중 시구간 정보와 GAN을 이용한 영상의 하이라이트 예측 알고리즘)

  • Lee, Hansol;Lee, Gyemin
    • Journal of Broadcast Engineering
    • /
    • v.25 no.2
    • /
    • pp.143-150
    • /
    • 2020
  • Huge amounts of contents are being uploaded every day on various streaming platforms. Among those videos, game and sports videos account for a great portion. The broadcasting companies sometimes create and provide highlight videos. However, these tasks are time-consuming and costly. In this paper, we propose models that automatically predict highlights in games and sports matches. While most previous approaches use visual information exclusively, our models use both audio and visual information, and present a way to understand short term and long term flows of videos. We also describe models that combine GAN to find better highlight features. The proposed models are evaluated on e-sports and baseball videos.