• Title/Summary/Keyword: 멀티모달시스템

Search Result 116, Processing Time 0.027 seconds

A Design of AI Cloud Platform for Safety Management on High-risk Environment (고위험 현장의 안전관리를 위한 AI 클라우드 플랫폼 설계)

  • Ki-Bong, Kim
    • Journal of Advanced Technology Convergence
    • /
    • v.1 no.2
    • /
    • pp.01-09
    • /
    • 2022
  • Recently, safety issues in companies and public institutions are no longer a task that can be postponed, and when a major safety accident occurs, not only direct financial loss, but also indirect loss of social trust in the company and public institution is greatly increased. In particular, in the case of a fatal accident, the damage is even more serious. Accordingly, as companies and public institutions expand their investments in industrial safety education and prevention, open AI learning model creation technology that enables safety management services without being affected by user behavior in industrial sites where high-risk situations exist, edge terminals System development using inter-AI collaboration technology, cloud-edge terminal linkage technology, multi-modal risk situation determination technology, and AI model learning support technology is underway. In particular, with the development and spread of artificial intelligence technology, research to apply the technology to safety issues is becoming active. Therefore, in this paper, an open cloud platform design method that can support AI model learning for high-risk site safety management is presented.

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

  • Jae Seung Roh;Jinbeom Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.257-269
    • /
    • 2023
  • There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.

A Study on Success Strategies for Generative AI Services in Mobile Environments: Analyzing User Experience Using LDA Topic Modeling Approach (모바일 환경에서의 생성형 AI 서비스 성공 전략 연구: LDA 토픽모델링을 활용한 사용자 경험 분석)

  • Soyon Kim;Ji Yeon Cho;Sang-Yeol Park;Bong Gyou Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.109-119
    • /
    • 2024
  • This study aims to contribute to the initial research on on-device AI in an environment where generative AI-based services on mobile and other on-device platforms are increasing. To derive success strategies for generative AI-based chatbot services in a mobile environment, over 200,000 actual user experience review data collected from the Google Play Store were analyzed using the LDA topic modeling technique. Interpreting the derived topics based on the Information System Success Model (ISSM), the topics such as tutoring, limitation of response, and hallucination and outdated informaiton were linked to information quality; multimodal service, quality of response, and issues of device interoperability were linked to system quality; inter-device compatibility, utility of the service, quality of premium services, and challenges in account were linked to service quality; and finally, creative collaboration was linked to net benefits. Humanization of generative AI emerged as a new experience factor not explained by the existing model. By explaining specific positive and negative experience dimensions from the user's perspective based on theory, this study suggests directions for future related research and provides strategic insights for companies to improve and supplement their services for successful business operations.

Multi-classification of Osteoporosis Grading Stages Using Abdominal Computed Tomography with Clinical Variables : Application of Deep Learning with a Convolutional Neural Network (멀티 모달리티 데이터 활용을 통한 골다공증 단계 다중 분류 시스템 개발: 합성곱 신경망 기반의 딥러닝 적용)

  • Tae Jun Ha;Hee Sang Kim;Seong Uk Kang;DooHee Lee;Woo Jin Kim;Ki Won Moon;Hyun-Soo Choi;Jeong Hyun Kim;Yoon Kim;So Hyeon Bak;Sang Won Park
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.3
    • /
    • pp.187-201
    • /
    • 2024
  • Osteoporosis is a major health issue globally, often remaining undetected until a fracture occurs. To facilitate early detection, deep learning (DL) models were developed to classify osteoporosis using abdominal computed tomography (CT) scans. This study was conducted using retrospectively collected data from 3,012 contrast-enhanced abdominal CT scans. The DL models developed in this study were constructed for using image data, demographic/clinical information, and multi-modality data, respectively. Patients were categorized into the normal, osteopenia, and osteoporosis groups based on their T-scores, obtained from dual-energy X-ray absorptiometry, into normal, osteopenia, and osteoporosis groups. The models showed high accuracy and effectiveness, with the combined data model performing the best, achieving an area under the receiver operating characteristic curve of 0.94 and an accuracy of 0.80. The image-based model also performed well, while the demographic data model had lower accuracy and effectiveness. In addition, the DL model was interpreted by gradient-weighted class activation mapping (Grad-CAM) to highlight clinically relevant features in the images, revealing the femoral neck as a common site for fractures. The study shows that DL can accurately identify osteoporosis stages from clinical data, indicating the potential of abdominal CT scans in early osteoporosis detection and reducing fracture risks with prompt treatment.

A Study on Biometric Model for Information Security (정보보안을 위한 생체 인식 모델에 관한 연구)

  • Jun-Yeong Kim;Se-Hoon Jung;Chun-Bo Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.317-326
    • /
    • 2024
  • Biometric recognition is a technology that determines whether a person is identified by extracting information on a person's biometric and behavioral characteristics with a specific device. Cyber threats such as forgery, duplication, and hacking of biometric characteristics are increasing in the field of biometrics. In response, the security system is strengthened and complex, and it is becoming difficult for individuals to use. To this end, multiple biometric models are being studied. Existing studies have suggested feature fusion methods, but comparisons between feature fusion methods are insufficient. Therefore, in this paper, we compared and evaluated the fusion method of multiple biometric models using fingerprint, face, and iris images. VGG-16, ResNet-50, EfficientNet-B1, EfficientNet-B4, EfficientNet-B7, and Inception-v3 were used for feature extraction, and the fusion methods of 'Sensor-Level', 'Feature-Level', 'Score-Level', and 'Rank-Level' were compared and evaluated for feature fusion. As a result of the comparative evaluation, the EfficientNet-B7 model showed 98.51% accuracy and high stability in the 'Feature-Level' fusion method. However, because the EfficietnNet-B7 model is large in size, model lightweight studies are needed for biocharacteristic fusion.

Design and Evaluation of Multisensory Interactions to Improve Artwork Appreciation Accessibility for the Visually Impaired People (시각장애인의 미술 작품 감상 접근성을 높이는 다중감각 인터랙션의 설계 및 평가)

  • Park, Gyeongbin;Jo, Sunggi;Jung, Chanho;Choi, Hyojin;Hong, Taelim;Jung, Jaeho;Yang, Changjun;Wang, Chao;Cho, Jundong;Lee, Sangwon
    • Science of Emotion and Sensibility
    • /
    • v.23 no.1
    • /
    • pp.41-56
    • /
    • 2020
  • This study suggests multisensory interaction techniques to help visually impaired people appreciate and understand artworks through non-visual senses such as tactile, auditory, and olfactory senses. A user study was conducted on the basis of a qualitative interview about the experience of appreciating artwork through the multisensory interaction system to visually impaired people so as to verify the development of the interaction techniques. The user test shows that the multisensory interactions in artwork generally not only help them appreciate and understand it but also give them satisfaction through artwork appreciation. However, it also indicates that some multisensory interactions caused the visually impaired people confusion and could not be perceived during the appreciation. On the basis of these outcomes, implications in this study are as follows. This study has contributed to providing specific development guidelines and indicators of non-visual multisensory interactions as a technical alternative to improve accessibility to cultural and artistic activities for the visually impaired. Furthermore, this study is expected to contribute to building a technical background, which can provide comprehensive sensory experiences with not only blind people but also non-blind people such as children and the elderly through universal interaction techniques beyond existing visual-oriented fragmentary experience.