• Title/Summary/Keyword: Visual Intelligence

Search Result 245, Processing Time 0.022 seconds

Modelling of the Information Process with Visual and Audio in Human Brain (두뇌의 시$\cdot$청각 정보처리 과정의 모델링)

  • 김성주;서재용;조현찬;김성현;전홍태
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.187-190
    • /
    • 2002
  • 인간의 두뇌에서는 갖가지 다양한 형태의 입력들을 이용하여 동시에 여러 가지의 판단, 추론 및 기억 등의 기능을 수행한다 이러한 이유로 인간 두뇌는 거대한 지능형 정보처리기라고 할 수 있다 현재 정보처리 메커니즘은 다양한 형태로 발달되고 있지만 그 중에서도 지능형 정보처리 메커니즘으로는 소프트 컴퓨팅 기법을 응용한 것이 대부분이다. 본 논문에서는 소프트 컴퓨팅 기법을 이용하여 두뇌에서의 시각, 청각의 정보처리 과정을 하나의 구조로 모델링하고자 한다. 시각에서의 정보와 청각에서의 정보는 각기 다른 모듈에서 처리되는 방식을 취하고 있으며, 최종적으로 두 감각 정보를 이용한 처리가 가능하도록 모듈형태의 전체적인 구조를 지니고 있다. 상이한 두 가지의 정보를 동시에 처리하는 과정을 모델링함으로써 복잡한 문제의 해결 및 다양한 경우에 대한 고려를 수행하여 인간 두뇌 모델링의 기초를 마련하고자 한다.

  • PDF

Music Generation Algorithm based on the Color-Emotional Effect of a Painting (그림의 색채 감정 효과를 기반으로 한 음악 생성 알고리즘)

  • Choi, Hee Ju;Hwang, Jung-Hun;Ryu, Shinhye;Kim, Sangwook
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.6
    • /
    • pp.765-771
    • /
    • 2020
  • To enable AI(artificial intelligence) to realize visual emotions, it attempts to create music centered on color, an element that causes emotions in paintings. Traditional image-based music production studies have a limitation in playing notes that are unrelated to the picture because of the absence of musical elements. In this paper, we propose a new algorithm to set the group of music through the average color of the picture, and to produce music after adding diatonic code progression and deleting sound using median value. And the results obtained through the proposed algorithm were analyzed.

Research Trends for Deep Learning-Based High-Performance Face Recognition Technology (딥러닝 기반 고성능 얼굴인식 기술 동향)

  • Kim, H.I.;Moon, J.Y.;Park, J.Y.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.4
    • /
    • pp.43-53
    • /
    • 2018
  • As face recognition (FR) has been well studied over the past decades, FR technology has been applied to many real-world applications such as surveillance and biometric systems. However, in the real-world scenarios, FR performances have been known to be significantly degraded owing to variations in face images, such as the pose, illumination, and low-resolution. Recently, visual intelligence technology has been rapidly growing owing to advances in deep learning, which has also improved the FR performance. Furthermore, the FR performance based on deep learning has been reported to surpass the performance level of human perception. In this article, we discuss deep-learning based high-performance FR technologies in terms of representative deep-learning based FR architectures and recent FR algorithms robust to face image variations (i.e., pose-robust FR, illumination-robust FR, and video FR). In addition, we investigate big face image datasets widely adopted for performance evaluations of the most recent deep-learning based FR algorithms.

CAD system development for design of limit gauges (限界 게이지의 自動 設計에 관한 硏究)

  • Lee, Dong-Ju;Lee, Kwang-Gil
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.13 no.1
    • /
    • pp.38-44
    • /
    • 1996
  • The CAD system for design and drawing of limit gauges was constructed and developed. This system was made by Visual Basic program. Using this system, drawings together with concerned data for the manufacturing of limit gauges are generated on the screen, file and printer. The data base was constructed by referring handbooks, textbooks, relevant standards and regulations. This system was proved a powerful tool for design and drawing of limit gauges by actual applications. The output drawings from this system are in good agreement with the drawings and data of the concerned standards and regulations.

  • PDF

A Study on Quantitative Space Analysis Model - Focused on a Visual Analysis and Image Analysis by Digital Image Processing - (정량적 공간분석 모델에 관한 연구 - 시각 분석과 영상처리에 의한 이미지 분석 모델을 중심으로 -)

  • 이혁준;이종석
    • Korean Institute of Interior Design Journal
    • /
    • no.37
    • /
    • pp.136-143
    • /
    • 2003
  • Users' demands on the space are changing in variety. These demands include reasonable space and form, harmonious composition with surroundings and esthetic satisfaction that could be brought by personal tastes and preferences. In addition, models that are introduced from designing process and from various forms tend to lack objective decision making standard. Accordingly it is difficult to find a clear alternative plan and process. In an effort to solve these problems, the objects of this study are; to propose an analysis model of image and space by using image process techniques that are on study in the field of artificial intelligence based on acquisition of digital image and to verify the application possibilities of such analysis model, 'Isovist' on quantitative analysis. The model can be applied with variable analysis model, as digital image process and other analysis model such as 'Isovist' It is possible that further study can complement problems from this study.

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.3
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

Railroad Surface Defect Segmentation Using a Modified Fully Convolutional Network

  • Kim, Hyeonho;Lee, Suchul;Han, Seokmin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4763-4775
    • /
    • 2020
  • This research aims to develop a deep learning-based method that automatically detects and segments the defects on railroad surfaces to reduce the cost of visual inspection of the railroad. We developed our segmentation model by modifying a fully convolutional network model [1], a well-known segmentation model used for machine learning, to detect and segment railroad surface defects. The data used in this research are images of the railroad surface with one or more defect regions. Railroad images were cropped to a suitable size, considering the long height and relatively narrow width of the images. They were also normalized based on the variance and mean of the data images. Using these images, the suggested model was trained to segment the defect regions. The proposed method showed promising results in the segmentation of defects. We consider that the proposed method can facilitate decision-making about railroad maintenance, and potentially be applied for other analyses.

Transforming Text into Video: A Proposed Methodology for Video Production Using the VQGAN-CLIP Image Generative AI Model

  • SukChang Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.225-230
    • /
    • 2023
  • With the development of AI technology, there is a growing discussion about Text-to-Image Generative AI. We presented a Generative AI video production method and delineated a methodology for the production of personalized AI-generated videos with the objective of broadening the landscape of the video domain. And we meticulously examined the procedural steps involved in AI-driven video production and directly implemented a video creation approach utilizing the VQGAN-CLIP model. The outcomes produced by the VQGAN-CLIP model exhibited a relatively moderate resolution and frame rate, and predominantly manifested as abstract images. Such characteristics indicated potential applicability in OTT-based video content or the realm of visual arts. It is anticipated that AI-driven video production techniques will see heightened utilization in forthcoming endeavors.

Deep Reinforcement Learning in ROS-based autonomous robot navigation

  • Roland, Cubahiro;Choi, Donggyu;Jang, Jongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.47-49
    • /
    • 2022
  • Robot navigation has seen a major improvement since the the rediscovery of the potential of Artificial Intelligence (AI) and the attention it has garnered in research circles. A notable achievement in the area was Deep Learning (DL) application in computer vision with outstanding daily life applications such as face-recognition, object detection, and more. However, robotics in general still depend on human inputs in certain areas such as localization, navigation, etc. In this paper, we propose a study case of robot navigation based on deep reinforcement technology. We look into the benefits of switching from traditional ROS-based navigation algorithms towards machine learning approaches and methods. We describe the state-of-the-art technology by introducing the concepts of Reinforcement Learning (RL), Deep Learning (DL) and DRL before before focusing on visual navigation based on DRL. The case study preludes further real life deployment in which mobile navigational agent learns to navigate unbeknownst areas.

  • PDF

A Study on the Automatic Door Speed Control Design by the Identification of Auxiliary Pedestrian Using Artificial Intelligence (AI) (인공지능(AI)를 활용한 보조보행기구 식별에 따른 자동문 속도 조절 설계에 대한 연구)

  • Kim, yu-min;Choi, kyu-min;Shin, jun-pyo;Seong, Seung-min;Lee, byung-kwon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.237-239
    • /
    • 2021
  • 본 논문에서는 YOLO 시스템을 사용하여 보조 보행 기구를 인식 한 후 자동문 속도 조절에 대한 방법을 제안한다. Visual studio, OpenCV, CUDA를 활용하여 보조 보행 기구를 인식이 가능하게 신경망 훈련 및 학습 한 데이터를 기반으로 Raspberry Pi, 카메라 모듈을 활용하여 실시간 모니터링을 통해 보조 보행 기구를 인식하여 자동문의 속도를 조절을 구현했다. 이로써 거동이 불편한 장애인은 원활하게 건물 출입이 가능하다.

  • PDF