• 제목/요약/키워드: Visual Intelligence

검색결과 251건 처리시간 0.028초

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

Railroad Surface Defect Segmentation Using a Modified Fully Convolutional Network

  • Kim, Hyeonho;Lee, Suchul;Han, Seokmin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권12호
    • /
    • pp.4763-4775
    • /
    • 2020
  • This research aims to develop a deep learning-based method that automatically detects and segments the defects on railroad surfaces to reduce the cost of visual inspection of the railroad. We developed our segmentation model by modifying a fully convolutional network model [1], a well-known segmentation model used for machine learning, to detect and segment railroad surface defects. The data used in this research are images of the railroad surface with one or more defect regions. Railroad images were cropped to a suitable size, considering the long height and relatively narrow width of the images. They were also normalized based on the variance and mean of the data images. Using these images, the suggested model was trained to segment the defect regions. The proposed method showed promising results in the segmentation of defects. We consider that the proposed method can facilitate decision-making about railroad maintenance, and potentially be applied for other analyses.

Transforming Text into Video: A Proposed Methodology for Video Production Using the VQGAN-CLIP Image Generative AI Model

  • SukChang Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권3호
    • /
    • pp.225-230
    • /
    • 2023
  • With the development of AI technology, there is a growing discussion about Text-to-Image Generative AI. We presented a Generative AI video production method and delineated a methodology for the production of personalized AI-generated videos with the objective of broadening the landscape of the video domain. And we meticulously examined the procedural steps involved in AI-driven video production and directly implemented a video creation approach utilizing the VQGAN-CLIP model. The outcomes produced by the VQGAN-CLIP model exhibited a relatively moderate resolution and frame rate, and predominantly manifested as abstract images. Such characteristics indicated potential applicability in OTT-based video content or the realm of visual arts. It is anticipated that AI-driven video production techniques will see heightened utilization in forthcoming endeavors.

Deep Reinforcement Learning in ROS-based autonomous robot navigation

  • Roland, Cubahiro;Choi, Donggyu;Jang, Jongwook
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 춘계학술대회
    • /
    • pp.47-49
    • /
    • 2022
  • Robot navigation has seen a major improvement since the the rediscovery of the potential of Artificial Intelligence (AI) and the attention it has garnered in research circles. A notable achievement in the area was Deep Learning (DL) application in computer vision with outstanding daily life applications such as face-recognition, object detection, and more. However, robotics in general still depend on human inputs in certain areas such as localization, navigation, etc. In this paper, we propose a study case of robot navigation based on deep reinforcement technology. We look into the benefits of switching from traditional ROS-based navigation algorithms towards machine learning approaches and methods. We describe the state-of-the-art technology by introducing the concepts of Reinforcement Learning (RL), Deep Learning (DL) and DRL before before focusing on visual navigation based on DRL. The case study preludes further real life deployment in which mobile navigational agent learns to navigate unbeknownst areas.

  • PDF

인공지능(AI)를 활용한 보조보행기구 식별에 따른 자동문 속도 조절 설계에 대한 연구 (A Study on the Automatic Door Speed Control Design by the Identification of Auxiliary Pedestrian Using Artificial Intelligence (AI))

  • 김유민;최규민;신준표;성승민;이병권
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2021년도 제63차 동계학술대회논문집 29권1호
    • /
    • pp.237-239
    • /
    • 2021
  • 본 논문에서는 YOLO 시스템을 사용하여 보조 보행 기구를 인식 한 후 자동문 속도 조절에 대한 방법을 제안한다. Visual studio, OpenCV, CUDA를 활용하여 보조 보행 기구를 인식이 가능하게 신경망 훈련 및 학습 한 데이터를 기반으로 Raspberry Pi, 카메라 모듈을 활용하여 실시간 모니터링을 통해 보조 보행 기구를 인식하여 자동문의 속도를 조절을 구현했다. 이로써 거동이 불편한 장애인은 원활하게 건물 출입이 가능하다.

  • PDF

CNN 모델을 이용한 사기 스마트 컨트랙트 탐지 (Fraudulent Smart Contract Detection Using CNN Models)

  • 박다은;박용범
    • 반도체디스플레이기술학회지
    • /
    • 제22권3호
    • /
    • pp.73-77
    • /
    • 2023
  • As the DeFi market continues to expand, fraudulent activities using smart contracts have also increased. HoneyPot and Ponzi schemes are well-known frauds that exploit smart contracts. While several studies have demonstrated the potential to detect smart contracts implementing these scams, there has been a lack of research focusing on simultaneously detecting both types of fraud. This paper addresses this gap by harnessing artificial intelligence to conduct experiments for the detection of both HoneyPot and Ponzi schemes. The study employs the CNN (Convolutional Neural Network) model, commonly used for malware detection. To effectively utilize CNN, the bytecode of smart contracts is transformed into visual representations. The experimental results showcase a recall rate of 0.89 and an F1 score of 0.85, indicating promising detection capabilities.

  • PDF

Reliable Fault Diagnosis Method Based on An Optimized Deep Belief Network for Gearbox

  • Oybek Eraliev;Ozodbek Xakimov;Chul-Hee Lee
    • 드라이브 ㆍ 컨트롤
    • /
    • 제20권4호
    • /
    • pp.54-63
    • /
    • 2023
  • High and intermittent loading cycles induce fatigue damage to transmission components, resulting in premature gearbox failure. To identify gearbox defects, numerous vibration-based diagnostics techniques, using several artificial intelligence (AI) algorithms, have recently been presented. In this paper, an optimized deep belief network (DBN) model for gearbox problem diagnosis was designed based on time-frequency visual pattern identification. To optimize the hyperparameters of the model, a particle swarm optimization (PSO) approach was integrated into the DBN. The proposed model was tested on two gearbox datasets: a wind turbine gearbox and an experimental gearbox. The optimized DBN model demonstrated strong and robust performance in classification accuracy. In addition, the accuracy of the generated datasets was compared using traditional ML and DL algorithms. Furthermore, the proposed model was evaluated on different partitions of the dataset. The results showed that, even with a small amount of sample data, the optimized DBN model achieved high accuracy in diagnosis.

그래프 기반 상태 표현을 활용한 작업 계획 알고리즘 개발 (Task Planning Algorithm with Graph-based State Representation)

  • 변성완;오윤선
    • 로봇학회논문지
    • /
    • 제19권2호
    • /
    • pp.196-202
    • /
    • 2024
  • The ability to understand given environments and plan a sequence of actions leading to goal state is crucial for personal service robots. With recent advancements in deep learning, numerous studies have proposed methods for state representation in planning. However, previous works lack explicit information about relationships between objects when the state observation is converted to a single visual embedding containing all state information. In this paper, we introduce graph-based state representation that incorporates both object and relationship features. To leverage these advantages in addressing the task planning problem, we propose a Graph Neural Network (GNN)-based subgoal prediction model. This model can extract rich information about object and their interconnected relationships from given state graph. Moreover, a search-based algorithm is integrated with pre-trained subgoal prediction model and state transition module to explore diverse states and find proper sequence of subgoals. The proposed method is trained with synthetic task dataset collected in simulation environment, demonstrating a higher success rate with fewer additional searches compared to baseline methods.

학습 장애아 진단 도구로 기초 학습 기능 검사의 유용성에 관한 연구 (A USEFULNESS OF KEDI-INDIVIDUAL BASIC LEARNING SKILLS TEST AS A DIAGNOSTIC TOOL OF LEARNING DISORDERS)

  • 김지혜;이명주;홍성도;김승태
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • 제8권1호
    • /
    • pp.101-112
    • /
    • 1997
  • 본 연구의 목적은 학습 장애를 진단하는데 있어 성취도 검사인 기초학습기능검사의 유용성을 살펴보는 것이다. 학습 장애 집단은 두 유형으로 나누어 언어적 학습 장애 집단(VLD:Verbal Learning Disorder) 34명, 비언어적 학습 장애 집단(NVLD:Nonverbal Learning Disorder) 14명으로 총 48명으로 구성되었으며, 비교 집단으로는 Dysthymia 집단 11명, 정상아 20명을 대상으로 지능 검사 및 기초학습 기능검사의 수행을 비교하였다. 지능 검사에서 VLD집단은 어휘력 및 언어를 통한 학습 과제, 언어-청각적인 주의과제에서 의미있는 저하를 나타내었고, NVLD 집단은 시-지각의 정확도, 정신-운동성 기능의 협응 속도, 시각-공각적인 조직력 등 동작성 기능 전반에 걸쳐 비효율성을 나타내었다. 기초학습기능검사에서는 VLD 집단은 음운 부호화과제, 셈하기 능력, 단어 재인 과제에서 의미있는 저하를 나타내었다. 또한 지능 검사의 소검사들에 기초학습기능검사의 소검사들을 포함하여 판별 분석을 한 결과, 기초학습기능검사를 포함시키지 않은 경우보다 판별율이 높아졌을 뿐 아니라, VLD집단을 유의미하게 판별해 주는 판별 함수를 도출하였다. 각 소검사들의 속성을 분석하기 위하여 요인 분석을 실시하였으며 이를 통하여 소검사들을 유목화하였으며 마지막으로 현 논문의 제한점 및 기초학습기능검사의 제한점을 논의하였다.

  • PDF

인공지능 스피커(AI speaker) 사례 분석을 통한 고찰 (A study on User Experience of Artificial Intelligence speaker)

  • 조규은;김승인
    • 한국융합학회논문지
    • /
    • 제9권8호
    • /
    • pp.127-133
    • /
    • 2018
  • 본 연구는 4차 산업혁명의 핵심 기술로서 활발히 개발되고 있는 인공지능 스피커의 기술 동향을 분석하고 국내외 출시된 인공지능 스피커의 사례분석을 통해 나아가야 할 방향 제안에 목적이 있다. 연구방법으로는 먼저 문헌연구를 통해 인공지능 스피커의 기술적 배경을 고찰하였으며, 이후 국내외 인공지능 스피커 사례를 조사하였다. 그 결과, 음성의 본질적 한계를 극복하고자 시각인터페이스로의 확장하려는 시도를 보인다. 이러한 시도 중 하나로 스크린 내장형 인공지능 스피커에 주목할 필요가 있다. 인공지능 스피커는 단순히 편의 기능 제공을 넘어 인간과 컴퓨터의 상호작용하는 플랫폼이 되어야 한다. 본 연구에 제시된 시사점을 바탕으로 앞으로 국내 인공지능 스피커의 서비스 발전 방향을 예측하는 것에 참고 자료로 사용될 수 있을 것을 기대한다.