• Title/Summary/Keyword: AI Image Recognition

Search Result 128, Processing Time 0.026 seconds

Real time instruction classification system

  • Sang-Hoon Lee;Dong-Jin Kwon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.212-220
    • /
    • 2024
  • A recently the advancement of society, AI technology has made significant strides, especially in the fields of computer vision and voice recognition. This study introduces a system that leverages these technologies to recognize users through a camera and relay commands within a vehicle based on voice commands. The system uses the YOLO (You Only Look Once) machine learning algorithm, widely used for object and entity recognition, to identify specific users. For voice command recognition, a machine learning model based on spectrogram voice analysis is employed to identify specific commands. This design aims to enhance security and convenience by preventing unauthorized access to vehicles and IoT devices by anyone other than registered users. We converts camera input data into YOLO system inputs to determine if it is a person, Additionally, it collects voice data through a microphone embedded in the device or computer, converting it into time-domain spectrogram data to be used as input for the voice recognition machine learning system. The input camera image data and voice data undergo inference tasks through pre-trained models, enabling the recognition of simple commands within a limited space based on the inference results. This study demonstrates the feasibility of constructing a device management system within a confined space that enhances security and user convenience through a simple real-time system model. Finally our work aims to provide practical solutions in various application fields, such as smart homes and autonomous vehicles.

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

  • Guiyoung Son;Soonil Kwon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.6
    • /
    • pp.284-290
    • /
    • 2024
  • Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.

Pattern 인식을 위한 Neural Network

  • Kim, Myeong-Won;Lee, Gwang-Lo
    • ETRI Journal
    • /
    • v.11 no.1
    • /
    • pp.41-58
    • /
    • 1989
  • Neural network연구는 뇌로부터 얻은 아이디어를 공학적으로 응용하려는 생각을 바탕으로 뇌의 구조와 유사한 mechanism에 의한 정보처리장치의 기초가 되는 정보처리의 양식 확립과 함께 그 정보처리 양식을 구체적으로 각각의 정보처리 문제에 응용하기 위한 응용기술을 연구하는 것이다. Neural network의 계산 기능적 특성은 병렬처리, 학습 및 noisy한 정보의 효율적처리 등으로써 특히 pattern인식 문제에 효율적으로 응용될 수 있다. 본 논문에서는 neural network의 역사적 고찰과 기존의 model들을 살펴보고 새로운 계산 구조와 계산 방식을 가진 neural network의 응용분야를 살펴 봄으로써 기존의 AI 기법으로 해결하기 어려운 pattern recognition(image,문자,speech등), robot vision 및 control 등 여러가지 문제에 효율적으로 적용가능함과 neural network의 앞으로의 전망에 대하여 기술한다.

  • PDF

Implementation of AI-based Object Recognition Model for Improving Driving Safety of Electric Mobility Aids (객체 인식 모델과 지면 투영기법을 활용한 영상 내 다중 객체의 위치 보정 알고리즘 구현)

  • Dong-Seok Park;Sun-Gi Hong;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.2
    • /
    • pp.119-125
    • /
    • 2023
  • In this study, we photograph driving obstacle objects such as crosswalks, side spheres, manholes, braille blocks, partial ramps, temporary safety barriers, stairs, and inclined curb that hinder or cause inconvenience to the movement of the vulnerable using electric mobility aids. We develop an optimal AI model that classifies photographed objects and automatically recognizes them, and implement an algorithm that can efficiently determine obstacles in front of electric mobility aids. In order to enable object detection to be AI learning with high probability, the labeling form is labeled as a polygon form when building a dataset. It was developed using a Mask R-CNN model in Detectron2 framework that can detect objects labeled in the form of polygons. Image acquisition was conducted by dividing it into two groups: the general public and the transportation weak, and image information obtained in two areas of the test bed was secured. As for the parameter setting of the Mask R-CNN learning result, it was confirmed that the model learned with IMAGES_PER_BATCH: 2, BASE_LEARNING_RATE 0.001, MAX_ITERATION: 10,000 showed the highest performance at 68.532, so that the user can quickly and accurately recognize driving risks and obstacles.

Hybrid-Domain High-Frequency Attention Network for Arbitrary Magnification Super-Resolution (임의배율 초해상도를 위한 하이브리드 도메인 고주파 집중 네트워크)

  • Yun, Jun-Seok;Lee, Sung-Jin;Yoo, Seok Bong;Han, Seunghwoi
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.11
    • /
    • pp.1477-1485
    • /
    • 2021
  • Recently, super-resolution has been intensively studied only on upscaling models with integer magnification. However, the need to expand arbitrary magnification is emerging in representative application fields of actual super-resolution, such as object recognition and display image quality improvement. In this paper, we propose a model that can support arbitrary magnification by using the weights of the existing integer magnification model. This model converts super-resolution results into the DCT spectral domain to expand the space for arbitrary magnification. To reduce the loss of high-frequency information in the image caused by the expansion by the DCT spectral domain, we propose a high-frequency attention network for arbitrary magnification so that this model can properly restore high-frequency spectral information. To recover high-frequency information properly, the proposed network utilizes channel attention layers. This layer can learn correlations between RGB channels, and it can deepen the model through residual structures.

Implementation of Low-cost Autonomous Car for Lane Recognition and Keeping based on Deep Neural Network model

  • Song, Mi-Hwa
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.210-218
    • /
    • 2021
  • CNN (Convolutional Neural Network), a type of deep learning algorithm, is a type of artificial neural network used to analyze visual images. In deep learning, it is classified as a deep neural network and is most commonly used for visual image analysis. Accordingly, an AI autonomous driving model was constructed through real-time image processing, and a crosswalk image of a road was used as an obstacle. In this paper, we proposed a low-cost model that can actually implement autonomous driving based on the CNN model. The most well-known deep neural network technique for autonomous driving is investigated and an end-to-end model is applied. In particular, it was shown that training and self-driving on a simulated road is possible through a practical approach to realizing lane detection and keeping.

Research on the development of automated tools to de-identify personal information of data for AI learning - Based on video data - (인공지능 학습용 데이터의 개인정보 비식별화 자동화 도구 개발 연구 - 영상데이터기반 -)

  • Hyunju Lee;Seungyeob Lee;Byunghoon Jeon
    • Journal of Platform Technology
    • /
    • v.11 no.3
    • /
    • pp.56-67
    • /
    • 2023
  • Recently, de-identification of personal information, which has been a long-cherished desire of the data-based industry, was revised and specified in August 2020. It became the foundation for activating data called crude oil[2] in the fourth industrial era in the industrial field. However, some people are concerned about the infringement of the basic rights of the data subject[3]. Accordingly, a development study was conducted on the Batch De-Identification Tool, a personal information de-identification automation tool. In this study, first, we developed an image labeling tool to label human faces (eyes, nose, mouth) and car license plates of various resolutions to build data for training. Second, an object recognition model was trained to run the object recognition module to perform de-identification of personal information. The automated personal information de-identification tool developed as a result of this research shows the possibility of proactively eliminating privacy violations through online services. These results suggest possibilities for data-based industries to maximize the value of data while balancing privacy and utilization.

  • PDF

Implementation of Driver Management Supervision System through AI Image Recognition Module (AI 영상 인식 모듈을 통한 운전자 관리 감독 시스템 구현)

  • Hyun Jun Suh;Min Ji Kim;Jae Hyun Shim;Seung Don Lee;JeongEun Nah
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.813-814
    • /
    • 2023
  • 매년 졸음 및 부주의 운전으로 인한 교통사고로 인명 및 재산 피해가 끊이지 않고 있다. 즉각적인 졸음 감지만을 위한 기존 시스템의 단점과 한계를 보완하고자, 본 논문에서는 위험 행동을 감지한 후 당시 사진과 데이터를 저장하고 이를 점수로 환산하여 장기적인 운전 습관 개선을 목표로 하는 운전자 관리 감독 시스템을 구현하였다. 이 시스템은 화물차 운전자와 같이 장시간 운전을 하는 대상에게 안전 주행을 장려하고 올바른 운전문화를 확립하게 하여 교통안전에 긍정적 역할을 담당할 수 있다.

Automated Inspection System for Micro-pattern Defection Using Artificial Intelligence (인공지능(AI)을 활용한 미세패턴 불량도 자동화 검사 시스템)

  • Lee, Kwan-Soo;Kim, Jae-U;Cho, Su-Chan;Shin, Bo-Sung
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.6_2
    • /
    • pp.729-735
    • /
    • 2021
  • Recently Artificial Intelligence(AI) has been developed and used in various fields. Especially AI recognition technology can perceive and distinguish images so it should plays a significant role in quality inspection process. For stability of autonomous driving technology, semiconductors inside automobiles must be protected from external electromagnetic wave(EM wave). As a shield film, a thin polymeric material with hole shaped micro-patterns created by a laser processing could be used for the protection. The shielding efficiency of the film can be increased by the hole structure with appropriate pitch and size. However, since the sensitivity of micro-machining for some parameters, the shape of every single hole can not be same, even it is possible to make defective patterns during process. And it is absolutely time consuming way to inspect all patterns by just using optical microscope. In this paper, we introduce a AI inspection system which is based on web site AI tool. And we evaluate the usefulness of AI model by calculate Area Under ROC curve(Receiver Operating Characteristics). The AI system can classify the micro-patterns into normal or abnormal ones displaying the text of the result on real-time images and save them as image files respectively. Furthermore, pressing the running button, the Hardware of robot arm with two Arduino motors move the film on the optical microscopy stage in order for raster scanning. So this AI system can inspect the entire micro-patterns of a film automatically. If our system could collect much more identified data, it is believed that this system should be a more precise and accurate process for the efficiency of the AI inspection. Also this one could be applied to image-based inspection process of other products.

Grade Analysis and Two-Stage Evaluation of Beef Carcass Image Using Deep Learning (딥러닝을 이용한 소도체 영상의 등급 분석 및 단계별 평가)

  • Kim, Kyung-Nam;Kim, Seon-Jong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.385-391
    • /
    • 2022
  • Quality evaluation of beef carcasses is an important issue in the livestock industry. Recently, through the AI monitor system based on artificial intelligence, the quality manager can receive help in making accurate decisions based on the analysis of beef carcass images or result information. This artificial intelligence dataset is an important factor in judging performance. Existing datasets may have different surface orientation or resolution. In this paper, we proposed a two-stage classification model that can efficiently manage the grades of beef carcass image using deep learning. And to overcome the problem of the various conditions of the image, a new dataset of 1,300 images was constructed. The recognition rate of deep network for 5-grade classification using the new dataset was 72.5%. Two-stage evaluation is a method to increase reliability by taking advantage of the large difference between grades 1++, 1+, and grades 1 and 2 and 3. With two experiments using the proposed two stage model, the recognition rates of 73.7% and 77.2% were obtained. As this, The proposed method will be an efficient method if we have a dataset with 100% recognition rate in the first stage.