• Title/Summary/Keyword: Image Recognition Technologies

Search Result 159, Processing Time 0.034 seconds

Detection Algorithm of Road Surface Damage Using Adversarial Learning (적대적 학습을 이용한 도로 노면 파손 탐지 알고리즘)

  • Shim, Seungbo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.4
    • /
    • pp.95-105
    • /
    • 2021
  • Road surface damage detection is essential for a comfortable driving environment and the prevention of safety accidents. Road management institutes are using automated technology-based inspection equipment and systems. As one of these automation technologies, a sensor to detect road surface damage plays an important role. For this purpose, several studies on sensors using deep learning have been conducted in recent years. Road images and label images are needed to develop such deep learning algorithms. On the other hand, considerable time and labor will be needed to secure label images. In this paper, the adversarial learning method, one of the semi-supervised learning techniques, was proposed to solve this problem. For its implementation, a lightweight deep neural network model was trained using 5,327 road images and 1,327 label images. After experimenting with 400 road images, a model with a mean intersection over a union of 80.54% and an F1 score of 77.85% was developed. Through this, a technology that can improve recognition performance by adding only road images was developed to learning without label images and is expected to be used as a technology for road surface management in the future.

A Study on Interactive Talking Companion Doll Robot System Using Big Data for the Elderly Living Alone (빅데이터를 이용한 독거노인 돌봄 AI 대화형 말동무 아가야(AGAYA) 로봇 시스템에 관한 연구)

  • Song, Moon-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.305-318
    • /
    • 2022
  • We focused on the care effectiveness of the interactive AI robots. developed an AI toy robot called 'Agaya' to contribute to personalization with more human-centered care. First, by applying P-TTS technology, you can maximize intimacy by autonomously selecting the voice of the person you want to hear. Second, it is possible to heal in your own way with good memory storage and bring back memory function. Third, by having five senses of the role of eyes, nose, mouth, ears, and hands, seeking better personalised services. Fourth, it attempted to develop technologies such as warm temperature maintenance, aroma, sterilization and fine dust removal, convenient charging method. These skills will expand the effective use of interactive robots by elderly people and contribute to building a positive image of the elderly who can plan the remaining old age productively and independently

Study on the Performance Evaluation of Encoding and Decoding Schemes in Vector Symbolic Architectures (벡터 심볼릭 구조의 부호화 및 복호화 성능 평가에 관한 연구)

  • Youngseok Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.4
    • /
    • pp.229-235
    • /
    • 2024
  • Recent years have seen active research on methods for efficiently processing and interpreting large volumes of data in the fields of artificial intelligence and machine learning. One of these data processing technologies, Vector Symbolic Architecture (VSA), offers an innovative approach to representing complex symbols and data using high-dimensional vectors. VSA has garnered particular attention in various applications such as natural language processing, image recognition, and robotics. This study quantitatively evaluates the characteristics and performance of VSA methodologies by applying five VSA methodologies to the MNIST dataset and measuring key performance indicators such as encoding speed, decoding speed, memory usage, and recovery accuracy across different vector lengths. BSC and VT demonstrated relatively fast performance in encoding and decoding speeds, while MAP and HRR were relatively slow. In terms of memory usage, BSC was the most efficient, whereas MAP used the most memory. The recovery accuracy was highest for MAP and lowest for BSC. The results of this study provide a basis for selecting appropriate VSA methodologies depending on the application area.

Artificial Vision Project by Micro-Bio Technologies

  • Kim Sung June;Jung Hum;Yu Young Suk;Yu Hyeong Gon;Cho Dong il;Lee Byeong Ho;Ku Yong Sook;Kim Eun Mi;Seo Jong Mo;Kim Hyo kyum;Kim Eui tae;Paik Seung June;Yoon Il Young
    • 한국가시화정보학회:학술대회논문집
    • /
    • 2002.04a
    • /
    • pp.51-78
    • /
    • 2002
  • A number of research groups worldwide are studying electronic implants that can be mounted on retinal optic nerve/visual cortex to restore vision of patients suffering from retinal degeneration. The implants consist of a neural interface made of biocompatible materials, one or more integrated circuits for stimuli generation, a camera, an image processor, and a telemetric channel. The realization of these classes of neural prosthetic devices is largely due to the explosive development of micro- and nano-electronics technologies in the late $20^{th}$ century and biotechnologies more recently. Animal experiments showed promise and some human experiments are in progress to indicate that recognition of images can be obtained and improved over time. We, at NBS-ERC of SNU, have started our own retinal implant project in 2000. We have selected polyimide as the biomaterial for an epi-retinal stimulator. In-vitro and in-vivo biocompatibility studies have been performed on the electrode arrays. We have obtained good affinity to retinal pigment epithelial cells and no harmful effect. The implant also showed very good stability and safety in rabbit eye for 12 weeks. We have also demonstrated that through proper stimulation of inner retina, meaning vision can be obtained.

  • PDF

Comparative Analysis of CNN Deep Learning Model Performance Based on Quantification Application for High-Speed Marine Object Classification (고속 해상 객체 분류를 위한 양자화 적용 기반 CNN 딥러닝 모델 성능 비교 분석)

  • Lee, Seong-Ju;Lee, Hyo-Chan;Song, Hyun-Hak;Jeon, Ho-Seok;Im, Tae-ho
    • Journal of Internet Computing and Services
    • /
    • v.22 no.2
    • /
    • pp.59-68
    • /
    • 2021
  • As artificial intelligence(AI) technologies, which have made rapid growth recently, began to be applied to the marine environment such as ships, there have been active researches on the application of CNN-based models specialized for digital videos. In E-Navigation service, which is combined with various technologies to detect floating objects of clash risk to reduce human errors and prevent fires inside ships, real-time processing is of huge importance. More functions added, however, mean a need for high-performance processes, which raises prices and poses a cost burden on shipowners. This study thus set out to propose a method capable of processing information at a high rate while maintaining the accuracy by applying Quantization techniques of a deep learning model. First, videos were pre-processed fit for the detection of floating matters in the sea to ensure the efficient transmission of video data to the deep learning entry. Secondly, the quantization technique, one of lightweight techniques for a deep learning model, was applied to reduce the usage rate of memory and increase the processing speed. Finally, the proposed deep learning model to which video pre-processing and quantization were applied was applied to various embedded boards to measure its accuracy and processing speed and test its performance. The proposed method was able to reduce the usage of memory capacity four times and improve the processing speed about four to five times while maintaining the old accuracy of recognition.

Digital Library Interface Research Based on EEG, Eye-Tracking, and Artificial Intelligence Technologies: Focusing on the Utilization of Implicit Relevance Feedback (뇌파, 시선추적 및 인공지능 기술에 기반한 디지털 도서관 인터페이스 연구: 암묵적 적합성 피드백 활용을 중심으로)

  • Hyun-Hee Kim;Yong-Ho Kim
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.1
    • /
    • pp.261-282
    • /
    • 2024
  • This study proposed and evaluated electroencephalography (EEG)-based and eye-tracking-based methods to determine relevance by utilizing users' implicit relevance feedback while navigating content in a digital library. For this, EEG/eye-tracking experiments were conducted on 32 participants using video, image, and text data. To assess the usefulness of the proposed methods, deep learning-based artificial intelligence (AI) techniques were used as a competitive benchmark. The evaluation results showed that EEG component-based methods (av_P600 and f_P3b components) demonstrated high classification accuracy in selecting relevant videos and images (faces/emotions). In contrast, AI-based methods, specifically object recognition and natural language processing, showed high classification accuracy for selecting images (objects) and texts (newspaper articles). Finally, guidelines for implementing a digital library interface based on EEG, eye-tracking, and artificial intelligence technologies have been proposed. Specifically, a system model based on implicit relevance feedback has been presented. Moreover, to enhance classification accuracy, methods suitable for each media type have been suggested, including EEG-based, eye-tracking-based, and AI-based approaches.

A Study on Formative Elements in 3D Animation Character -Focusing on Characters' Visual Recognition Elements of Form through Elements of Form and Formation Method of Form- (3D 애니메이션 캐릭터의 조형성 연구 -<겨울왕국> 캐릭터를 중심으로 조형의 구성요소와 원리를 통한 시각인지요소에 관한 연구-)

  • Kim, Hye Sung;Sung, Re-A
    • Cartoon and Animation Studies
    • /
    • s.36
    • /
    • pp.45-74
    • /
    • 2014
  • We can anticipate that animations will form one of the axes and lead popular culture in our future visual age. Recently, research has been actively conducted, but it mainly focuses on their value in culture industry or technologies and methods of producing animations. Of course, research that deals with animation characters has constantly come out. This study focuses on the 'formative elements' of 3D animation characters and attempts differentiation from other research by inducing new logic theoretically. Being freed from the research on characters that has been merely focused on theoretical grounds, this study intends to figure out how audience that is consumers who actually get to watch and feel animations recognizes them and find out related problems and also solutions for them. In particular, this study intends to examine the formative characteristics of 3D animation characters with the characters appearing in , one of the animations that have achieved artistic value as well as commercial success. And for that, the study conducted not only literature review but various surveys and Delphi method as well. Also, the researcher devised an analysis frame to evaluate the formative elements through in-depth discussion with experts. And with this, the study created the forms such as the Elements of Form, Formation Methods of Form and Visual Recognition Elements of Form, examined how audience recognized 3D characters. The process of recognizing an image is influenced by socio-cultural environment or sex, age, and the level of knowledge differently. This was meant to investigate current visual culture and the public's perspective through characters in that represent the visual mode.

Memory-Free Skin-Detection Algorithm and Implementation of Hardware Design for Small-Sized Display Device (소형 DISPLAY 장치를 위한 비 메모리 피부 검출 알고리즘 및 HARDWARE 구현)

  • Im, Jeong-Uk;Song, Jin-Gun;Ha, Joo-Young;Kang, Bong-Soon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.8
    • /
    • pp.1456-1464
    • /
    • 2007
  • The research of skin-tone detection has been conducting continuously to enlarge the importance in security, surveillance and administration of the information and 'Password Control System' for using face and skin recognition in airports, harbors and general companies. As well as tile rapid diffusion of the application range in image communications and an electron transaction using wide range of communication network, the importance of the accurate detection of skin color has been augmenting recently. In this paper, it will set up the boundaries of skin colors using the information of Cb and Cr in YCbCr color model of human skin color which is from hundreds compiled portrait images for each race, and suggest a efficient yet simple structure about the skin detection which has been followed by whether the comprehension of the boundaries of skin or not with adaptive skin-range set. With the possibility of the 1D Processes which does not use any memory, it is able to be applied to relatively small-sized hardware and system such as mobile apparatuses. To add the selective mode, it is not only available the improvement of tie skin detection, but also showing the correspondent results about previous face recognition technologies using complicated algorithm.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

Invariant Classification and Detection for Cloth Searching (의류 검색용 회전 및 스케일 불변 이미지 분류 및 검색 기술)

  • Hwang, Inseong;Cho, Beobkeun;Jeon, Seungwoo;Choe, Yunsik
    • Journal of Broadcast Engineering
    • /
    • v.19 no.3
    • /
    • pp.396-404
    • /
    • 2014
  • The field of searching clothing, which is very difficult due to the nature of the informal sector, has been in an effort to reduce the recognition error and computational complexity. However, there is no concrete examples of the whole progress of learning and recognizing for cloth, and the related technologies are still showing many limitations. In this paper, the whole process including identifying both the person and cloth in an image and analyzing both its color and texture pattern is specifically shown for classification. Especially, deformable search descriptor, LBPROT_35 is proposed for identifying the pattern of clothing. The proposed method is scale and rotation invariant, so we can obtain even higher detection rate even though the scale and angle of the image changes. In addition, the color classifier with the color space quantization is proposed not to loose color similarity. In simulation, we build database by training a total of 810 images from the clothing images on the internet, and test some of them. As a result, the proposed method shows a good performance as it has 94.4% matching rate while the former Dense-SIFT method has 63.9%.