• 제목/요약/키워드: Deep Learning AI

검색결과 622건 처리시간 0.031초

모션 인식을 이용한 수화 번역 웨어러블 기기 (Sign Language Translation Wearable Device Using Motion Recognition)

  • 이준영;강현수;김성준;손준호;유동준;박양우
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2023년도 제68차 하계학술대회논문집 31권2호
    • /
    • pp.453-454
    • /
    • 2023
  • 현재 선천적인 청각장애인이나 언어 장애가 있는 사람은 다른 사람과의 대화에 많은 불편을 겪고 있다. 매장을 이용하기 어려움은 물론 언어전달 능력이 떨어지기 때문에 간단한 의사소통을 통한 서로 간의 교류 또한 불편함을 감수해야 한다. 현재는 따로 디스플레이가 내장된 장치를 이용하여 지정된 장소에서 수화를 번역해야 하는 불편함을 해당 문제 해결을 위해 본 연구에서는 딥러닝을 적용하여 수화를 인식하고 번역하여 디스플레이에 텍스트를 출력해주는 시스템을 개발하였다. AI 프레임워크 MediaPipe와 SVM 알고리즘을 라즈베리파이에 적용하여 구현하였다. 개발한 시스템은 제스처에 대한 번역 결과를 제공한다. 기존의 지정된 장소가 아닌 대화가 필요한 모든 장소에서 번역이 가능하도록 개선하여 청각장애인과 언어장애가 있는 사람들과 소통의 불편함을 줄일 수 있을 것으로 기대할 수 있다.

  • PDF

Improving Accuracy of Instance Segmentation of Teeth

  • Jongjin Park
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제16권1호
    • /
    • pp.280-286
    • /
    • 2024
  • In this paper, layered UNet with warmup and dropout tricks was used to segment teeth instantly by using data labeled for each individual tooth and increase performance of the result. The layered UNet proposed before showed very good performance in tooth segmentation without distinguishing tooth number. To do instance segmentation of teeth, we labeled teeth CBCT data according to tooth numbering system which is devised by FDI World Dental Federation notation. Colors for labeled teeth are like AI-Hub teeth dataset. Simulation results show that layered UNet does also segment very well for each tooth distinguishing tooth number by color. Layered UNet model using warmup trick was the best with IoU values of 0.80 and 0.77 for training, validation data. To increase the performance of instance segmentation of teeth, we need more labeled data later. The results of this paper can be used to develop medical software that requires tooth recognition, such as orthodontic treatment, wisdom tooth extraction, and implant surgery.

Fuel Consumption Prediction and Life Cycle History Management System Using Historical Data of Agricultural Machinery

  • Jung Seung Lee;Soo Kyung Kim
    • Journal of Information Technology Applications and Management
    • /
    • 제29권5호
    • /
    • pp.27-37
    • /
    • 2022
  • This study intends to link agricultural machine history data with related organizations or collect them through IoT sensors, receive input from agricultural machine users and managers, and analyze them through AI algorithms. Through this, the goal is to track and manage the history data throughout all stages of production, purchase, operation, and disposal of agricultural machinery. First, LSTM (Long Short-Term Memory) is used to estimate oil consumption and recommend maintenance from historical data of agricultural machines such as tractors and combines, and C-LSTM (Convolution Long Short-Term Memory) is used to diagnose and determine failures. Memory) to build a deep learning algorithm. Second, in order to collect historical data of agricultural machinery, IoT sensors including GPS module, gyro sensor, acceleration sensor, and temperature and humidity sensor are attached to agricultural machinery to automatically collect data. Third, event-type data such as agricultural machine production, purchase, and disposal are automatically collected from related organizations to design an interface that can integrate the entire life cycle history data and collect data through this.

메타버스 환경에서의 딥 러닝 기반 알고리즘을 활용한 상표권 탐지 시스템 (Trandemark detection system using deep learning-based algorithms in a metaverse environment)

  • 이지은;이형수;신용태
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2024년도 제69차 동계학술대회논문집 32권1호
    • /
    • pp.1-4
    • /
    • 2024
  • 코로나 19(Covide-19)이후 가상과 현실이 융·복합 되어 사회·경제·문학활동과 가치 창출이 가능한 메타버스가 차세대 핵심산업으로 부상하고 있다. 이에 자사 보유 기술, IP(Intellectual Property) 등을 활용하여 메타버스 플랫폼을 구축하고자 하는 기업들이 증가하여 지식재산권을 둔 법적 이슈들이 새롭게 나타나고 있다. 따라서 본 논문에서는 상표권 침해를 보호하기 위하여 딥 러닝 기반 객체 탐지모델인 YOLOv5 모델을 활용한 메타버스 환경에서의 상표권 탐지 시스템을 제안한다.

  • PDF

트래픽 플로우 및 딥러닝 기반의 프로토콜 분류 방법론 (Protocol Classification Based on Traffic Flow and Deep Learning)

  • 박예진;조영필
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2024년도 춘계학술발표대회
    • /
    • pp.836-838
    • /
    • 2024
  • 본 논문은 현대 사회에서 급증하는 VPN의 악용 가능성을 인지하고 VPN과 Non-VPN 트래픽 구별의 중요도를 강조한다. 전통적인 포트 기반 분류와 패킷 분석 접근법의 한계를 넘어서기 위해 트래픽 플로우 특징과 인공지능(AI) 기술을 결합하여 VPN과 Non-VPN 프로토콜을 구별하는 새로운 방법을 제안한다. 직접 수집한 패킷 데이터셋을 사용하여 트래픽 플로우 특징을 추출하고, 패킷의 페이로드와 결합해 이미지를 생성한다. 이를 CNN 모델에 적용함으로써 높은 정확도로 프로토콜을 구별한다. 실험 결과, 제안된 방법은 99.71%의 높은 정확도를 달성하여 트래픽 분류 및 네트워크 보안 강화에 기여할 수 있는 방법론임을 입증한다.

Multichannel Convolution Neural Network Classification for the Detection of Histological Pattern in Prostate Biopsy Images

  • Bhattacharjee, Subrata;Prakash, Deekshitha;Kim, Cho-Hee;Choi, Heung-Kook
    • 한국멀티미디어학회논문지
    • /
    • 제23권12호
    • /
    • pp.1486-1495
    • /
    • 2020
  • The analysis of digital microscopy images plays a vital role in computer-aided diagnosis (CAD) and prognosis. The main purpose of this paper is to develop a machine learning technique to predict the histological grades in prostate biopsy. To perform a multiclass classification, an AI-based deep learning algorithm, a multichannel convolutional neural network (MCCNN) was developed by connecting layers with artificial neurons inspired by the human brain system. The histological grades that were used for the analysis are benign, grade 3, grade 4, and grade 5. The proposed approach aims to classify multiple patterns of images extracted from the whole slide image (WSI) of a prostate biopsy based on the Gleason grading system. The Multichannel Convolution Neural Network (MCCNN) model takes three input channels (Red, Green, and Blue) to extract the computational features from each channel and concatenate them for multiclass classification. Stain normalization was carried out for each histological grade to standardize the intensity and contrast level in the image. The proposed model has been trained, validated, and tested with the histopathological images and has achieved an average accuracy of 96.4%, 94.6%, and 95.1%, respectively.

Siamese 네트워크 기반 SAR 표적영상 간 유사도 분석 (Similarity Analysis Between SAR Target Images Based on Siamese Network)

  • 박지훈
    • 한국군사과학기술학회지
    • /
    • 제25권5호
    • /
    • pp.462-475
    • /
    • 2022
  • Different from the field of electro-optical(EO) image analysis, there has been less interest in similarity metrics between synthetic aperture radar(SAR) target images. A reliable and objective similarity analysis for SAR target images is expected to enable the verification of the SAR measurement process or provide the guidelines of target CAD modeling that can be used for simulating realistic SAR target images. For this purpose, this paper presents a similarity analysis method based on the siamese network that quantifies the subjective assessment through the distance learning of similar and dissimilar SAR target image pairs. The proposed method is applied to MSTAR SAR target images of slightly different depression angles and the resultant metrics are compared and analyzed with qualitative evaluation. Since the image similarity is somewhat related to recognition performance, the capacity of the proposed method for target recognition is further checked experimentally with the confusion matrix.

한국어 립리딩: 데이터 구축 및 문장수준 립리딩 (Korean Lip-Reading: Data Construction and Sentence-Level Lip-Reading)

  • 조선영;윤수성
    • 한국군사과학기술학회지
    • /
    • 제27권2호
    • /
    • pp.167-176
    • /
    • 2024
  • Lip-reading is the task of inferring the speaker's utterance from silent video based on learning of lip movements. It is very challenging due to the inherent ambiguities present in the lip movement such as different characters that produce the same lip appearances. Recent advances in deep learning models such as Transformer and Temporal Convolutional Network have led to improve the performance of lip-reading. However, most previous works deal with English lip-reading which has limitations in directly applying to Korean lip-reading, and moreover, there is no a large scale Korean lip-reading dataset. In this paper, we introduce the first large-scale Korean lip-reading dataset with more than 120 k utterances collected from TV broadcasts containing news, documentary and drama. We also present a preprocessing method which uniformly extracts a facial region of interest and propose a transformer-based model based on grapheme unit for sentence-level Korean lip-reading. We demonstrate that our dataset and model are appropriate for Korean lip-reading through statistics of the dataset and experimental results.

Artificial Intelligence Plant Doctor: Plant Disease Diagnosis Using GPT4-vision

  • Yoeguang Hue;Jea Hyeoung Kim;Gang Lee;Byungheon Choi;Hyun Sim;Jongbum Jeon;Mun-Il Ahn;Yong Kyu Han;Ki-Tae Kim
    • 식물병연구
    • /
    • 제30권1호
    • /
    • pp.99-102
    • /
    • 2024
  • Integrated pest management is essential for controlling plant diseases that reduce crop yields. Rapid diagnosis is crucial for effective management in the event of an outbreak to identify the cause and minimize damage. Diagnosis methods range from indirect visual observation, which can be subjective and inaccurate, to machine learning and deep learning predictions that may suffer from biased data. Direct molecular-based methods, while accurate, are complex and time-consuming. However, the development of large multimodal models, like GPT-4, combines image recognition with natural language processing for more accurate diagnostic information. This study introduces GPT-4-based system for diagnosing plant diseases utilizing a detailed knowledge base with 1,420 host plants, 2,462 pathogens, and 37,467 pesticide instances from the official plant disease and pesticide registries of Korea. The AI plant doctor offers interactive advice on diagnosis, control methods, and pesticide use for diseases in Korea and is accessible at https://pdoc.scnu.ac.kr/.

매치 3 게임 플레이를 위한 PPO 알고리즘을 이용한 강화학습 에이전트의 설계 및 구현 (Design and Implementation of Reinforcement Learning Agent Using PPO Algorithim for Match 3 Gameplay)

  • 박대근;이완복
    • 융합정보논문지
    • /
    • 제11권3호
    • /
    • pp.1-6
    • /
    • 2021
  • 매치 3 퍼즐 게임들은 주로 MCTS(Monte Carlo Tree Search) 알고리즘을 사용하여 자동 플레이를 구현하였지만 MCTS의 느린 탐색 속도로 인해 MCTS와 DNN(Deep Neural Network)을 함께 적용하거나 강화학습으로 인공지능을 구현하는 것이 일반적인 경향이다. 본 연구에서는 매치 3 게임 개발에 주로 사용되는 유니티3D 엔진과 유니티 개발사에서 제공해주는 머신러닝 SDK를 이용하여 PPO(Proximal Policy Optimization) 알고리즘을 적용한 강화학습 에이전트를 설계 및 구현하여, 그 성능을 확인해본 결과, 44% 정도 성능이 향상되었음을 확인하였다. 실험 결과 에이전트가 게임 규칙을 배우고 실험이 진행됨에 따라 더 나은 전략적 결정을 도출 해 낼 수 있는 것을 확인할 수 있었으며 보통 사람들보다 퍼즐 게임을 더 잘 수행하는 결과를 확인하였다. 본 연구에서 설계 및 구현한 에이전트가 일반 사람들보다 더 잘 플레이하는 만큼, 기계와 인간 플레이 수준 사이의 간극을 조절하여 게임의 레벨 디지인에 적용된다면 향후 빠른 스테이지 개발에 도움이 될 것으로 기대된다.