• Title/Summary/Keyword: Image Learning

Search Result 3,175, Processing Time 0.026 seconds

Usability of CPR Training System based on Extended Reality (확장현실 기반의 심폐소생술 교육 시스템의 사용성 평가)

  • Lee, Youngho;Kim, Sun Kyung;Choi, Jongmyung;Park, Gun Woo;Go, Younghye
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.6
    • /
    • pp.115-122
    • /
    • 2022
  • Recently, the importance of CPR training for the layperson has been emphasized to improve the survival rate of out-of-hospital cardiac arrest patients. An accurate and realistic training strategy is required for the CPR training effect for laypersons. In this study, we develop an extended reality (XR) based CPR training system and evaluate its usability. The XR based CPR training system consisted of three applications. First, a 3D heart anatomy image registered to the manikin is transmitted to the smart glasses to guide the chest compression point. The second application provides visual and auditory information about the CPR process through smart glasses. At the same time, the smartwatch sends a vibration notification to guide the compression rate. The 'Add-on-kit' is a device that detects the depth and speed of chest compression via sensors installed on the manikin and sends immediate feedback to the smartphone. One hundred laypersons who participated in this study agreed that the XR based CPR training system has realism and effectiveness. XR based registration technology will contribute to improving the efficiency of CPR training by enhancing realism, immersion, and self-directed learning.

Data augmentation in voice spoofing problem (데이터 증강기법을 이용한 음성 위조 공격 탐지모형의 성능 향상에 대한 연구)

  • Choi, Hyo-Jung;Kwak, Il-Youp
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.449-460
    • /
    • 2021
  • ASVspoof 2017 deals with detection of replay attacks and aims to classify real human voices and fake voices. The spoofed voice refers to the voice that reproduces the original voice by different types of microphones and speakers. data augmentation research on image data has been actively conducted, and several studies have been conducted to attempt data augmentation on voice. However, there are not many attempts to augment data for voice replay attacks, so this paper explores how audio modification through data augmentation techniques affects the detection of replay attacks. A total of 7 data augmentation techniques were applied, and among them, dynamic value change (DVC) and pitch techniques helped improve performance. DVC and pitch showed an improvement of about 8% of the base model EER, and DVC in particular showed noticeable improvement in accuracy in some environments among 57 replay configurations. The greatest increase was achieved in RC53, and DVC led to an approximately 45% improvement in base model accuracy. The high-end recording and playback devices that were previously difficult to detect were well identified. Based on this study, we found that the DVC and pitch data augmentation techniques are helpful in improving performance in the voice spoofing detection problem.

Hyperparameter Optimization for Image Classification in Convolutional Neural Network (합성곱 신경망에서 이미지 분류를 위한 하이퍼파라미터 최적화)

  • Lee, Jae-Eun;Kim, Young-Bong;Kim, Jong-Nam
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.21 no.3
    • /
    • pp.148-153
    • /
    • 2020
  • In order to obtain high accuracy with an convolutional neural network(CNN), it is necessary to set the optimal hyperparameters. However, the exact value of the hyperparameter that can make high performance is not known, and the optimal hyperparameter value is different based on the type of the dataset, therefore, it is necessary to find it through various experiments. In addition, since the range of hyperparameter values is wide and the number of combinations is large, it is necessary to find the optimal values of the hyperparameters after the experimental design in order to save time and computational costs. In this paper, we suggest an algorithm that use the design of experiments and grid search algorithm to determine the optimal hyperparameters for a classification problem. This algorithm determines the optima values of the hyperparameters that yields high performance using the factorial design of experiments. It is shown that the amount of computational time can be efficiently reduced and the accuracy can be improved by performing a grid search after reducing the search range of each hyperparameter through the experimental design. Moreover, Based on the experimental results, it was shown that the learning rate is the only hyperparameter that has the greatest effect on the performance of the model.

A research on the possibility of restoring cultural assets of artificial intelligence through the application of artificial neural networks to roof tile(Wadang)

  • Kim, JunO;Lee, Byong-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.19-26
    • /
    • 2021
  • Cultural assets excavated in historical areas have their own characteristics based on the background of the times, and it can be seen that their patterns and characteristics change little by little according to the history and the flow of the spreading area. Cultural properties excavated in some areas represent the culture of the time and some maintain their intact appearance, but most of them are damaged/lost or divided into parts, and many experts are mobilized to research the composition and repair the damaged parts. The purpose of this research is to learn patterns and characteristics of the past through artificial intelligence neural networks for such restoration research, and to restore the lost parts of the excavated cultural assets based on Generative Adversarial Network(GAN)[1]. The research is a process in which the rest of the damaged/lost parts are restored based on some of the cultural assets excavated based on the GAN. To recover some parts of dammed of cultural asset, through training with the 2D image of a complete cultural asset. This research is focused on how much recovered not only damaged parts but also reproduce colors and materials. Finally, through adopted this trained neural network to real damaged cultural, confirmed area of recovered area and limitation.

Object Detection Based on Hellinger Distance IoU and Objectron Application (Hellinger 거리 IoU와 Objectron 적용을 기반으로 하는 객체 감지)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.63-70
    • /
    • 2022
  • Although 2D Object detection has been largely improved in the past years with the advance of deep learning methods and the use of large labeled image datasets, 3D object detection from 2D imagery is a challenging problem in a variety of applications such as robotics, due to the lack of data and diversity of appearances and shapes of objects within a category. Google has just announced the launch of Objectron that has a novel data pipeline using mobile augmented reality session data. However, it also is corresponding to 2D-driven 3D object detection technique. This study explores more mature 2D object detection method, and applies its 2D projection to Objectron 3D lifting system. Most object detection methods use bounding boxes to encode and represent the object shape and location. In this work, we explore a stochastic representation of object regions using Gaussian distributions. We also present a similarity measure for the Gaussian distributions based on the Hellinger Distance, which can be viewed as a stochastic Intersection-over-Union. Our experimental results show that the proposed Gaussian representations are closer to annotated segmentation masks in available datasets. Thus, less accuracy problem that is one of several limitations of Objectron can be relaxed.

Road Extraction from Images Using Semantic Segmentation Algorithm (영상 기반 Semantic Segmentation 알고리즘을 이용한 도로 추출)

  • Oh, Haeng Yeol;Jeon, Seung Bae;Kim, Geon;Jeong, Myeong-Hun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.3
    • /
    • pp.239-247
    • /
    • 2022
  • Cities are becoming more complex due to rapid industrialization and population growth in modern times. In particular, urban areas are rapidly changing due to housing site development, reconstruction, and demolition. Thus accurate road information is necessary for various purposes, such as High Definition Map for autonomous car driving. In the case of the Republic of Korea, accurate spatial information can be generated by making a map through the existing map production process. However, targeting a large area is limited due to time and money. Road, one of the map elements, is a hub and essential means of transportation that provides many different resources for human civilization. Therefore, it is essential to update road information accurately and quickly. This study uses Semantic Segmentation algorithms Such as LinkNet, D-LinkNet, and NL-LinkNet to extract roads from drone images and then apply hyperparameter optimization to models with the highest performance. As a result, the LinkNet model using pre-trained ResNet-34 as the encoder achieved 85.125 mIoU. Subsequent studies should focus on comparing the results of this study with those of studies using state-of-the-art object detection algorithms or semi-supervised learning-based Semantic Segmentation techniques. The results of this study can be applied to improve the speed of the existing map update process.

A Study on Problems and Improvement Plans of Non-Face-to-Face Midi Classes (비대면 미디 수업의 문제점과 개선 방안 연구)

  • Baek, Sung-Hyun
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.4
    • /
    • pp.267-277
    • /
    • 2021
  • Both teachers and learners should participate in non-face-to-face class due to COVID-19. The non-face-to-face class has brought about many problems, where they made adequate preparations for such abrupt situation. This study attempted to understand and improve problems occurring during non-face-to-face midi class. The findings are as follows: First, there were differences in equipment available to contact and non-face-to-face class. Such a problem could be improved by using Reaper, DAW which can be installed and freely utilized without any functional limits, regardless of the types of operating systems. Second, latency could not be reduced, when the screen share function of Zoom was used, since it was impossible to select audio interface's drivers in DAW. This problem was improved by again receiving audio output as input and sending it, from the perspectives of teachers. In addition, learners who used the operating system of Windows and have no audio interfaces usually suffer from latency during practices. The latency can be reduced by installing Asio4all. Third, image degradation and screen disconnection phenomena occurred due to the lack of resource. Two computers were connected by using a capture board and the screen disconnection phenomena could be improved by distributing resources and maintaining high-resolution. The system for allowing non-face-to-face midi class could be successfully established, as one more computer was connected by using Vienna Ensemble Pro and more plug-ins were used by securing additional resources. Consequently, the problems of non-face-to-face midi class could be understood and improved.

A Comparative Study on the Object Detection of Deposited Marine Debris (DMD) Using YOLOv5 and YOLOv7 Models (YOLOv5와 YOLOv7 모델을 이용한 해양침적쓰레기 객체탐지 비교평가)

  • Park, Ganghyun;Youn, Youjeong;Kang, Jonggu;Kim, Geunah;Choi, Soyeon;Jang, Seonwoong;Bak, Suho;Gong, Shinwoo;Kwak, Jiwoo;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1643-1652
    • /
    • 2022
  • Deposited Marine Debris(DMD) can negatively affect marine ecosystems, fishery resources, and maritime safety and is mainly detected by sonar sensors, lifting frames, and divers. Considering the limitation of cost and time, recent efforts are being made by integrating underwater images and artificial intelligence (AI). We conducted a comparative study of You Only Look Once Version 5 (YOLOv5) and You Only Look Once Version 7 (YOLOv7) models to detect DMD from underwater images for more accurate and efficient management of DMD. For the detection of the DMD objects such as glass, metal, fish traps, tires, wood, and plastic, the two models showed a performance of over 0.85 in terms of Mean Average Precision (mAP@0.5). A more objective evaluation and an improvement of the models are expected with the construction of an extensive image database.

The Performance Improvement of U-Net Model for Landcover Semantic Segmentation through Data Augmentation (데이터 확장을 통한 토지피복분류 U-Net 모델의 성능 개선)

  • Baek, Won-Kyung;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1663-1676
    • /
    • 2022
  • Recently, a number of deep-learning based land cover segmentation studies have been introduced. Some studies denoted that the performance of land cover segmentation deteriorated due to insufficient training data. In this study, we verified the improvement of land cover segmentation performance through data augmentation. U-Net was implemented for the segmentation model. And 2020 satellite-derived landcover dataset was utilized for the study data. The pixel accuracies were 0.905 and 0.923 for U-Net trained by original and augmented data respectively. And the mean F1 scores of those models were 0.720 and 0.775 respectively, indicating the better performance of data augmentation. In addition, F1 scores for building, road, paddy field, upland field, forest, and unclassified area class were 0.770, 0.568, 0.433, 0.455, 0.964, and 0.830 for the U-Net trained by original data. It is verified that data augmentation is effective in that the F1 scores of every class were improved to 0.838, 0.660, 0.791, 0.530, 0.969, and 0.860 respectively. Although, we applied data augmentation without considering class balances, we find that data augmentation can mitigate biased segmentation performance caused by data imbalance problems from the comparisons between the performances of two models. It is expected that this study would help to prove the importance and effectiveness of data augmentation in various image processing fields.

Implementation of CoMirror System with Video Call and Messaging Function between Smart Mirrors (스마트 미러간 화상 통화와 메시징 기능을 가진 CoMirror 시스템 구현)

  • Hwang, Kitae;Kim, Kyung-Mi;Kim, Yu-Jin;Park, Chae-Won;Yoo, Song-Yeon;Jung, Inhwan;Lee, Jae-Moon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.6
    • /
    • pp.121-127
    • /
    • 2022
  • Smart mirror is an IoT device that attaches a display and an embedded computer to the mirror and provides various information to the useer along with the mirror function. This paper went beyond the form of dealing with smart mirrors only stand alone device the provide information to users, and constructed a network in which smart mirrors are connected, and proposed and implemented a CoMirror system that allows users to talk and share information with other smart mirror users. The CoMirror system has a structure in which several CoMirror clients are connected on one CoMirror server. The CoMirror client consists of Raspberry Pi, a mirror film, a touch pad, a display device, an web camera, etc. The server has functions such as face learning and recognition, user management, a relay role for exchanging messages between clients, and setting up for video call. Users can communicate with other CoMirror users via the server, such as text, image, and audio messages, as well as 1:1 video call.