• Title/Summary/Keyword: Image Training

Search Result 1,376, Processing Time 0.025 seconds

A Case Study on Digital Interactive Training Content <Tamagotchi> and <Peridot>

  • DongHee Choi;Jeanhun Chung
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.306-313
    • /
    • 2023
  • Having pet is one of the activities people living in modern society do to relieve stress and find peace of mind. Currently, the object of companion animals has moved beyond being a real 'living entity' and has developed to a stage where the animal's upbringing process can be enjoyed in a virtual space by being programmed in digital content. This paper studies detailed elements such as character design, interaction, and realism of 'Tamagotchi (1996)', which can be said to be the beginning of digital training content, and 'Peridot (2023)', a recently introduced augmented reality-based training content. The point was that it was training content using portable electronic devices. However, while the environment in the electronic device in which Tamagotchi's character exists was a simple black and white screen, the environment in which Peridot's character operates has been changed to the real world projected on the screen based on augmented reality. Mutual communication with characters in Tamagotchi remained a response to pressing buttons, but in Peridot, it has advanced to the point where you can pet the characters by touching the smartphone screen. In addition, through object and step recognition, it was confirmed that the sense of reality had become more realistic, with toys thrown by users on the screen bouncing off real objects. We hope that this research material will serve as a useful reference for the development of digital training content to be developed in the near future.

A Practical Digital Video Database based on Language and Image Analysis

  • Liang, Yiqing
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.24-48
    • /
    • 1997
  • . Supported byㆍDARPA′s image Understanding (IU) program under "Video Retrieval Based on Language and image Analysis" project.DARPA′s Computer Assisted Education and Training Initiative program (CAETI)ㆍObjective: Develop practical systems for automatic understanding and indexing of video sequences using both audio and video tracks(omitted)

  • PDF

Comparison Study of the Performance of CNN Models with Multi-view Image Set on the Classification of Ship Hull Blocks (다시점 영상 집합을 활용한 선체 블록 분류를 위한 CNN 모델 성능 비교 연구)

  • Chon, Haemyung;Noh, Jackyou
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.3
    • /
    • pp.140-151
    • /
    • 2020
  • It is important to identify the location of ship hull blocks with exact block identification number when scheduling the shipbuilding process. The wrong information on the location and identification number of some hull block can cause low productivity by spending time to find where the exact hull block is. In order to solve this problem, it is necessary to equip the system to track the location of the blocks and to identify the identification numbers of the blocks automatically. There were a lot of researches of location tracking system for the hull blocks on the stockyard. However there has been no research to identify the hull blocks on the stockyard. This study compares the performance of 5 Convolutional Neural Network (CNN) models with multi-view image set on the classification of the hull blocks to identify the blocks on the stockyard. The CNN models are open algorithms of ImageNet Large-Scale Visual Recognition Competition (ILSVRC). Four scaled hull block models are used to acquire the images of ship hull blocks. Learning and transfer learning of the CNN models with original training data and augmented data of the original training data were done. 20 tests and predictions in consideration of five CNN models and four cases of training conditions are performed. In order to compare the classification performance of the CNN models, accuracy and average F1-Score from confusion matrix are adopted as the performance measures. As a result of the comparison, Resnet-152v2 model shows the highest accuracy and average F1-Score with full block prediction image set and with cropped block prediction image set.

Extracting the Point of Impact from Simulated Shooting Target based on Image Processing (영상처리 기반 모의 사격 표적지 탄착점 추출)

  • Lee, Tae-Guk;Lim, Chang-Gyoon;Kim, Kang-Chul;Kim, Young-Min
    • Journal of Internet Computing and Services
    • /
    • v.11 no.1
    • /
    • pp.117-128
    • /
    • 2010
  • There are many researches related to a simulated shooting training system for replacing the real military and police shooting training. In this paper, we propose the point of impact from a simulated shooting target based on image processing instead of using a sensor based approach. The point of impact is extracted by analyzing the image extracted from the camera on the muzzle of a gun. The final shooting result is calculated by mapping the target and the coordinates of the point of impact. The recognition system is divided into recognizing the projection zone, extracting the point of impact on the projection zone, and calculating the shooting result from the point of impact. We find the vertices of the projection zone after converting the captured image to the binary image and extract the point of impact in it. We present the extracting process step by step and provide experiments to validate the results. The experiments show that exact vertices of the projection area and the point of impact are found and a conversion result for the final result is shown on the interface.

The development of food image detection and recognition model of Korean food for mobile dietary management

  • Park, Seon-Joo;Palvanov, Akmaljon;Lee, Chang-Ho;Jeong, Nanoom;Cho, Young-Im;Lee, Hae-Jeung
    • Nutrition Research and Practice
    • /
    • v.13 no.6
    • /
    • pp.521-528
    • /
    • 2019
  • BACKGROUND/OBJECTIVES: The aim of this study was to develop Korean food image detection and recognition model for use in mobile devices for accurate estimation of dietary intake. MATERIALS/METHODS: We collected food images by taking pictures or by searching web images and built an image dataset for use in training a complex recognition model for Korean food. Augmentation techniques were performed in order to increase the dataset size. The dataset for training contained more than 92,000 images categorized into 23 groups of Korean food. All images were down-sampled to a fixed resolution of $150{\times}150$ and then randomly divided into training and testing groups at a ratio of 3:1, resulting in 69,000 training images and 23,000 test images. We used a Deep Convolutional Neural Network (DCNN) for the complex recognition model and compared the results with those of other networks: AlexNet, GoogLeNet, Very Deep Convolutional Neural Network, VGG and ResNet, for large-scale image recognition. RESULTS: Our complex food recognition model, K-foodNet, had higher test accuracy (91.3%) and faster recognition time (0.4 ms) than those of the other networks. CONCLUSION: The results showed that K-foodNet achieved better performance in detecting and recognizing Korean food compared to other state-of-the-art models.

CycleGAN-based Object Detection under Night Environments (CycleGAN을 이용한 야간 상황 물체 검출 알고리즘)

  • Cho, Sangheum;Lee, Ryong;Na, Jaemin;Kim, Youngbin;Park, Minwoo;Lee, Sanghwan;Hwang, Wonjun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.1
    • /
    • pp.44-54
    • /
    • 2019
  • Recently, image-based object detection has made great progress with the introduction of Convolutional Neural Network (CNN). Many trials such as Region-based CNN, Fast R-CNN, and Faster R-CNN, have been proposed for achieving better performance in object detection. YOLO has showed the best performance under consideration of both accuracy and computational complexity. However, these data-driven detection methods including YOLO have the fundamental problem is that they can not guarantee the good performance without a large number of training database. In this paper, we propose a data sampling method using CycleGAN to solve this problem, which can convert styles while retaining the characteristics of a given input image. We will generate the insufficient data samples for training more robust object detection without efforts of collecting more database. We make extensive experimental results using the day-time and night-time road images and we validate the proposed method can improve the object detection accuracy of the night-time without training night-time object databases, because we converts the day-time training images into the synthesized night-time images and we train the detection model with the real day-time images and the synthesized night-time images.

Adaptive Hyperspectral Image Classification Method Based on Spectral Scale Optimization

  • Zhou, Bing;Bingxuan, Li;He, Xuan;Liu, Hexiong
    • Current Optics and Photonics
    • /
    • v.5 no.3
    • /
    • pp.270-277
    • /
    • 2021
  • The adaptive sparse representation (ASR) can effectively combine the structure information of a sample dictionary and the sparsity of coding coefficients. This algorithm can effectively consider the correlation between training samples and convert between sparse representation-based classifier (SRC) and collaborative representation classification (CRC) under different training samples. Unlike SRC and CRC which use fixed norm constraints, ASR can adaptively adjust the constraints based on the correlation between different training samples, seeking a balance between l1 and l2 norm, greatly strengthening the robustness and adaptability of the classification algorithm. The correlation coefficients (CC) can better identify the pixels with strong correlation. Therefore, this article proposes a hyperspectral image classification method called correlation coefficients and adaptive sparse representation (CCASR), based on ASR and CC. This method is divided into three steps. In the first step, we determine the pixel to be measured and calculate the CC value between the pixel to be tested and various training samples. Then we represent the pixel using ASR and calculate the reconstruction error corresponding to each category. Finally, the target pixels are classified according to the reconstruction error and the CC value. In this article, a new hyperspectral image classification method is proposed by fusing CC and ASR. The method in this paper is verified through two sets of experimental data. In the hyperspectral image (Indian Pines), the overall accuracy of CCASR has reached 0.9596. In the hyperspectral images taken by HIS-300, the classification results show that the classification accuracy of the proposed method achieves 0.9354, which is better than other commonly used methods.

A DCT Learning Combined RRU-Net for the Image Splicing Forgery Detection (DCT 학습을 융합한 RRU-Net 기반 이미지 스플라이싱 위조 영역 탐지 모델)

  • Young-min Seo;Jung-woo Han;Hee-jung Kwon;Su-bin Lee;Joongjin Kook
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.1
    • /
    • pp.11-17
    • /
    • 2023
  • This paper proposes a lightweight deep learning network for detecting an image splicing forgery. The research on image forgery detection using CNN, a deep learning network, and research on detecting and localizing forgery in pixel units are in progress. Among them, CAT-Net, which learns the discrete cosine transform coefficients of images together with images, was released in 2022. The DCT coefficients presented by CAT-Net are combined with the JPEG artifact learning module and the backbone model as pre-learning, and the weights are fixed. The dataset used for pre-training is not included in the public dataset, and the backbone model has a relatively large number of network parameters, which causes overfitting in a small dataset, hindering generalization performance. In this paper, this learning module is designed to learn the characterization depending on the DCT domain in real-time during network training without pre-training. The DCT RRU-Net proposed in this paper is a network that combines RRU-Net which detects forgery by learning only images and JPEG artifact learning module. It is confirmed that the network parameters are less than those of CAT-Net, the detection performance of forgery is better than that of RRU-Net, and the generalization performance for various datasets improves through the network architecture and training method of DCT RRU-Net.

  • PDF

Building-up and Feasibility Study of Image Dataset of Field Construction Equipments for AI Training (인공지능 학습용 토공 건설장비 영상 데이터셋 구축 및 타당성 검토)

  • Na, Jong Ho;Shin, Hyu Soun;Lee, Jae Kang;Yun, Il Dong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.1
    • /
    • pp.99-107
    • /
    • 2023
  • Recently, the rate of death and safety accidents at construction sites is the highest among all kinds of industries. In order to apply artificial intelligence technology to construction sites, it is essential to secure a dataset which can be used as a basic training data. In this paper, a number of image data were collected through actual construction site, for which major construction equipment objects mainly operated in civil engineering sites were defined. The optimal training dataset construction was completed by annotation process of about 90,000 image dataset. Reliability of the dataset was verified with the mAP of over 90 % in use of YOLO, a representative model in the field of object detection. The construction equipment training dataset built in this study has been released which is currently available on the public data portal of the Ministry of Public Administration and Security. This dataset is expected to be freely used for any application of object detection technology on construction sites especially in the field of construction safety in the future.

CycleGAN Based Translation Method between Asphalt and Concrete Crack Images for Data Augmentation (데이터 증강을 위한 순환 생성적 적대 신경망 기반의 아스팔트와 콘크리트 균열 영상 간의 변환 기법)

  • Shim, Seungbo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.5
    • /
    • pp.171-182
    • /
    • 2022
  • The safe use of a structure requires it to be maintained in an undamaged state. Thus, a typical factor that determines the safety of a structure is a crack in it. In addition, cracks are caused by various reasons, damage the structure in various ways, and exist in different shapes. Making matters worse, if these cracks are unattended, the risk of structural failure increases and proceeds to a catastrophe. Hence, recently, methods of checking structural damage using deep learning and computer vision technology have been introduced. These methods usually have the premise that there should be a large amount of training image data. However, the amount of training image data is always insufficient. Particularly, this insufficiency negatively affects the performance of deep learning crack detection algorithms. Hence, in this study, a method of augmenting crack image data based on the image translation technique was developed. In particular, this method obtained the crack image data for training a deep learning neural network model by transforming a specific case of a asphalt crack image into a concrete crack image or vice versa . Eventually, this method expected that a robust crack detection algorithm could be developed by increasing the diversity of its training data.