• Title/Summary/Keyword: Convolutional Neural Network (CNN)

Search Result 966, Processing Time 0.03 seconds

Machine Learning Data Extension Way for Confirming Genuine of Trademark Image which is Rotated (회전한 상표 이미지의 진위 결정을 위한 기계 학습 데이터 확장 방법)

  • Gu, Bongen
    • Journal of Platform Technology
    • /
    • v.8 no.1
    • /
    • pp.16-23
    • /
    • 2020
  • For protecting copyright for trademark, convolutional neural network can be used to confirm genuine of trademark image. For this, repeated training one trademark image degrades the performance of machine learning because of overfitting problem. Therefore, this type of machine learning application generates training data in various way. But if genuine trademark image is rotated, this image is classified as not genuine trademark. In this paper, we propose the way for extending training data to confirm genuine of trademark image which is rotated. Our proposed way generates rotated image from genuine trademark image as training data. To show effectiveness of our proposed way, we use CNN machine learning model, and evaluate the accuracy with test image. From evaluation result, our way can be used to generate training data for machine learning application which confirms genuine of rotated trademark image.

  • PDF

Multiple Sclerosis Lesion Detection using 3D Autoencoder in Brain Magnetic Resonance Images (3D 오토인코더 기반의 뇌 자기공명영상에서 다발성 경화증 병변 검출)

  • Choi, Wonjune;Park, Seongsu;Kim, Yunsoo;Gahm, Jin Kyu
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.979-987
    • /
    • 2021
  • Multiple Sclerosis (MS) can be early diagnosed by detecting lesions in brain magnetic resonance images (MRI). Unsupervised anomaly detection methods based on autoencoder have been recently proposed for automated detection of MS lesions. However, these autoencoder-based methods were developed only for 2D images (e.g. 2D cross-sectional slices) of MRI, so do not utilize the full 3D information of MRI. In this paper, therefore, we propose a novel 3D autoencoder-based framework for detection of the lesion volume of MS in MRI. We first define a 3D convolutional neural network (CNN) for full MRI volumes, and build each encoder and decoder layer of the 3D autoencoder based on 3D CNN. We also add a skip connection between the encoder and decoder layer for effective data reconstruction. In the experimental results, we compare the 3D autoencoder-based method with the 2D autoencoder models using the training datasets of 80 healthy subjects from the Human Connectome Project (HCP) and the testing datasets of 25 MS patients from the Longitudinal multiple sclerosis lesion segmentation challenge, and show that the proposed method achieves superior performance in prediction of MS lesion by up to 15%.

Development of a Sign Language Learning Assistance System using Mediapipe for Sign Language Education of Deaf-Mutility (청각장애인의 수어 교육을 위한 MediaPipe 활용 수어 학습 보조 시스템 개발)

  • Kim, Jin-Young;Sim, Hyun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1355-1362
    • /
    • 2021
  • Recently, not only congenital hearing impairment, but also the number of people with hearing impairment due to acquired factors is increasing. The environment in which sign language can be learned is poor. Therefore, this study intends to present a sign language (sign language number/sign language text) evaluation system as a sign language learning assistance tool for sign language learners. Therefore, in this paper, sign language is captured as an image using OpenCV and Convolutional Neural Network (CNN). In addition, we study a system that recognizes sign language behavior using MediaPipe, converts the meaning of sign language into text-type data, and provides it to users. Through this, self-directed learning is possible so that learners who learn sign language can judge whether they are correct dez. Therefore, we develop a sign language learning assistance system that helps us learn sign language. The purpose is to propose a sign language learning assistance system as a way to support sign language learning, the main language of communication for the hearing impaired.

A General Acoustic Drone Detection Using Noise Reduction Preprocessing (환경 소음 제거를 통한 범용적인 드론 음향 탐지 구현)

  • Kang, Hae Young;Lee, Kyung-ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.881-890
    • /
    • 2022
  • As individual and group users actively use drones, the risks (Intrusion, Information leakage, and Sircraft crashes and so on) in no-fly zones are also increasing. Therefore, it is necessary to build a system that can detect drones intruding into the no-fly zone. General acoustic drone detection researches do not derive location-independent performance by directly learning drone sound including environmental noise in a deep learning model to overcome environmental noise. In this paper, we propose a drone detection system that collects sounds including environmental noise, and detects drones by removing noise from target sound. After removing environmental noise from the collected sound, the proposed system predicts the drone sound using Mel spectrogram and CNN deep learning. As a result, It is confirmed that the drone detection performance, which was weak due to unstudied environmental noises, can be improved by more than 7%.

Application of CCTV Image and Semantic Segmentation Model for Water Level Estimation of Irrigation Channel (관개용수로 CCTV 이미지를 이용한 CNN 딥러닝 이미지 모델 적용)

  • Kim, Kwi-Hoon;Kim, Ma-Ga;Yoon, Pu-Reun;Bang, Je-Hong;Myoung, Woo-Ho;Choi, Jin-Yong;Choi, Gyu-Hoon
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.64 no.3
    • /
    • pp.63-73
    • /
    • 2022
  • A more accurate understanding of the irrigation water supply is necessary for efficient agricultural water management. Although we measure water levels in an irrigation canal using ultrasonic water level gauges, some errors occur due to malfunctions or the surrounding environment. This study aims to apply CNN (Convolutional Neural Network) Deep-learning-based image classification and segmentation models to the irrigation canal's CCTV (Closed-Circuit Television) images. The CCTV images were acquired from the irrigation canal of the agricultural reservoir in Cheorwon-gun, Gangwon-do. We used the ResNet-50 model for the image classification model and the U-Net model for the image segmentation model. Using the Natural Breaks algorithm, we divided water level data into 2, 4, and 8 groups for image classification models. The classification models of 2, 4, and 8 groups showed the accuracy of 1.000, 0.987, and 0.634, respectively. The image segmentation model showed a Dice score of 0.998 and predicted water levels showed R2 of 0.97 and MAE (Mean Absolute Error) of 0.02 m. The image classification models can be applied to the automatic gate-controller at four divisions of water levels. Also, the image segmentation model results can be applied to the alternative measurement for ultrasonic water gauges. We expect that the results of this study can provide a more scientific and efficient approach for agricultural water management.

A Study of Tram-Pedestrian Collision Prediction Method Using YOLOv5 and Motion Vector (YOLOv5와 모션벡터를 활용한 트램-보행자 충돌 예측 방법 연구)

  • Kim, Young-Min;An, Hyeon-Uk;Jeon, Hee-gyun;Kim, Jin-Pyeong;Jang, Gyu-Jin;Hwang, Hyeon-Chyeol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.561-568
    • /
    • 2021
  • In recent years, autonomous driving technologies have become a high-value-added technology that attracts attention in the fields of science and industry. For smooth Self-driving, it is necessary to accurately detect an object and estimate its movement speed in real time. CNN-based deep learning algorithms and conventional dense optical flows have a large consumption time, making it difficult to detect objects and estimate its movement speed in real time. In this paper, using a single camera image, fast object detection was performed using the YOLOv5 algorithm, a deep learning algorithm, and fast estimation of the speed of the object was performed by using a local dense optical flow modified from the existing dense optical flow based on the detected object. Based on this algorithm, we present a system that can predict the collision time and probability, and through this system, we intend to contribute to prevent tram accidents.

Optimizing CNN Structure to Improve Accuracy of Artwork Artist Classification

  • Ji-Seon Park;So-Yeon Kim;Yeo-Chan Yoon;Soo Kyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.9-15
    • /
    • 2023
  • Metaverse is a modern new technology that is advancing quickly. The goal of this study is to investigate this technique from the perspective of computer vision as well as general perspective. A thorough analysis of computer vision related Metaverse topics has been done in this study. Its history, method, architecture, benefits, and drawbacks are all covered. The Metaverse's future and the steps that must be taken to adapt to this technology are described. The concepts of Mixed Reality (MR), Augmented Reality (AR), Extended Reality (XR) and Virtual Reality (VR) are briefly discussed. The role of computer vision and its application, advantages and disadvantages and the future research areas are discussed.

A Worker-Driven Approach for Opening Detection by Integrating Computer Vision and Built-in Inertia Sensors on Embedded Devices

  • Anjum, Sharjeel;Sibtain, Muhammad;Khalid, Rabia;Khan, Muhammad;Lee, Doyeop;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.353-360
    • /
    • 2022
  • Due to the dense and complicated working environment, the construction industry is susceptible to many accidents. Worker's fall is a severe problem at the construction site, including falling into holes or openings because of the inadequate coverings as per the safety rules. During the construction or demolition of a building, openings and holes are formed in the floors and roofs. Many workers neglect to cover openings for ease of work while being aware of the risks of holes, openings, and gaps at heights. However, there are safety rules for worker safety; the holes and openings must be covered to prevent falls. The safety inspector typically examines it by visiting the construction site, which is time-consuming and requires safety manager efforts. Therefore, this study presented a worker-driven approach (the worker is involved in the reporting process) to facilitate safety managers by developing integrated computer vision and inertia sensors-based mobile applications to identify openings. The TensorFlow framework is used to design Convolutional Neural Network (CNN); the designed CNN is trained on a custom dataset for binary class openings and covered and deployed on an android smartphone. When an application captures an image, the device also extracts the accelerometer values to determine the inclination in parallel with the classification task of the device to predict the final output as floor (openings/ covered), wall (openings/covered), and roof (openings / covered). The proposed worker-driven approach will be extended with other case scenarios at the construction site.

  • PDF

Visual Explanation of a Deep Learning Solar Flare Forecast Model and Its Relationship to Physical Parameters

  • Yi, Kangwoo;Moon, Yong-Jae;Lim, Daye;Park, Eunsu;Lee, Harim
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.1
    • /
    • pp.42.1-42.1
    • /
    • 2021
  • In this study, we present a visual explanation of a deep learning solar flare forecast model and its relationship to physical parameters of solar active regions (ARs). For this, we use full-disk magnetograms at 00:00 UT from the Solar and Heliospheric Observatory/Michelson Doppler Imager and the Solar Dynamics Observatory/Helioseismic and Magnetic Imager, physical parameters from the Space-weather HMI Active Region Patch (SHARP), and Geostationary Operational Environmental Satellite X-ray flare data. Our deep learning flare forecast model based on the Convolutional Neural Network (CNN) predicts "Yes" or "No" for the daily occurrence of C-, M-, and X-class flares. We interpret the model using two CNN attribution methods (guided backpropagation and Gradient-weighted Class Activation Mapping [Grad-CAM]) that provide quantitative information on explaining the model. We find that our deep learning flare forecasting model is intimately related to AR physical properties that have also been distinguished in previous studies as holding significant predictive ability. Major results of this study are as follows. First, we successfully apply our deep learning models to the forecast of daily solar flare occurrence with TSS = 0.65, without any preprocessing to extract features from data. Second, using the attribution methods, we find that the polarity inversion line is an important feature for the deep learning flare forecasting model. Third, the ARs with high Grad-CAM values produce more flares than those with low Grad-CAM values. Fourth, nine SHARP parameters such as total unsigned vertical current, total unsigned current helicity, total unsigned flux, and total photospheric magnetic free energy density are well correlated with Grad-CAM values.

  • PDF

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

  • Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.28-35
    • /
    • 2023
  • In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.