• Title/Summary/Keyword: GPU model

Search Result 168, Processing Time 0.025 seconds

Detection of Smoking Behavior in Images Using Deep Learning Technology (딥러닝 기술을 이용한 영상에서 흡연행위 검출)

  • Dong Jun Kim;Yu Jin Choi;Kyung Min Park;Ji Hyun Park;Jae-Moon Lee;Kitae Hwang;In Hwan Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.107-113
    • /
    • 2023
  • This paper proposes a method for detecting smoking behavior in images using artificial intelligence technology. Since smoking is not a static phenomenon but an action, the object detection technology was combined with the posture estimation technology that can detect the action. A smoker detection learning model was developed to detect smokers in images, and the characteristics of smoking behaviors were applied to posture estimation technology to detect smoking behaviors in images. YOLOv8 was used for object detection, and OpenPose was used for posture estimation. In addition, when smokers and non-smokers are included in the image, a method of separating only people was applied. The proposed method was implemented using Google Colab NVIDEA Tesla T4 GPU in Python, and it was found that the smoking behavior was perfectly detected in the given video as a result of the test.

Deep Learning based Dynamic Taint Detection Technique for Binary Code Vulnerability Detection (바이너리 코드 취약점 탐지를 위한 딥러닝 기반 동적 오염 탐지 기술)

  • Kwang-Man Ko
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.3
    • /
    • pp.161-166
    • /
    • 2023
  • In recent years, new and variant hacking of binary codes has increased, and the limitations of techniques for detecting malicious codes in source programs and defending against attacks are often exposed. Advanced software security vulnerability detection technology using machine learning and deep learning technology for binary code and defense and response capabilities against attacks are required. In this paper, we propose a malware clustering method that groups malware based on the characteristics of the taint information after entering dynamic taint information by tracing the execution path of binary code. Malware vulnerability detection was applied to a three-layered Few-shot learning model, and F1-scores were calculated for each layer's CPU and GPU. We obtained 97~98% performance in the learning process and 80~81% detection performance in the test process.

Speed-up Techniques for High-Resolution Grid Data Processing in the Early Warning System for Agrometeorological Disaster (농업기상재해 조기경보시스템에서의 고해상도 격자형 자료의 처리 속도 향상 기법)

  • Park, J.H.;Shin, Y.S.;Kim, S.K.;Kang, W.S.;Han, Y.K.;Kim, J.H.;Kim, D.J.;Kim, S.O.;Shim, K.M.;Park, E.W.
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.19 no.3
    • /
    • pp.153-163
    • /
    • 2017
  • The objective of this study is to enhance the model's speed of estimating weather variables (e.g., minimum/maximum temperature, sunshine hour, PRISM (Parameter-elevation Regression on Independent Slopes Model) based precipitation), which are applied to the Agrometeorological Early Warning System (http://www.agmet.kr). The current process of weather estimation is operated on high-performance multi-core CPUs that have 8 physical cores and 16 logical threads. Nonetheless, the server is not even dedicated to the handling of a single county, indicating that very high overhead is involved in calculating the 10 counties of the Seomjin River Basin. In order to reduce such overhead, several cache and parallelization techniques were used to measure the performance and to check the applicability. Results are as follows: (1) for simple calculations such as Growing Degree Days accumulation, the time required for Input and Output (I/O) is significantly greater than that for calculation, suggesting the need of a technique which reduces disk I/O bottlenecks; (2) when there are many I/O, it is advantageous to distribute them on several servers. However, each server must have a cache for input data so that it does not compete for the same resource; and (3) GPU-based parallel processing method is most suitable for models such as PRISM with large computation loads.

Visualizing sphere-contacting areas on automobile parts for ECE inspection

  • Inui, Masatomo;Umezun, Nobuyuki;Kitamura, Yuuki
    • Journal of Computational Design and Engineering
    • /
    • v.2 no.1
    • /
    • pp.55-66
    • /
    • 2015
  • To satisfy safety regulations of Economic Commission for Europe (ECE), the surface regions of automobile parts must have a sufficient degree of roundness if there is any chance that they could contact a sphere of 50.0 mm radius (exterior parts) or 82.5 mm radius (interior parts). In this paper, a new offset-based method is developed to automatically detect the possible sphere-contacting shape of such parts. A polyhedral model that precisely approximates the part shape is given as input, and the offset shape of the model is obtained as the Boolean union of the expanded shapes of all surface triangles. We adopt a triple-dexel representation of the 3D model to enable stable and precise Boolean union computations. To accelerate the dexel operations in these Boolean computations, a new parallel processing method with a pseudo-list structure and axis-aligned bounding box is developed. The possible sphere-contacting shape of the part surface is then extracted from the offset shape as a set of points or a set of polygons.

Development of a Low-cost Industrial OCR System with an End-to-end Deep Learning Technology

  • Subedi, Bharat;Yunusov, Jahongir;Gaybulayev, Abdulaziz;Kim, Tae-Hyong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.2
    • /
    • pp.51-60
    • /
    • 2020
  • Optical character recognition (OCR) has been studied for decades because it is very useful in a variety of places. Nowadays, OCR's performance has improved significantly due to outstanding deep learning technology. Thus, there is an increasing demand for commercial-grade but affordable OCR systems. We have developed a low-cost, high-performance OCR system for the industry with the cheapest embedded developer kit that supports GPU acceleration. To achieve high accuracy for industrial use on limited computing resources, we chose a state-of-the-art text recognition algorithm that uses an end-to-end deep learning network as a baseline model. The model was then improved by replacing the feature extraction network with the best one suited to our conditions. Among the various candidate networks, EfficientNet-B3 has shown the best performance: excellent recognition accuracy with relatively low memory consumption. Besides, we have optimized the model written in TensorFlow's Python API using TensorFlow-TensorRT integration and TensorFlow's C++ API, respectively.

Olefin/Paraffin Separation though Facilitated Transport Membranes in Solid State

  • Hong, Seong-Uk;Won, Jong-Ok;Hong, Jae-Min;Park, Hyun-Chae;Kang, Yong-Soo
    • Proceedings of the Membrane Society of Korea Conference
    • /
    • 1999.07a
    • /
    • pp.15-18
    • /
    • 1999
  • A simple mathematical model for facilitated mass transport through a fixed site carrier membrane was derived by assuming an instantaneous, microscopic concentration (activity) fluctuation. The current model demonstrates that the facilitation factor depends on the extent of concentration fluctuation, the time scale ratios of diffusion to chemical reaction and the ratio of the carrier concentration to the solute solubility in matrix. The model was examined against the experimental data on oxygen transport in membranes containing metallo-porphyrin carriers, and the agreement was exceptional (within 10% error). The basic concept of this approach was applied to separate olefin from olefin/paraffin mixtures. A proprietaty carrier, developed here, resulted that the selectivity of propylene over propane was more than 120 and the propylene permeance exceed 40 gpu.

  • PDF

Virtual view synthesis using unsupervised learning depth estimation model (비지도 학습 깊이 예측 모델을 이용한 가상시점 합성)

  • Song, Min-Ki;Yang, Ji-Hee;Hwang, Dong-Ho;Park, Goo-Man
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.155-157
    • /
    • 2019
  • 본 논문에서는 기존의 DERS, VSRS를 이용한 가상시점 합성이 가지고 있는 문제점을 해결하기 위해 비지도 학습 방식의 학습 모델을 이용하여 가상시점 합성에 적용하는 방식을 제안한다. 제안한 방식에서는 기존의 DERS와 달리 Disparity의 탐색범위를 지정하지 않고 Depth의 예측이 가능하며 단안의 영상에서 Depth를 예측하기 때문에 가상시점 합성 시 더 넓은 시점을 합성 할 수 있다. 또한 기존 방식은 Depth와 합성 영상을 각각 처리해야하지만 제안하는 방식은 한 번에 작업이 이루어지며, GPU를 기반으로 구현하였기 때문에 기존의 합성 방식 보다 처리 속도가 우수하다.

  • PDF

Convolutional Neural Network with Particle Filter Approach for Visual Tracking

  • Tyan, Vladimir;Kim, Doohyun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.693-709
    • /
    • 2018
  • In this paper, we propose a compact Convolutional Neural Network (CNN)-based tracker in conjunction with a particle filter architecture, in which the CNN model operates as an accurate candidates estimator, while the particle filter predicts the target motion dynamics, lowering the overall number of calculations and refines the resulting target bounding box. Experiments were conducted on the Online Object Tracking Benchmark (OTB) [34] dataset and comparison analysis in respect to other state-of-art has been performed based on accuracy and precision, indicating that the proposed algorithm outperforms all state-of-the-art trackers included in the OTB dataset, specifically, TLD [16], MIL [1], SCM [36] and ASLA [15]. Also, a comprehensive speed performance analysis showed average frames per second (FPS) among the top-10 trackers from the OTB dataset [34].

Enhanced Network Intrusion Detection using Deep Convolutional Neural Networks

  • Naseer, Sheraz;Saleem, Yasir
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.10
    • /
    • pp.5159-5178
    • /
    • 2018
  • Network Intrusion detection is a rapidly growing field of information security due to its importance for modern IT infrastructure. Many supervised and unsupervised learning techniques have been devised by researchers from discipline of machine learning and data mining to achieve reliable detection of anomalies. In this paper, a deep convolutional neural network (DCNN) based intrusion detection system (IDS) is proposed, implemented and analyzed. Deep CNN core of proposed IDS is fine-tuned using Randomized search over configuration space. Proposed system is trained and tested on NSLKDD training and testing datasets using GPU. Performance comparisons of proposed DCNN model are provided with other classifiers using well-known metrics including Receiver operating characteristics (RoC) curve, Area under RoC curve (AuC), accuracy, precision-recall curve and mean average precision (mAP). The experimental results of proposed DCNN based IDS shows promising results for real world application in anomaly detection systems.

Integer-Pel Motion Estimation for HEVC on Compute Unified Device Architecture (CUDA)

  • Lee, Dongkyu;Sim, Donggyu;Oh, Seoung-Jun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.3 no.6
    • /
    • pp.397-403
    • /
    • 2014
  • A new video compression standard called High Efficiency Video Coding (HEVC) has recently been released onto the market. HEVC provides higher coding performance compared to previous standards, but at the cost of a significant increase in encoding complexity, particularly in motion estimation (ME). At the same time, the computing capabilities of Graphics Processing Units (GPUs) have become more powerful. This paper proposes a parallel integer-pel ME (IME) algorithm for HEVC on GPU using the Compute Unified Device Architecture (CUDA). In the proposed IME, concurrent parallel reduction (CPR) is introduced. CPR performs several parallel reduction (PR) operations concurrently to solve two problems in conventional PR; low thread utilization and high thread synchronization latency. The proposed encoder reduces the portion of IME in the encoder to almost zero with a 2.3% increase in bitrate. In terms of IME, the proposed IME is up to 172.6 times faster than the IME in the HEVC reference model.