• Title/Summary/Keyword: DNN

Search Result 376, Processing Time 0.028 seconds

Improving Multi-DNN Computational Performance of Embedded Multicore Processors through a Global Queue (글로벌 큐를 통한 임베디드 멀티코어 프로세서의 멀티 DNN 연산 성능 향상)

  • Cho, Ho-jin;Kim, Myung-sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.6
    • /
    • pp.714-721
    • /
    • 2020
  • DNN is expanding its use in embedded systems such as robots and autonomous vehicles. For high recognition accuracy, computational complexity is greatly increased, and multiple DNNs are running aperiodically. Therefore, the ability processing multiple DNNs in embedded environments is a crucial issue. Accordingly, multicore based platforms are being released. However, most DNN models are operated in a batch process, and when multiple DNNs are operated in multicore together, the execution time deviation between each DNN may be large and the end-to-end execution time of the whole DNNs could be long depending on how they are allocated to the cores. In this paper, we solve these problems by providing a framework that decompose each DNN into individual layers and then distribute to multicores through a global queue. As a result of the experiment, the total DNN execution time was reduced by 31%, and when operating multiple identical DNNs, the deviation in execution time was reduced by up to 95.1%.

Optimal Solution of a Large-scale Travelling Salesman Problem applying DNN and k-opt (DNN과 k-opt를 적용한 대규모 외판원 문제의 최적 해법)

  • Lee, Sang-Un
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.4
    • /
    • pp.249-257
    • /
    • 2015
  • This paper introduces a heuristic algorithm to NP-hard travelling salesman problem. The proposed algorithm, in its bid to determine initial path, applies SW-DNN, DW-DNN, and DC-DNN, which are modified forms of the prevalent Double-sided Nearest Neighbor Search and searches the minimum value. As a part of its optimization process on the initial solution, it employs 2, 2.5, 3-opt of a local search k-opt on candidate delete edges and 4-opt on undeleted ones among them. When tested on TSP-1 of 26 European cities and TSP-2 of 49 U.S. cities, the proposed algorithm has successfully obtained optimal results in both, disproving the prevalent disbelief in the attainability of the optimal solution and making itself available as a general algorithm for the travelling salesman problem.

Priority-based Multi-DNN scheduling framework for autonomous vehicles (자율주행차용 우선순위 기반 다중 DNN 모델 스케줄링 프레임워크)

  • Cho, Ho-Jin;Hong, Sun-Pyo;Kim, Myung-Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.368-376
    • /
    • 2021
  • With the recent development of deep learning technology, autonomous things technology is attracting attention, and DNNs are widely used in embedded systems such as drones and autonomous vehicles. Embedded systems that can perform large-scale operations and process multiple DNNs for high recognition accuracy without relying on the cloud are being released. DNNs with various levels of priority exist within these systems. DNNs related to the safety-critical applications of autonomous vehicles have the highest priority, and they must be handled first. In this paper, we propose a priority-based scheduling framework for DNNs when multiple DNNs are executed simultaneously. Even if a low-priority DNN is being executed first, a high-priority DNN can preempt it, guaranteeing the fast response characteristics of safety-critical applications of autonomous vehicles. As a result of checking through extensive experiments, the performance improved by up to 76.6% in the actual commercial board.

Multi-DNN Acceleration Techniques for Embedded Systems with Tucker Decomposition and Hidden-layer-based Parallel Processing (터커 분해 및 은닉층 병렬처리를 통한 임베디드 시스템의 다중 DNN 가속화 기법)

  • Kim, Ji-Min;Kim, In-Mo;Kim, Myung-Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.6
    • /
    • pp.842-849
    • /
    • 2022
  • With the development of deep learning technology, there are many cases of using DNNs in embedded systems such as unmanned vehicles, drones, and robotics. Typically, in the case of an autonomous driving system, it is crucial to run several DNNs which have high accuracy results and large computation amount at the same time. However, running multiple DNNs simultaneously in an embedded system with relatively low performance increases the time required for the inference. This phenomenon may cause a problem of performing an abnormal function because the operation according to the inference result is not performed in time. To solve this problem, the solution proposed in this paper first reduces the computation by applying the Tucker decomposition to DNN models with big computation amount, and then, make DNN models run in parallel as much as possible in the unit of hidden layer inside the GPU. The experimental result shows that the DNN inference time decreases by up to 75.6% compared to the case before applying the proposed technique.

Comparison of Audio Event Detection Performance using DNN (DNN을 이용한 오디오 이벤트 검출 성능 비교)

  • Chung, Suk-Hwan;Chung, Yong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.3
    • /
    • pp.571-578
    • /
    • 2018
  • Recently, deep learning techniques have shown superior performance in various kinds of pattern recognition. However, there have been some arguments whether the DNN performs better than the conventional machine learning techniques when classification experiments are done using a small amount of training data. In this study, we compared the performance of the conventional GMM and SVM with DNN, a kind of deep learning techniques, in audio event detection. When tested on the same data, DNN has shown superior overall performance but SVM was better than DNN in segment-based F-score.

Trends in Neuromorphic Software Platform for Deep Neural Network (딥 뉴럴 네트워크 지원을 위한 뉴로모픽 소프트웨어 플랫폼 기술 동향)

  • Yu, Misun;Ha, Youngmok;Kim, Taeho
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.4
    • /
    • pp.14-22
    • /
    • 2018
  • Deep neural networks (DNNs) are widely used in various domains such as speech and image recognition. DNN software frameworks such as Tensorflow and Caffe contributed to the popularity of DNN because of their easy programming environment. In addition, many companies are developing neuromorphic processing units (NPU) such as Tensor Processing Units (TPUs) and Graphical Processing Units (GPUs) to improve the performance of DNN processing. However, there is a large gap between NPUs and DNN software frameworks due to the lack of framework support for various NPUs. A bridge for the gap is a DNN software platform including DNN optimized compilers and DNN libraries. In this paper, we review the technical trends of DNN software platforms.

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using DNN-HMM-based System (DNN-HMM 기반 시스템을 이용한 효과적인 구개인두부전증 환자 음성 인식)

  • Yoon, Ki-mu;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.1
    • /
    • pp.33-38
    • /
    • 2019
  • This paper proposes an effective recognition method of VPI patient's speech employing DNN-HMM-based speech recognition system, and evaluates the recognition performance compared to GMM-HMM-based system. The proposed method employs speaker adaptation technique to improve VPI speech recognition. This paper proposes to use simulated VPI speech for generating a prior model for speaker adaptation and selective learning of weight matrices of DNN, in order to effectively utilize the small size of VPI speech for model adaptation. We also apply Linear Input Network (LIN) based model adaptation technique for the DNN model. The proposed speaker adaptation method brings 2.35% improvement in average accuracy compared to GMM-HMM based ASR system. The experimental results demonstrate that the proposed DNN-HMM-based speech recognition system is effective for VPI speech with small-sized speech data, compared to conventional GMM-HMM system.

Implementation of CNN in the view of mini-batch DNN training for efficient second order optimization (효과적인 2차 최적화 적용을 위한 Minibatch 단위 DNN 훈련 관점에서의 CNN 구현)

  • Song, Hwa Jeon;Jung, Ho Young;Park, Jeon Gue
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.23-30
    • /
    • 2016
  • This paper describes some implementation schemes of CNN in view of mini-batch DNN training for efficient second order optimization. This uses same procedure updating parameters of DNN to train parameters of CNN by simply arranging an input image as a sequence of local patches, which is actually equivalent with mini-batch DNN training. Through this conversion, second order optimization providing higher performance can be simply conducted to train the parameters of CNN. In both results of image recognition on MNIST DB and syllable automatic speech recognition, our proposed scheme for CNN implementation shows better performance than one based on DNN.

Indoor Space Recognition using Super-pixel and DNN (DNN과 슈퍼픽셀을 이용한 실내 공간 인식)

  • Kim, Kisang;Choi, Hyung-Il
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.43-48
    • /
    • 2018
  • In this paper, we propose an indoor-space recognition using DNN and super-pixel. In order to recognize the indoor space from the image, segmentation process is required for dividing an image Super-pixel is performed algorithm which can be divided into appropriate sizes. In order to recognize each segment, features are extracted using a proposed method. Extracted features are learned using DNN, and each segment is recognized using the DNN model. Experimental results show the performance comparison between the proposed method and existing algorithms.

TPMP: A Privacy-Preserving Technique for DNN Prediction Using ARM TrustZone (TPMP : ARM TrustZone을 활용한 DNN 추론 과정의 기밀성 보장 기술)

  • Song, Suhyeon;Park, Seonghwan;Kwon, Donghyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.3
    • /
    • pp.487-499
    • /
    • 2022
  • Machine learning such as deep learning have been widely used in recent years. Recently deep learning is performed in a trusted execution environment such as ARM TrustZone to improve security in edge devices and embedded devices with low computing resource. To mitigate this problem, we propose TPMP that efficiently uses the limited memory of TEE through DNN model partitioning. TPMP achieves high confidentiality of DNN by performing DNN models that could not be run with existing memory scheduling methods in TEE through optimized memory scheduling. TPMP required a similar amount of computational resources to previous methodologies.