DOI QR코드

DOI QR Code

Rapid and Brief Communication GPU implementation of neural networks

  • Oh, Kyoung-Su (School of Media, College of Information Science, Soongsil University) ;
  • Jung, Kee-Chul (School of Media, College of Information Science, Soongsil University)
  • Published : 2007.02.05

Abstract

Graphics processing unit (GPU) is used for a faster artificial neural network. It is used to implement the matrix multiplication of a neural network to enhance the time performance of a text detection system. Preliminary results produced a 20-fold performance enhancement using an ATI RADEON 9700 PRO board. The parallelism of a GPU is fully utilized by accumulating a lot of input feature vectors and weight vectors, then converting the many inner-product operations into one matrix operation. Further research areas include benchmarking the performance with various hardware and GPU-aware learning algorithms. (c) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

Keywords

Cited by

  1. Radial Basis Function Networks GPU-Based Implementation vol.19, pp.12, 2007, https://doi.org/10.1109/tnn.2008.2003284
  2. Speedup of Implementing Fuzzy Neural Networks With High-Dimensional Inputs Through Parallel Processing on Graphic Processing Units vol.19, pp.4, 2007, https://doi.org/10.1109/tfuzz.2011.2140326
  3. GPGPUを用いたニューラルネットワーク学習の高速化 vol.131, pp.11, 2007, https://doi.org/10.1541/ieejeiss.131.1889
  4. AN EVALUATION OF MULTIPLE FEED-FORWARD NETWORKS ON GPUs vol.21, pp.1, 2007, https://doi.org/10.1142/s0129065711002638
  5. Graphics processing units and genetic programming: an overview vol.15, pp.8, 2007, https://doi.org/10.1007/s00500-011-0695-2
  6. Neural Network-Based Thermal Simulation of Integrated Circuits on GPUs vol.31, pp.1, 2007, https://doi.org/10.1109/tcad.2011.2174236
  7. Neurovisual Control in the Quake II Environment vol.4, pp.1, 2007, https://doi.org/10.1109/tciaig.2012.2184109
  8. The Parallelization Method for Neural Network Learning vol.132, pp.11, 2007, https://doi.org/10.1541/ieejeiss.132.1762
  9. Comparison of GPU- and CPU-implementations of mean-firing rate neural networks on parallel hardware vol.23, pp.4, 2007, https://doi.org/10.3109/0954898x.2012.739292
  10. Real time decision support system for diagnosis of rare cancers, trained in parallel, on a graphics processing unit vol.42, pp.4, 2007, https://doi.org/10.1016/j.compbiomed.2011.12.004
  11. Platform independent, efficient implementation of the Ising Model on parallel acceleration devices vol.210, pp.1, 2012, https://doi.org/10.1140/epjst/e2012-01643-x
  12. GPU realization of a neural network face recognition algorithm vol.2013, pp.12, 2007, https://doi.org/10.7463/1213.0659387
  13. Development and Implementation of a Cyberinfrastructure Framework for Research in Nondestructive Evaluation Using Acoustic Emission Data vol.28, pp.3, 2007, https://doi.org/10.1061/(asce)cp.1943-5487.0000335
  14. An Efficient Parallel Approach for Sclera Vein Recognition vol.9, pp.2, 2007, https://doi.org/10.1109/tifs.2013.2291314
  15. Cytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiers vol.6, pp.None, 2014, https://doi.org/10.1186/1758-2946-6-29
  16. Bayesian neural networks for detecting epistasis in genetic association studies vol.15, pp.None, 2007, https://doi.org/10.1186/s12859-014-0368-0
  17. A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network vol.7, pp.1, 2007, https://doi.org/10.1145/2539125
  18. Parallel Computational Intelligence-Based Multi-Camera Surveillance System vol.3, pp.2, 2014, https://doi.org/10.3390/jsan3020095
  19. Speeding up deep neural network based speech recognition systems vol.9, pp.10, 2007, https://doi.org/10.4304/jsw.9.10.2706-2712
  20. A Parallelization Method for Neural Network Learning vol.191, pp.2, 2015, https://doi.org/10.1002/eej.22694
  21. Evaluating automatically parallelized versions of the support vector machine vol.28, pp.7, 2016, https://doi.org/10.1002/cpe.3413
  22. Global Research on Artificial Intelligence from 1990-2014: Spatially-Explicit Bibliometric Analysis vol.5, pp.5, 2016, https://doi.org/10.3390/ijgi5050066
  23. Tube-Based Robust Model Predictive Control of Nonlinear Systems via Collective Neurodynamic Optimization vol.63, pp.7, 2016, https://doi.org/10.1109/tie.2016.2544718
  24. A survey of neural network accelerators vol.11, pp.5, 2007, https://doi.org/10.1007/s11704-016-6159-1
  25. Deep learning in robotics: a review of recent research vol.31, pp.16, 2007, https://doi.org/10.1080/01691864.2017.1365009
  26. A 4-fJ/Spike Artificial Neuron in 65 nm CMOS Technology vol.11, pp.None, 2007, https://doi.org/10.3389/fnins.2017.00123
  27. A Deep Learning Model for Robust Wafer Fault Monitoring With Sensor Measurement Noise vol.30, pp.1, 2017, https://doi.org/10.1109/tsm.2016.2628865
  28. Speedup for quantum optimal control from automatic differentiation based on graphics processing units vol.95, pp.4, 2007, https://doi.org/10.1103/physreva.95.042318
  29. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review vol.29, pp.9, 2007, https://doi.org/10.1162/neco_a_00990
  30. An overview on the evolution and adoption of deep learning applications used in the industry vol.8, pp.4, 2007, https://doi.org/10.1002/widm.1257
  31. Detecting Proline and Non-Proline Cis Isomers in Protein Structures from Sequences Using Deep Residual Ensemble Learning vol.58, pp.9, 2007, https://doi.org/10.1021/acs.jcim.8b00442
  32. GPU Acceleration of the Most Apparent Distortion Image Quality Assessment Algorithm vol.4, pp.10, 2007, https://doi.org/10.3390/jimaging4100111
  33. Understanding a Deep Learning Technique through a Neuromorphic System a Case Study with SpiNNaker Neuromorphic Platform vol.164, pp.None, 2007, https://doi.org/10.1051/matecconf/201816401015
  34. Revised simplex algorithm for linear programming on GPUs with CUDA vol.77, pp.22, 2007, https://doi.org/10.1007/s11042-018-5947-z
  35. Accurate Single-Sequence Prediction of Protein Intrinsic Disorder by an Ensemble of Deep Recurrent and Convolutional Architectures vol.58, pp.11, 2007, https://doi.org/10.1021/acs.jcim.8b00636
  36. TexNN: Fast Texture Encoding Using Neural Networks vol.38, pp.1, 2007, https://doi.org/10.1111/cgf.13534
  37. TherelativisticHopfield network: Rigorous results vol.60, pp.3, 2007, https://doi.org/10.1063/1.5077060
  38. Pricing Options and Computing Implied Volatilities using Neural Networks vol.7, pp.1, 2007, https://doi.org/10.3390/risks7010016
  39. The Alzheimer Precision Medicine Initiative vol.68, pp.1, 2019, https://doi.org/10.3233/jad-181121
  40. Applying deep learning to right whale photo identification vol.33, pp.3, 2007, https://doi.org/10.1111/cobi.13226
  41. Machine-learning based design of active composite structures for 4D printing vol.28, pp.6, 2007, https://doi.org/10.1088/1361-665x/ab1439
  42. CPU versus GPU: which can perform matrix computation faster-performance comparison for basic linear algebra subprograms vol.31, pp.8, 2007, https://doi.org/10.1007/s00521-018-3354-z
  43. The application of artificial neural networks in metabolomics: a historical perspective vol.15, pp.11, 2007, https://doi.org/10.1007/s11306-019-1608-0
  44. Recent developments in deep learning applied to protein structure prediction vol.87, pp.12, 2007, https://doi.org/10.1002/prot.25824
  45. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning vol.10, pp.1, 2007, https://doi.org/10.1038/s41467-019-13395-9
  46. To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map vol.60, pp.1, 2007, https://doi.org/10.1021/acs.jcim.9b00438
  47. The building blocks of a brain-inspired computer vol.7, pp.1, 2007, https://doi.org/10.1063/1.5129306
  48. Evaluating graph resilience with tensor stack networks: a Keras implementation vol.32, pp.9, 2020, https://doi.org/10.1007/s00521-020-04790-1
  49. Archaeology and contemporary emerging zoonosis: A framework for predicting future Rift Valley fever virus outbreaks vol.30, pp.3, 2007, https://doi.org/10.1002/oa.2862
  50. A Survey on Distributed Machine Learning vol.53, pp.2, 2020, https://doi.org/10.1145/3377454
  51. A distributed parallel training method of deep belief networks vol.24, pp.17, 2007, https://doi.org/10.1007/s00500-020-04754-6
  52. A survey of the recent architectures of deep convolutional neural networks vol.53, pp.8, 2007, https://doi.org/10.1007/s10462-020-09825-6
  53. Brian2GeNN: accelerating spiking neural network simulations with graphics hardware vol.10, pp.None, 2007, https://doi.org/10.1038/s41598-019-54957-7
  54. Instance Segmentation with Mask R-CNN Applied to Loose-Housed Dairy Cows in a Multi-Camera Setting vol.10, pp.12, 2007, https://doi.org/10.3390/ani10122402
  55. Deep learning for camera data acquisition, control, and image estimation vol.12, pp.4, 2007, https://doi.org/10.1364/aop.398263
  56. A neural system dynamics modeling platform and its applications in randomized controlled trial data analysis vol.24, pp.None, 2007, https://doi.org/10.1016/j.imu.2021.100612
  57. Pre-Earthquake Ionospheric Perturbation Identification Using CSES Data via Transfer Learning vol.9, pp.None, 2007, https://doi.org/10.3389/fenvs.2021.779255
  58. Machine Learning in Predictive Toxicology: Recent Applications and Future Directions for Classification Models vol.34, pp.2, 2021, https://doi.org/10.1021/acs.chemrestox.0c00316
  59. Design of Power-Efficient Training Accelerator for Convolution Neural Networks vol.10, pp.7, 2007, https://doi.org/10.3390/electronics10070787
  60. Meta-Heuristic Optimization Methods for Quaternion-Valued Neural Networks vol.9, pp.9, 2007, https://doi.org/10.3390/math9090938
  61. Daisen: A Framework for Visualizing Detailed GPU Execution vol.40, pp.3, 2007, https://doi.org/10.1111/cgf.14303
  62. Catalog-free modeling of galaxy types in deep images : Massive dimensional reduction with neural networks vol.652, pp.None, 2021, https://doi.org/10.1051/0004-6361/202140383
  63. Convolutional neural network in proteomics and metabolomics for determination of comorbidity between cancer and schizophrenia vol.122, pp.None, 2007, https://doi.org/10.1016/j.jbi.2021.103890
  64. A Marr's Three‐Level Analytical Framework for Neuromorphic Electronic Systems vol.3, pp.11, 2007, https://doi.org/10.1002/aisy.202100054
  65. TEMImageNet training library and AtomSegNet deep-learning models for high-precision atom segmentation, localization, denoising, and deblurring of atomic-resolution images vol.11, pp.1, 2021, https://doi.org/10.1038/s41598-021-84499-w
  66. Against generalisation: Data-driven decisions need context to be human-compatible vol.38, pp.4, 2021, https://doi.org/10.1177/02663821211061986
  67. SafeNet: SwArm for Earthquake Perturbations Identification Using Deep Learning Networks vol.13, pp.24, 2007, https://doi.org/10.3390/rs13245033