• Title/Summary/Keyword: Xavier

Search Result 120, Processing Time 0.027 seconds

TVM-based Performance Optimization for Image Classification in Embedded Systems (임베디드 시스템에서의 객체 분류를 위한 TVM기반의 성능 최적화 연구)

  • Cheonghwan Hur;Minhae Ye;Ikhee Shin;Daewoo Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.101-108
    • /
    • 2023
  • Optimizing the performance of deep neural networks on embedded systems is a challenging task that requires efficient compilers and runtime systems. We propose a TVM-based approach that consists of three steps: quantization, auto-scheduling, and ahead-of-time compilation. Our approach reduces the computational complexity of models without significant loss of accuracy, and generates optimized code for various hardware platforms. We evaluate our approach on three representative CNNs using ImageNet Dataset on the NVIDIA Jetson AGX Xavier board and show that it outperforms baseline methods in terms of processing speed.

A Study on the Improvement of YOLOv7 Inference Speed in Jetson Embedded Platform (Jetson 임베디드 플랫폼에서의 YOLOv7 추론 속도 개선에 관한 연구)

  • Bo-Chan Kang;Dong-Young Yoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.154-155
    • /
    • 2023
  • 오픈 소스인 YOLO(You Only Look Once) 객체 탐지 알고리즘이 공개된 이후, 산업 현장에서는 고성능 컴퓨터에서 벗어나 효율과 특수한 환경에 사용하기 위해 임베디드 시스템에 도입하고 있다. 그러나, NVIDIA의 Jetson nano의 경우, Pytorch의 YOLOv7 딥러닝 모델에 대한 추론이 진행되지 않는다. 따라서 제한적인 전력과 메모리, 연산능력 최적화 과정은 필수적이다. 본 논문은 NVIDIA의 임베디드 플랫폼 Jetson 계열의 Xavier NX, Orin AGX, Nano에서 딥러닝 모델을 적용하기 위한 최적화 과정과 플랫폼에서 다양한 크기의 YOLOv7의 PyTorch 모델들을 Tensor RT로 변환하여 FPS(Frames Per Second)를 측정 및 비교한다. 측정 결과를 통해, 각 임베디드 플랫폼에서 YOLOv7 모델의 추론은 Tensor RT는 Pytorch에서 약 4.1배 적은 FPS 변동성과 약 2.25배 정도의 FPS 속도향상을 보였다.

An implicit damage-plastic model for concrete

  • Gustavo Luz Xavier da Costa
    • Computers and Concrete
    • /
    • v.33 no.3
    • /
    • pp.301-308
    • /
    • 2024
  • This paper proposes a numerically-based methodology to implicitly model irreversible deformations in concrete through a damage model. Plasticity theory is not explicitly employed, although resemblances are still present. A scalar isotropic damage model is adopted and the damage variable is split in two: one contributing for stiffness degradation (cracking) and other contributing for irreversible deformations (plasticity). The proposed methodology is thermodynamically consistent as it consists in a damage model rewritten in different terms. Its Finite Element coding is presented, indicating that minor changes are necessary. It is also demonstrated that nonlinear algorithms are unnecessary to model concrete cracking and plasticity. Experimental data from direct tension and four-point bending tests under cyclic loading are compared to the proposed methodology. A numerical case study of a low-cycle fatigue is also presented. It can be concluded that the model is simple, feasible and capable to capture the essentials concerning cracking and plasticity.

Unilateral caudate infarct following pituitary adenoma resection

  • Xavier Wong-Achi;Luis Rodriguez-Hernandez;Jose Herrera-Castro;Marcos Sangrador-Deitos;Juan Luis Gomez-Amador;Ulises Garcia-Gonzalez
    • Journal of Cerebrovascular and Endovascular Neurosurgery
    • /
    • v.26 no.2
    • /
    • pp.210-215
    • /
    • 2024
  • Cerebral ischemic complications after pituitary surgery are not frequently reported. Multiple mechanisms have been proposed, including vasospasm, and delayed cerebral ischemia resulting from postoperative subarachnoid bleeding. Given the unknown etiology of vasospasm following these situations, little is known about its prevention. Through a case report and bibliographic review, the authors warn about the importance of recognizing key signs postoperatively that could indicate increased risk for cerebral vasospasm and must be recognized in a timely manner, with appropriate treatment strategies implemented once these symptoms present.

Initialization by using truncated distributions in artificial neural network (절단된 분포를 이용한 인공신경망에서의 초기값 설정방법)

  • Kim, MinJong;Cho, Sungchul;Jeong, Hyerin;Lee, YungSeop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.693-702
    • /
    • 2019
  • Deep learning has gained popularity for the classification and prediction task. Neural network layers become deeper as more data becomes available. Saturation is the phenomenon that the gradient of an activation function gets closer to 0 and can happen when the value of weight is too big. Increased importance has been placed on the issue of saturation which limits the ability of weight to learn. To resolve this problem, Glorot and Bengio (Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256, 2010) claimed that efficient neural network training is possible when data flows variously between layers. They argued that variance over the output of each layer and variance over input of each layer are equal. They proposed a method of initialization that the variance of the output of each layer and the variance of the input should be the same. In this paper, we propose a new method of establishing initialization by adopting truncated normal distribution and truncated cauchy distribution. We decide where to truncate the distribution while adapting the initialization method by Glorot and Bengio (2010). Variances are made over output and input equal that are then accomplished by setting variances equal to the variance of truncated distribution. It manipulates the distribution so that the initial values of weights would not grow so large and with values that simultaneously get close to zero. To compare the performance of our proposed method with existing methods, we conducted experiments on MNIST and CIFAR-10 data using DNN and CNN. Our proposed method outperformed existing methods in terms of accuracy.

A Study on Family Relations Drawn at of Xavier Dolan (자비에 돌란의 <단지 세상의 끝>에 그려진 가족관계 연구)

  • Kim, Tae-Hyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.12
    • /
    • pp.622-628
    • /
    • 2019
  • "It's Only the End of the World" is a work that expresses in depth the perception, attitude, and reflection of a person's death. Composed of a simple story of a protagonist who has been diagnosed with AIDS and has been visiting his hometown for the first time in 12 years, this work constantly asks us what a family is in this process. The visit, which aims to inform one's condition, restore the relationship with his family as he wished, and foremost but in a beautiful parting, reveals the feelings of resentment, hatred and criticism between the family members who have been hiding or trying to accept. Are family relationships always understood and must be forgiven and cared for? The director looks into the abyss of the relationship and reveals the painful truth we wanted to hide. And we realize that this painful truth is a reality. Louis's negative stance, and the complaints and dissatisfaction of the family members who were waiting for him were absolutely inadequate in narrowing the gap. This family, each of whom has a wound and does not really understand each other, shows a deep bond of feelings toward each other, though they are tied together in a 'family' community.

Implementation of 3D Road Surface Monitoring System for Vehicle based on Line Laser (선레이저 기반 이동체용 3차원 노면 모니터링 시스템 구현)

  • Choi, Seungho;Kim, Seoyeon;Kim, Taesik;Min, Hong;Jung, Young-Hoon;Jung, Jinman
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.101-107
    • /
    • 2020
  • Road surface measurement is an essential process for quantifying the degree and displacement of roughness in road surface management. For safer road surface management and quick maintenance, it is important to accurately measure the road surface while mounted on a vehicle. In this paper, we propose a sophisticated road surface measurement system that can be measured on a moving vehicle. The proposed road surface measurement system supports more accurate measurement of the road surface by using a high-performance line laser sensor. It is also possible to measure the transverse and longitudinal profile by matching the position information acquired from the RTK, and the velocity adaptive update algorithm allows a manager to monitor in a real-time manner. In order to evaluate the proposed system, the Gocator laser sensor, MRP module, and NVIDIA Xavier processor were mounted on a test mobile and tested on the road surface. Our evaluation results demonstrate that our system measures accurate profile base on the MSE. Our proposed system can be used not only for evaluating the condition of roads but also for evaluating the impact of adjacent excavation.

A Performance Study on CPU-GPU Data Transfers of Unified Memory Device (통합메모리 장치에서 CPU-GPU 데이터 전송성능 연구)

  • Kwon, Oh-Kyoung;Gu, Gibeom
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.5
    • /
    • pp.133-138
    • /
    • 2022
  • Recently, as GPU performance has improved in HPC and artificial intelligence, its use is becoming more common, but GPU programming is still a big obstacle in terms of productivity. In particular, due to the difficulty of managing host memory and GPU memory separately, research is being actively conducted in terms of convenience and performance, and various CPU-GPU memory transfer programming methods are suggested. Meanwhile, recently many SoC (System on a Chip) products such as Apple M1 and NVIDIA Tegra that bundle CPU, GPU, and integrated memory into one large silicon package are emerging. In this study, data between CPU and GPU devices are used in such an integrated memory device and performance-related research is conducted during transmission. It shows different characteristics from the existing environment in which the host memory and GPU memory in the CPU are separated. Here, we want to compare performance by CPU-GPU data transmission method in NVIDIA SoC chips, which are integrated memory devices, and NVIDIA SMX-based V100 GPU devices. For the experimental workload for performance comparison, a two-dimensional matrix transposition example frequently used in HPC applications was used. We analyzed the following performance factors: the difference in GPU kernel performance according to the CPU-GPU memory transfer method for each GPU device, the transfer performance difference between page-locked memory and pageable memory, overall performance comparison, and performance comparison by workload size. Through this experiment, it was confirmed that the NVIDIA Xavier can maximize the benefits of integrated memory in the SoC chip by supporting I/O cache consistency.

Computational Materials Engineering: Recent Applications of VASP in the MedeA® Software Environment

  • Wimmer, Erich;Christensen, Mikael;Eyert, Volker;Wolf, Walter;Reith, David;Rozanska, Xavier;Freeman, Clive;Saxe, Paul
    • Journal of the Korean Ceramic Society
    • /
    • v.53 no.3
    • /
    • pp.263-272
    • /
    • 2016
  • Electronic structure calculations have become a powerful foundation for computational materials engineering. Four major factors have enabled this unprecedented evolution, namely (i) the development of density functional theory (DFT), (ii) the creation of highly efficient computer programs to solve the Kohn-Sham equations, (iii) the integration of these programs into productivity-oriented computational environments, and (iv) the phenomenal increase of computing power. In this context, we describe recent applications of the Vienna Ab-initio Simulation Package (VASP) within the MedeA$^{(R)}$ computational environment, which provides interoperability with a comprehensive range of modeling and simulation tools. The focus is on technological applications including microelectronic materials, Li-ion batteries, high-performance ceramics, silicon carbide, and Zr alloys for nuclear power generation. A discussion of current trends including high-throughput calculations concludes this article.

Should Male Circumcision be Advocated for Genital Cancer Prevention?

  • Morris, Brian J.;Mindel, Adrian;Tobian, Aaron A.R.;Hankins, Catherine A.;Gray, Ronald H.;Bailey, Robert C.;Bosch, Xavier;Wodak, Alex D.
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.9
    • /
    • pp.4839-4842
    • /
    • 2012
  • The recent policy statement by the Cancer Council of Australia on infant circumcision and cancer prevention and the announcement that the quadrivalent human papillomavirus (HPV) vaccine will be made available for boys in Australia prompted us to provide an assessment of genital cancer prevention. While HPV vaccination of boys should help reduce anal cancer in homosexual men and cervical cancer in women, it will have little or no impact on penile or prostate cancer. Male circumcision can reduce cervical, penile and possibly prostate cancer. Promotion of both HPV vaccination and male circumcision will synergistically maximize genital cancer prevention.