Search | Korea Science

PartitionTuner: An operator scheduler for deep-learning compilers supporting multiple heterogeneous processing units

Misun Yu;Yongin Kwon;Jemin Lee;Jeman Park;Junmo Park;Taeho Kim
- ETRI Journal
- /
- v.45 no.2
- /
- pp.318-328
- /
- 2023
Recently, embedded systems, such as mobile platforms, have multiple processing units that can operate in parallel, such as centralized processing units (CPUs) and neural processing units (NPUs). We can use deep-learning compilers to generate machine code optimized for these embedded systems from a deep neural network (DNN). However, the deep-learning compilers proposed so far generate codes that sequentially execute DNN operators on a single processing unit or parallel codes for graphic processing units (GPUs). In this study, we propose PartitionTuner, an operator scheduler for deep-learning compilers that supports multiple heterogeneous PUs including CPUs and NPUs. PartitionTuner can generate an operator-scheduling plan that uses all available PUs simultaneously to minimize overall DNN inference time. Operator scheduling is based on the analysis of DNN architecture and the performance profiles of individual and group operators measured on heterogeneous processing units. By the experiments for seven DNNs, PartitionTuner generates scheduling plans that perform 5.03% better than a static type-based operator-scheduling technique for SqueezeNet. In addition, PartitionTuner outperforms recent profiling-based operator-scheduling techniques for ResNet50, ResNet18, and SqueezeNet by 7.18%, 5.36%, and 2.73%, respectively.
https://doi.org/10.4218/etrij.2021-0446 인용 PDF

Test Suit Generation System for Retargetable C Compilers (재겨냥성 C 컴파일러를 위한 테스트 집합 생성 시스템)

Woo, Gyun;Bae, Jung-Ho;Jang, Han-Il;Lee, Yun-Jung;Chae, Heung-Seok
- The KIPS Transactions:PartA
- /
- v.16A no.4
- /
- pp.245-254
- /
- 2009
With the increasing adoption of embedded processors, the need of developing compilers for the embedded processors with timely manner is also growing. Retargeting has been adopted as a viable approach to constructing new compilers by modifying the back-end of an existing compiler. This paper proposes a test suite generation system for testing retargetable C compilers. The proposed system generates the test suite using the grammar coverage concept. Generally, the size of the test suite satisfying the grammar coverage of the source language is very large. Hence, the proposed system also provides the facility to reduce the size of the test suite. According to the experimental result, the reduced test suite can detect 75% of the compiler faults detected by the original test suite though the size of the reduced test suite is only 10% of that of the original test suite in average. This result indicates that the reduction technique proposed in this paper can be effectively used in the prior phase of the development procedure of the embedded compilers.
https://doi.org/10.3745/KIPSTA.2009.16-A.4.245 인용 PDF KSCI

Development of a Code Generation Support System in Integrated Development Environment of an Educational Compiler

Kwon, Jung-Hoon;Bae, Jong-Min
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.11
- /
- pp.159-166
- /
- 2016
Compiler course is one of the important courses in computer science. It requires more efficient learning environment because of its large coverage scale and complexity. One of its solutions is to provide the integrated development environment for educational compilers which is enable to give practice-oriented class and enhance student's interest. This paper presents the code generation support system developed in an integrated development environment of educational compiler. Our system helps students to understand the process of code generation and visualizes the relation among the source language, AST, and the target language. It makes students develop their own compilers more easily.
https://doi.org/10.9708/jksci.2016.21.11.159 인용 PDF KSCI

Comparison of two retargetable compilers: GCC and SoarGen

Zhiwen, Zheng;Ahn, Minwook;Youn, Jonghee M.;Kim, Yongjoo;Kwon, Yongin;Paek, Yunheung
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.11a
- /
- pp.17-18
- /
- 2009
This paper shows our empirical comparison result between two retargetable compilers, GCC and SoarGen. SoarGen is our retargetable compiler. According to our experimental result, using SoarGen for targeting ODALRISC is proved to be easier and faster than using GCC. The average retarget time of the SoarGen is much less than the retarget time of the GCC.
https://doi.org/10.3745/PKIPS.y2009m11a.17 인용 PDF

Optimization of the computing environment to improve the speed of the modeling (WRF and CMAQ) calculation of the National Air Quality Forecast System (국가 대기질 예보 시스템의 모델링(기상 및 대기질) 계산속도 향상을 위한 전산환경 최적화 방안)

Myoung, Jisu;Kim, Taehee;Lee, Yonghee;Suh, Insuk;Jang, Limsuk
- Journal of Environmental Science International
- /
- v.27 no.8
- /
- pp.723-735
- /
- 2018
In this study, to investigate an optimal configuration method for the modeling system, we performed an optimization experiment by controlling the types of compilers and libraries, and the number of CPU cores because it was important to provide reliable model data very quickly for the national air quality forecast. We were made up the optimization experiment of twelve according to compilers (PGI and Intel), MPIs (mvapich-2.0, mvapich-2.2, and mpich-3.2) and NetCDF (NetCDF-3.6.3 and NetCDF-4.1.3) and performed wall clock time measurement for the WRF and CMAQ models based on the built computing resources. In the result of the experiment according to the compiler and library type, the performance of the WRF (30 min 30 s) and CMAQ (47 min 22 s) was best when the combination of Intel complier, mavapich-2.0, and NetCDF-3.6.3 was applied. Additionally, in a result of optimization by the number of CPU cores, the WRF model was best performed with 140 cores (five calculation servers), and the CMAQ model with 120 cores (five calculation servers). While the WRF model demonstrated obvious differences depending on the number of CPU cores rather than the types of compilers and libraries, CMAQ model demonstrated the biggest differences on the combination of compilers and libraries.
https://doi.org/10.5322/JESI.2018.27.8.723 인용 PDF KSCI

Effective Integer Promotion Bug Detection Technique for Embedded Software (효과적인 내장형 소프트웨어의 정수 확장 (Integer Promotion) 버그 검출 기법)

Kim, Yunho;Kim, Taejin;Kim, Moonzoo;Lee, Ho-jung;Jang, Hoon;Park, Mingyu
- Journal of KIISE
- /
- v.43 no.6
- /
- pp.692-699
- /
- 2016
C compilers for 8-bit MCUs used in washing machines and refrigerators often do not follow the C standard to improve runtime performance. Developers who are unaware of the difference between C compilers following the C standard and the C compilers for 8-bit MCU can cause bugs that do not appear in the standard C environment but appear in the embedded systems using 8-bit MCUs. It is difficult for bug detectors that assume the standard C environment to detect such bugs. In this paper, we introduce integer promotion bugs caused by the different integer promotion rules of the C compilers for 8-bit MCU from the C standard and propose 5 bug patterns where the integer promotion bugs occur. We have developed an integer promotion bug detection tool and applied it to the washing machine control software developed by the LG electronics. The integer promotion bug detection tool successfully detected 27 integer promotion bugs in the washing machine control software.
https://doi.org/10.5626/JOK.2016.43.6.692 인용 KSCI

Development of an Eclipse-based IDE for Educational Compilers (이클립스 기반의 교육용 컴파일러 통합개발환경)

Sung, U-Kyung;Kang, Hyun-Syug;Bae, Jong-Min
- The Journal of Korean Association of Computer Education
- /
- v.14 no.5
- /
- pp.9-18
- /
- 2011
Compiler development projects, which are designed and taught in compiler course, allow students to practice and absorb valuable amount of experience and techniques in developing compilers. However, both instructors and students face difficulties as they are often limited by insufficient hands-on time during course of an academic year along with a relatively high level of technologies involved when dealing with compilers. As well, most compiler's target systems use interpreter-based technologies which are rather limited in drawing student's attention. As a result compiler courses often end up being more of a theoretical course than practical. This paper presents a new integrated development environment (IDE) that will help overcome aforementioned difficulties and allow students to obtain both theoretical and practical knowledge more efficiently. The development environment includes a reference compiler with $Mindstorms^{(R)}$ NXT Robots as the target system, compiler development tool, target language test tool, and code generation visualizer. It is developed as a plug-in for the popular Eclipse IDE which enables easy access and great expandability. This integrated development environment allows students to understand compilers better and start their development faster.
PDF

Performance Comparison between LLVM and GCC Compilers for the AE32000 Embedded Processor

Park, Chanhyun;Han, Miseon;Lee, Hokyoon;Cho, Myeongjin;Kim, Seon Wook
- IEIE Transactions on Smart Processing and Computing
- /
- v.3 no.2
- /
- pp.96-102
- /
- 2014
The embedded processor market has grown rapidly and consistently with the appearance of mobile devices. In an embedded system, the power consumption and execution time are important factors affecting the performance. The system performance is determined by both hardware and software. Although the hardware architecture is high-end, the software runs slowly due to the low quality of codes. This study compared the performance of two major compilers, LLVM and GCC on a32-bit EISC embedded processor. The dynamic instructions and static code sizes were evaluated from these compilers with the EEMBC benchmarks.LLVM generally performed better in the ALU intensive benchmarks, whereas GCC produced a better register allocation and jump optimization. The dynamic instruction count and static code of GCCwere on average 8% and 7% lower than those of LLVM, respectively.
https://doi.org/10.5573/IEIESPC.2014.3.2.96 인용 PDF KSCI

An XML Compiler Generator using Object Oriented Attribute Grammar and SML (객체지향 속성 문법과 SML을 이용한 XML 컴파일러 생성기)

Choi, Jong-Myung;Yoo, Chae-Woo
- The KIPS Transactions:PartA
- /
- v.11A no.2
- /
- pp.149-158
- /
- 2004
XML as a standard for representing data and document structure is widely used in every area, and we have to write XML compilers which process the XML documents according to a user's intention. Because it takes time and costs to write XML compilers by hand, we need some generators that automatically generate XML compilers. In this paper, we introduce an XML compiler generator named XCC. It reads DTD and semantic rules, and it generates XML compiler and Java classes which correspond to the elements defined in the DTD.
https://doi.org/10.3745/KIPSTA.2004.11A.2.149 인용 PDF KSCI

Trends of Compiler Development for AI Processor (인공지능 프로세서 컴파일러 개발 동향)

Kim, J.K.;Kim, H.J.;Cho, Y.C.P.;Kim, H.M.;Lyuh, C.G.;Han, J.;Kwon, Y.
- Electronics and Telecommunications Trends
- /
- v.36 no.2
- /
- pp.32-42
- /
- 2021
The rapid growth of deep-learning applications has invoked the R&D of artificial intelligence (AI) processors. A dedicated software framework such as a compiler and runtime APIs is required to achieve maximum processor performance. There are various compilers and frameworks for AI training and inference. In this study, we present the features and characteristics of AI compilers, training frameworks, and inference engines. In addition, we focus on the internals of compiler frameworks, which are based on either basic linear algebra subprograms or intermediate representation. For an in-depth insight, we present the compiler infrastructure, internal components, and operation flow of ETRI's "AI-Ware." The software framework's significant role is evidenced from the optimized neural processing unit code produced by the compiler after various optimization passes, such as scheduling, architecture-considering optimization, schedule selection, and power optimization. We conclude the study with thoughts about the future of state-of-the-art AI compilers.
https://doi.org/10.22648/ETRI.2021.J.360204 인용 PDF

Search Result 71, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)