통합 검색 | Korea Science

PartitionTuner: An operator scheduler for deep-learning compilers supporting multiple heterogeneous processing units

Misun Yu;Yongin Kwon;Jemin Lee;Jeman Park;Junmo Park;Taeho Kim
- ETRI Journal
- /
- 제45권2호
- /
- pp.318-328
- /
- 2023
Recently, embedded systems, such as mobile platforms, have multiple processing units that can operate in parallel, such as centralized processing units (CPUs) and neural processing units (NPUs). We can use deep-learning compilers to generate machine code optimized for these embedded systems from a deep neural network (DNN). However, the deep-learning compilers proposed so far generate codes that sequentially execute DNN operators on a single processing unit or parallel codes for graphic processing units (GPUs). In this study, we propose PartitionTuner, an operator scheduler for deep-learning compilers that supports multiple heterogeneous PUs including CPUs and NPUs. PartitionTuner can generate an operator-scheduling plan that uses all available PUs simultaneously to minimize overall DNN inference time. Operator scheduling is based on the analysis of DNN architecture and the performance profiles of individual and group operators measured on heterogeneous processing units. By the experiments for seven DNNs, PartitionTuner generates scheduling plans that perform 5.03% better than a static type-based operator-scheduling technique for SqueezeNet. In addition, PartitionTuner outperforms recent profiling-based operator-scheduling techniques for ResNet50, ResNet18, and SqueezeNet by 7.18%, 5.36%, and 2.73%, respectively.
https://doi.org/10.4218/etrij.2021-0446 인용 PDF

재겨냥성 C 컴파일러를 위한 테스트 집합 생성 시스템 (Test Suit Generation System for Retargetable C Compilers)

우균;배정호;장한일;이윤정;채흥석
- 정보처리학회논문지A
- /
- 제16A권4호
- /
- pp.245-254
- /
- 2009
임베디드 프로세서 사용이 증가함에 따라 임베디드 프로세서를 위한 컴파일러를 시기 적절히 개발해야 할 필요성이 증가하고 있다. 컴파일러 후단부를 수정하여 새로운 컴파일러를 구성하는 재겨냥 기법이 이에 적합한 기법으로 채택되고 있다. 이 논문에서는 재겨냥성 C 컴파일러를 테스트하기 위한 테스트 집합 생성 시스템을 제안한다. 제안한 시스템은 문법 커버리지 개념을 이용하여 테스트 집합을 생성한다. 일반적으로 원시 프로그래밍 언어의 문법을 이용하여 테스트 집합을 생성하면 방대한 크기의 테스트 집합이 얻어진다. 그러나 신속히 컴파일러를 출시해야하는 상황에서는 방대한 테스트 집합 크기가 문제가 될 수 있다. 이에 이 논문에서 제안한 시스템은 중간 코드를 고려하여 테스트 집합을 축약하는 기능을 탑재하고 있다. 실험 결과에 따르면, 비록 축약된 테스트 집합 크기는 원본 테스트 집합 크기의 평균 10%에 불과하지만 원본 테스트 집합이 검출할 수 있는 컴파일러 오류의 75% 정도를 검출할 수 있음을 알 수 있었다. 이는 본 논문에서 제시한 축약 기법이 임베디드 컴파일러 개발 초기 단계에서 효과적으로 사용될 수 있음을 의미한다.
https://doi.org/10.3745/KIPSTA.2009.16-A.4.245 인용 PDF KSCI

Development of a Code Generation Support System in Integrated Development Environment of an Educational Compiler

Kwon, Jung-Hoon;Bae, Jong-Min
- 한국컴퓨터정보학회논문지
- /
- 제21권11호
- /
- pp.159-166
- /
- 2016
Compiler course is one of the important courses in computer science. It requires more efficient learning environment because of its large coverage scale and complexity. One of its solutions is to provide the integrated development environment for educational compilers which is enable to give practice-oriented class and enhance student's interest. This paper presents the code generation support system developed in an integrated development environment of educational compiler. Our system helps students to understand the process of code generation and visualizes the relation among the source language, AST, and the target language. It makes students develop their own compilers more easily.
https://doi.org/10.9708/jksci.2016.21.11.159 인용 PDF KSCI

Comparison of two retargetable compilers: GCC and SoarGen

Zhiwen, Zheng;Ahn, Minwook;Youn, Jonghee M.;Kim, Yongjoo;Kwon, Yongin;Paek, Yunheung
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2009년도 추계학술발표대회
- /
- pp.17-18
- /
- 2009
This paper shows our empirical comparison result between two retargetable compilers, GCC and SoarGen. SoarGen is our retargetable compiler. According to our experimental result, using SoarGen for targeting ODALRISC is proved to be easier and faster than using GCC. The average retarget time of the SoarGen is much less than the retarget time of the GCC.
https://doi.org/10.3745/PKIPS.y2009m11a.17 인용 PDF

국가 대기질 예보 시스템의 모델링(기상 및 대기질) 계산속도 향상을 위한 전산환경 최적화 방안 (Optimization of the computing environment to improve the speed of the modeling (WRF and CMAQ) calculation of the National Air Quality Forecast System)

명지수;김태희;이용희;서인석;장임석
- 한국환경과학회지
- /
- 제27권8호
- /
- pp.723-735
- /
- 2018
In this study, to investigate an optimal configuration method for the modeling system, we performed an optimization experiment by controlling the types of compilers and libraries, and the number of CPU cores because it was important to provide reliable model data very quickly for the national air quality forecast. We were made up the optimization experiment of twelve according to compilers (PGI and Intel), MPIs (mvapich-2.0, mvapich-2.2, and mpich-3.2) and NetCDF (NetCDF-3.6.3 and NetCDF-4.1.3) and performed wall clock time measurement for the WRF and CMAQ models based on the built computing resources. In the result of the experiment according to the compiler and library type, the performance of the WRF (30 min 30 s) and CMAQ (47 min 22 s) was best when the combination of Intel complier, mavapich-2.0, and NetCDF-3.6.3 was applied. Additionally, in a result of optimization by the number of CPU cores, the WRF model was best performed with 140 cores (five calculation servers), and the CMAQ model with 120 cores (five calculation servers). While the WRF model demonstrated obvious differences depending on the number of CPU cores rather than the types of compilers and libraries, CMAQ model demonstrated the biggest differences on the combination of compilers and libraries.
https://doi.org/10.5322/JESI.2018.27.8.723 인용 PDF KSCI

효과적인 내장형 소프트웨어의 정수 확장 (Integer Promotion) 버그 검출 기법 (Effective Integer Promotion Bug Detection Technique for Embedded Software)

김윤호;김태진;김문주;이호정;장훈;박민규
- 정보과학회 논문지
- /
- 제43권6호
- /
- pp.692-699
- /
- 2016
세탁기, 냉장고 등의 가전제품에 탑재되는 8-bit MCU용 C 컴파일러는 소프트웨어 실행 속도를 높이기 위해 표준 C 언어 규칙을 따르지 않고 컴파일을 수행할 수 있다. 개발자가 일반 C 컴파일러와 8-bit MCU용 C 컴파일러의 차이를 정확하게 이해하지 못할 경우 표준 C 언어 환경에서는 발생하지 않으나 8-bit MCU를 사용하는 내장형 시스템에서는 발생하는 버그를 야기할 수 있으며 이런 버그는 표준 C언어 환경을 가정하는 버그 검출 도구로는 찾기 어렵다. 본 논문에서는 표준 C 정수 확장 규칙을 따르지 않는 8-bit MCU용 컴파일러를 사용할 때 발생하는 정수 확장 버그를 소개하고 정수 확장 버그를 탐지하기 위한 다섯 종류의 버그 패턴을 제안한다. 정수 확장 버그 패턴 검출 도구를 개발하여 LG전자 세탁기 소프트웨어를 분석한 결과 컴파일러 옵션을 잘못 선택한 경우 발생하는 27개의 정수 확장 버그를 발견하였다.
https://doi.org/10.5626/JOK.2016.43.6.692 인용 KSCI

이클립스 기반의 교육용 컴파일러 통합개발환경 (Development of an Eclipse-based IDE for Educational Compilers)

성우경;강현석;배종민
- 컴퓨터교육학회논문지
- /
- 제14권5호
- /
- pp.9-18
- /
- 2011
컴파일러 교과목에서 수행하는 컴파일러 개발 프로젝트는 많은 경험과 기술을 얻을 수 있다. 그러나 한 학기에 이수하기 부족한 강의 시간과 컴파일러 개발의 높은 난이도 때문에 수업에 어려움을 겪는다. 그리고 컴파일러 목적시스템은 대게 인터프리터로 구현되기 때문에 학생의 흥미를 유도하기도 어렵다. 이에 따라서 컴파일러 교육은 이론 위주의 교육이 되기 쉽다. 이러한 한계를 극복하기 위하여 본 논문에서는 컴파일러의 이론과 실제를 보다 쉽게 습득할 수 있는 통합개발환경을 제시한다. 개발된 통합개발환경에는 마인드스톰 NXT 로봇을 목적시스템으로 하는 레퍼런스 컴파일러와 컴파일러 제작 도구, 목적언어 테스트 도구, 코드생성 시각화 도구가 포함되며 이클립스 플러그인 기반으로 개발되어 편의성과 확장성이 뛰어나다. 개발된 통합개발환경은 학생들이 컴파일러를 보다 쉽게 이해하고 개발할 수 있도록 도와준다.
PDF

Performance Comparison between LLVM and GCC Compilers for the AE32000 Embedded Processor

Park, Chanhyun;Han, Miseon;Lee, Hokyoon;Cho, Myeongjin;Kim, Seon Wook
- IEIE Transactions on Smart Processing and Computing
- /
- 제3권2호
- /
- pp.96-102
- /
- 2014
The embedded processor market has grown rapidly and consistently with the appearance of mobile devices. In an embedded system, the power consumption and execution time are important factors affecting the performance. The system performance is determined by both hardware and software. Although the hardware architecture is high-end, the software runs slowly due to the low quality of codes. This study compared the performance of two major compilers, LLVM and GCC on a32-bit EISC embedded processor. The dynamic instructions and static code sizes were evaluated from these compilers with the EEMBC benchmarks.LLVM generally performed better in the ALU intensive benchmarks, whereas GCC produced a better register allocation and jump optimization. The dynamic instruction count and static code of GCCwere on average 8% and 7% lower than those of LLVM, respectively.
https://doi.org/10.5573/IEIESPC.2014.3.2.96 인용 PDF KSCI

객체지향 속성 문법과 SML을 이용한 XML 컴파일러 생성기 (An XML Compiler Generator using Object Oriented Attribute Grammar and SML)

최종명;유재우
- 정보처리학회논문지A
- /
- 제11A권2호
- /
- pp.149-158
- /
- 2004
XML은 데이터와 문서를 표현하기 위한 표준화된 메타언어고서 점차 많은 분야에서 사용되고 있지만, 각 분야에서 XML 문서론 올바르게 처리하기 위해서는 XML 컴파일러를 작성해야 한다. XML 컴파일러를 작성하는 많은 시간과 노력을 필요로 하기 때문에 XML 컴파일러를 자동적으로 생성할 수 있는 방법이 필요해진다. 논문에서는 XML 문서를 의미에 맞게 처리할 수 있는 XML 컴파일러를 자동으로 생성할 수 있는 XCC라는 XML 컴파일러 생성기를 소개한다. XCC는 XML문서의 DTD를 입력으로 받고, XML 원소(element)들 간의 관계를 이용해서 상속과 컴포지션 관계를 갖는 자바 클래스들을 생성한다. XCC는 또한 의미 규칙을 입력으로 받아서 XML 문서를 의미에 맞게 처리하기 위한 XML 컴파일러를 생성한다. XCC는 XML, 컴파일러를 자동적으로 생성함으로써 XML문서 처리를 위한 소프트웨어 개발에서 비용을 절감시킬 수 있다
https://doi.org/10.3745/KIPSTA.2004.11A.2.149 인용 PDF KSCI

인공지능 프로세서 컴파일러 개발 동향 (Trends of Compiler Development for AI Processor)

김진규;김혜지;조용철;김현미;여준기;한진호;권영수
- 전자통신동향분석
- /
- 제36권2호
- /
- pp.32-42
- /
- 2021
The rapid growth of deep-learning applications has invoked the R&D of artificial intelligence (AI) processors. A dedicated software framework such as a compiler and runtime APIs is required to achieve maximum processor performance. There are various compilers and frameworks for AI training and inference. In this study, we present the features and characteristics of AI compilers, training frameworks, and inference engines. In addition, we focus on the internals of compiler frameworks, which are based on either basic linear algebra subprograms or intermediate representation. For an in-depth insight, we present the compiler infrastructure, internal components, and operation flow of ETRI's "AI-Ware." The software framework's significant role is evidenced from the optimized neural processing unit code produced by the compiler after various optimization passes, such as scheduling, architecture-considering optimization, schedule selection, and power optimization. We conclude the study with thoughts about the future of state-of-the-art AI compilers.
https://doi.org/10.22648/ETRI.2021.J.360204 인용 PDF

검색결과 73건 처리시간 0.022초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)