• Title/Summary/Keyword: Optimizing compiler

Search Result 18, Processing Time 0.028 seconds

Getting Feedback on a Compiler's Optimization Decisions, Enabling More Code-Optimization Opportunities

  • Min, Gyeong Il;Park, Sewon;Han, Miseon;Kim, Seon Wook
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.6
    • /
    • pp.450-454
    • /
    • 2015
  • Short execution time is the major performance factor for computer systems. This performance factor is directly determined by code quality, which is influenced by the compiler's optimizations. However, a compiler has limitations when optimizing source code due to insufficient information. Thus, if programmers can learn the reasons why a compiler fails to apply optimizations, they can rewrite code that is more easily understood by the compiler, and thus improve performance. In this paper, we propose a compiler that provides a programmer with reasons for failed optimization and recognizes programmer's additional information to obtain better optimization. As a result, we obtain performance improvement, i.e., reducing execution time and code size, by taking advantage of additional optimization opportunities.

Development of a Prototyping Tool for New Memory Subsystem

  • Cho, Jungseok;Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.1
    • /
    • pp.69-74
    • /
    • 2019
  • The compiler is the key of the prototyping framework for the new memory system. These compiler-centric prototyping tools have several components, including compiler, linker, assembler, and standard libraries. It takes a lot of cost and man power to develop it all at zero base. Therefore, developer usually use a development framework to develop these prototyping tools efficiently. These development frameworks should be free of licensing issues when considering the commercialization of development results. Thus, developer should investigate the development framework, which is free from licensing issues and that provides all of the development environment to enable actual execution. There are three representative compiler-centric development frameworks: GCC, Clang (LLVM), and MS visual studio. There are some differences depending on the release version among them. And, there are some limitations to the freeware and commercial use. We chose LLVM here to explain the development of prototyping tools. This information will help accelerate the development of prototyping tools and will help reduce system development costs.

Implementation of Optimizing Compiler for Bus-based VLIW Processors (버스기반의 VLIW형 프로세서를 위한 최적화 컴파일러 구현)

  • Hong, Seung-Pyo;Moon, Soo-Mook
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.4
    • /
    • pp.401-407
    • /
    • 2000
  • Modern microprocessors exploit instruction-level parallel processing to increase the performance. Especially VLIW processors supported by the parallelizing compiler are used more and more in specific applications such as high-end DSP and graphic processing. Bus-based VLIW architecture was proposed for these specific applications and it was designed to reduce the overhead of forwarding unit and the instruction width. In this paper, a optimizing scheduling compiler developed for the proposed bus-based VLIW processor is introduced. First, the method to model interconnections between buses and resource usage patterns is described. Then, on the basis of the modeling, machine-dependent optimization techniques such as bus-to-register promotion, copy coalescing and operand substitution were implemented. Optimization techniques for general-purpose VLIW microprocessors such as selective scheduling and enhanced pipelining scheduling(EPS) were also implemented. The experiment result shows about 20% performance gain for multimedia application benchmarks.

  • PDF

Technology of the next generation low power memory system

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.4
    • /
    • pp.6-11
    • /
    • 2018
  • As embedded memory technology evolves, the traditional Static Random Access Memory (SRAM) technology has reached the end of development. For deepening the manufacturing process technology, the next generation memory technology is highly required because of the exponentially increasing leakage current of SRAM. Non-volatile memories such as STT-MRAM (Spin Torque Transfer Magnetic Random Access Memory), PCM (Phase Change Memory) are good candidates for replacing SRAM technology in embedded memory systems. They have many advanced characteristics in the perspective of power consumption, leakage power, size (density) and latency. Nonetheless, nonvolatile memories have two major problems that hinder their use it the next-generation memory. First, the lifetime of the nonvolatile memory cell is limited by the number of write operations. Next, the write operation consumes more latency and power than the same size of the read operation.These disadvantages can be solved using the compiler. The disadvantage of non-volatile memory is in write operations. Therefore, when the compiler decides the layout of the data, it is solved by optimizing the write operation to allocate a lot of data to the SRAM. This study provides insights into how these compiler and architectural designs can be developed.

Implementation of C++ ID Compiler (C++ IDL 컴파일러 구현)

  • Park, Chan-Mo;Lee, Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.5
    • /
    • pp.970-976
    • /
    • 2001
  • In this paper, OUIG IDL CFE, provided by Sunsoft, is used to take a IDL definitions as inputs and parse those. OmniORB3 is introduced to support functionality of the ORB. Suns CFE produce AST after parsing inputs. Actually, the node of AST Is instances of classes which are derived from CFE classes. As the compiler back end visit the node of the AST using iterator class, UTL_ScopeActiveIterator, it dumps codes of output. During processing, two files are generated. Routines of generating code are invoked by BE_produce.cc and codes are produced while visiting root of AST, idl_global->root(). The dump* functions which dump codes is called according to the type of node. In this paper, Mapping C++ of IDL definition is experimented and results In the same as that of omniidl which is provided by omniORB3. The code of results behavior correctly on omniORB3. In the future, we are interested in optimizing the performance of marshalling code via IDL compiler.

  • PDF

An Optimizing Compiler for VLIW Microcontrollers (VLIW형 마이크로컨트롤러를 위한 최적화 컴파일러의 구현)

  • 홍승표;문수묵
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10a
    • /
    • pp.759-761
    • /
    • 1998
  • 90년대 중반 이후 고성능의 프로세서들은 성능 향상을 위해 명령어 수준의 병렬성을 이용하고 있다. 특히 실행화일의 호환성을 고려할 필요가 없는 마이크로컨트롤에서는 같은 하드웨어로 더 많은 함수유닛을 가질 수 있는 VLIW 구조가 널리 사용된다. 이러한 VLIW형의 마이크로컨트롤러에서는 병렬성을 추출하는 역할이 전적으로 소프트웨어에 있으므로 컴파일어가 성능향상에 매우 큰 영향을 미치게 된다. 본 논문에서는 마이크로컨트롤러의 구조와 그룹짓기 조건을 분석하고 선택 스케쥴링과 소프트웨어 파이프라이닝을 이용한 VLIW형 마이크로컨트롤러용 최적화 컴파일러를 구현하고 그 성능을 측정한다.

  • PDF

Code Generation Techniques for the Optimized Energy Consumption (최적화된 에너지 소비를 위한 코드 생성 기술)

  • Ko, Kwang-Man;So, Kyoung-Young
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.12
    • /
    • pp.63-71
    • /
    • 2008
  • Recently, together with a new advent of embedded processor developed to support specific application area, and it evolution, a new study of software development to support the embedded processor and its commercial use has been revitalized. Specially, In a mobile device that is built-in embedded processor, software management is as important as hardware management for the limited power/energy. In this paper, we suggest that the code generation technique considering the energy dissipation through the verified retargetable compiler backend tool, EXPRESSION. For this goals, we describes the efficient code generation patterns and showed the variable performance results.

Compiler Optimization for Parallelism and Locality Improvement (병렬성 및 지역성 증진을 위한 컴파일러 최적화)

  • Jim, Jin-Mi;Byeon, Seok-U;Pyo, Chang-U;Lee, Man-Ho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.307-314
    • /
    • 1999
  • In this paper, we study on the transformation technique of sequential programs for the purpose of 'exploiting parallelism' and 'improving locality'. Based on the analysis of loop procedures of sequential programs with the factor of dependency and locality, two transformation techniques of loop distribution and loop fusion are applied to them. Transformed programs can be easily expressed as a parallel program wit thread notation, having coarse-grain parallelism and improved locality. This means that those transformations can be useful tools for optimizing and automatic-parallelizing compiler construction. Application of those techniques to SPEC95 on a solaris machine with four SPARC processors show an improvement of execution time.

  • PDF

Optimizing Constant Value Generation in Just-in-time Compiler for 64-bit JavaScript Engine (64-bit 자바스크립트 적시 컴파일러를 위한 상수 값 생성 최적화)

  • Choi, Hyung-Kyu;Lee, Jehyung
    • Journal of KIISE
    • /
    • v.43 no.1
    • /
    • pp.34-39
    • /
    • 2016
  • JavaScript is widely used in web pages with HTML. Many JavaScript engines adopt Just-in-time compilers to accelerate the execution of JavaScript programs. Recently, many newly introduced devices are adopting 64-bit CPUs instead of 32-bit and Just-in-time compilers for 64-bit CPU are slowly being introduced in JavaScript engines. However, there are many inefficiencies in the currently available Just-in-time compilers for 64-bit devices. Especially, the size of code is significantly increased compared to 32-bit devices, mainly due to 64-bit wide addresses in 64-bit devices. In this paper, we are going to address the inefficiencies introduced by 64-bit wide addresses and values in the Just-in-time compiler for the V8 JavaScript engine and propose more efficient ways of generating constant values and addresses to reduce the size of code. We implemented the proposed optimization in the V8 JavaScript engine and measured the size of code as well as performance improvements with Octane and SunSpider benchmarks. We observed a 3.6% performance gain and 0.7% code size reduction in Octane and a 0.32% performance gain and 2.8% code size reduction in SunSpider.

A Systematic Generation of Register-Reuse Chains (레지스터 재활용 사슬의 체계적 생성)

  • Lee, Hyuk-Jae
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.48 no.12
    • /
    • pp.1564-1574
    • /
    • 1999
  • In order to improve the efficiency of optimizing compilers, integration of register allocation and instruction scheduling has been extensively studied. One of the promising integration techniques is register allocation based on register-reuse chains. However, the generation of register-reuse chains in the previous approach was not completely systematic and consequently it creates unnecessarily dependencies that restrict instruction scheduling. This paper proposes a new register allocation technique based on a systematic generation of register-reuse chains. The first phase of the proposed technique is to generate register-reuse chains that are optimal in the sense that no additional dependencies are created. Thus, register allocation can be done without restricting instruction scheduling. For the case when the optimal register-reuse chains require more than available registers, the second phase reduces the number of required registers by merging the register-reuse chains. Chain merging always generates additional dependencies and consequently enforces the execution order of instructions. A heuristic is developed for the second phase in order to reduce additional dependencies created by merging chains. For matrix multiplication program, the number of registers resulting from the first phase is small enough to fit into available registers for most basic blocks. In addition, it is shown that the restriction to instruction scheduling is reduced by the proposed merging heuristic of the second phase.

  • PDF