• Title/Summary/Keyword: 코드 캐시

Search Result 22, Processing Time 0.027 seconds

A Real-Time JPEG2000 Codec Implementation on ARM9 Processor (ARM9 프로세서용 실시간 JPEG2000 코덱의 구현)

  • Kim, Young-Tae;Cho, Shi-Won;Lee, Dong-Wook
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.8 no.3
    • /
    • pp.149-155
    • /
    • 2007
  • In this paper, we propose an real-time implementation of JPEG2000 codec on the ARM9 processor. The implemented codec is designed to separate control codes from data management codes in order to use effectively the system resources such as processor and memory. Especially, in embedded situations like cellular phones it is very important to provide good services using limited processor and internal memory. Since ARM9 series processors do not provide floating-point, large amount of computational time is required to perform the operation which needs highly repetitive floating-point computations like DWT(discrete wavelet transform). The proposed codec was programed using fixed-point to overcome this weakness. Also code optimization considering cache memory was applied to further improve the computational speed.

  • PDF

Acceleration of LU-SGS Code on Latest Microprocessors Considering the Increase of Level 2 Cache Hit-Rate (최신 마이크로프로세서에서 2차 캐쉬 적중률 증가를 고려한 LU-SGS 코드의 가속)

  • Choi, J.Y.;Oh, Se-Jong
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.7
    • /
    • pp.68-80
    • /
    • 2002
  • An approach for composing a performance optimized computational code is suggested for latest microprocessors. The concept of the code optimization, called here as localization, is maximizing the utilization of the second level cache that is common to all the latest computer system, and minimizing the access to system main memory. In this study, the localized optimization of LU-SGS (Lower-Upper Symmetric Gauss-Seidel) code for the solution of fluid dynamic equations was carried out in three different levels and tested for several different microprocessor architectures most widely used in these days. The test results of localized optimization showed a remarkable performance gain up to 7.35 times faster solution, depending on the system, than the baseline algorithm for producing exactly the same solution on the same computer system.

An Optimization Technique for Irregular Data Access Patterns on Software Controlled On-Chip Memory SubSystems (소프트웨어 제어 온칩 메모리 서브시스템에서 불규칙 데이터 접근 패턴 최적화 기법)

  • Cho, Doo-San;Cho, Jung-Seok
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.212-214
    • /
    • 2012
  • 데이터 집약적인 대부분의 애플리케이션들은 규칙적인 메모리 접근 패턴과 동시에 불규칙적인 접근 패턴을 커널 코드에 포함하고 있다. 그 동안 대부분의 메모리 접근 패턴 최적화 기법은 규칙적인 패턴에 집중되어 있었다. 하지만 암호화/통신 관련 애플리케이션에서는 불규칙한 패턴으로 메모리 접근의 대부분을 구성하는 경우가 많다. 이러한 불규칙한 메모리 접근 패턴을 대상으로 온칩메모리를 효율적으로 사용하도록 최적화 기법을 일반화하여 설계하는 일은 어려운 작업이기 때문에 관련 연구분야에 큰 진전이 없는 실정이다. 우리는 불규칙 메모리 접근 패턴 최적화 문제를 해결하기 위하여 데이터 클러스터링 기법을 제안하였다. 클러스터링은 접근되는 데이터의 시공간 지역성을 계산하여 이득이 큰 데이터들을 하나의 블록으로 구성하여 온칩메모리에 상주시키는 기본단위로 사용하는 기법이다. 본 기법을 이용하면 기존의 캐시메모리에 비하여 약 19% 에너지 소모를 절감할 수 있다.

The Analysis and Design of Thread Model for Java Virtual Machine (자바가상머신 쓰레드 모델 분석 및 설계)

  • 유용선;박윤미;류현수;이철훈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.625-627
    • /
    • 2004
  • 최근 들어 인터넷의 발달과 더불어 PDA, 핸드폰과 같은 모바일 디바이스와 다양한 정보가전용 기기들에 네트워크 기반의 자바기술이 적용되고 있으며, 이러한 자바 기술을 사용함으로써 플랫폼 독립성 이식성, 보안성, 이동성 둥의 장점을 얻을 수 있다. 그러나, 자바로 작성된 응용프로그램은 C, C++로 작성된 응용프로그램 보다 수행속도가 느리다는 단점이 있다. 이러한 문제점을 해결하기 위해서는 자바가상머신의 성능향상이 필수적이다. 지금까지 메모리 관리를 위한 가비지 컬렉션, 소프트웨어나 하드웨어를 이용한 바이트 코드 변환, 인라인캐시(inline-cache)를 사용한 접근 속도 향상 등 많은 부분에서 활발한 연구가 진행되고 있다. 본 논문에서는 모바일 플랫폼에서 동작하는 KVM(kilo-virtual machine)의 성능향상을 위한 쓰레드 구조를 분석하고 설계한다.

  • PDF

Meltdown Threat Dynamic Detection Mechanism using Decision-Tree based Machine Learning Method (의사결정트리 기반 머신러닝 기법을 적용한 멜트다운 취약점 동적 탐지 메커니즘)

  • Lee, Jae-Kyu;Lee, Hyung-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.209-215
    • /
    • 2018
  • In this paper, we propose a method to detect and block Meltdown malicious code which is increasing rapidly using dynamic sandbox tool. Although some patches are available for the vulnerability of Meltdown attack, patches are not applied intentionally due to the performance degradation of the system. Therefore, we propose a method to overcome the limitation of existing signature detection method by using machine learning method for infrastructures without active patches. First, to understand the principle of meltdown, we analyze operating system driving methods such as virtual memory, memory privilege check, pipelining and guessing execution, and CPU cache. And then, we extracted data by using Linux strace tool for detecting Meltdown malware. Finally, we implemented a decision tree based dynamic detection mechanism to identify the meltdown malicious code efficiently.

ANC Caching Technique for Replacement of Execution Code on Active Network Environment (액티브 네트워크 환경에서 실행 코드 교체를 위한 ANC 캐싱 기법)

  • Jang Chang-bok;Lee Moo-Hun;Cho Sung-Hoon;Choi Eui-In
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.9B
    • /
    • pp.610-618
    • /
    • 2005
  • As developed Internet and Computer Capability, Many Users take the many information through the network. So requirement of User that use to network was rapidly increased and become various. But it spend much time to accept user requirement on current network, so studied such as Active network for solved it. This Active node on Active network have the capability that stored and processed execution code aside from capability of forwarding packet on current network. So required execution code for executed packet arrived in active node, if execution code should not be in active node, have to take by request previous Action node and Code Server to it. But if this execution code take from previous active node and Code Server, bring to time delay by transport execution code and increased traffic of network and execution time. So, As used execution code stored in cache on active node, it need to increase execution time and decreased number of request. So, our paper suggest ANC caching technique that able to decrease number of execution code request and time of execution code by efficiently store execution code to active node. ANC caching technique may decrease the network traffic and execution time of code, to decrease request of execution code from previous active node.

Memory Hierarchy Optimization in Embedded Systems using On-Chip SRAM (On-Chip SRAM을 이용한 임베디드 시스템 메모리 계층 최적화)

  • Kim, Jung-Won;Kim, Seung-Kyun;Lee, Jae-Jin;Jung, Chang-Hee;Woo, Duk-Kyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.2
    • /
    • pp.102-110
    • /
    • 2009
  • The memory wall is the growing disparity of speed between CPU and memory outside the CPU chip. An economical solution is a memory hierarchy organized into several levels, such as processor registers, cache, main memory, disk storage. We introduce a novel memory hierarchy optimization technique in Linux based embedded systems using on-chip SRAM for the first time. The optimization technique allocates On-Chip SRAM to the code/data that selected by programmers by using virtual memory systems. Experiments performed with nine applications indicate that the runtime improvements can be achieved by up to 35%, with an average of 14%, and the energy consumption can be reduced by up to 40%, with an average of 15%.

Design of Global Buffer Manager in SAN-based Cluster File Systems (SAN 환경의 대용량 클러스터 파일 시스템을 위한 광역 버퍼 관리기의 설계)

  • Lee, Kyu-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.11
    • /
    • pp.2404-2410
    • /
    • 2011
  • This paper describes the design overview of cluster file system $SANique^{TM}$ based on SAN(Storage Area Network) environment. The design issues and problems of the conventional global buffer manager are also illustrated under a large set of clustered computing hosts. We propose the efficient global buffer management method that provides the more scalability and availability. In our proposed global buffer management method, we reuse the maintained list of lock information from our cluster lock manager. The global buffer manger can easily find and determine the location of requested data block cache based on that lock information. We present the pseudo code of the global buffer manager and illustration of global cache operation in cluster environment.

Performance Improvement Through Aggressive Instruction Packing (적극적인 명령어 압축을 통한 성능향상)

  • Ji, Seung-Hyeon;Kim, Seok-Il
    • The KIPS Transactions:PartA
    • /
    • v.9A no.2
    • /
    • pp.231-240
    • /
    • 2002
  • This paper proposes balancing scheduling effort more evenly between the compiler and the processor, by introducing independently scheduled VLIW instructions. Aggressively Packed VLIW (APVLIW) processor is aimed specifically at independent scheduling Very Long Instruction Word(VLIW) instructions with dependency information. The APVLIW processor independently schedules earth instruction within long instructions using functional unit and dynamic scheduler pairs. Every dynamic scheduler dynamically checks far data dependencies and resource collisions while scheduling each instruction. This scheduling is especially effective in applications containing loops. We simulate the architecture and show that the APVLIW processor performs significantly better than the VLIW processor for a wide range of cache sizes and across various numerical benchmark applications.

External Merge Sorting in Tajo with Variable Server Configuration (매개변수 환경설정에 따른 타조의 외부합병정렬 성능 연구)

  • Lee, Jongbaeg;Kang, Woon-hak;Lee, Sang-won
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.820-826
    • /
    • 2016
  • There is a growing requirement for big data processing which extracts valuable information from a large amount of data. The Hadoop system employs the MapReduce framework to process big data. However, MapReduce has limitations such as inflexible and slow data processing. To overcome these drawbacks, SQL query processing techniques known as SQL-on-Hadoop were developed. Apache Tajo, one of the SQL-on-Hadoop techniques, was developed by a Korean development group. External merge sort is one of the heavily used algorithms in Tajo for query processing. The performance of external merge sort in Tajo is influenced by two parameters, sort buffer size and fanout. In this paper, we analyzed the performance of external merge sort in Tajo with various sort buffer sizes and fanouts. In addition, we figured out that there are two major causes of differences in the performance of external merge sort: CPU cache misses which increase as the sort buffer size grows; and the number of merge passes determined by fanout.