• Title/Summary/Keyword: 메모리 할당 기법

Search Result 127, Processing Time 0.026 seconds

Implementation of Virtual OS Application using Server Based Computing (서버 기반 컴퓨팅을 이용한 가상 OS 활용 및 구현)

  • Sagong, Hyeon;Shin, Jang Won;Kwak, Jong Wook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.1670-1673
    • /
    • 2010
  • 서버 기반 컴퓨팅(Server Based Computing)은 데이터와 작업 처리가 서버에서 이루어지기 때문에 데이터를 효과적으로 통합하고 관리를 할 수 있다. 본 논문에서는 서버 기반 컴퓨팅을 이용하여 사용자에게 본인만의 데스크톱 환경을 제공하고, 언제 어디서나 필요한 정보와 애플리케이션을 실행할 수 있는 방법을 제안한다. 이러한 환경 하에서 최대한 서버의 활용률을 높이고 낭비하는 자원을 줄이기 위해 서버 가상화 기법(Server Virtualization)과 가상 OS 메모리 할당 알고리즘을 도입하였다. 서버와 사용자의 수에 따른 메모리 할당 방식을 hard handoff 라고 명하고, 사용자에게 메모리를 적절히 할당할 수 있도록 하였다. 또한 기존 사용자에 대한 메모리 재할당의 경우, Immutable OS와 별도의 사용자 데이터 공간으로 나누어 관리하여 가상 OS의 재접속 시간을 단축시킬 수 있었다.

A Dynamic Allocation Scheme for Improving Memory Utilization in Xen (Xen에서 메모리 이용률 향상을 위한 동적 할당 기법)

  • Lee, Kwon-Yong;Park, Sung-Yong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.3
    • /
    • pp.147-160
    • /
    • 2010
  • The system virtualization shows interest in the consolidation of servers for the efficient utilization of system resources. There are many various researches to utilize a server machine more efficiently through the system virtualization technique, and improve performance of the virtualization software. These researches have studied with the activity to control the resource allocation of virtual machines dynamically focused on CPU, or to manage resources in the cross-machine using the migration. However, the researches of the memory management have been wholly lacking. In this respect, the use of memory is limited to allocate the memory statically to virtual machine in server consolidation. Unfortunately, the static allocation of the memory causes a great quantity of the idle memory and decreases the memory utilization. The underutilization of the memory makes other side effects such as the load of other system resources or the performance degradation of services in virtual machines. In this paper, we suggest the dynamic allocation of the memory in Xen to control the memory allocation of virtual machines for the utilization without the performance degradation. Using AR model for the prediction of the memory usage and ACO (Ant Colony Optimization) algorithm for optimizing the memory utilization, the system operates more virtual machines without the performance degradation of servers. Accordingly, we have obtained 1.4 times better utilization than the static allocation.

Improvement of Address Pointer Assignment in DSP Code Generation (DSP용 코드 생성에서 주소 포인터 할당 성능 향상 기법)

  • Lee, Hee-Jin;Lee, Jong-Yeol
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.1
    • /
    • pp.37-47
    • /
    • 2008
  • Exploitation of address generation units which are typically provided in DSPs plays an important role in DSP code generation since that perform fast address computation in parallel to the central data path. Offset assignment is optimization of memory layout for program variables by taking advantage of the capabilities of address generation units, consists of memory layout generation and address pointer assignment steps. In this paper, we propose an effective address pointer assignment method to minimize the number of address calculation instructions in DSP code generation. The proposed approach reduces the time complexity of a conventional address pointer assignment algorithm with fixed memory layouts by using minimum cost-nodes breaking. In order to contract memory size and processing time, we employ a powerful pruning technique. Moreover our proposed approach improves the initial solution iteratively by changing the memory layout for each iteration because the memory layout affects the result of the address pointer assignment algorithm. We applied the proposed approach to about 3,000 sequences of the OffsetStone benchmarks to demonstrate the effectiveness of the our approach. Experimental results with benchmarks show an average improvement of 25.9% in the address codes over previous works.

Memory Allocation Scheme for Reducing False Sharing on Multiprocessor Systems (다중처리기 시스템에서 거짓 공유 완화를 위한 메모리 할당 기법)

  • Han, Boo-Hyung;Cho, Seong-Je
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.4
    • /
    • pp.383-393
    • /
    • 2000
  • In shared memory multiprocessor systems, false sharing occurs when several independent data objects, not shared but accessed by different processors, are allocated to the same coherency unit of memory. False sharing is one of the major factors that may degrade the performance of memory coherency protocols. This paper presents a new shared memory allocation scheme to reduce false sharing of parallel applications where master processor controls allocation of all the shared objects. Our scheme allocates the objects to temporary address space for the moment, and actually places each object in the address space of processor that first accesses the object later. Its goal is to allocate independent objects that may have different access patterns to different pages. We use execution-driven simulation of real parallel applications to evaluate the effectiveness of our scheme. Experimental results show that by using our scheme a considerable amount of false sharing faults can be reduced with low overhead.

  • PDF

An Assignment Method for Loop with Loop-Carried Dependence (루프 캐리 종속성을 가진 루프의 할당 기법)

  • Kim, Hyeon-Cheol;Yu, Gi-Yeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.8
    • /
    • pp.379-389
    • /
    • 2001
  • 본 논문에서는 루프 반복들 간에 종속 관계가 존재하는 루프의 효율적 수행을 위한 새로운 루프 할당 기법을 제안한다. 그리고, 중앙 큐를 사용하여 공유 메모리 다중처리기에 루프 반복을 할당하는 기존 셀프 스케쥴링 기법들을 루프 캐리 종속성(loop-carried dependence)을 가진 루프의 할당에 적용하기 위해 제안한 기법을 이용한 그들의 변형에 대해 알아본다. 종속 거리를 고려하여 루프를 세 단계별로 할당하는 제안된 CDSS(Carried-Dependence Self-Scheduling) 기법 또한, 중앙 작업 큐를 기반으로 한 것이며 별도의 스케쥴러가 필요 없는 셀프 스케쥴링 알고리즘이다. 종속거리, 프로세서 수, 반복 수, 스케쥴링 연산 시간 등을 다양하게 하여 변형된 할당 기법들과 비교 분석한 결과, 제안한 기법은 양호한 부하 균형을 유지하였으며 변형된 다른 기법들에 비해 루프 수행 시간을 줄여 효율적임을 알 수 있었다. 다양한 실험 환경에서 평균적으로 제안한 CDSS, 변형된 SS, Factoring, GSS, CSS 기법 순으로 루프 수행 시간 측면에서 좋은 성능을 보였다.

  • PDF

Dynamic Memory Allocation for Scientific Workflows in Containers (컨테이너 환경에서의 과학 워크플로우를 위한 동적 메모리 할당)

  • Adufu, Theodora;Choi, Jieun;Kim, Yoonhee
    • Journal of KIISE
    • /
    • v.44 no.5
    • /
    • pp.439-448
    • /
    • 2017
  • The workloads of large high-performance computing (HPC) scientific applications are steadily becoming "bursty" due to variable resource demands throughout their execution life-cycles. However, the over-provisioning of virtual resources for optimal performance during execution remains a key challenge in the scheduling of scientific HPC applications. While over-provisioning of virtual resources guarantees peak performance of scientific application in virtualized environments, it results in increased amounts of idle resources that are unavailable for use by other applications. Herein, we proposed a memory resource reconfiguration approach that allows the quick release of idle memory resources for new applications in OS-level virtualized systems, based on the applications resource-usage pattern profile data. We deployed a scientific workflow application in Docker, a light-weight OS-level virtualized system. In the proposed approach, memory allocation is fine-tuned to containers at each stage of the workflows execution life-cycle. Thus, overall memory resource utilization is improved.

Analysis of GPU Performance and Memory Efficiency according to Task Processing Units (작업 처리 단위 변화에 따른 GPU 성능과 메모리 접근 시간의 관계 분석)

  • Son, Dong Oh;Sim, Gyu Yeon;Kim, Cheol Hong
    • Smart Media Journal
    • /
    • v.4 no.4
    • /
    • pp.56-63
    • /
    • 2015
  • Modern GPU can execute mass parallel computation by exploiting many GPU core. GPGPU architecture, which is one of approaches exploiting outstanding computational resources on GPU, executes general-purpose applications as well as graphics applications, effectively. In this paper, we investigate the impact of memory-efficiency and performance according to number of CTAs(Cooperative Thread Array) on a SM(Streaming Multiprocessors), since the analysis of relation between number of CTA on a SM and them provides inspiration for researchers who study the GPU to improve the performance. Our simulation results show that almost benchmarks increasing the number of CTAs on a SM improve the performance. On the other hand, some benchmarks cannot provide performance improvement. This is because the number of CTAs generated from same kernel is a little or the number of CTAs executed simultaneously is not enough. To precisely classify the analysis of performance according to number of CTA on a SM, we also analyze the relations between performance and memory stall, dram stall due to the interconnect congestion, pipeline stall at the memory stage. We expect that our analysis results help the study to improve the parallelism and memory-efficiency on GPGPU architecture.

Preventive Adaption Threshold Mechanism in Buffer Allocation for Shared Memory Buffer (공유 메모리 버퍼에서의 예방적 적응 한계치 버퍼 할당 기법)

  • Shin, Tae-Ho;Lee, Sung-Chang;Lee, Hyeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.38 no.10
    • /
    • pp.24-33
    • /
    • 2001
  • Delay, delay variation and packet loss rate are principal QoS(Quality of Service) elements of packet communication. This paper proposes a new buffer allocation mechanism to improve the packet loss performance in such a situation that multiple logical buffers share a single physical memory buffer. In the proposed buffer allocation mechanism, the movement of dynamic threshold follows a curved track instead of a straight line which is used in the DT(dynamic threshold) mechanism. In order evaluate the effectiveness of the proposed mechanism, it is compared with the existing previously proposed mechanisms in several aspects including NC(no control), ST(Static Threshold) and DT mechanisms.

  • PDF

EAST: An Efficient and Advanced Space-management Technique for Flash Memory using Reallocation Blocks (재할당 블록을 이용한 플래시 메모리를 위한 효율적인 공간 관리 기법)

  • Kwon, Se-Jin;Chung, Tae-Sun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.7
    • /
    • pp.476-487
    • /
    • 2007
  • Flash memory offers attractive features, such as non-volatile, shock resistance, fast access, and low power consumption for data storage. However, it has one main drawback of requiring an erase before updating the contents. Furthermore, flash memory can only be erased limited number of times. To overcome limitations, flash memory needs a software layer called flash translation layer (FTL). The basic function of FTL is to translate the logical address from the file system like file allocation table (FAT) to the physical address in flash memory. In this paper, a new FTL algorithm called an efficient and advanced space-management technique (EAST) is proposed. EAST improves the performance by optimizing the number of log blocks, by applying the state transition, and by using reallocation blocks. The results of experiments show that EAST outperforms FAST, which is an enhanced log block scheme, particularly when the usage of flash memory is not full.

A Simple Implementation of Dynamical Memory Allocation in Old-fashioned Singleton's Mixed-radix Fast Fourier Transformation Code (구식 싱글턴 혼합기수 고속푸리에변환 코드에 대한 간단한 동적메모리 할당방법 프로그래밍)

  • Kim, In-Gee
    • Journal of the Korean Magnetics Society
    • /
    • v.22 no.2
    • /
    • pp.33-36
    • /
    • 2012
  • We propose a simple prescription for resolving the general-$N$ problem existing in the old-fashioned mixed-radix fast Fourier transformation FORTRAN subroutine by Singleton in 1968. After a brief investigation on the problem, we discuss our prescription with the worst case analysis within the dynamical allocation. The analysis reveals that our implementation is superior, at least for multi-variate data set, than previously proposed data copying methods.