• 제목/요약/키워드: Heterogeneous Memory

검색결과 71건 처리시간 0.02초

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제14권4호
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.

Relationship of Working Memory, Processing Speed, and Fluid Reasoning in Psychiatric Patients

  • Kim, Se-Jin;Park, Eun Hee
    • Psychiatry investigation
    • /
    • 제15권12호
    • /
    • pp.1154-1161
    • /
    • 2018
  • Objective The present study aimed to investigate relationship among cognitive factors (working memory and processing speed) and fluid reasoning (Gf) in psychiatric patients using a standardized clinical tool. Methods We included the responses of 115 heterogeneous patients who were diagnosed with the MINI-Plus 5.0 and WAIS-IV/WMS-IV was administered. For our analysis, structured equation modeling (SEM) was conducted to evaluate which cognitive variables are closely related to the Gf. Results The results showed that the visual working memory was the strongest predictor of the Gf compared to other cognitive factors. Conclusion Processing speed was capable of predicting the Gf, when visual working memory was controlled. The inter-relationship among the Gf and other cognitive factors and its clinical implications were further discussed.

이기종 저장 장치 환경을 위한 버퍼 캐시 관리 기법 (An Efficient Buffer Cache Management Scheme for Heterogeneous Storage Environments)

  • 이세환;고건;반효경
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제37권5호
    • /
    • pp.285-291
    • /
    • 2010
  • 플래시 메모리는 하드 디스크에 비해 크기가 작고 물리적 충격에 강하며 전력 소모가 적은 점 등 많은 장점을 가지고 있지만 아직까지 단위 공간당 가격이 높아 하드 디스크를 전면 대체하기는 어려운 실정이다. 최근 노트북 컴퓨터 동 일부 모바일 장치에서는 하드 디스크와 플래시 메모리를 함께 사용하여 두 매체의 장점을 극대화하려는 시도가 이루어지고 있다. 하지만 기존 운영체제는 이기종 저장 장치 환경이 아닌 단일 저장 장치 환경에 최적화되어 이러한 장점을 충분히 살리지 못하고 있다. 본 논문에서는 이를 해결하기 위해 세 가지 기법을 이용하는 새로운 버퍼 캐시 관리 기법을 제안한다. 첫째, 입출력 접근 패턴을 탐지하고 블록의 저장 위치 별 성능 특성을 분석한 후 동적 한계 효용에 근거하여 버퍼 캐시 공간을 할당한다. 둘째, 입출력 접근 패턴과 저장 장치 특성에 따라서 선택적으로 선반입 기법을 적용한다. 셋째, 버퍼 캐시에서 저장 장치로 쫓겨날 때 해당 블록의 접근 패턴에 따라 하드 디스크와 플래시 메모리 중 더 적합한 매체를 결정하고 그 매체에 블록이 저장되도록 한다. 제안하는 기법들을 트레이스 기반 시뮬레이션으로 검증한 결과 기존 기법에 비해 버퍼 캐시 적중률은 29.9%, 총 실행시간은 49.5% 향상되었다.

다중 메모리 뱅크 구조를 위한 고속의 자료 할당 기법 (Rapid Data Allocation Technique for Multiple Memory Bank Architectures)

  • 조정훈;백윤홍;최준식
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2003년도 가을 학술발표논문집 Vol.30 No.2 (1)
    • /
    • pp.196-198
    • /
    • 2003
  • Virtually every digital signal processors(DSPs) support on-chip multi- memory banks that allow the processor to access multiple words of data from memory in a single instruction cycle. Also, all existing fixed-point DSPs have irregular architecture of heterogeneous register which contains multiple register files that are distributed and dedicated to different sets of instructions. Although there have been several studies conducted to efficiently assign data to multi-memory banks, most of them assumed processors with relatively simple, homogeneous general-purpose resisters. Therefore, several vendor-provided compilers fer DSPs were unable to efficiently assign data to multiple data memory banks. thereby often failing to generate highly optimized code fer their machines. This paper presents an algorithm that helps the compiler to efficiently assign data to multi- memory banks. Our algorithm differs from previous work in that it assigns variables to memory banks in separate, decoupled code generation phases, instead of a single, tightly-coupled phase. The experimental results have revealed that our decoupled algorithm greatly simplifies our code generation process; thus our compiler runs extremely fast, yet generates target code that is comparable In quality to the code generated by a coupled approach

  • PDF

슈퍼 칩 구현을 위한 헤테로집적화 기술 (Ultimate Heterogeneous Integration Technology for Super-Chip)

  • 이강욱
    • 마이크로전자및패키징학회지
    • /
    • 제17권4호
    • /
    • pp.1-9
    • /
    • 2010
  • 삼차원 집적화기술의 현황과 과제 및 향후에 요구되어질 새로운 삼차원 집적화기술의 필요성에 대해 논의를 하였다. Super-chip 기술이라 불리우는 자기조직화 웨이퍼집적화 기술 및 삼차원 헤테로집적화 기술에 대해 소개를 하였다. 액체의 표면장력을 이용하여지지 기반위에 다수의 KGD를 일괄 실장하는 새로운 집적화 기술을 적용하여, KGD만으로 구성된 자기조직화 웨이퍼를 다층으로 적층함으로써 크기가 다른 칩들을 적층하는 것에 성공을 하였다. 또한 삼차원 헤테로집적화 기술을 이용하여 CMOS LSI, MEMS 센서들의 전기소자들과 PD, VC-SEL등의 광학소자 및 micro-fluidic 등의 이종소자들을 삼차원으로 집적하여 시스템화하는데 성공하였다. 이러한 기술은 향후 TSV의 실용화 및 궁극의 3-D IC인 super-chip을 구현하는데 필요한 핵심기술이다.

SLC/MLC 혼합 플래시 메모리를 이용한 하이브리드 하드디스크 설계 (Designing Hybrid HDD using SLC/MLC combined Flash Memory)

  • 홍성철;신동군
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제16권7호
    • /
    • pp.789-793
    • /
    • 2010
  • 최근 플래시 메모리 기반 비휘발성 캐시가 저장장치의 성능과 전력 소모 측면에서 효과적인 해법으로 떠오르고 있다. 비휘발성 캐시로 저장장치의 성능을 향상시키고 전력 소모를 줄이기 위해, 가격이 싸고 용량이 큰 multi-level-cell (MLC) 플래시 메모리를 사용하는 것이 좋다. 그러나 MLC 플래시 메모리의 수명은 single-level-cell (SLC) 플래시 메모리보다 훨씬 짧기 때문에 전체 저장장치의 수명이 짧아질 수 있다. 이러한 약점을 최소화하기 위해 SLC 플래시 메모리와 MLC플래시 메모리를 결합한 형태의 비휘발성 캐시를 고려해볼 수 있다. 본 논문에서는 SLC와 MLC를 결합한 플래시 메모리를 버퍼로 사용하는 새로운 하이브리드 하드디스크 구조를 제안한다.

A Novel Memory Hierarchy for Flash Memory Based Storage Systems

  • Yim, Keno-Soo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제5권4호
    • /
    • pp.262-269
    • /
    • 2005
  • Semiconductor scientists and engineers ideally desire the faster but the cheaper non-volatile memory devices. In practice, no single device satisfies this desire because a faster device is expensive and a cheaper is slow. Therefore, in this paper, we use heterogeneous non-volatile memories and construct an efficient hierarchy for them. First, a small RAM device (e.g., MRAM, FRAM, and PRAM) is used as a write buffer of flash memory devices. Since the buffer is faster and does not have an erase operation, write can be done quickly in the buffer, making the write latency short. Also, if a write is requested to a data stored in the buffer, the write is directly processed in the buffer, reducing one write operation to flash storages. Second, we use many types of flash memories (e.g., SLC and MLC flash memories) in order to reduce the overall storage cost. Specifically, write requests are classified into two types, hot and cold, where hot data is vulnerable to be modified in the near future. Only hot data is stored in the faster SLC flash, while the cold is kept in slower MLC flash or NOR flash. The evaluation results show that the proposed hierarchy is effective at improving the access time of flash memory storages in a cost-effective manner thanks to the locality in memory accesses.

Reevaluating the overhead of data preparation for asymmetric multicore system on graphics processing

  • Pei, Songwen;Zhang, Junge;Jiang, Linhua;Kim, Myoung-Seo;Gaudiot, Jean-Luc
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권7호
    • /
    • pp.3231-3244
    • /
    • 2016
  • As processor design has been transiting from homogeneous multicore processor to heterogeneous multicore processor, traditional Amdahl's law cannot meet the new challenges for asymmetric multicore system. In order to further investigate the impact factors related to the Overhead of Data Preparation (ODP) for Asymmetric multicore systems, we evaluate an asymmetric multicore system built with CPU-GPU by measuring the overheads of memory transfer, computing kernel, cache missing and synchronization. This paper demonstrates that decreasing the overhead of data preparation is a promising approach to improve the whole performance of heterogeneous system.

Neural network heterogeneous autoregressive models for realized volatility

  • Kim, Jaiyool;Baek, Changryong
    • Communications for Statistical Applications and Methods
    • /
    • 제25권6호
    • /
    • pp.659-671
    • /
    • 2018
  • In this study, we consider the extension of the heterogeneous autoregressive (HAR) model for realized volatility by incorporating a neural network (NN) structure. Since HAR is a linear model, we expect that adding a neural network term would explain the delicate nonlinearity of the realized volatility. Three neural network-based HAR models, namely HAR-NN, $HAR({\infty})-NN$, and HAR-AR(22)-NN are considered with performance measured by evaluating out-of-sample forecasting errors. The results of the study show that HAR-NN provides a slightly wider interval than traditional HAR as well as shows more peaks and valleys on the turning points. It implies that the HAR-NN model can capture sharper changes due to higher volatility than the traditional HAR model. The HAR-NN model for prediction interval is therefore recommended to account for higher volatility in the stock market. An empirical analysis on the multinational realized volatility of stock indexes shows that the HAR-NN that adds daily, weekly, and monthly volatility averages to the neural network model exhibits the best performance.

Integer-Valued HAR(p) model with Poisson distribution for forecasting IPO volumes

  • SeongMin Yu;Eunju Hwang
    • Communications for Statistical Applications and Methods
    • /
    • 제30권3호
    • /
    • pp.273-289
    • /
    • 2023
  • In this paper, we develop a new time series model for predicting IPO (initial public offering) data with non-negative integer value. The proposed model is based on integer-valued autoregressive (INAR) model with a Poisson thinning operator. Just as the heterogeneous autoregressive (HAR) model with daily, weekly and monthly averages in a form of cascade, the integer-valued heterogeneous autoregressive (INHAR) model is considered to reflect efficiently the long memory. The parameters of the INHAR model are estimated using the conditional least squares estimate and Yule-Walker estimate. Through simulations, bias and standard error are calculated to compare the performance of the estimates. Effects of model fitting to the Korea's IPO are evaluated using performance measures such as mean square error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) etc. The results show that INHAR model provides better performance than traditional INAR model. The empirical analysis of the Korea's IPO indicates that our proposed model is efficient in forecasting monthly IPO volumes.