통합 검색 | Korea Science

김태균
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2000년도 가을 학술발표논문집 Vol.27 No.2 (3)
- /
- pp.609-611
- /
- 2000
CC-NUMA 시스템은 SMP 시스템의 장점인 프로그래밍의 편리함, 작업 환경의 유연함 및 관리의 용이함 등을 유지하는 한편, SMP의 단점이었던 확장성까지 제공한다. 더욱이 메모리 장벽 즉 급격히 빨라지는 프로세서의 처리 속도에 비해 메모리의 속도는 거의 변화가 없음으로 인하여 야기되는 문제를 극복할 수 있는 구조적인 대안으로 각광받고 있다. 이러한 CC-NUMA 시스템은 노드간의 논리적인 거리가 길기 때문에 프로세싱 노드간의 통신이 시스템의 성능에 영향을 미치는 가장 핵심 요소가 된다. 따라서 노드간의 통신을 최소화 해주기 위한 노력으로 각 노드에 장착되어지는 원격 캐쉬의 중요성이 강조된다. 본 논문에서는 CC-NUMA 시스템에서는 노드간 데이터 통신의 유형을 파악하고, 원격 캐쉬의 블록 사이즈에 따른 이들의 발생횟수의 변화를 분석하였다. 인스트럭션 시뮬레이터인 CacheMire와 II 벤치마크 중 하나인 FFT를 이용하여 실행-구동 시뮬레이션을 통해 원격캐쉬 블록의 크기가 증가할수록 노드간 통신의 횟수는 물론 전송되는 데이터의 절대적인 양이 감소한다는 사실을 알 수 있었다.
PDF

김나현;김정범
- 한국전자통신학회논문지
- /
- 제18권5호
- /
- pp.777-784
- /
- 2023
감지 증폭기는 메모리 설계에 필수적인 주변 회로로서, 작은 차동 입력 신호를 감지하여 디지털 신호로 증폭하기 위해 사용된다. 본 논문에서는 인 메모리 컴퓨팅 회로에서 활용 가능한 고속 감지 증폭기를 제안하였다. 제안하는 회로는 추가적인 방전 경로를 제공하는 트랜지스터 Mtail을 통해 감지 지연 시간을 감소시키고, m-GDI(:modified Gate Diffusion Input)를 적용하여 감지 증폭기의 회로 성능을 개선하였다. 기존 구조와 비교했을 때 감지 지연 시간은 16.82% 감소하였으며, PDP(: Power Delay Product)는 17.23%, EDP(: Energy Delay Product)은 31.1%가 감소하는 결과를 보였다. 제안하는 회로는 TSMC의 65nm CMOS 공정을 사용하여 구현하였으며 SPECTRE 시뮬레이션을 통해 본 연구의 타당성을 검증하였다.
https://doi.org/10.13067/JKIECS.2023.18.5.777 인용 PDF

김정근
- 반도체디스플레이기술학회지
- /
- 제22권3호
- /
- pp.142-148
- /
- 2023
In this paper, we identify performance issues in executing compute kernels from PolyBench, which includes compute kernels that are the core computational units of various data-intensive workloads, such as deep learning and data-intensive applications, on Processing-in-Memory (PIM) devices. Therefore, using our in-house simulator, we measured and compared the various performance metrics of workloads based on traditional out-of-order and in-order processors with Processing-in-Memory-based systems. As a result, the PIM-based system improves performance compared to other computing models due to the short-term data reuse characteristic of computational kernels from PolyBench. However, some kernels perform poorly in PIM-based systems without a multi-layer cache hierarchy due to some kernel's long-term data reuse characteristics. Hence, our evaluation and analysis results suggest that further research should consider dynamic and workload pattern adaptive approaches to overcome performance degradation from computational kernels with long-term data reuse characteristics and hidden data locality.
PDF

김정근
- 반도체디스플레이기술학회지
- /
- 제22권3호
- /
- pp.78-83
- /
- 2023
This paper proposes a lightweight trace-driven Processing-In-Memory (PIM) simulator, TP-Sim. TP-Sim is a General Purpose PIM (GP-PIM) simulator that evaluates various PIM system performance-related metrics. Based on instruction and memory traces extracted from the Intel Pin tool, TP-Sim can replay trace files for multiple models of PIM architectures to compare its performance. To verify the availability of TP-Sim, we estimated three different system configurations on the STREAM benchmark. Compared to the traditional Host CPU-only systems with conventional memory hierarchy, simple GP-PIM architecture achieved better performance; even the Host CPU has the same number of in-order cores. For further study, we also extend TP-Sim as a part of a heterogeneous system simulator that contains CPU, GPGPU, and PIM as its primary and co-processors.
PDF