Search | Korea Science

Gated Recurrent Unit based Prefetching for Graph Processing (그래프 프로세싱을 위한 GRU 기반 프리페칭)

Shivani Jadhav;Farman Ullah;Jeong Eun Nah;Su-Kyung Yoon
- Journal of the Semiconductor & Display Technology
- /
- v.22 no.2
- /
- pp.6-10
- /
- 2023
High-potential data can be predicted and stored in the cache to prevent cache misses, thus reducing the processor's request and wait times. As a result, the processor can work non-stop, hiding memory latency. By utilizing the temporal/spatial locality of memory access, the prefetcher introduced to improve the performance of these computers predicts the following memory address will be accessed. We propose a prefetcher that applies the GRU model, which is advantageous for handling time series data. Display the currently accessed address in binary and use it as training data to train the Gated Recurrent Unit model based on the difference (delta) between consecutive memory accesses. Finally, using a GRU model with learned memory access patterns, the proposed data prefetcher predicts the memory address to be accessed next. We have compared the model with the multi-layer perceptron, but our prefetcher showed better results than the Multi-Layer Perceptron.
PDF

Efficient Processing of Grouped Aggregation on Non-Uniformed Memory Access Architecture (비균등 메모리 접근 구조에서의 효율적인 그룹화 집단 연산의 처리)

Choe, Seongjun;Min, Jun-Ki
- Database Research
- /
- v.34 no.3
- /
- pp.14-27
- /
- 2018
Recently, to alleviate the memory bottleneck problme occurred in Symmetric Multiprocessing (SMP) architecture, Non-Uniform Memory Access (NUMA) architecture was proposed. In addition, since an aggregation operator is an important operator providing properties and summary of data, the efficiency of the aggregation operator is crucial to overall performance of a system. Thus, in this paper, we propose an efficient aggregation processing technique on NUMA architecture. Our proposed technique consists of partition phase and merge phase. In the partition phase, the target relation is partitioned into several partial relations according to grouping attribute. Thus, since each thread can process aggregation operator on partial relation independently, we prevent the remote memory access during the merge phase. Furthermore, at the merge phase, we improve the performance of the aggregation processing by letting each thread compute aggregation with a local hash table as well as avoiding lock contention to merge aggregation results generated by all threads into one.

Memory management in hihg-speed viterbi decoders (고속 Viterbi 복호기를 위한 메모리 관리)

임민중
- Journal of the Korean Institute of Telematics and Electronics C
- /
- v.35C no.7
- /
- pp.30-36
- /
- 1998
Memory management is one of the most important problems in implementing viterbi decoders. This paper introduces a novel traceback scheme for memory management of high-speed viterbi decoders. The new method balances the read and the write oeprations by inserting dummy write operations into the traceback process, resulting in simpler memory access schemes. It is suitable for VLSI implementation since it uses minimal memory requirements, it does not need global interconnections, and its address genration shceme for accessig memory contents is very simple.
PDF

A Design and Verification of an Efficient Control Unit for Optical Processor (광프로세서를 위한 효율적인 제어회로 설계 및 검증)

Lee Won-Joo
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.43 no.4 s.310
- /
- pp.23-30
- /
- 2006
This paper presents design andd verification of a circuit that improves the control-operation problems of Stored Program Optical Computer (SPOC), which is an optical computer using $LiNbO_3$ optical switching element. Since the memory of SPOC takes the Delay Line Memory (DLM) architecture and instructions that are needless of operands should go though memory access stages, SPOC memory have problems; it takes immoderate access time and unnecessary operations are executed in Arithmetic Logical Unit (ALU) because desired operations can't be selectively executed. In this paper, improvement on circuit has been achieved by removing the memory access of instructions that are needless of operands by decoding instructions before locating operand. Unnecessary operations have been reduced by sending operands to some specific operational units, not to all the operational units in ALD. We show that total execution time of a program is minimized by using the Dual Instruction Register(DIR) architecture.
PDF KSCI

Study on the Performance Evaluation and Analysis of Mobile Cache Memory

Lee, Sangmin;Kim, Jongwan;Kim, Ji Young;Oh, Dukshin
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.6
- /
- pp.99-107
- /
- 2020
In this paper, we analyze the characteristics of mobile cache, which is used to improve the data access speed when executing applications on mobile devices, and verify the importance of mobile cache through a cache data access experiment. The mobile device market has grown at a fast pace over the past decade; however, battery limitations and size, price considerations restrict the usage of fast hardware. Thus, their performance are supplemented by using a memory buffer structure such as the cache memory. The analysis mainly focuses on cache size, hierarchical structure of cache, cache replacement policy, and the effect these features has on mobile performance. For the experimental data, we applied a data set from a microprocessor system study, originally used to test the cache performance. In the experimental results, the average data access speed on a mobile device showed a performance improvement of up to 10 times with the presence of cache memory than without. Accordingly, the cache memory was helpful for the performance improvement of a mobile device when the specifications were identical.
https://doi.org/10.9708/jksci.2020.25.06.099 인용 PDF KSCI

Peducing the Overhead of Virtual Address Translation Process (가상주소 변환 과정에 대한 부담의 줄임)

U, Jong-Jeong
- The Transactions of the Korea Information Processing Society
- /
- v.3 no.1
- /
- pp.118-126
- /
- 1996
Memory hierarchy is a useful mechanism for improving the memory access speed and making the program space larger by layering the memories and separating program spaces from memory spaces. However, it needs at least two memory accesses for each data reference : a TLB(Translation Lookaside Buffer) access for the address translation and a data cache access for the desired data. If the cache size increases to the multiplication of page size and the cache associativity, it is difficult to access the TLB with the cache in parallel, thereby making longer the critical timing path in the processor. To achieve such parallel accesses, we present the hybrid mapped TLB which combines a direct mapped TLB with a very small fully-associative mapped TLB. The former can reduce the TLB access time. while the latter removes the conflict misses from the former. The trace-driven simulation shows that under given workloads the proposed TLB is effective even when a fully-associative mapped TLB with only four entries is added because the effects of its increased misses are offset by its speed benefits.
PDF

A Study on the Methods to Manage Private Records Utilizing AtoM (Access to Memory): Focused on 'Archive Village' (AtoM을 활용한 민간기록물 관리방안 -'기록사랑마을' 중심으로-)

Yuk, Hye-In;Kim, Yong;Jang, Jun-Kab
- Journal of the Korean BIBLIA Society for library and Information Science
- /
- v.26 no.2
- /
- pp.79-105
- /
- 2015
This study aims to propose management plans for 'Archive Village' that are operating in order to protect important private records and archives. Since 2008, Archive Villages from National Archives of Korea have had nearly 3,000 more records by 2014. However, many users have had difficulties to know the status of the recorded material, even less access to record's informations. This problem arising is that it is difficult to manage and use records. The purpose of this study is to propose a plan for the management of records, which is to realize the 'Computerization of records' official opinion, one of which was raised in the previous study. Because the project is facing the issue of 'human resources and costs', 'the burden of system construction'. This study is implemented records management system considering the problems noted above, utilizing AtoM (Access to Memory).
https://doi.org/10.14699/kbiblia.2015.26.2.079 인용 PDF KSCI

A Low Power Phase-Change Random Access Memory Using A Selective Data Write Scheme (선택적 데이터 쓰기 기법을 이용한 저전력 상변환 메모리)

Yang, Byung-Do
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.44 no.1
- /
- pp.45-50
- /
- 2007
This paper proposes a low power selective data write (SDW) scheme for a phase-change random access memory (PRAM). The PRAM consumes large write power because large write currents are required during long time. At first, the SDW scheme reads a stored data during write operation. And then, it writes an input data only when the input and stored data are different. Therefore, it can reduce the write power consumption to a half. The 1K-bit PRAM test chip with $128{\times}8bits$ is implemented with a $0.8{\mu}m$ CMOS technology with a $0.8{\mu}m$ GST cell.
PDF KSCI

A Design of Direct Memory Access (DMA) Controller For H.264 Encoder (H.264 Encoder용 Direct Memory Access (DMA) 제어기 설계)

Song, In-Keun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.14 no.2
- /
- pp.445-452
- /
- 2010
In this paper, an attempt has been made to design the controller applicable for H.264 level3 encoder of baseline profile on full hardware basis. The designed controller module first stores the images supplied from CMOS Image Sensor(CIS) at main memory, and then reads or stores the image data in macroblock unit according to encoder operation. The timing cycle of the DMA controller required to process a macroblock is 478 cycles. In order to verify the our design, reference-C encoder, which is compatible to JM 9.4, is developed and the designed controller is verified by using the test vector generated from the reference C code. The number of cycle in the designed DMA controller is reduced by 40% compared with the cycle of using Xilinx MIG.
https://doi.org/10.6109/jkiice.2010.14.2.445 인용 PDF KSCI

Improving the Reliability by Straight Channel of As₂Se₃-based Resistive Random Access Memory (As₂Se₃ 기반 Resistive Random Access Memory의 채널 직선화를 통한 신뢰성 향상)

Nam, Ki-Hyun;Kim, Chung-Hyeok
- Journal of the Korean Institute of Electrical and Electronic Material Engineers
- /
- v.29 no.6
- /
- pp.327-331
- /
- 2016
Resistive random access memory (ReRAM) of metallic conduction channel mechanism is based on the electrochemical control of metal in solid electrolyte thin film. Amorphous chalcogenide materials have the solid electrolyte characteristic and optical reactivity at the same time. The optical reactivity has been used to improve the memory switching characteristics of the amorphous $As_2Se_3$-based ReRAM. This study focuses on the formation of holographic lattices patterns in the amorphous $As_2Se_3$ thin film for straight conductive channel. The optical parameters of amorphous $As_2Se_3$ thin film which is a refractive index and extinction coefficient was taken by n&k thin film analyzer. He-Cd laser (wavelength: 325 nm) was selected based on these basic optical parameters. The straighten conduction channel was formed by holographic lithography method using He-Cd laser.$ Ag^+$ ions that photo-diffused periodically by holographic lithography method will be the role of straight channel patterns. The fabricated ReRAM operated more less voltage and indicated better reliability.
https://doi.org/10.4313/JKEM.2016.29.6.327 인용 PDF KSCI

Search Result 1,134, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)