• Title/Summary/Keyword: Near-Memory Processing

Search Result 38, Processing Time 0.021 seconds

Efficient Hybrid Transactional Memory Scheme using Near-optimal Retry Computation and Sophisticated Memory Management in Multi-core Environment

  • Jang, Yeon-Woo;Kang, Moon-Hwan;Chang, Jae-Woo
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.499-509
    • /
    • 2018
  • Recently, hybrid transactional memory (HyTM) has gained much interest from researchers because it combines the advantages of hardware transactional memory (HTM) and software transactional memory (STM). To provide the concurrency control of transactions, the existing HyTM-based studies use a bloom filter. However, they fail to overcome the typical false positive errors of a bloom filter. Though the existing studies use a global lock, the efficiency of global lock-based memory allocation is significantly low in multi-core environment. In this paper, we propose an efficient hybrid transactional memory scheme using near-optimal retry computation and sophisticated memory management in order to efficiently process transactions in multi-core environment. First, we propose a near-optimal retry computation algorithm that provides an efficient HTM configuration using machine learning algorithms, according to the characteristic of a given workload. Second, we provide an efficient concurrency control for transactions in different environments by using a sophisticated bloom filter. Third, we propose a memory management scheme being optimized for the CPU cache line, in order to provide a fast transaction processing. Finally, it is shown from our performance evaluation that our HyTM scheme achieves up to 2.5 times better performance by using the Stanford transactional applications for multi-processing (STAMP) benchmarks than the state-of-the-art algorithms.

Increased Ventrolateral Prefrontal Cortex Activation during Accurate Eyewitness Memory Retrieval: An Exploratory Functional Near-Infrared Spectroscopy Study (목격 여부에 따른 배가쪽 이마앞 영역의 활성화 차이: Functional Near-Infrared Spectroscopy Study 연구)

  • Ham, Keunsoo;Kim, Ki Pyoung;Jeong, Hojin;Yoo, Seong Ho
    • The Korean Journal of Legal Medicine
    • /
    • v.42 no.4
    • /
    • pp.146-152
    • /
    • 2018
  • We investigated the neural correlates of accurate eyewitness memory retrieval using functional near-infrared spectroscopy. We analyzed oxygenated hemoglobin ($HbO_2$) concentration in the prefrontal cortex during eyewitness memory retrieval task and examined regional $HbO_2$ differences between observed objects (target) and unobserved objects (lure). We found that target objects elicited increased activation in the bilateral ventrolateral prefrontal cortex, which is known for monitoring retrieval processing via bottom-up attentional processing. Our results suggest bottom-up attentional mechanisms could be different during accurate eyewitness memory retrieval. These findings indicate that investigating retrieval mechanisms using functional near-infrared spectroscopy might be useful for establishing an accurate eyewitness recognition model.

Design of Low Power Current Memory Circuit based on Voltage Scaling (Voltage Scaling 기반의 저전력 전류메모리 회로 설계)

  • Yeo, Sung-Dae;Kim, Jong-Un;Cho, Tae-Il;Cho, Seung-Il;Kim, Seong-Kweon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.2
    • /
    • pp.159-164
    • /
    • 2016
  • A wireless communication system is required to be implemented with the low power circuits because it uses a battery having a limited energy. Therefore, the current mode circuit has been studied because it consumes constant power regardless of the frequency change. However, the clock-feedthrough problem is happened by leak of stored energy in memory operation. In this paper, we suggest the current memory circuit to minimize the clock-feedthrough problem and introduce a technique for ultra low power operation by inducing dynamic voltage scaling. The current memory circuit was designed with BSIM3 model of $0.35{\mu}m$ process and was operated in the near-threshold region. From the simulation result, the clock-feedthrough could be minimized when designing the memory MOS Width of $2{\mu}m$, the switch MOS Width of $0.3{\mu}m$ and dummy MOS Width of $13{\mu}m$ in 1MHz switching operation. The power consumption was calculated with $3.7{\mu}W$ at the supply voltage of 1.2 V, near-threshold voltage.

Trends in Compute Express Link(CXL) Technology (CXL 인터커넥트 기술 연구개발 동향)

  • S.Y. Kim;H.Y. Ahn;Y.M. Park;W.J. Han
    • Electronics and Telecommunications Trends
    • /
    • v.38 no.5
    • /
    • pp.23-33
    • /
    • 2023
  • With the widespread demand from data-intensive tasks such as machine learning and large-scale databases, the amount of data processed in modern computing systems is increasing exponentially. Such data-intensive tasks require large amounts of memory to rapidly process and analyze massive data. However, existing computing system architectures face challenges when building large-scale memory owing to various structural issues such as CPU specifications. Moreover, large-scale memory may cause problems including memory overprovisioning. The Compute Express Link (CXL) allows computing nodes to use large amounts of memory while mitigating related problems. Hence, CXL is attracting great attention in industry and academia. We describe the overarching concepts underlying CXL and explore recent research trends in this technology.

Enhanced Prediction Algorithm for Near-lossless Image Compression with Low Complexity and Low Latency

  • Son, Ji Deok;Song, Byung Cheol
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.2
    • /
    • pp.143-151
    • /
    • 2016
  • This paper presents new prediction methods to improve compression performance of the so-called near-lossless RGB-domain image coder, which is designed to effectively decrease the memory bandwidth of a system-on-chip (SoC) for image processing. First, variable block size (VBS)-based intra prediction is employed to eliminate spatial redundancy for the green (G) component of an input image on a pixel-line basis. Second, inter-color prediction (ICP) using spectral correlation is performed to predict the R and B components from the previously reconstructed G-component image. Experimental results show that the proposed algorithm improves coding efficiency by up to 30% compared with an existing algorithm for natural images, and improves coding efficiency with low computational cost by about 50% for computer graphics (CG) images.

Enhanced Prediction for Low Complexity Near-lossless Compression (낮은 복잡도의 준무손실 압축을 위한 향상된 예측 기법)

  • Son, Ji Deok;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.19 no.2
    • /
    • pp.227-239
    • /
    • 2014
  • This paper proposes an enhance prediction for conventional near-lossless coder to effectively lower external memory bandwidth in image processing SoC. First, we utilize an already reconstructed green component as a base of predictor of the other color component because high correlation between RGB color components usually exists. Next, we can improve prediction performance by applying variable block size prediction. Lastly, we use minimum internal memory and improve a temporal prediction performance by using a template dictionary that is sampled in previous frame. Experimental results show that the proposed algorithm shows better performance than the previous works. Natural images have approximately 30% improvement in coding efficiency and CG images have 60% improvement on average.

Development of a GB-SAR (II) : Focusing Algorithms (GB-SAR의 개발 (II) : 영상화 기법)

  • Lee, Hoon-Yol;Sung, Nak-Hoon;Kim, Jung-Ho;Cho, Seong-Jun
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.4
    • /
    • pp.247-256
    • /
    • 2007
  • In this paper we introduced GB-SAR focusing algorithms for image formation and suggested an optimized solution. We compared the characteristics, advantages, and limitations of the Deramp-FFT (DF) algorithm and the Range-Doppler (RD) algorithm in terms of their image formation principles, memory usage and processing time. We found that DF algorithm is efficient in memory and processing time but can not focus the near range. The RD algorithm can focus the entire range but, considering the refinement on the rail length, it has much redundancy in memory and processing time. In conclusion, we optimized the GB-SAR focusing by using the DF algorithm for a far-range case and the RD algorithm for a near-range case separately.

An Exploratory Structural Analysis of the Accident Causing Factors in Railway Traffic Controllers (철도관제사의 사고유발 요인에 관한 탐색적 구조분석)

  • Kim, Kyung-Nam;Shin, Tack-Hyun
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.1
    • /
    • pp.119-126
    • /
    • 2018
  • This study intended to exploratively testify human error causing factors for railway traffic controller, using AMOS structural equation model. Through literature survey, fatigue and stress as exogenous variable, errors in information process such as cognitive, memory, storage, and execution error as endogenous variable, and accident and incident(near-miss) as dependent variable were set up. Results based on AMOS using 201 railway traffic controllers' questionnaire showed that a clear causality loop like as 'stress ${\rightarrow}$ memory error ${\rightarrow}$ storage error ${\rightarrow}$ incident(near-miss) ${\rightarrow}$ accident' is formed. This result suggests that for the purpose of mitigation of traffic controller's accident, it is so necessary to reduce memory and execution error in the information processing process based on the effective management of stress, as the precedent of them.

An Efficient Algorithm for a Block Angular Linear Program with the Same Blocks (부분문제가 같은 블록대각형 선형계획문제의 효율적인 방볍)

  • 양병학;박순달
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.12 no.2
    • /
    • pp.42-50
    • /
    • 1987
  • This objective of this paper is to develop an efficient method with small memory requirement for a feed-mixing problem on a micro computer. First this method uses the decomposition principle to reduce the memory requirement. Next, the decomposition principle is modified to fit the problem. Further four different variations in solving subproblems are designed in order to improve efficiency of the principle. According to the test with respect to the processing time, the best variation is such that the dual simplex method is used, and the optimal basis of a previous subproblem is used as an initial basis, and the master problem is (M +1) dimensional. In general, the convergence of solution becomes slower near the optimal value. This paper introduces a termination criterion for a sufficiently good solution. According to the test, 5%-tolerence is acceptable with respect to the relation between the processing time and optimal value.

  • PDF

Weighted Competitive Update Protocol for DSM Systems (DSM 시스템에서 통신 부하의 가중치를 고려한 경쟁적인 갱신 프로토콜)

  • Im, Seong-Hwa;Baek, Sang-Hyeon;Kim, Jae-Hun;Kim, Seong-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2245-2252
    • /
    • 1999
  • Since DSM provides a user a simple shared memory abstraction, the user does not have to be concerned with data movement between hosts. Each node in DSM systems has processor, memory, and connection to a network. Memory is divided into pages, and a page can have multiple copies in different nodes. To maintain data consistency between nodes, two conventional protocols are used : write-update protocol and invalidate protocol. The performance of these protocols depends on the system parameters and the memory access patterns. for adapting to memory access patterns, competitive update protocol updates those copies of a page that are expected to be used in the near future, while selectively invalidating other copies. We present weighted competitive update protocols that consider different communication bandwidth for each connection a of two nodes. Test result by simulation show that the weighted competitive update protocol improves performance.

  • PDF