Search | Korea Science

Counter-Based Approaches for Efficient WCET Analysis of Multicore Processors with Shared Caches

Ding, Yiqiang;Zhang, Wei
- Journal of Computing Science and Engineering
- /
- v.7 no.4
- /
- pp.285-299
- /
- 2013
To enable hard real-time systems to take advantage of multicore processors, it is crucial to obtain the worst-case execution time (WCET) for programs running on multicore processors. However, this is challenging and complicated due to the inter-thread interferences from the shared resources in a multicore processor. Recent research used the combined cache conflict graph (CCCG) to model and compute the worst-case inter-thread interferences on a shared L2 cache in a multicore processor, which is called the CCCG-based approach in this paper. Although it can compute the WCET safely and accurately, its computational complexity is exponential and prohibitive for a large number of cores. In this paper, we propose three counter-based approaches to significantly reduce the complexity of the multicore WCET analysis, while achieving absolute safety with tightness close to the CCCG-based approach. The basic counter-based approach simply counts the worst-case number of cache line blocks mapped to a cache set of a shared L2 cache from all the concurrent threads, and compares it with the associativity of the cache set to compute the worst-case cache behavior. The enhanced counter-based approach uses techniques to enhance the accuracy of calculating the counters. The hybrid counter-based approach combines the enhanced counter-based approach and the CCCG-based approach to further improve the tightness of analysis without significantly increasing the complexity. Our experiments on a 4-core processor indicate that the enhanced counter-based approach overestimates the WCET by 14% on average compared to the CCCG-based approach, while its averaged running time is less than 1/380 that of the CCCG-based approach. The hybrid approach reduces the overestimation to only 2.65%, while its running time is less than 1/150 that of the CCCG-based approach on average.
https://doi.org/10.5626/JCSE.2013.7.4.285 인용 PDF KSCI KPUBS

Comparing Separate and Statically-Partitioned Caches for Time-Predictable Multicore Processors

Wu, Lan;Ding, Yiqiang;Zhang, Wei
- Journal of Computing Science and Engineering
- /
- v.8 no.1
- /
- pp.25-33
- /
- 2014
In this paper, we quantitatively compare two different time-predictable multicore cache architectures, separate and statically-partitioned caches, through extensive simulation. Current research trends primarily focus on partitioned-cache architectures in order to achieve time predictability for hard real-time multicore based systems, and our experiments reveal that separate caches actually lead to much better performance and energy efficiency when compared to statically-partitioned caches, and both of them are adequate for timing analysis for real-time multicore applications.
https://doi.org/10.5626/JCSE.2014.8.1.25 인용 PDF KSCI KPUBS

Improving Reliability and Security in IEEE 802.15.4 Wireless Sensor Networks (IEEE 802.15.4 센서 네트워크에서의 신뢰성 및 보안성 향상 기법)

Shon, Tae-Shik;Park, Yong-Suk
- The KIPS Transactions:PartC
- /
- v.16C no.3
- /
- pp.407-416
- /
- 2009
Recently, various application services in wireless sensor networks are more considered than before, and thus reliable and secure communication of sensor network is turning out as one of essential issues. This paper studies such communication in IEEE 802.15.4 based sensor network. We present IMHRS (IEEE 802.15.4 MAC-based Hybrid hop-by-hop Reliability Scheme) employing EHHR (Enhanced Hop-by-Hop Reliability), which uses Hop-cache and Hop-ack and ALC (Adaptive Link Control), which considers link status and packet type. Also, by selecting security suite depending on network and application type, energy efficiency is considered based on HAS (Hybrid Adaptive Security) Framework. The presented schemes are evaluated by simulations and experiments. Besides, the prototype system is developed and tested to show the potential efficiency.
https://doi.org/10.3745/KIPSTC.2009.16-C.3.407 인용 PDF KSCI

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.6
- /
- pp.1-11
- /
- 2012
As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.
https://doi.org/10.9708/jksci.2012.17.6.001 인용 PDF KSCI

Fuzzy Relevance-based Transcoding for Differentiated Streaming Media Service in the Proxy System (프록시 시스템에서 차별화된 스트리밍 미디어 서비스를 위한 퍼지 적합도 기반 트랜스 코딩)

Lee, Chong-Deuk
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.6
- /
- pp.2785-2792
- /
- 2011
Such problems as delay, congestion, and crosstalk in the proxy system degrade not only QoS (Quality of Service) but responsiveness and reliability of the streaming media service. To solve this problem this paper proposed a FRTP (Fuzzy Relevance-based Transcoding Proxy) mechanism. The proposed FRTP mechanism analyzes fuzzy similarity for partitioned segment versions of media objects to create a FRTG (Fuzzy Relevance-based Transcoding Graph). Created FRTG determines the transcoding for partitioned media object segment versions. Determined transcoding improves DSR (Delay Saving Ratios), CHPR (Cache Hit Precision Ratio), and CHRR (Cache Hit Recall Ratio). The proposed mechanism is simulated to evaluate such performance parameters as DSR, CHPR, and CHRR. Simulation results shows that the proposed mechanism outperforms in DSR, CHPR and CHRR compared with the other existing mechanisms.
https://doi.org/10.5762/KAIS.2011.12.6.2785 인용 PDF KSCI

Design of a User Location Prediction Algorithm Using the Cache Scheme (캐시 기법을 이용한 위치 예측 알고리즘 설계)

Son, Byoung-Hee;Kim, Sang-Hee;Nahm, Eui-Seok;Kim, Hag-Bae
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.32 no.6B
- /
- pp.375-381
- /
- 2007
This paper focuses on the prediction algorithm among the context-awareness technologies. With a representative algorithm, Bayesian Networks, it is difficult to realize a context-aware as well as to decrease process time in real-time environment. Moreover, it is also hard to be sure about the accuracy and reliability of prediction. One of the simplest algorithms is the sequential matching algorithm. We use it by adding the proposed Cache Scheme. It is adequate for a context-aware service adapting user's habit and reducing the processing time by average 48.7% in this paper. Thus, we propose a design method of user location prediction algorithm that uses sequential matching with the cache scheme by taking user's habit or behavior into consideration. The novel approach will be dealt in a different way compared to the conventional prediction algorithm.
PDF KSCI

DJFS: Providing Highly Reliable and High-Performance File System with Small-Sized NVRAM

Kim, Junghoon;Lee, Minho;Song, Yongju;Eom, Young Ik
- ETRI Journal
- /
- v.39 no.6
- /
- pp.820-831
- /
- 2017
File systems and applications try to implement their own update protocols to guarantee data consistency, which is one of the most crucial aspects of computing systems. However, we found that the storage devices are substantially under-utilized when preserving data consistency because they generate massive storage write traffic with many disk cache flush operations and force-unit-access (FUA) commands. In this paper, we present DJFS (Delta-Journaling File System) that provides both a high level of performance and data consistency for different applications. We made three technical contributions to achieve our goal. First, to remove all storage accesses with disk cache flush operations and FUA commands, DJFS uses small-sized NVRAM for a file system journal. Second, to reduce the access latency and space requirements of NVRAM, DJFS attempts to journal compress the differences in the modified blocks. Finally, to relieve explicit checkpointing overhead, DJFS aggressively reflects the checkpoint transactions to file system area in the unit of the specified region. Our evaluation on TPC-C SQLite benchmark shows that, using our novel optimization schemes, DJFS outperforms Ext4 by up to 64.2 times with only 128 MB of NVRAM.
https://doi.org/10.4218/etrij.17.0117.0558 인용 PDF KSCI

Static Timing Analysis of Shared Caches for Multicore Processors

Zhang, Wei;Yan, Jun
- Journal of Computing Science and Engineering
- /
- v.6 no.4
- /
- pp.267-278
- /
- 2012
The state-of-the-art techniques in multicore timing analysis are limited to analyze multicores with shared instruction caches only. This paper proposes a uniform framework to analyze the worst-case performance for both shared instruction caches and data caches in a multicore platform. Our approach is based on a new concept called address flow graph, which can be used to model both instruction and data accesses for timing analysis. Our experiments, as a proof-of-concept study, indicate that the proposed approach can accurately compute the worst-case performance for real-time threads running on a dual-core processor with a shared L2 cache (either to store instructions or data).
https://doi.org/10.5626/JCSE.2012.6.4.267 인용 PDF KSCI KPUBS

Adaptive Inter-Agent Communication Protocol for Large-Scale Mobile Agent Systems (대규모 이동 에이전트 시스템을 위한 적응적 에이전트간 통신 프로토콜)

Ahn Jin-Ho
- The KIPS Transactions:PartA
- /
- v.13A no.4 s.101
- /
- pp.351-362
- /
- 2006
This paper proposes an adaptive inter-agent communication protocol to considerably reduce the amount of agent location information maintained by each service node and the message delivery time while avoiding the dependency of the home node of a mobile agent. To satisfy this goal, the protocol enables each mobile agent to autonomously leave its location information only on some few of its visiting nodes. Also, it may significantly reduce the agent cache updating frequency of each service node by keeping the identifier of the location manager of each agent in the smart agent location cache of the node. Simulation results show that the proposed protocol reduces about $76%{\sim}80%$ of message delivery overhead and about $76%{\sim}79%$ of the amount of agent location information each service node should maintain compared with the traditional one.
https://doi.org/10.3745/KIPSTA.2006.13A.4.351 인용 PDF KSCI

Data Deduplication Method using PRAM Cache in SSD Storage System (SSD 스토리지 시스템에서 PRAM 캐시를 이용한 데이터 중복제거 기법)

Kim, Ju-Kyeong;Lee, Seung-Kyu;Kim, Deok-Hwan
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.4
- /
- pp.117-123
- /
- 2013
In the recent cloud storage environment, the amount of SSD (Solid-State Drive) replacing with the traditional hard disk drive is increasing. Management of SSD for its space efficiency has become important since SSD provides fast IO performance due to no mechanical movement whereas it has wearable characteristics and does not provide in place update. In order to manage space efficiency of SSD, data de-duplication technique is frequently used. However, this technique occurs much overhead because it consists of data chunking, hasing and hash matching operations. In this paper, we propose new data de-duplication method using PRAM cache. The proposed method uses hierarchical hash tables and LRU(Least Recently Used) for data replacement in PRAM. First hash table in DRAM is used to store hash values of data cached in the PRAM and second hash table in PRAM is used to store hash values of data in SSD storage. The method also enhance data reliability against power failure by maintaining backup of first hash table into PRAM. Experimental results show that average writing frequency and operation time of the proposed method are 44.2% and 38.8% less than those of existing data de-depulication method, respectively, when three workloads are used.
https://doi.org/10.5573/ieek.2013.50.4.117 인용 PDF KSCI

Search Result 21, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)