Search | Korea Science

An Optimized Cache Coherence Protocol in Multiprocessor System Connected by Slotted Ring (슬롯링으로 연결된 다중처리기 시스템에서 최적화된 캐쉬일관성 프로토콜)

Min, Jun-Sik;Chang, Tae-Mu
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.12
- /
- pp.3964-3975
- /
- 2000
There are two policies for maintaining consistency among the multiple processor caches in a multiprocessor system: Write invalidate and Write update. In the write invalidate policy, whenever a processor attempt to write its cached block, it has to invalidate all the same copies of the updated block in the system. As a results of this frequent invalidations, this policy results in high cache miss ratio. On the other hand, the write update policy renew them, instead of invalidating all the same copies. This policy has to transfer the updated contents through interconnection network, whether the updated block is ptivate or not. Therefore the network suffer from heavy transaction traffic. In this paper we present an efficient cache coherence protocol for shared memory multiprocessor system connected by slotted ring. This protocol is based on the write update policy, but the updated contents are transferred only in case of updating the shared block. Otherwise, if the updated block is private, the updated contents are not transferred. We analyze the proposed protocol and enforce simulation to compare it with previous version.
PDF

DNS-based Dynamic Load Balancing Method on a Distributed Web-server System (분산 웹 서버 시스템에서의 DNS 기반 동적 부하분산 기법)

Moon, Jong-Bae;Kim, Myung-Ho
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.3
- /
- pp.193-204
- /
- 2006
In most existing distributed Web systems, incoming requests are distributed to servers via Domain Name System (DNS). Although such systems are simple to implement, the address caching mechanism easily results in load unbalancing among servers. Moreover, modification of the DNS is necessary to load considering the server's state. In this paper, we propose a new dynamic load balancing method using dynamic DNS update and round-robin mechanism. The proposed method performs effective load balancing without modification of the DNS. In this method, a server can dynamically be added to or removed from the DNS list according to the server's load. By removing the overloaded server from the DNS list, the response time becomes faster. For dynamic scheduling, we propose a scheduling algorithm that considers the CPU, memory, and network usage. We can select a scheduling policy based on resources usage. The proposed system can easily be managed by a GUI-based management tool. Experiments show that modules implemented in this paper have low impact on the proposed system. Furthermore, experiments show that both the response time and the file transfer rate of the proposed system are faster than those of a pure Round-Robin DNS.
PDF KSCI

AS B-tree: A study on the enhancement of the insertion performance of B-tree on SSD (AS B-트리: SSD를 사용한 B-트리에서 삽입 성능 향상에 관한 연구)

Kim, Sung-Ho;Roh, Hong-Chan;Lee, Dae-Wook;Park, Sang-Hyun
- The KIPS Transactions:PartD
- /
- v.18D no.3
- /
- pp.157-168
- /
- 2011
Recently flash memory has been being utilized as a main storage device in mobile devices, and flashSSDs are getting popularity as a major storage device in laptop and desktop computers, and even in enterprise-level server machines. Unlike HDDs, on flash memory, the overwrite operation is not able to be performed unless it is preceded by the erase operation to the same block. To address this, FTL(Flash memory Translation Layer) is employed on flash memory. Even though the modified data block is overwritten to the same logical address, FTL writes the updated data block to the different physical address from the previous one, mapping the logical address to the new physical address. This enables flash memory to avoid the high block-erase cost. A flashSSD has an array of NAND flash memory packages so it can access one or more flash memory packages in parallel at once. To take advantage of the internal parallelism of flashSSDs, it is beneficial for DBMSs to request I/O operations on sequential logical addresses. However, the B-tree structure, which is a representative index scheme of current relational DBMSs, produces excessive I/O operations in random order when its node structures are updated. Therefore, the original b-tree is not favorable to SSD. In this paper, we propose AS(Always Sequential) B-tree that writes the updated node contiguously to the previously written node in the logical address for every update operation. In the experiments, AS B-tree enhanced 21% of B-tree's insertion performance.
https://doi.org/10.3745/KIPSTD.2011.18D.3.157 인용 PDF KSCI

A Temperature- and Supply-Insensitive 1Gb/s CMOS Open-Drain Output Driver for High-Bandwidth DRAMs (High-Bandwidth DRAM용 온도 및 전원 전압에 둔감한 1Gb/s CMOS Open-Drain 출력 구동 회로)

Kim, Young-Hee;Sohn, Young-Soo;Park, Hong-Jung;Wee, Jae-Kyung;Choi, Jin-Hyeok
- Journal of the Institute of Electronics Engineers of Korea SD
- /
- v.38 no.8
- /
- pp.54-61
- /
- 2001
A fully on-chip open-drain CMOS output driver was designed for high bandwidth DRAMs, such that its output voltage swing was insensitive to the variations of temperature and supply voltage. An auto refresh signal was used to update the contents of the current control register, which determined the transistors to be turned-on among the six binary-weighted transistors of an output driver. Because the auto refresh signal is available in DRAM chips, the output driver of this work does not require any external signals to update the current control register. During the time interval while the update is in progress, a negative feedback loop is formed to maintain the low level output voltage ($V_OL$) to be equal to the reference voltage ($V_{OL.ref}$) which is generated by a low-voltage bandgap reference circuit. Test results showed the successful operation at the data rate up to 1Gb/s. The worst-case variations of $V_{OL.ref}$ and $V_OL$ of the proposed output driver were measured to be 2.5% and 7.5% respectively within a temperature range of $20^{\circ}C$ to $90^{\circ}C$ and a supply voltage range of 2.25V to 2.75V, while the worst-case variation of $V_OL$ of the conventional output driver was measured to be 24% at the same temperature and supply voltage ranges.
PDF

Design of Fast Operation Method In NAND Flash Memory File System (NAND 플래시 메모리 파일 시스템에 빠른 연산을 위한 설계)

Jin, Jong-Won;Lee, Tae-Hoon;Chung, Ki-Dong
- Journal of KIISE:Computing Practices and Letters
- /
- v.14 no.1
- /
- pp.91-95
- /
- 2008
Flash memory is widely used in embedded systems because of its benefits such as non-volatile, shock resistant, and low power consumption. But NAND flash memory suffers from out-place-update, limited erase cycles, and page based read/write operations. To solve these problems, log-structured filesystem was proposed such as YAFFS. However, YAFFS sequentially retrieves an array of all block information to allocate free block for a write operation. Also before the write operation, YAFPS read the array of block information to find invalid block for erase. These could reduce the performance of the filesystem. This paper suggests fast operation method for NAND flash filesystem that solves the above-mentioned problems. We implemented the proposed methods in YAFFS. And we measured the performance compared with the original technique.
PDF KSCI

An Extended R-Tree Indexing Method using Prefetching in Main Memory (메인 메모리에서 선반입을 사용한 확장된 R-Tree 색인 기법)

Kang, Hong-Koo;Kim, Dong-O;Hong, Dong-Sook;Han, Ki-Joon
- Journal of Korea Spatial Information System Society
- /
- v.6 no.1 s.11
- /
- pp.19-29
- /
- 2004
Recently, studies have been performed to improve the cache performance of the R-Tree in main memory. A general mothed to improve the cache performance of the R-Tree is to reduce size of an entry so that a node can store more entries and fanout of it can increase. However, this method generally requites additional process to reduce information of entries and do not support incremental updates. In addition, the cache miss always occurs on moving between a parent node and a child node. To solve these problems efficiently, this paper proposes and evaluates the PR-Tree that is an extended R-Tree indexing method using prefetching in main memory. The PR-Tree can produce a wider node to optimize prefetching without additional modifications on the R-Tree. Moreover, the PR-Tree reduces cache miss rates that occur in moving between a parent node and a child node. In our simulation, the search performance, the update performance, and the node split performance of the PR-Tree improve up to 38%. 30%, and 67% respectively, compared with the original R-Tree.
PDF

Application-aware Design Parameter Exploration of NAND Flash Memory

Bang, Kwanhu;Kim, Dong-Gun;Park, Sang-Hoon;Chung, Eui-Young;Lee, Hyuk-Jun
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.13 no.4
- /
- pp.291-302
- /
- 2013
NAND flash memory (NFM) based storage devices, e.g. Solid State Drive (SSD), are rapidly replacing conventional storage devices, e.g. Hard Disk Drive (HDD). As NAND flash memory technology advances, its specification has evolved to support denser cells and larger pages and blocks. However, efforts to fully understand their impacts on design objectives such as performance, power, and cost for various applications are often neglected. Our research shows this recent trend can adversely affect the design objectives depending on the characteristics of applications. Past works mostly focused on improving the specific design objectives of NFM based systems via various architectural solutions when the specification of NFM is given. Several other works attempted to model and characterize NFM but did not access the system-level impacts of individual parameters. To the best of our knowledge, this paper is the first work that considers the specification of NFM as the design parameters of NAND flash storage devices (NFSDs) and analyzes the characteristics of various synthesized and real traces and their interaction with design parameters. Our research shows that optimizing design parameters depends heavily on the characteristics of applications. The main contribution of this research is to understand the effects of low-level specifications of NFM, e.g. cell type, page size, and block size, on system-level metrics such as performance, cost, and power consumption in various applications with different characteristics, e.g. request length, update ratios, read-and-modify ratios. Experimental results show that the optimized page and block size can achieve up to 15 times better performance than the conventional NFM configuration in various applications. The results can be used to optimize the system-level objectives of a system with specific applications, e.g. embedded systems with NFM chips, or predict the future direction of NFM.
https://doi.org/10.5573/JSTS.2013.13.4.291 인용 PDF KSCI

A Global IPv6 Unicast Address Lookup Scheme Using Variable Multiple Hashing (가변적인 복수 해슁을 이용한 글로벌 IPv6 유니캐스트 주소 검색 구조)

Park Hyun-Tae;Moon Byung-In;Kang Sung-Ho
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.31 no.5B
- /
- pp.378-389
- /
- 2006
An IP address lookup scheme has become a critical issue increasingly for high-speed networking techniques due to the advent of IPv6 based on 128bit. In this paper, a novel global IPv6 unicast address lookup scheme is proposed for next generation internet routers. The proposed scheme perform a variable multiple hashing based on prefix grouping. Accordingly, it should not only minimize overflows with the proper number of memory modules, but also reduce a memory size required to organize forwarding tables. It has the fast building and searching mechanisms for forwarding tables during only a single memory access. Besides, it is easy to update forwarding tables incrementally. In the simulation using CERNET routing data as a 6bone test phase, we compared the proposed scheme with a similar scheme using a uniform multiple hashing. As a result, we verified that the number of overflows is reduced by 50% and the size of memory for forwarding tables is shrunken by 15% with 8 tables.
PDF KSCI

A Compressed Hot-Cold Clustering to Improve Index Operation Performance of Flash Memory-SSD Systems (플래시메모리-SSD의 인덱스 연산 성능 향상을 위한 압축된 핫-콜드 클러스터링 기법)

Byun, Si-Woo
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.11 no.1
- /
- pp.166-174
- /
- 2010
SSDs are one of the best media to support portable and desktop computers' storage devices. Their features include non-volatility, low power consumption, and fast access time for read operations, which are sufficient to present flash memories as major database storage components for desktop and server computers. However, we need to improve traditional index management schemes based on B-Tree due to the relatively slow characteristics of flash memory operations, as compared to RAM memory. In order to achieve this goal, we propose a new index management scheme based on a compressed hot-cold clustering called CHC-Tree. CHC-Tree-based index management improves index operation performance by dividing index nodes into hot or cold segments and compressing pointers and keys in the index nodes and clustering the hot or cold segments. The offset compression techniques using unused free area in cold index node lead to reduce the number of slow erase operations in index node insert/delete processes. Simulation results show that our scheme significantly reduces the write and erase operation overheads, improving the index search performance of B-Tree by up to 26 percent, and the index update performance by up to 23 percent.
https://doi.org/10.5762/KAIS.2010.11.1.166 인용 PDF KSCI

Enhancing LRU Buffer Replacement Policy with Delayed Write of Not-cold-dirty-pages for Flash Memory (플래시 메모리를 위한 Not-cold-Page 쓰기지연을 통한 LRU 버퍼교체 정책 개선)

Jung Ho-Young;Park Sung-Min;Cha Jae-Hyuk;Kang Soo-Yong
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.9
- /
- pp.634-641
- /
- 2006
Flash memory has many advantages like non-volatility and fast I/O speed, but it has also disadvantages such as not-in-place-update data and asymmetric read/write/erase speed. For the performance of flash memory storage, it is essential for the buffer replacement algorithms to reduce the number of write operations that also affects the number of erase operations. A new buffer replacement algorithm is proposed in this paper, that delays the writes of not-cold-dirty pages in the buffer cache of flash storage. We show that this algorithm effectively decreases the number of write operations and erase operations without much degradation of hit ratio. As a result overall performance of flash I/O speed is improved.
PDF KSCI

Search Result 163, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)