• 제목/요약/키워드: reducing memory

검색결과 424건 처리시간 0.026초

Analysis on the Performance and Temperature of the 3D Quad-core Processor according to Cache Organization (캐쉬 구성에 따른 3차원 쿼드코어 프로세서의 성능 및 온도 분석)

  • Son, Dong-Oh;Ahn, Jin-Woo;Choi, Hong-Jun;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • 제17권6호
    • /
    • pp.1-11
    • /
    • 2012
  • As the process technology scales down, multi-core processors cause serious problems such as increased interconnection delay, high power consumption and thermal problems. To solve the problems in 2D multi-core processors, researchers have focused on the 3D multi-core processor architecture. Compared to the 2D multi-core processor, the 3D multi-core processor decreases interconnection delay by reducing wire length significantly, since each core on different layers is connected using vertical through-silicon via(TSV). However, the power density in the 3D multi-core processor is increased dramatically compared to that in the 2D multi-core processor, because multiple cores are stacked vertically. Unfortunately, increased power density causes thermal problems, resulting in high cooling cost, negative impact on the reliability. Therefore, temperature should be considered together with performance in designing 3D multi-core processors. In this work, we analyze the temperature of the cache in quad-core processors varying cache organization. Then, we propose the low-temperature cache organization to overcome the thermal problems. Our evaluation shows that peak temperature of the instruction cache is lower than threshold. The peak temperature of the data cache is higher than threshold when the cache is composed of many ways. According to the results, our proposed cache organization not only efficiently reduces the peak temperature but also reduces the performance degradation for 3D quad-core processors.

A Study on the Pixel-Paralled Image Processing System for Image Smoothing (영상 평활화를 위한 화소-병렬 영상처리 시스템에 관한 연구)

  • Kim, Hyun-Gi;Yi, Cheon-Hee
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • 제39권11호
    • /
    • pp.24-32
    • /
    • 2002
  • In this paper we implemented various image processing filtering using the format converter. This design method is based on realized the large processor-per-pixel array by integrated circuit technology. These two types of integrated structure are can be classify associative parallel processor and parallel process DRAM(or SRAM) cell. Layout pitch of one-bit-wide logic is identical memory cell pitch to array high density PEs in integrate structure. This format converter design has control path implementation efficiently, and can be utilize the high technology without complicated controller hardware. Sequence of array instruction are generated by host computer before process start, and instructions are saved on unit controller. Host computer is executed the pixel-parallel operation starting at saved instructions after processing start. As a result, we obtained three result that 1)simple smoothing suppresses higher spatial frequencies, reducing noise but also blurring edges, 2) a smoothing and segmentation process reduces noise while preserving sharp edges, and 3) median filtering, like smoothing and segmentation, may be applied to reduce image noise. Median filtering eliminates spikes while maintaining sharp edges and preserving monotonic variations in pixel values.

Implementation of Multicore-Aware Load Balancing on Clusters through Data Distribution in Chapel (클러스터 상에서 다중 코어 인지 부하 균등화를 위한 Chapel 데이터 분산 구현)

  • Gu, Bon-Gen;Carpenter, Patrick;Yu, Weikuan
    • The KIPS Transactions:PartA
    • /
    • 제19A권3호
    • /
    • pp.129-138
    • /
    • 2012
  • In distributed memory architectures like clusters, each node stores a portion of data. How data is distributed across nodes influences the performance of such systems. The data distribution scheme is the strategy to distribute data across nodes and realize parallel data processing. Due to various reasons such as maintenance, scale up, upgrade, etc., the performance of nodes in a cluster can often become non-identical. In such clusters, data distribution without considering performance cannot efficiently distribute data on nodes. In this paper, we propose a new data distribution scheme based on the number of cores in nodes. We use the number of cores as the performance factor. In our data distribution scheme, each node is allocated an amount of data proportional to the number of cores in it. We implement our data distribution scheme using the Chapel language. To show our data distribution is effective in reducing the execution time of parallel applications, we implement Mandelbrot Set and ${\pi}$-Calculation programs with our data distribution scheme, and compare the execution times on a cluster. Based on experimental results on clusters of 8-core and 16-core nodes, we demonstrate that data distribution based on the number of cores can contribute to a reduction in the execution times of parallel programs on clusters.

Access Control of XML Documents Including Update Operators (갱신 연산을 고려한 XML문서의 접근제어)

  • Lim Chung-Hwan;Park Seog
    • Journal of KIISE:Databases
    • /
    • 제31권6호
    • /
    • pp.567-584
    • /
    • 2004
  • As XML becomes popular as the way of presenting information on the web, how to secure XML data becomes an important issue. So far study on XML security has focused on security of data communications by using digital sign or encryption technology. But, it now requires not just to communicate secure XML data on communication but also to manage query process to access XML data since XML data becomes more complicated and bigger. We can manage XML data queries by access control technique. Right now current XML data access control only deals with read operation. This approach has no option to process update XML queries. In this paper, we present XML access control model and technique that can support both read and update operations. In this paper, we will propose the operation for XML document update. Also, We will define action type as a new concept to manage authorization information and process update queries. It results in both minimizing access control steps and reducing memory cost. In addition, we can filter queries that have no access rights at the XML data, which it can reduce unnecessary tasks for processing unauthorized query. As a result of the performance evaluation, we show our access control model is proved to be better than other access control model in update query. But it has a little overhead to decide action type in select query.

Processor Design Technique for Low-Temperature Filter Cache (필터 캐쉬의 저온도 유지를 위한 프로세서 설계 기법)

  • Choi, Hong-Jun;Yang, Na-Ra;Lee, Jeong-A;Kim, Jong-Myon;Kim, Cheol-Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • 제15권1호
    • /
    • pp.1-12
    • /
    • 2010
  • Recently, processor performance has been improved dramatically. Unfortunately, as the process technology scales down, energy consumption in a processor increases significantly whereas the processor performance continues to improve. Moreover, peak temperature in the processor increases dramatically due to the increased power density, resulting in serious thermal problem. For this reason, performance, energy consumption and thermal problem should be considered together when designing up-to-date processors. This paper proposes three modified filter cache schemes to alleviate the thermal problem in the filter cache, which is one of the most energy-efficient design techniques in the hierarchical memory systems : Bypass Filter Cache (BFC), Duplicated Filter Cache (DFC) and Partitioned Filter Cache (PFC). BFC scheme enables the direct access to the L1 cache when the temperature on the filter cache exceeds the threshold, leading to reduced temperature on the filter cache. DFC scheme lowers temperature on the filter cache by appending an additional filter cache to the existing filter cache. The filter cache for PFC scheme is composed of two half-size filter caches to lower the temperature on the filter cache by reducing the access frequency. According to our simulations using Wattch and Hotspot, the proposed partitioned filter cache shows the lowest peak temperature on the filter cache, leading to higher reliability in the processor.

Mining Frequent Sequential Patterns over Sequence Data Streams with a Gap-Constraint (순차 데이터 스트림에서 발생 간격 제한 조건을 활용한 빈발 순차 패턴 탐색)

  • Chang, Joong-Hyuk
    • Journal of the Korea Society of Computer and Information
    • /
    • 제15권9호
    • /
    • pp.35-46
    • /
    • 2010
  • Sequential pattern mining is one of the essential data mining tasks, and it is widely used to analyze data generated in various application fields such as web-based applications, E-commerce, bioinformatics, and USN environments. Recently data generated in the application fields has been taking the form of continuous data streams rather than finite stored data sets. Considering the changes in the form of data, many researches have been actively performed to efficiently find sequential patterns over data streams. However, conventional researches focus on reducing processing time and memory usage in mining sequential patterns over a target data stream, so that a research on mining more interesting and useful sequential patterns that efficiently reflect the characteristics of the data stream has been attracting no attention. This paper proposes a mining method of sequential patterns over data streams with a gap constraint, which can help to find more interesting sequential patterns over the data streams. First, meanings of the gap for a sequential pattern and gap-constrained sequential patterns are defined, and subsequently a mining method for finding gap-constrained sequential patterns over a data stream is proposed.

Low Power TLB Supporting Multiple Page Sizes without Operation System (운영체제 도움 없이 멀티 페이지를 지원하는 저전력 TLB 구조)

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • 제18권12호
    • /
    • pp.1-9
    • /
    • 2013
  • Even though the multiple pages TLB are effective in improving the performance, a conventional method with OS support cannot utilize multiple page sizes in user application. Thus, we propose a new multiple-TLB structure supporting multiple page sizes for high performance and low power consumption without any operating system support. The proposed TLB is organised as two parts of a S-TLB(Small TLB) with a small page size and a L-TLB(Large TLB) with a large page size. Both are designed as fully associative bank structures. The S-TLB stores small pages are evicted from the L-TLB, and the L-TLB stores large pages including a small page generated by the CPU. Each one bank module of S-TLB and L-TLB can be selectively accessed base on particular one and two bits of the virtual address generated from CPU, respectively. Energy savings are achieved by reducing the number of entries accessed at a time. Also, this paper proposed the simple 1-bit LRU policy to improve the performance. The proposed LRU policy can present recently referenced block by using an additional one bit of each entry on TLBs. This method can simply select a least recently used page from the L-TLB. According to the simulation results, the proposed TLB can reduce Energy * Delay by about 76%, 57%, and 6% compared with a fully associative TLB, a ARM TLB, and a Dual TLB, respectively.

Magnetization Switching of MTJs with CoFeSiB/Ru/CoFeSiB Free Layers (CoFeSiB/Ru/CoFeSiB 자유층을 갖는 자기터널 접합의 스위칭 자기장)

  • Lee, S.Y.;Lee, S.W.;Rhee, J.R.
    • Journal of the Korean Magnetics Society
    • /
    • 제17권3호
    • /
    • pp.124-127
    • /
    • 2007
  • Magnetic tunnel junctions (MTJs), which consisted of amorphous CoFeSiB layers, were investigated. The CoFeSiB layers were used to substitute for the traditionally used CoFe and/or NiFe layers with an emphasis given on understanding the effect of the amorphous free layer on the switching characteristics of the MTJs. CoFeSiB has a lower saturation magnetization ($M_s\;:\;560\;emu/cm^3$) and a higher anisotropy constant ($K_u\;:\;2800\;erg/cm^3$) than CoFe and NiFe, respectively. An exchange coupling energy ($J_{ex}$) of $-0.003\;erg/cm^2$ was observed by inserting a 1.0 nm Ru layer in between CoFeSiB layers. In the Si/$SiO_2$/Ta 45/Ru 9.5/IrMn 10/CoFe 7/$AlO_x$/CoFeSiB 7 or CoFeSiB (t)/Ru 1.0/CoFeSiB (7-t)/Ru 60 (in nm) MTJs structure, it was found that the size dependence of the switching field originated in the lower $J_{ex}$ using the experimental and simulation results. The CoFeSiB synthetic antiferromagnet structures were proved to be beneficial for the switching characteristics such as reducing the coercivity ($H_c$) and increasing the sensitivity in micrometer size, even in submicrometer sized elements.

Modified SMPO for Type-II Optimal Normal Basis (Type-II 최적 정규기저에서 변형된 SMPO)

  • Yang Dong-Jin;Chang Nam-Su;Ji Sung-Yeon;Kim Chang-Han
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • 제16권2호
    • /
    • pp.105-111
    • /
    • 2006
  • Cryptographic application and coding theory require operations in finite field $GF(2^m)$. In such a field, the area and time complexity of implementation estimate by memory and time delay. Therefore, the effort for constructing an efficient multiplier in finite field have been proceeded. Massey-Omura proposed a multiplier that uses normal bases to represent elements $CH(2^m)$ [11] and Agnew at al. suggested a sequential multiplier that is a modification of Massey-Omura's structure for reducing the path delay. Recently, Rayhani-Masoleh and Hasan and S.Kwon at al. suggested a area efficient multipliers for modifying Agnew's structure respectively[2,3]. In [2] Rayhani-Masoleh and Hasan proposed a modified multiplier that has slightly increased a critical path delay from Agnew at al's structure. But, In [3] S.Kwon at al. proposed a modified multiplier that has no loss of a time efficiency from Agnew's structure. In this paper we will propose a multiplier by modifying Rayhani-Masoleh and Hassan's structure and the area-time complexity of the proposed multiplier is exactly same as that of S.Kwon at al's structure for type-II optimal normal basis.

The past, present and future of silkworm as a natural health food (천연 건강식품인 누에의 과거, 현재 그리고 미래)

  • Kim, Kee-Young;Koh, Young Ho
    • Food Science and Industry
    • /
    • 제55권2호
    • /
    • pp.154-165
    • /
    • 2022
  • Humans have been breeding the mulberry silkworm for the long period of time to obtain silk fabric and nutrient-rich pupae. Currently, silkworm larvae, pupae, and silk-Fibroin hydrolysates are registered as food raw materials, while silkworm feces and Bombyx batryticatus are registered as Korean traditional medicines. Among sericulture products, individually recognized health functional food ingredients include silk-protein acid-hydrolysates for immunity enhancement, Fibroin-hydrolysates for memory improvement, and freeze-dried 5th instar and 3rd-day-silkworm powder for lowering-blood sugar. Recently, HongJam produced by steaming and freeze-drying mature silkworms were reported to have various health-promoting effects such as preventing the onset of Alzheimer's disease and Parkinson's disease, enhancing gastro-intestinal functions, improving skin-whitening and hair growth, and extending healthspan. By consuming silkworm products with various health-promoting effects, it is possible to increase the healthspan of human beings, thereby reducing personal and national medical expenses, resulting in increasing the individual's happiness.