• Title/Summary/Keyword: memory efficiency

Search Result 712, Processing Time 0.029 seconds

VLSI Implementation of Hopfield Neural Network (Hopfield 신령회로망의 VLSI 구현에 관한 연구)

  • 박성범;오재혁;이창호
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.11
    • /
    • pp.66-73
    • /
    • 1993
  • This paper presents an analog circuit implementation and experimental resuls of the Hopfield type neural network. The proposed architecture enables the reconfiguration betwewn feedback and feedforward networks and employs new circuit designs for the weight supply and storage, analog multilier, nd current-voltage converter, in order to achieve area efficiency as well as function al versatility. The layout design of the eight-neuron neural network is tested as an associative memory to verify its applicability to real world.

  • PDF

Meshfree/GFEM in hardware-efficiency prospective

  • Tian, Rong
    • Interaction and multiscale mechanics
    • /
    • v.6 no.2
    • /
    • pp.197-210
    • /
    • 2013
  • A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.

A Study on the Determination of System Characteristic Constants Using Diakoptic Method (분할법을 이용한 계통발생정수 결정에 관한 연구)

  • 김준현;유석구;송기철
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.32 no.11
    • /
    • pp.387-393
    • /
    • 1983
  • This paper descrives the computational algorithm for determination of system characteristic constants using diakoptic method. When the system charateristic contants of large scale power system are calculated by using sensitivity matrix method which requires the inversion of Jacobian matrix, the necessary computing time and memory requirements increase very largely. Diakoptic mothods is benefit to get around these problems, and it is applied to a model system compared with the integrated method to discuss the efficiency of the algorithm.

Rapid plasmid mapping computer program (Plasmid의 제한효소 지도 작성을 위한 콤퓨터 프로그램)

  • 이동훈;김영준;이승택;강현삼
    • Korean Journal of Microbiology
    • /
    • v.24 no.1
    • /
    • pp.12-17
    • /
    • 1986
  • A new computer algorithm is described to order the restriction fragments of plasmid DNA which has been cleaved with several restriction endonucleases in single or double digestions rapidly with realistic error rates. The permutation and high weight on small fragments methods construct all logical circular map solutions. The program is written in Apple BASIC and run on an Apple II plus microcomputer with 64K memory. Several examples are presented which indicate the high efficiency of the profram in construction possible restriction map for YEp24.

  • PDF

Memory Efficient Parallel Ray Casting Algorithm for Unstructured Grid Volume Rendering on Multi-core CPUs (비정렬 격자 볼륨 렌더링을 위한 다중코어 CPU기반 메모리 효율적 광선 투사 병렬 알고리즘)

  • Kim, Duksu
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.304-313
    • /
    • 2016
  • We present a novel memory-efficient parallel ray casting algorithm for unstructured grid volume rendering on multi-core CPUs. Our method is based on the Bunyk ray casting algorithm. To solve the high memory overhead problem of the Bunyk algorithm, we allocate a fixed size local buffer for each thread and the local buffers contain information of recently visited faces. The stored information is used by other rays or replaced by other face's information. To improve the utilization of local buffers, we propose an image-plane based ray grouping algorithm that makes ray groups have high coherency. The ray groups are then distributed to computing threads and each thread processes the given groups independently. We also propose a novel hash function that uses the index of faces as keys for calculating the buffer index each face will use to store the information. To see the benefits of our method, we applied it to three unstructured grid datasets with different sizes and measured the performance. We found that our method requires just 6% of the memory space compared with the Bunyk algorithm for storing face information. Also it shows compatible performance with the Bunyk algorithm even though it uses less memory. In addition, our method achieves up to 22% higher performance for a large-scale unstructured grid dataset with less memory than Bunyk algorithm. These results show the robustness and efficiency of our method and it demonstrates that our method is suitable to volume rendering for a large-scale unstructured grid dataset.

Buffer Cache Management for Low Power Consumption (저전력을 위한 버퍼 캐쉬 관리 기법)

  • Lee, Min;Seo, Eui-Seong;Lee, Joon-Won
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.6
    • /
    • pp.293-303
    • /
    • 2008
  • As the computing environment moves to the wireless and handheld system, the power efficiency is getting more important. That is the case especially in the embedded hand-held system and the power consumed by the memory system takes the second largest portion in overall. To save energy consumed in the memory system we can utilize low power mode of SDRAM. In the case of RDRAM, nap mode consumes less than 5% of the power consumed in active or standby mode. However hardware controller itself can't use this facility efficiently unless the operating system cooperates. In this paper we focus on how to minimize the number of active units of SDRAM. The operating system allocates its physical pages so that only a few units of SDRAM need to be activated and the unnecessary SDRAM can be put into nap mode. This work can be considered as a generalized and system-wide version of PAVM(Power-Aware Virtual Memory) research. We take all the physical memory into account, especially buffer cache, which takes an half of total memory usage on average. Because of the portion of buffer cache and its importance, PAVM approach cannot be robust without taking the buffer cache into account. In this paper, we analyze the RAM usage and propose power-aware page allocation policy. Especially the pages mapped into the process' address space and the buffer cache pages are considered. The relationship and interactions of these two kinds of pages are analyzed and exploited for energy saving.

Wall Cuckoo: A Method for Reducing Memory Access Using Hash Function Categorization (월 쿠쿠: 해시 함수 분류를 이용한 메모리 접근 감소 방법)

  • Moon, Seong-kwang;Min, Dae-hong;Jang, Rhong-ho;Jung, Chang-hun;NYang, Dae-hun;Lee, Kyung-hee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.6
    • /
    • pp.127-138
    • /
    • 2019
  • The data response speed is a critical issue of cloud services because it directly related to the user experience. As such, the in-memory database is widely adopted in many cloud-based applications for achieving fast data response. However, the current implementation of the in-memory database is mostly based on the linked list-based hash table which cannot guarantee the constant data response time. Thus, cuckoo hashing was introduced as an alternative solution, however, there is a disadvantage that only half of the allocated memory can be used for storing data. Subsequently, bucketized cuckoo hashing (BCH) improved the performance of cuckoo hashing in terms of memory efficiency but still cannot overcome the limitation that the insert overhead. In this paper, we propose a data management solution called Wall Cuckoo which aims to improve not only the insert performance but also lookup performance of BCH. The key idea of Wall Cuckoo is that separates the data among a bucket according to the different hash function be used. By doing so, the searching range among the bucket is narrowed down, thereby the amount of slot accesses required for the data lookup can be reduced. At the same time, the insert performance will be improved because the insert is following up the operation of the lookup. According to analysis, the expected value of slot access required for our Wall Cuckoo is less than that of BCH. We conducted experiments to show that Wall Cuckoo outperforms the BCH and Sorting Cuckoo in terms of the amount of slot access in lookup and insert operations and in different load factor (i.e., 10%-95%).

An efficient search of binary tree for huffman decoding based on numeric interpretation of codewords

  • Kim, Byeong-Il;Chang, Tae-Gyu;Jeong, Jong-Hoon
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.133-136
    • /
    • 2002
  • This paper presents a new method of Huffman decoding which gives a significant improvement of processing efficiency based on the reconstruction of an efficient one-dimensional array data structure incorporating the numeric interpretation of the accrued codewords in the binary tree. In the Proposed search method, the branching address is directly obtained by the arithematic operation with the incoming digit value eliminating the compare instruction needed in the binary tree search. The proposed search method gives 30% of improved Processing efficiency and the memory space of the reconstructed Huffman table is reduced to one third compared to the ordinary ‘compare and jump’ based binary tree. The experimental result with the six MPEG-2 AAC test files also shows about 198% of performance improvement compared to those of the widely used conventional sequential search method.

  • PDF

A Study on the Real Time Remote Monitoring Technology based on Embedded Linux System (임베디드 리눅스 시스템 기반 실시간 원격 모니터링 기법 연구)

  • Hong, Hang-Seol;Kim, Seong-Hwan;Park, Jang-Hyeon
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.04a
    • /
    • pp.11-14
    • /
    • 2007
  • The Embedded Linux System has been developed as a system which can be used with a processor of low efficiency and small-sized memory. Unlike the usual Linux and Windows web server, it has some limitations in the install of application programs, compatibility and scalability when transferring data through web server in real-time. In this paper, we present a real-time remote monitoring system which is very useful to the embedded linux system. The presented system use Java Script without the additional programs at the Embedded Linux System web server and confirm the efficiency of the system through the existing real-time remote monitoring techniques.

  • PDF

Computational Efficiency of Resamplers in Multi-Stage Structure (재표본화에서 다단계 구현의 계산 효율성)

  • Kim Rin-Chul
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.138-141
    • /
    • 2006
  • This paper evaluates the computational efficiency of sample-rate converters with rational factors in multi-stage structure in terms of memory requirement and multiplications per second. We describe resolution preserving and mutual prime conditions, and then present a method for designing the converter from which optimal rational-valued conversion factors for each stage can be yielded directly. As an example, we show an implementation of the 44.1-to-48KHz sample-rate converter in 2-stage structure.