• Title/Summary/Keyword: In-Memory Computing

Search Result 755, Processing Time 0.031 seconds

Evaluation of Network Reliability Using Most Probable States

  • Oh, Dae-Ho;Park, Dong-Ho;Lee, Seung-Min
    • Proceedings of the Korean Reliability Society Conference
    • /
    • 2001.06a
    • /
    • pp.463-469
    • /
    • 2001
  • An algorithm is presenter for generating the most probable states in decreasing order of probability of each unit. The proposed new algorithm in this note is compared with the existing methods regarding memory requirement and execution time. Our method is simpler and, judging from the computing experiment, it requires less memory size than the previously known methods and takes comparable execution time to previous methods for an acceptable level of criterion.

  • PDF

Computational Complexity Comparison of Second-Order Volterrra Filtering Algorithms

  • Im, Sungin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2E
    • /
    • pp.38-46
    • /
    • 1997
  • The objective of the paper is to compare the computational complexity of five algorithms for computing time-domain second-order Volterra filter outputs in terms of number of real multiplication and addition operations required for implementation. This study shows that if the filter memory length is greater that or equal to 16, the fast algorithm using the overlap-save method and the frequency-domain symmetry properties of the quadratic coefficients is the most efficient among the algorithms investigated in this paper, When the filter memory length is less than 16, the algorithm using the time-domain symmetry properties is better than any other algorithm.

  • PDF

Meshfree/GFEM in hardware-efficiency prospective

  • Tian, Rong
    • Interaction and multiscale mechanics
    • /
    • v.6 no.2
    • /
    • pp.197-210
    • /
    • 2013
  • A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.

Analysis of Cloud Service Providers

  • Lee, Yo-Seob
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.315-320
    • /
    • 2021
  • Currently, cloud computing is being used as a technology that greatly changes the IT field. For many businesses, many cloud services are available in the form of custom, reliable, and cost-effective web applications. Most cloud service providers provide functions such as IoT, machine learning, AI services, blockchain, AR & VR, mobile services, and containers in addition to basic cloud services that support the scalability of processors, memory, and storage. In this paper, we will look at the most used cloud service providers and compare the services provided by the cloud service providers.

Performance Analysis of Clustering and Non-clustering Methods in Flash Memory Environment (플래시 메모리 환경에서 클러스터링 방법과 비 클러스터링 방법의 성능 분석)

  • Bae, Duck-Ho;Chang, Ji-Woong;Kim, Sang-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.599-603
    • /
    • 2008
  • Flash memory has its unique characteristics: the write operation is much more costly than the read operation and in-place updating is not allowed. In this paper, we analyze how these characteristics of flash memory affect the performance of clustering and non-clustering in record management, and shows that non-clustering is more suitable in flash memory environment, which does not hold in disk environment. Also, we discuss the problems of the existing non-clustering method, and identify considerable designing factors of record management method in flash memory environment.

Improvement Method and Performance Analysis of Shared Memory in Dual Core Embedded Linux system (듀얼코어 임베디드 리눅스 시스템에서 공유 메모리 성능 개선 방안 및 성능 분석)

  • Jung, Ji-Sung;Kim, Chang-Bong
    • Journal of Internet Computing and Services
    • /
    • v.11 no.4
    • /
    • pp.95-106
    • /
    • 2010
  • Recently multiple process communicate together. They share resource and information for cooperation in complicated programming environment. Kernel provides IPC (Inter -Process Communication) for communication with each other process. Shared Memory is a technique that many processes can access to identical memory area in the Linux environment. In this paper, we propose a performance improvement method of shared memory in the dual-core embedded linux system which is consist of different core and different operating system. We construct the MPC2530F (ARM926F+ARM946E) linux system and measure the performance therein. We attempt a performance enhancement in each CPU for each process which uses a shared memory.

JMP+RAND: Mitigating Memory Sharing-Based Side-Channel Attack by Embedding Random Values in Binaries (JMP+RAND: 바이너리 난수 삽입을 통한 메모리 공유 기반 부채널 공격 방어 기법)

  • Kim, Taehun;Shin, Youngjoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.5
    • /
    • pp.101-106
    • /
    • 2020
  • Since computer became available, much effort has been made to achieve information security. Even though memory protection defense mechanisms were studied the most among of them, the problems of existing memory protection defense mechanisms were found due to improved performance of computer and new defense mechanisms were needed due to the advent of the side-channel attacks. In this paper, we propose JMP+RAND that embedding random values of 5 to 8 bytes per page to defend against memory sharing based side-channel attacks and bridging the gap of existing memory protection defense mechanism. Unlike the defense mechanism of the existing side-channel attacks, JMP+RAND uses static binary rewriting and continuous jmp instruction and random values to defend against the side-channel attacks in advance. We numerically calculated the time it takes for a memory sharing-based side-channel attack to binary adopted JMP+RAND technique and verified that the attacks are impossible in a realistic time. Modern architectures have very low overhead for JMP+RAND because of the very fast and accurate branching of jmp instruction using branch prediction. Since random value can be embedded only in specific programs using JMP+RAND, it is expected to be highly efficient when used with memory deduplication technique, especially in a cloud computing environment.

MATE: Memory- and Retraining-Free Error Correction for Convolutional Neural Network Weights

  • Jang, Myeungjae;Hong, Jeongkyu
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.1
    • /
    • pp.22-28
    • /
    • 2021
  • Convolutional neural networks (CNNs) are one of the most frequently used artificial intelligence techniques. Among CNN-based applications, small and timing-sensitive applications have emerged, which must be reliable to prevent severe accidents. However, as the small and timing-sensitive systems do not have sufficient system resources, they do not possess proper error protection schemes. In this paper, we propose MATE, which is a low-cost CNN weight error correction technique. Based on the observation that all mantissa bits are not closely related to the accuracy, MATE replaces some mantissa bits in the weight with error correction codes. Therefore, MATE can provide high data protection without requiring additional memory space or modifying the memory architecture. The experimental results demonstrate that MATE retains nearly the same accuracy as the ideal error-free case on erroneous DRAM and has approximately 60% accuracy, even with extremely high bit error rates.

Analyzing the Overhead of the Memory Mapped File I/O for In-Memory File Systems (메모리 파일시스템에서 메모리 매핑을 이용한 파일 입출력의 오버헤드 분석)

  • Choi, Jungsik;Han, Hwansoo
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.10
    • /
    • pp.497-503
    • /
    • 2016
  • Emerging next-generation storage technologies such as non-volatile memory will help eliminate almost all of the storage latency that has plagued previous storage devices. In conventional storage systems, the latency of slow storage devices dominates access latency; hence, software efficiency is not critical. With low-latency storage, software costs can quickly dominate memory latency. Hence, researchers have proposed the memory mapped file I/O to avoid the software overhead. Mapping a file into the user memory space enables users to access the file directly. Therefore, it is possible to avoid the complicated I/O stack. This minimizes the number of user/kernel mode switchings. In addition, there is no data copy between kernel and user areas. Despite of the benefits in the memory mapped file I/O, its overhead still needs to be addressed, as the existing mechanism for the memory mapped file I/O is designed for slow block devices. In this paper, we identify the overheads of the memory mapped file I/O via experiments.

Efficient External Memory Algorithm for Finding the Maximum Suffix of a String (스트링의 최대 서픽스를 계산하는 효율적인 외부 메모리 알고리즘)

  • Kim, Sung-Kwon;Kim, Soo-Cheol;Cho, Jung-Sik
    • The KIPS Transactions:PartA
    • /
    • v.15A no.4
    • /
    • pp.239-242
    • /
    • 2008
  • We study the problem of finding the maximum suffix of a string on the external memory model of computation with one disk. In this model, we are primarily interested in designing algorithms that reduce the number of I/Os between the disk and the internal memory. A string of length N has N suffixes and among these, the lexicographically largest one is called the maximum suffix of the string. Finding the maximum suffix of a string plays a crucial role in solving some string problems. In this paper, we present an external memory algorithm for computing the maximum suffix of a string of length N. The algorithm uses four blocks in the internal memory and performs at most 4(N/L) disk I/Os, where L is the size of a block.