• Title/Summary/Keyword: High-Throughput Computing

Search Result 94, Processing Time 0.021 seconds

A Pipelined Parallel Optimized Design for Convolution-based Non-Cascaded Architecture of JPEG2000 DWT (JPEG2000 이산웨이블릿변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계)

  • Lee, Seung-Kwon;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.7
    • /
    • pp.29-38
    • /
    • 2009
  • In this paper, a high performance pipelined computing design of parallel multiplier-temporal buffer-parallel accumulator is present for the convolution-based non-cascaded architecture aiming at the real time Discrete Wavelet Transform(DWT) processing. The convolved multiplication of DWT would be reduced upto 1/4 by utilizing the filter coefficients symmetry and the up/down sampling; and it could be dealt with 3-5 times faster computation by LUT-based DA multiplication of multiple filter coefficients parallelized for product terms with an image data. Further, the reutilization of computed product terms could be achieved by storing in the temporal buffer, which yields the saving of computation as well as dynamic power by 50%. The convolved product terms of image data and filter coefficients are realigned and stored in the temporal buffer for the accumulated addition. Then, the buffer management of parallel aligned storage is carried out for the high speed sequential retrieval of parallel accumulations. The convolved computation is pipelined with parallel multiplier-temporal buffer-parallel accumulation in which the parallelization of temporal buffer and accumulator is optimize, with respect to the performance of parallel DA multiplier, to improve the pipelining performance. The proposed architecture is back-end designed with 0.18um library, which verifies the 30fps throughput of SVGA(800$\times$600) images at 90MHz.

Performance Evaluation and Analysis on Single and Multi-Network Virtualization Systems with Virtio and SR-IOV (가상화 시스템에서 Virtio와 SR-IOV 적용에 대한 단일 및 다중 네트워크 성능 평가 및 분석)

  • Jaehak Lee;Jongbeom Lim;Heonchang Yu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.48-59
    • /
    • 2024
  • As functions that support virtualization on their own in hardware are developed, user applications having various workloads are operating efficiently in the virtualization system. SR-IOV is a virtualization support function that takes direct access to PCI devices, thus giving a high I/O performance by minimizing the need for hypervisor or operating system interventions. With SR-IOV, network I/O acceleration can be realized in virtualization systems that have relatively long I/O paths compared to bare-metal systems and frequent context switches between the user area and kernel area. To take performance advantages of SR-IOV, network resource management policies that can derive optimal network performance when SR-IOV is applied to an instance such as a virtual machine(VM) or container are being actively studied.This paper evaluates and analyzes the network performance of SR-IOV implementing I/O acceleration is compared with Virtio in terms of 1) network delay, 2) network throughput, 3) network fairness, 4) performance interference, and 5) multi-network. The contributions of this paper are as follows. First, the network I/O process of Virtio and SR-IOV was clearly explained in the virtualization system, and second, the evaluation results of the network performance of Virtio and SR-IOV were analyzed based on various performance metrics. Third, the system overhead and the possibility of optimization for the SR-IOV network in a virtualization system with high VM density were experimentally confirmed. The experimental results and analysis of the paper are expected to be referenced in the network resource management policy for virtualization systems that operate network-intensive services such as smart factories, connected cars, deep learning inference models, and crowdsourcing.

Optimistic Concurrency Control based on 2-Version and TimeStamp for Broadcast Environment : OCC/2VTS (방송환경에서 이중 버전과 타임스탬프에 기반을 둔 낙관적 동시성 제어 기법)

  • Lee, Uk-Hyun;Hwang, Bu-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.8D no.2
    • /
    • pp.132-144
    • /
    • 2001
  • The broadcast environment is asymmetric communication aspect that is typically much greater communication capacity available from server to clients than in the opposite direction. In addition, most of mobile computing systems only allow the generation of read-only transactions from mobile clients for retrieving different types of information such as stock data, traffic information and news updates. Since previous concurrency control protocols, however, do not consider such a particular characteristics, the performance degradation occurs when those schemes are applied to the broadcast environment having quite a high data contention. In this paper, we propose OCC/2VTS (Optimistic Concurrency Control based on 2-Version and TimeStamp) that is most appropriate for broadcast environment. OCC/2VTS lets each client process and commit query transactions for itself by using two version data in cache. If the values of appropriate data items are not changed twice by invalidation report after a query transaction starts, the query transaction is committed safely independent of commitment of update transactions. OCC/2VTS decreases the number of informing server for the purpose of commitment. Due to broadcasting the validation reports including updated recent values, it reduces the opportunity of requesting a recent data values of server as well. As a result, OCC/2VTS makes full use of the asymmetric bandwidth. It also improves transaction throughput by increasing the query transaction commit ratio as much as possible.

  • PDF

Neighbor Caching for P2P Applications in MUlti-hop Wireless Ad Hoc Networks (멀티 홉 무선 애드혹 네트워크에서 P2P 응용을 위한 이웃 캐싱)

  • 조준호;오승택;김재명;이형호;이준원
    • Journal of KIISE:Information Networking
    • /
    • v.30 no.5
    • /
    • pp.631-640
    • /
    • 2003
  • Because of multi-hop wireless communication, P2P applications in ad hoc networks suffer poor performance. We Propose neighbor caching strategy to overcome this shortcoming and show it is more efficient than self caching that nodes store data in theirs own cache individually. A node can extend its caching storage instantaneously with neighbor caching by borrowing the storage from idle neighbors, so overcome multi-hop wireless communications with data source long distance away from itself. We also present the ranking based prediction that selects the most appropriate neighbor which data can be stored in. The node that uses the ranking based prediction can select the neighbor that has high possibility to keep data for a long time and avoid caching the low ranked data. Therefore the ranking based prediction improves the throughput of neighbor caching. In the simulation results, we observe that neighbor caching has better performance, as large as network size, as long as idle time, and as small as cache size. We also show the ranking based prediction is an adaptive algorithm that adjusts times of data movement into the neighbor, so makes neighbor caching flexible according to the idleness of nodes