• Title/Summary/Keyword: data Parallel

Search Result 2,372, Processing Time 0.03 seconds

Performance Improvement of Prediction-Based Parallel Gate-Level Timing Simulation Using Prediction Accuracy Enhancement Strategy (예측정확도 향상 전략을 통한 예측기반 병렬 게이트수준 타이밍 시뮬레이션의 성능 개선)

  • Yang, Seiyang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.12
    • /
    • pp.439-446
    • /
    • 2016
  • In this paper, an efficient prediction accuracy enhancement strategy is proposed for improving the performance of the prediction-based parallel event-driven gate-level timing simulation. The proposed new strategy adopts the static double prediction and the dynamic prediction for input and output values of local simulations. The double prediction utilizes another static prediction data for the secondary prediction once the first prediction fails, and the dynamic prediction tries to use the on-going simulation result accumulated dynamically during the actual parallel simulation execution as prediction data. Therefore, the communication overhead and synchronization overhead, which are the main bottleneck of parallel simulation, are maximally reduced. Throughout the proposed two prediction enhancement techniques, we have observed about 5x simulation performance improvement over the commercial parallel multi-core simulation for six test designs.

Reliability Estimation in Bivariate Pareto Model with Bivariate Type I Censored Data

  • Cho, Jang-Sik;Cho, Kil-Ho;Kang, Sang-Gil
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.31-38
    • /
    • 2003
  • In this paper, we obtain the estimator of system reliability for the bivariate Pareto model with bivariate type 1 censored data. We obtain the estimators and approximated confidence intervals of the reliability for the parallel system based on likelihood function and the relative frequency, respectively. Also we present a numerical example by giving a data set which is generated by computer.

  • PDF

Design of a 155.52 Mbps CMOS data transmitter (155.52 Mbps CMOS 데이타 트랜스미터의 설계)

  • 채상훈;김길동;송원철
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.3
    • /
    • pp.62-68
    • /
    • 1996
  • A CMOS transmitter ASIC for the ATM switching system etc., was designed to transmit 155.52 Mbps serial data transformed from 19.44 Mbps parallel data. 155.52 MHz clock for synchronization of data is genrated using reference 19.44 MHz clock by an analog PLL while parallel to serial data conversion is done by a digital circuit. Circuit simulations confirm that PLL locking and data conversion are accomplished successfully. The area of the designed ASIC chip is 1.3${\times}1.0mm^2$. The locking time and the power consumption of the chip are about 600 nsec and less than 150 mW, respectively.

  • PDF

Design and Implementation of parallel Media server in current system environment (기존 시스템 환경에서의 병렬 미디어 서버의 설계 및 구현)

  • 김경훈;류재상;김서균;남지승
    • Proceedings of the IEEK Conference
    • /
    • 2000.06c
    • /
    • pp.97-100
    • /
    • 2000
  • As network resources have become faster and demands for multimedia service through network have increased, the demand for Media server system has increased. These kinds of media server solve their bottle neck problem of internal storage device by using parallel system which takes advantage of fast network resource. Many vendors have suggested each of their media server system to solve these problem radically, but most of them require major modification of infra component and additional drawback has added. For example, storage mechanism for specific media requires new file system which is totally different from traditional one, and algorithm for enhancing performance may not suit for traditional operating system environment. In this paper, we designed a parallel media server based on web interface of traditional system and implemented a program for media server. Implemented server system performs parallel processing through web interface without any modification of traditional system, and controls which is related to merging load by distributed data is charged only to client and control server and consequently load of storage server can be minimized. And also, data transfer protocol for streaming media includes Retransfer algorithm and client Admission control policy relevant to performance of whole system.

  • PDF

High Throughput Parallel KMP Algorithm Considering CPU-GPU Memory Hierarchy (CPU-GPU 메모리 계층을 고려한 고처리율 병렬 KMP 알고리즘)

  • Park, Soeun;Kim, Daehee;Lee, Myungho;Park, Neungsoo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.5
    • /
    • pp.656-662
    • /
    • 2018
  • Pattern matching algorithm is widely used in many application fields such as bio-informatics, intrusion detection, etc. Among many string matching algorithms, KMP (Knuth-Morris-Pratt) algorithm is commonly used because of its fast execution time when using large texts. However, the processing speed of KMP algorithm is also limited when the text size increases significantly. In this paper, we propose a high throughput parallel KMP algorithm considering CPU-GPU memory hierarchy based on OpenCL in GPGPU (General Purpose computing on Graphic Processing Unit). We focus on the optimization for the allocation of work-times and work-groups, the local memory copy of the pattern data and the failure table, and the overlapping of the data transfer with the string matching operations. The experimental results show that the execution time of the optimized parallel KMP algorithm is about 3.6 times faster than that of the non-optimized parallel KMP algorithm.

Parallel Model Feature Extraction to Improve Performance of a BCI System (BCI 시스템의 성능 개선을 위한 병렬 모델 특징 추출)

  • Chum, Pharino;Park, Seung-Min;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.1022-1028
    • /
    • 2013
  • It is well knowns that based on the CSP (Common Spatial Pattern) algorithm, the linear projection of an EEG (Electroencephalography) signal can be made to spaces that optimize the discriminant between two patterns. Sharing disadvantages from linear time invariant systems, CSP suffers from the non-stationary nature of EEGs causing the performance of the classification in a BCI (Brain-Computer Interface) system to drop significantly when comparing the training data and test data. The author has suggested a simple idea based on the parallel model of CSP filters to improve the performance of BCI systems. The model was tested with a simple CSP algorithm (without any elaborate regularizing methods) and a perceptron learning algorithm as a classifier to determine the improvement of the system. The simulation showed that the parallel model could improve classification performance by over 10% compared to conventional CSP methods.

Prediction of Sunspot Number Time Series using the Parallel-Structure Fuzzy Systems (병렬구조 퍼지시스템을 이용한 태양흑점 시계열 데이터의 예측)

  • Kim Min-Soo;Chung Chan-Soo
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.6
    • /
    • pp.390-395
    • /
    • 2005
  • Sunspots are dark areas that grow and decay on the lowest level of the sun that is visible from the Earth. Shot-term predictions of solar activity are essential to help plan missions and to design satellites that will survive for their useful lifetimes. This paper presents a parallel-structure fuzzy system(PSFS) for prediction of sunspot number time series. The PSFS consists of a multiple number of component fuzzy systems connected in parallel. Each component fuzzy system in the PSFS predicts future data independently based on its past time series data with different embedding dimension and time delay. An embedding dimension determines the number of inputs of each component fuzzy system and a time delay decides the interval of inputs of the time series. According to the embedding dimension and the time delay, the component fuzzy system takes various input-output pairs. The PSFS determines the final predicted value as an average of all the outputs of the component fuzzy systems in order to reduce error accumulation effect.

VDI Performance Optimization with Hybrid Parallel Processing in Thick Client System under Heterogeneous Multi-Core Environment (Heterogeneous 멀티 코어 환경의 Thick Client에서 VDI 성능 최적화를 위한 혼합 병렬 처리 기법 연구)

  • Kim, Myeong-Seob;Huh, Eui-Nam
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.3
    • /
    • pp.163-171
    • /
    • 2013
  • Recently, the requirement of processing High Definition (HD) video or 3D application on low, mobile devices has been expanded and content data has been increased as well. It is becoming a major issue in Cloud computing where a Virtual Desktop Infrastructure (VDI) Service needs efficient data processing ability to provide Quality of Experience (QoE) in Cloud computing. In this paper, we propose three kind of Thick-Thin VDI Service which can share and delegate VDI service based on Thick Client using CPU and GPU. Furthermore, we propose and discuss the VDI Service Optimization Method in mixed CPU and GPU Heterogeneous Environment using CPU Parallel Processing OpenMP and GPU Parallel Processing CUDA.

Design of the new parallel processing architecture for commercial applications (상용 응용을 위한 병렬처리 구조 설계)

  • 한우종;윤석한;임기욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.5
    • /
    • pp.41-51
    • /
    • 1996
  • In this paper, anew parallel processing system based on a cluster architecture which provides scalability of a parallel processing system while maintains shared memory multiprocessor characteristics is proposed. In recent days low cost, high performnce microprocessors have led to construction of large scale parallel processing systems. Such parallel processing systems provides large scalability but are mainly used for scientific applications which have large data parallelism. A shared memory multiprocessor system like TICOM is currently used as aserver for the commercial application, however, the shared memory multiprocessor system is known to have very limited scalability. The proposed architecture can support scalability and performance of the parallel processing system while it provides adaptability for the commerical application, hence it can overcome the limitation of the shared memory multiprocessor. The architecture and characteristics of the proposed system shall be described. A proprietary hierarchical crsossbar network is designed for this system, of which the protocol, routing and switching technique and the signal transfer technique are optimized for the proposed architecture. The design trade-offs for the network are described in this paper and with simulation usihng the SES/workbench, it is explored that the network fits to the proposed architecture.

  • PDF

A Parallel Loop Scheduling Algorithm on Multiprocessor System Environments (다중프로세서 시스템 환경에서 병렬 루프 스케쥴링 알고리즘)

  • 이영규;박두순
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.3
    • /
    • pp.309-319
    • /
    • 2000
  • The purpose of a parallel scheduling under a multiprocessor environment is to carry out the scheduling with the minimum synchronization overhead, and to perform load balance for a parallel application program. The processors calculate the chunk of iteration and are allocated to carry out the parallel iteration. At this time, it frequently accesses mutually exclusive global memory so that there are a lot of scheduling overhead and bottleneck imposed. And also, when the distribution of the parallel iteration in the allocated chunk to the processor is different, the different execution time of each chunk causes the load imbalance and badly affects the capability of the all scheduling. In the paper. we investigate the problems on the conventional algorithms in order to achieve the minimum scheduling overhead and load balance. we then present a new parallel loop scheduling algorithm, considering the locality of the data and processor affinity.

  • PDF