• Title/Summary/Keyword: Parallel processor

Search Result 482, Processing Time 0.044 seconds

A Linear Clustering Method for the Scheduling of the Directed Acyclic Graph Model with Multiprocessors Using Genetic Algorithm (다중프로세서를 갖는 유방향무환그래프 모델의 스케쥴링을 위한 유전알고리즘을 이용한 선형 클러스터링 해법)

  • Sung, Ki-Seok;Park, Jee-Hyuk
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.24 no.4
    • /
    • pp.591-600
    • /
    • 1998
  • The scheduling of parallel computing systems consists of two procedures, the assignment of tasks to each available processor and the ordering of tasks in each processor. The assignment procedure is same with a clustering. The clustering is classified into linear or nonlinear according to the precedence relationship of the tasks in each cluster. The parallel computing system can be modeled with a Directed Acyclic Graph(DAG). By the granularity theory, DAG is categorized into Coarse Grain Type(CDAG) and Fine Grain Type(FDAG). We suggest the linear clustering method for the scheduling of CDAG using the genetic algorithm. The method utilizes a properly that the optimal schedule of a CDAG is one of linear clustering. We present the computational comparisons between the suggested method for CDAG and an existing method for the general DAG including CDAG and FDAG.

  • PDF

A New Architecture of Call Processor Based On Data flow System (데이타 흐름 시스템을 이용한 호처리 프로세서의 구조)

  • Lim, In-Taek;Lee, Sung-Gyu;Han, Young-Chul
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.965-968
    • /
    • 1987
  • Conventional major electronic switching systems based on stored program control employ a Von Neumann styled control processor. It has strict limitations such that it essentially lacks concurrency in executing instructions, which have brought the software bottleneck problem, and the capabilities of call processing are restricted by expanding system's capacity. In this paper, a new architecture of call control processor based on the data flow system is proposed, aiming at fundamental resolution for these limitations. The processor has a number of advantages in such as expansibility of system's capacity, parallel processing of calls, and so on.

  • PDF

High-throughput Low-complexity Mixed-radix FFT Processor using a Dual-path Shared Complex Constant Multiplier

  • Nguyen, Tram Thi Bao;Lee, Hanho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.17 no.1
    • /
    • pp.101-109
    • /
    • 2017
  • This paper presents a high-throughput low-complexity 512-point eight-parallel mixed-radix multipath delay feedback (MDF) fast Fourier transform (FFT) processor architecture for orthogonal frequency division multiplexing (OFDM) applications. To decrease the number of twiddle factor (TF) multiplications, a mixed-radix $2^4/2^3$ FFT algorithm is adopted. Moreover, a dual-path shared canonical signed digit (CSD) complex constant multiplier using a multi-layer scheme is proposed for reducing the hardware complexity of the TF multiplication. The proposed FFT processor is implemented using TSMC 90-nm CMOS technology. The synthesis results demonstrate that the proposed FFT processor can lead to a 16% reduction in hardware complexity and higher throughput compared to conventional architectures.

Multithread video coding processor for the videophone (동영상 전화기용 다중 스레드 비디오 코딩 프로세서)

  • 김정민;홍석균;이일완;채수익
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.33A no.5
    • /
    • pp.155-164
    • /
    • 1996
  • The architecture of a programmable video codec IC is described that employs multiple vector processors in a single chip. The vector processors operate in parallel and communicate with one another through on-chip shared memories. A single scalar control processor schedules each vector processor independently to achieve real-tiem video coding with special vector instructions. With programmable interconnection buses, the proposed architecture performs multi-processing of tasks and data in video coding. Therefore, it can provide good parallelism as well as good programmability. especially, it can operate multithread video coding, which processes several independent image sequences simultaneously. We explain its scheduling, multithred video coding, and vector processor architectures. We implemented a prototype video codec with a 0.8um CMOS cell-based technology for the multi-standard videophone. This codec can execute video encoding and decoding simultaneously for the QCIF image at a frame rate of 30Hz.

  • PDF

HPC(High Performance Computer) Linux Clustering for UltraSPARC(64bit-RISC processor) (UltraSPARC(64bit-RISC processor)을 위한 고성능 컴퓨터 리눅스 클러스터링)

  • 김기영;조영록;장종권
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.45-48
    • /
    • 2003
  • We can easily buy network system for high performance micro-processor, progress computer architecture is caused of high bandwidth and low delay time. Coupling PC-based commodity technology with distributed computing methodologies provides an important advance in the development of single-user dedicated systems. Lately Network is joined PC or workstation by computers of high performance and low cost. Than it make intensive that Cluster system is resembled supercomputer. Unix, Linux, BSD, NT(Windows series) can use Cluster system OS(operating system). I'm chosen linux gain low cost, high performance and open technical documentation. This paper is benchmark performance of Beowulf clustering by UltraSPARC-1K(64bit-RISC processor). Benchmark tools use MPI(Message Passing Interface) and NetPIPE. Beowulf is a class of experimental parallel workstations developed to evaluate and characterize the design space of this new operating point in price-performance.

  • PDF

Design and Implementation of Xcent-Net

  • Park, Kyoung;Hahn, Jong-Seok;Sim, Won-Sae;Hahn, Woo-Jong
    • Journal of Electrical Engineering and information Science
    • /
    • v.2 no.6
    • /
    • pp.74-81
    • /
    • 1997
  • Xcent-Net is a new system network designed to support a clustered SMP called SPAX(Scalable Parallel Architecture based on Xbar) that is being developed by ETRI. It is a duplicated hierarchical crossbar network to provide the connections among 16 clusters of 128 nodes. Xcent-Net is designed as a packet switched, virtual cut-through routed, point-to-point network. Variable length packets contain up to 64 bytes of data. The packets are transmitted via full duplexed, 32-bit wide channels using source synchronous transmission technique. Its plesiochronous clocking scheme eliminates the global clock distribution problem. Two level priority-based round-robin scheme is adopted to resolve the traffic congestion. Clear-to-send mechanism is used as a packet level flow control scheme. Most of functions are built in Xcent router, which is implemented as an ASIC. This paper describes the architecture and the functional features of Xcent-Net and discusses its implementation.

  • PDF

Hardware Implementation of Genetic Algorithm Processor for EHW (EHW를 위한 Genetic Algorithm Processor 구현)

  • Kim, Jin-Jung;Kim, Yong-Hun;Choi, Yun-Ho;Chung, Duck-Jin
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2827-2829
    • /
    • 1999
  • Genetic algorithms were described as a method of solving large-scaled optimization problems with complex constraints. It has overcome their slowness, a major drawback of genetic algorithms using hardware implementation of genetic algorithm processor (GAP). In this study, we proposed GAP effectively connecting the goodness of survival-based GA, steady-state GA, tournament selection. Using Pipeline Parallel processing, handshaking protocol effectively, the proposed GAP exhibits 50% speed-up over survival-based GA which runs one million crossovers per second(1MHz). It will be used for high speed processing such of central processor of EHW, robot control and many optimization problem.

  • PDF

Optical Look-ahead Carry Full-adder Using Dual-rail Coding

  • Gil Sang Keun
    • Journal of the Optical Society of Korea
    • /
    • v.9 no.3
    • /
    • pp.111-118
    • /
    • 2005
  • In this paper, a new optical parallel binary arithmetic processor (OPBAP) capable of computing arbitrary n-bit look-ahead carry full-addition is proposed and implemented. The conventional Boolean algebra is considered to implement OPBAP by using two schemes of optical logic processor. One is space-variant optical logic gate processor (SVOLGP), the other is shadow-casting optical logic array processor (SCOLAP). SVOLGP can process logical AND and OR operations different in space simultaneously by using free-space interconnection logic filters, while SCOLAP can perform any possible 16 Boolean logic function by using spatial instruction-control filter. A dual-rail encoding method is adopted because the complement of an input is needed in arithmetic process. Experiment on OPBAP for an 8-bit look-ahead carry full addition is performed. The experimental results have shown that the proposed OPBAP has a capability of optical look-ahead carry full-addition with high computing speed regardless of the data length.

A Design of Superscalar Digital Signal Processor (다중 명령어 처리 DSP 설계)

  • Park, Sung-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.323-328
    • /
    • 2008
  • This paper presents a Digital Signal Processor achieving high through-put for both decision intensive and computation intensive tasks. The proposed processor employees a multiplier, two ALU and load/store. Unit as operational units. Those four units are controlled and works parallel by superscalar control scheme, which is different from prior DSP architecture. The performance evaluation was done by implementing AC-3 decoding algorithm and 37.8% improvement was achieved. This study is valuable especially for the consumer electronics applications, which require very low cost.

Image Processing Processor Design for Artificial Intelligence Based Service Robot (인공지능 기반 서비스 로봇을 위한 영상처리 프로세서 설계)

  • Moon, Ji-Youn;Kim, Soo-Min
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.633-640
    • /
    • 2022
  • As service robots are applied to various fields, interest in an image processing processor that can perform an image processing algorithm quickly and accurately suitable for each task is increasing. This paper introduces an image processing processor design method applicable to robots. The proposed processor consists of an AGX board, FPGA board, LiDAR-Vision board, and Backplane board. It enables the operation of CPU, GPU, and FPGA. The proposed method is verified through simulation experiments.