• Title/Summary/Keyword: parallel communication

Search Result 1,114, Processing Time 0.032 seconds

An Efficient Array Algorithm for VLSI Implementation of Vector-radix 2-D Fast Discrete Cosine Transform (Vector-radix 2차원 고속 DCT의 VLSI 구현을 위한 효율적인 어레이 알고리듬)

  • 신경욱;전흥우;강용섬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1970-1982
    • /
    • 1993
  • This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) fast discrete cosine transform (VR-FCT), and its VLSI implementation. By mapping the 2-D VR-FCT onto a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently importanted with high concurrency and local communication geometry. The proposed array algorithm features architectural modularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required, which is invitable in the conventional row-column decomposition approach. It has the time complexity of O(N+Nnzp-log2N) for (N*N) 2-D DCT, where Nnzd is the number of non-zero digits in canonic-signed digit(CSD) code, By adopting the CSD arithmetic in circuit desine, the number of addition is reduced by about 30%, as compared to the 2`s complement arithmetic. The computational accuracy analysis for finite wordlength processing is presented. From simulation result, it is estimated that (8*8) 2-D DCT (with Nnzp=4) can be computed in about 0.88 sec at 50 MHz clock frequency, resulting in the throughput rate of about 72 Mega pixels per second.

  • PDF

A Study on Korean Dual System: K-Dual System (한국형 듀얼 시스템(K-Dual 시스템)의 구축 및 운영방안에 관한 연구)

  • Lee, Moon-Su;Lee, Woo-Young;Oh, Chang-Heon
    • Journal of Practical Engineering Education
    • /
    • v.5 no.2
    • /
    • pp.139-149
    • /
    • 2013
  • In this paper, we proposed a new Korean Dual system, K-Dual System, and discussed the concepts and details about three sub-models of the system, IPP, W3 and corporate university models. Propose K-Dual system is a new and unique educational model that combines academic study and industrial work in order to solve the various problems of existing work-study parallel educational systems in Korea. K-Dual System has two tracks, Academic and Vocational tracks. Academic track has a long-term field experience training program, IPP program. On the other hands, Vocational track can be divided into two sub-models and those are $W^3$ and corporate university. Finally, we summarized several key points for the successful setup and operation of K-Dual system.

Trace-Back Viterbi Decoder with Sequential State Transition Control (순서적 역방향 상태천이 제어에 의한 역추적 비터비 디코더)

  • 정차근
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.11
    • /
    • pp.51-62
    • /
    • 2003
  • This paper presents a novel survivor memeory management and decoding techniques with sequential backward state transition control in the trace back Viterbi decoder. The Viterbi algorithm is an maximum likelihood decoding scheme to estimate the likelihood of encoder state for channel error detection and correction. This scheme is applied to a broad range of digital communication such as intersymbol interference removing and channel equalization. In order to achieve the area-efficiency VLSI chip design with high throughput in the Viterbi decoder in which recursive operation is implied, more research is required to obtain a simple systematic parallel ACS architecture and surviver memory management. As a method of solution to the problem, this paper addresses a progressive decoding algorithm with sequential backward state transition control in the trace back Viterbi decoder. Compared to the conventional trace back decoding techniques, the required total memory can be greatly reduced in the proposed method. Furthermore, the proposed method can be implemented with a simple pipelined structure with systolic array type architecture. The implementation of the peripheral logic circuit for the control of memory access is not required, and memory access bandwidth can be reduced Therefore, the proposed method has characteristics of high area-efficiency and low power consumption with high throughput. Finally, the examples of decoding results for the received data with channel noise and application result are provided to evaluate the efficiency of the proposed method.

A Dynamical Load Balancing Method for Data Streaming and User Request in WebRTC Environment (WebRTC 환경에 데이터 스트리밍 및 사용자 요청에 따른 동적로드 밸런싱 방법)

  • Ma, Linh Van;Park, Sanghyun;Jang, Jong-hyun;Park, Jaehyung;Kim, Jinsul
    • Journal of Digital Contents Society
    • /
    • v.17 no.6
    • /
    • pp.581-592
    • /
    • 2016
  • WebRTC has quickly grown to be the world's advanced real-time communication in several platforms such as web and mobile. In spite of the advantage, the current technology in WebRTC does not handle a big-streaming efficiently between peers and a large amount request of users on the Signaling server. Therefore, in this paper, we put our work to handle the problem by delivering the flow of data with dynamical load balancing algorithms. We analyze the request source users and direct those streaming requests to a load balancing component. More specifically, the component determines an amount of the requested resource and available resource on the response server, then it delivers streaming data to the requesting user parallel or alternately. To show how the method works, we firstly demonstrate the load-balancing algorithm by using a network simulation tool OPNET, then, we seek to implement the method into an Ubuntu server. In addition, we compare the result of our work and the original implementation of WebRTC, it shows that the method performs efficiently and dynamically than the origin.

Efficient Algorithms for Motion Parameter Estimation in Object-Oriented Analysis-Synthesis Coding (객체지향 분석-함성 부호화를 위한 효율적 움직임 파라미터 추정 알고리듬)

  • Lee Chang Bum;Park Rae-Hong
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.653-660
    • /
    • 2004
  • Object-oriented analysis-synthesis coding (OOASC) subdivides each image of a sequence into a number of moving objects and estimates and compensates the motion of each object. It employs a motion parameter technique for estimating motion information of each object. The motion parameter technique employing gradient operators requires a high computational load. The main objective of this paper is to present efficient motion parameter estimation techniques using the hierarchical structure in object-oriented analysis-synthesis coding. In order to achieve this goal, this paper proposes two algorithms : hybrid motion parameter estimation method (HMPEM) and adaptive motion parameter estimation method (AMPEM) using the hierarchical structure. HMPEM uses the proposed hierarchical structure, in which six or eight motion parameters are estimated by a parameter verification process in a low-resolution image, whose size is equal to one fourth of that of an original image. AMPEM uses the same hierarchical structure with the motion detection criterion that measures the amount of motion based on the temporal co-occurrence matrices for adaptive estimation of the motion parameters. This method is fast and easily implemented using parallel processing techniques. Theoretical analysis and computer simulation show that the peak signal to noise ratio (PSNR) of the image reconstructed by the proposed method lies between those of images reconstructed by the conventional 6- and 8-parameter estimation methods with a greatly reduced computational load by a factor of about four.

Design and Implementation of An I/O System for Irregular Application under Parallel System Environments (병렬 시스템 환경하에서 비정형 응용 프로그램을 위한 입출력 시스템의 설계 및 구현)

  • No, Jae-Chun;Park, Seong-Sun;;Gwon, O-Yeong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.11
    • /
    • pp.1318-1332
    • /
    • 1999
  • 본 논문에서는 입출력 응용을 위해 collective I/O 기법을 기반으로 한 실행시간 시스템의 설계, 구현 그리고 그 성능평가를 기술한다. 여기서는 모든 프로세서가 동시에 I/O 요구에 따라 스케쥴링하며 I/O를 수행하는 collective I/O 방안과 프로세서들이 여러 그룹으로 묶이어, 다음 그룹이 데이터를 재배열하는 통신을 수행하는 동안 오직 한 그룹만이 동시에 I/O를 수행하는 pipelined collective I/O 등의 두 가지 설계방안을 살펴본다. Pipelined collective I/O의 전체 과정은 I/O 노드 충돌을 동적으로 줄이기 위해 파이프라인된다. 이상의 설계 부분에서는 동적으로 충돌 관리를 위한 지원을 제공한다. 본 논문에서는 다른 노드의 메모리 영역에 이미 존재하는 데이터를 재 사용하여 I/O 비용을 줄이기 위해 collective I/O 방안에서의 소프트웨어 캐슁 방안과 두 가지 모형에서의 chunking과 온라인 압축방안을 기술한다. 그리고 이상에서 기술한 방안들이 입출력을 위해 높은 성능을 보임을 기술하는데, 이 성능결과는 Intel Paragon과 ASCI/Red teraflops 기계 상에서 실험한 것이다. 그 결과 응용 레벨에서의 bandwidth는 peak point가 55%까지 측정되었다.Abstract In this paper we present the design, implementation and evaluation of a runtime system based on collective I/O techniques for irregular applications. We present two designs, namely, "Collective I/O" and "Pipelined Collective I/O". In the first scheme, all processors participate in the I/O simultaneously, making scheduling of I/O requests simpler but creating a possibility of contention at the I/O nodes. In the second approach, processors are grouped into several groups, so that only one group performs I/O simultaneously, while the next group performs communication to rearrange data, and this entire process is pipelined to reduce I/O node contention dynamically. In other words, the design provides support for dynamic contention management. Then we present a software caching method using collective I/O to reduce I/O cost by reusing data already present in the memory of other nodes. Finally, chunking and on-line compression mechanisms are included in both models. We demonstrate that we can obtain significantly high-performance for I/O above what has been possible so far. The performance results are presented on an Intel Paragon and on the ASCI/Red teraflops machine. Application level I/O bandwidth up to 55% of the peak is observed.he peak is observed.

COMS Electrical Power Subsystem Preliminary Design (통신해양기상위성 전력계 예비설계)

  • Gu, Ja-Chun;Kim, Ui-Chan
    • Journal of Satellite, Information and Communications
    • /
    • v.1 no.2
    • /
    • pp.95-100
    • /
    • 2006
  • The COMS(Communication, Ocean and Meteorological Satellite) EPS(Electrical Power Subsystem) is derived from an enhanced Eurostar 3000 version. Eurostar 3000 EpS is fully autonomous operation in nominal conditions or in the event of a failure and provides a high level of reconfigure capability. This paper introduces the COMS EPS preliminary design result. COMS EPS consists of a battery, a solar arrat wing, a PSR(Power Supply Regulator), a PRU(Pyrotechnic Unit), a SDAM(Solar Array Drive Mechanism) and relay and fuse brackets. COMS EPS can offer a bus power capability of 3 kW. The solar array is made of a deployable wing with two panels. One type fo solar cells is selected ad GaAs/Ge triple junction cells. Li-ion battery is base lined with ten series cell module of five cells in parallel. PSR associated to battery and solar array wing generates a power bus fully regulated at 50 V. Power bus os centralized protection and distribution by relay and fuse brackets. PRU provides power for firing actuarors devices. The solar array wing is rotated by the SADM under control of the attitude orbit control subsystem. The control and monitoring of the EPS, especially of the battery, is performed by the PSR in combination with the on-board software.

  • PDF

A Study on the Interference of Harmonic Frequency during the Change of Urban Transit's Signalling Systems (도시철도 신호시스템의 절체에 따른 주파수 간섭 연구)

  • Jeong, Rag-Gyo;Kim, Beak-Hyun;Joung, Eui-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.2
    • /
    • pp.469-475
    • /
    • 2010
  • The railway signalling system plays an essential role in the safe and efficient train operation as serving control functions of train operation intervals and train routes. The reliability and safety of the system are very important because the failure of the railway signalling system can lead to train collision or derailment as well as train operation stop. Until now, in railway signalling system the conventional wayside signal mode has been used generally. There are, however, the risk of accidents such as human mistakes caused by that the driver identifies the signal lamp status and controls train speed with the naked eye. It is also necessary to refurbish the obsolete system. Thereby, It is being effective that the onboard signal mode has been recently introduced and applied in order to transmit the speed control information to train by using the computer and communication equipment. It is necessary to switch over the system in a way while providing passengers with an operation service to replace the obsolete signal system. In this paper, we verify the cases through trial assessment which are solved by the way of adding specific functionalities in the problems of interference among the procedure of switch-over processes and a serial of processes for system verification while a train is operated in the new system in parallel to the existing system.

Simulation of YUV-Aware Instructions for High-Performance, Low-Power Embedded Video Processors (고성능, 저전력 임베디드 비디오 프로세서를 위한 YUV 인식 명령어의 시뮬레이션)

  • Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.5
    • /
    • pp.252-259
    • /
    • 2007
  • With the rapid development of multimedia applications and wireless communication networks, consumer demand for video-over-wireless capability on mobile computing systems is growing rapidly. In this regard, this paper introduces YUV-aware instructions that enhance the performance and efficiency in the processing of color image and video. Traditional multimedia extensions (e.g., MMX, SSE, VIS, and AltiVec) depend solely on generic subword parallelism whereas the proposed YUV-aware instructions support parallel operations on two-packed 16-bit YUV (6-bit Y, 5-bits U, V) values in a 32-bit datapath architecture, providing greater concurrency and efficiency for color image and video processing. Moreover, the ability to reduce data format size reduces system cost. Experiment results on a representative dynamically scheduled embedded superscalar processor show that YUV-aware instructions achieve an average speedup of 3.9x over the baseline superscalar performance. This is in contrast to MMX (a representative Intel#s multimedia extension), which achieves a speedup of only 2.1x over the same baseline superscalar processor. In addition, YUV-aware instructions outperform MMX instructions in energy reduction (75.8% reduction with YUV-aware instructions, but only 54.8% reduction with MMX instructions over the baseline).

Serialized Multitasking Code Generation from Dataflow Specification (데이타 플로우 명세로부터 직렬화된 멀티태스킹 코드 생성)

  • Kwon, Seong-Nam;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.9_10
    • /
    • pp.429-440
    • /
    • 2008
  • As embedded system becomes more complex, software development becomes more important in the entire design process. Most embedded applications consist of multi -tasks, that are executed in parallel. So, dataflow model that expresses concurrency naturally is preferred than sequential programming language to develop multitask software. For the execution of multitasking codes, operating system is essential to schedule multi-tasks and to deal with the communication between tasks. But, it is needed to execute multitasking code without as when the target hardware platform cannot execute as or target platforms are candidates of design space exploration, because it is very costly to port as for all candidate platforms of DSE. For this reason, we propose the serialized multitasking code generation technique from dataflow specification. In the proposed technique, a task is specified with dataflow model, and generated as a C code. Code generation consists of two steps: First, a block in a task is generated as a separate function. Second, generated functions are scheduled by a multitasking scheduler that is also generated automatically. To make it easy to write customized scheduler manually, the data structure and information of each task are defined. With the preliminary experiment of DivX player, it is confirmed that the generated code from the proposed framework is efficiently and correctly executed on the target system.