• Title/Summary/Keyword: dataflow

Search Result 40, Processing Time 0.025 seconds

Hardware Synthesis From Coarse-Grained Dataflow Specification For Fast HW/SW Cosynthesis (빠른 하드웨어/소프트웨어 통합합성을 위한 데이타플로우 명세로부터의 하드웨어 합성)

  • Jung, Hyun-Uk;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.5
    • /
    • pp.232-242
    • /
    • 2005
  • This paper concerns automatic hardware synthesis from data flow graph (DFG) specification for fast HW/SW cosynthesis. A node in BFG represents a coarse grain block such as FIR and DCT and a port in a block may consume multiple data samples per invocation, which distinguishes our approach from behavioral synthesis and complicates the problem. In the presented design methodology, a dataflow graph with specified algorithm can be mapped to various hardware structures according to the resource allocation and schedule information. This simplifies the management of the area/performance tradeoff in hardware design and widens the design space of hardware implementation of a dataflow graph compared with the previous approaches. Through experiments with some examples, the usefulness of the proposed technique is demonstrated.

Design and Implementation of a Massively Parallel Multithreaded Architecture: DAVRID

  • Sangho Ha;Kim, Junghwan;Park, Eunha;Yoonhee Hah;Sangyong Han;Daejoon Hwang;Kim, Heunghwan;Seungho Cho
    • Journal of Electrical Engineering and information Science
    • /
    • v.1 no.2
    • /
    • pp.15-26
    • /
    • 1996
  • MPAs(Massively Parallel Architectures) should address two fundamental issues for scalability: synchronization and communication latency. Dataflow architecture faces problems of excessive synchronization overhead and inefficient execution of sequential programs while they offer the ability to exploit massive parallelism inherent in programs. In contrast, MPAs based on von Neumann computational model may suffer from inefficient synchronization mechanism and communication latency. DAVRID (DAtaflow/Von Neumann RISC hybrID) is a massively parallel multithreaded architecture which takes advantages of von Neumann and dataflow models. It has good single thread performance as well as tolerates synchronization and communication latency. In this paper, we describe the DAVRID architecture in detail and evaluate its performance through simulation runs over several benchmarks.

  • PDF

A Method for Distributed Database Processing with Optimized Communication Cost in Dataflow model (데이터플로우 모델에서 통신비용 최적화를 이용한 분산 데이터베이스 처리 방법)

  • Jun, Byung-Uk
    • Journal of Internet Computing and Services
    • /
    • v.8 no.1
    • /
    • pp.133-142
    • /
    • 2007
  • Large database processing is one of the most important technique in the information society, Since most large database is regionally distributed, the distributed database processing has been brought into relief. Communications and data compressions are the basic technologies for large database processing. In order to maximize those technologies, the execution time for the task, the size of data, and communication time between processors should be considered. In this paper, the dataflow scheme and vertically layered allocation algorithm have been used to optimize the distributed large database processing. The basic concept of this method is rearrangement of processes considering the communication time between processors. The paper also introduces measurement model of the execution time, the size of output data, and the communication time in order to implement the proposed scheme.

  • PDF

Co-scheduling Technique of Dataflow Applications with Shared Processor Allocation (프로세서 공유를 이용한 데이터 플로우 어플리케이션의 동시 스케줄링 기법)

  • Kang, Duseok;Kang, Shinhaeng;Yang, Hoeseok;Ha, Soonhoi
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.1-7
    • /
    • 2016
  • When multiple applications are running concurrently on a multi-processor system, interferences between applications make it difficult to guarantee real-time constraints. We propose a novel interference analysis technique that allows sharing of share processors among dataflow applications, while satisfying real-time constraints. Based on the interference analysis, we develop a co-scheduling technique that aims to minimize the resource usage. Compared to an existent technique that involves converting application graphs to real-time tasks, the proposed technique shows better results in terms of resource usage, especially when it is applied to applications with tight time constraints.

Serialized Multitasking Code Generation from Dataflow Specification (데이타 플로우 명세로부터 직렬화된 멀티태스킹 코드 생성)

  • Kwon, Seong-Nam;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.9_10
    • /
    • pp.429-440
    • /
    • 2008
  • As embedded system becomes more complex, software development becomes more important in the entire design process. Most embedded applications consist of multi -tasks, that are executed in parallel. So, dataflow model that expresses concurrency naturally is preferred than sequential programming language to develop multitask software. For the execution of multitasking codes, operating system is essential to schedule multi-tasks and to deal with the communication between tasks. But, it is needed to execute multitasking code without as when the target hardware platform cannot execute as or target platforms are candidates of design space exploration, because it is very costly to port as for all candidate platforms of DSE. For this reason, we propose the serialized multitasking code generation technique from dataflow specification. In the proposed technique, a task is specified with dataflow model, and generated as a C code. Code generation consists of two steps: First, a block in a task is generated as a separate function. Second, generated functions are scheduled by a multitasking scheduler that is also generated automatically. To make it easy to write customized scheduler manually, the data structure and information of each task are defined. With the preliminary experiment of DivX player, it is confirmed that the generated code from the proposed framework is efficiently and correctly executed on the target system.

New buffer mapping method for Hybrid SPM with Buffer sharing (하이브리드 SPM을 위한 버퍼 공유를 활용한 새로운 버퍼 매핑 기법)

  • Lee, Daeyoung;Oh, Hyunok
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.4
    • /
    • pp.209-218
    • /
    • 2016
  • This paper proposes a new lifetime aware buffer mapping method of a synchronous dataflow (SDF) graph on a hybrid memory system with DRAM and PRAM. Since the number of write operations on PRAM is limited, the number of written samples on PRAM is minimized to maximize the lifetime of PRAM. We improve the utilization of DRAM by mapping more buffers on DRAM through buffer sharing. The problem is formulated formally and solved by an optimal approach of an answer set programming. In experiment, the buffer mapping method with buffer sharing improves the PRAM lifetime by 63%.

An Efficient Parallel Algorithm for Solving Large Sparse Linear Systems of Equations (대형 Sparse 선형시스템 방정식을 풀기위한 효과적인 병렬 알고리즘)

  • Chae, Soo-Hoan;Lee, Jin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.4
    • /
    • pp.388-397
    • /
    • 1989
  • This paper describes an intelligent iterative parallel algorithm for solving large sparse linear systems of equations, and proposes a ststic dataflow computer architechture for the implementation of the algorithm. Implemented with the Jacobi interative method, the intelligent algorithm reduces the parallel execution time by reducing the individual inner product operation time.

  • PDF

A Review of Structural Testing Methods for ASIC based AI Accelerators

  • Umair, Saeed;Irfan Ali, Tunio;Majid, Hussain;Fayaz Ahmed, Memon;Ayaz Ahmed, Hoshu;Ghulam, Hussain
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.103-111
    • /
    • 2023
  • Implementing conventional DFT solution for arrays of DNN accelerators having large number of processing elements (PEs), without considering architectural characteristics of PEs may incur overwhelming test overheads. Recent DFT based techniques have utilized the homogeneity and dataflow of arrays at PE-level and Core-level for obtaining reduction in; test pattern volume, test time, test power and ATPG runtime. This paper reviews these contemporary test solutions for ASIC based DNN accelerators. Mainly, the proposed test architectures, pattern application method with their objectives are reviewed. It is observed that exploitation of architectural characteristic such as homogeneity and dataflow of PEs/ arrays results in reduced test overheads.

Reconfigurable SoC Design with Hierarchical FSM and Synchronous Dataflow Model (Hierarchical FSM과 Synchronous Dataflow Model을 이용한 재구성 가능한 SoC의 설계)

  • 이성현;유승주;최기영
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.8
    • /
    • pp.619-630
    • /
    • 2003
  • We present a method of runtime configuration scheduling in reconfigurable SoC design. As a model of computation, we use a popular formal model of computation, hierarchical FSM (HFSM) with synchronous dataflow (SDF) model, in short, HFSM-SDF model. In reconfigurable SoC design with HFSM-SDF model, the problem of configuration scheduling becomes challenging due to the dynamic behavior of the system such as concurrent execution of state transitions (by AND relation), complex control flow (HFSM), and complex schedules of SDF actor firing. This makes it hard to hide configuration latency efficiently with compile-time static configuration scheduling. To resolve the problem, it is necessary to know the exact order of required configurations during runtime and to perform runtime configuration scheduling. To obtain the exact order of configurations, we exploit the inherent property of HFSM-SDF that the execution order of SDF actors can be determined before executing the state transition of top FSM. After obtaining the order information and storing it in the ready configuration queue (ready CQ), we execute the state transition. During the execution, whenever there is FPGA resource available, a new configuration is selected from the ready CQ and fetched by the runtime configuration scheduler. We applied the method to an MPEG4 decoder and IS95 design and obtained up to 21.8% improvement in system runtime with a negligible overhead of memory usage.

A Visual Programming Environment for Medical Image Processing (의료영상처리를 위한 시각 프로그래밍 환경)

  • Sung, Chong-Won;Kim, Jin-Ho;Kim, Jee-In
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2349-2360
    • /
    • 2000
  • In medical image processing, if new technologies arc developed, they arc applied to real clinical cases. The results are to be analyzed by doctors to improve the new technologies. So, it is important for doctors to have a tool that helps the doctors in applying the new technologies to clinical cases and analyzing the clinical results. In this paper, we design and implement a visual programming environment where non-programming experts, such as medical doctors, can easily compose a medical image processing application program. A set of image processing functions are implemented and represented as icons. Thc user selects functions by clicking correslxmding icons. The users can easily find necessary' functions from the visualized library. A user selects a function from the visualized library and [Jut the function node into a canvas of Visual Programming Interface. The user connects nodes to compose a dataflow diagram. The connected dataflow diagram shows the now of the program. Hyperbolic Tree is helpful in visualizing a set of function icons in a single screen because it provides both the whole stmcture of the function Iihrary and the details of the focused functions at the same time. We also developed a CUI builder where the user interfaces of the medical image processing applications are composed. Therefore. non'programming experts such as physicians can apply new medical image processing algorithms to clinical cases without performing complex computer programming procedures.

  • PDF