• Title/Summary/Keyword: 병렬탐색

Search Result 188, Processing Time 0.03 seconds

Design of Systolic Array for High Speed Processing of Block Matching Motion Estimation Algorithm (블록 정합 움직임추정 알고리즘의 고속처리를 위한 시스토릭 어레이의 설계)

  • 추봉조;김혁진;이수진
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.119-124
    • /
    • 1998
  • Block Matching Motion Estimation(BMME) Algorithm is demands a very large amount of computing power and have been proposed many fast algorithms. These algorithms are many problem that larger size of VLSI scale due to non-localized search block data and problem of non-reuse of input data for each processing step. In this paper, we designed systolic arry of high processing capacity, constraints input output pin size and reuse of input data for small VLSI size. The proposed systolic array is optimized memory access time because of iterative reuse of input data on search block and become independent of problem size due to increase of algorithm's parallelism and total processing elements connection is localized spatial and temporal. The designed systolic array is reduced O(N6) time complexity to O(N3) on moving vector and has O(N) input/output pin size.

  • PDF

Motion Estimation Specific Instructions and Their Hardware Architecture for ASIP (ASIP을 위한 움직임 추정 전용 연산기 구조 및 명령어 설계)

  • Hwang, Sung-Jo;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.3
    • /
    • pp.106-111
    • /
    • 2011
  • This paper presents an ASIP (Application-specific Instruction Processor) for motion estimation that employs specific IME instructions and its programmable and reconfigurable hardware architecture for various video codecs, such as H.264/AVC, MPEG4, etc. With the proposed specific instructions and hardware accelerator, it can handle the real-time processing requirement of High Definition (HD) video. With the parallel operations and SAD unit control using pattern information, the proposed IME instruction supports not only full search algorithm but also other fast search algorithms. The hardware size is 77K gates for each Processing Element Group (PEG) which has 256 SAD PEs. The proposed ASIP runs at 160MHz with sixteen PEGs and it can handle 1080p@30 frame in real time.

Implementation of an Optimal Many-core Processor for Beamforming Algorithm of Mobile Ultrasound Image Signals (모바일 초음파 영상신호의 빔포밍 기법을 위한 최적의 매니코어 프로세서 구현)

  • Choi, Byong-Kook;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.8
    • /
    • pp.119-128
    • /
    • 2011
  • This paper introduces design space exploration of many-core processors that meet high performance and low power required by the beamforming algorithm of image signals of mobile ultrasound. For the design space exploration of the many-core processor, we mapped different number of ultrasound image data to each processing element of many-core, and then determined an optimal many-core processor architecture in terms of execution time, energy efficiency and area efficiency. Experimental results indicate that PE=4096 and 1024 provide the highest energy efficiency and area efficiency, respectively. In addition, PE=4096 achieves 46x and 10x better than TI DSP C6416, which is widely used for ultrasound image devices, in terms of energy efficiency and area efficiency, respectively.

A Study on the Parallel Escape Maze through Cooperative Activities of Humanoid Robots (인간형 로봇들의 협력 작업을 통한 미로 동시 탈출에 관한 연구)

  • Jun, Bong-Gi
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.6
    • /
    • pp.1441-1446
    • /
    • 2014
  • For the escape from a maze, the cooperative method by robot swarm was proposed in this paper. The robots can freely move by collecting essential data and making a decision in the use of sensors; however, a central control system is required to organize all robots for the escape from the maze. The robots explore new mazes and then send the information to the system for analyzing and mapping the escaping route. Three issues were considered as follows for the effective escape by multiple robots from the mazes in this paper. In the first, the mazes began to divide and secondly, dead-ends should be blocked. Finally, after the first arrivals at the destination, a shortcut should be provided for rapid escaping from the maze. The parallel-escape algorithms were applied to the different size of mazes, so that robot swarm can effectively get away the mazes.

Code Optimization in DNA Computing for the Hamiltonian Path Problem (해밀톤 경로 문제를 위한 DNA 컴퓨팅에서 코드 최적화)

  • 김은경;이상용
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.387-393
    • /
    • 2004
  • DNA computing is technology that applies immense parallel castle of living body molecules into information processing technology, and has used to solve NP-complete problems. However, there are problems which do not look for solutions and take much time when only DNA computing technology solves NP-complete problems. In this paper we proposed an algorithm called ACO(Algorithm for Code Optimization) that can efficiently express DNA sequence and create good codes through composition and separation processes as many as the numbers of reaction by DNA coding method. Also, we applied ACO to Hamiltonian path problem of NP-complete problems. As a result, ACO could express DNA codes of variable lengths more efficiently than Adleman's DNA computing algorithm could. In addition, compared to Adleman's DNA computing algorithm, ACO could reduce search time and biological error rate by 50% and could search for accurate paths in a short time.

A Simple Stereo Matching Algorithm using PBIL and its Alternative (PBIL을 이용한 소형 스테레오 정합 및 대안 알고리즘)

  • Han Kyu-Phil
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.429-436
    • /
    • 2005
  • A simple stereo matching algorithm using population-based incremental learning(PBIL) is proposed in this paper to decrease the general problem of genetic algorithms, such as memory consumption and inefficiency of search. PBIL is a variation of genetic algorithms using stochastic search and competitive teaming based on a probability vector. The structure of PBIL is simpler than that of other genetic algorithm families, such as serial and parallel ones, due to the use of a probability vector. The PBIL strategy is simplified and adapted for stereo matching circumstances. Thus, gene pool, chromosome crossover, and gene mutation we removed, while the evolution rule, that fitter chromosomes should have higher survival probabilities, is preserved. As a result, memory space is decreased, matching rules are simplified and computation cost is reduced. In addition, a scheme controlling the distance of neighbors for disparity smoothness is inserted to obtain a wide-area consistency of disparities, like a result of coarse-to-fine matchers. Because of this scheme, the proposed algorithm can produce a stable disparity map with a small fixed-size window. Finally, an alterative version of the proposed algorithm without using probability vector is also presented for simpler set-ups.

Hardware/Software Partitioning Methodology for Reconfigurable System (재구성형 시스템을 위한 하드웨어/소프트웨어 분할 기법)

  • Kim, Jun-Yong;Ahn, Seong-Yong;Lee, Jeong-A.
    • The KIPS Transactions:PartA
    • /
    • v.11A no.5
    • /
    • pp.303-312
    • /
    • 2004
  • In this paper, we propose a methodology solving the problem of the hardware-software partitioning in reconfigurable systems using a Y-chart design space exploration and implement a simulator according to the methodology. The methodology generates a mapping set between tasks and hardware elements using the hardware element model and the application model. We evaluate the throughput by simulating cases in each mapping set. With the throughput evaluation result, we can select the mapping case with the highest throughput. We also propose an heuristic improving the simulation time by reducing the mapping set on the basis of the relationship between workload and parallelism. Simulation results show that we can reduce the size of mapping set which poses difficulties on hardware-software partitioning by up to 80%.

An Exploratory Study on Policy Decision Making with Artificial Intelligence: Applying Problem Structuring Typology on Success and Failure Cases (인공지능을 활용한 정책의사결정에 관한 탐색적 연구: 문제구조화 유형으로 살펴 본 성공과 실패 사례 분석)

  • Eun, Jong-Hwan;Hwang, Sung-Soo
    • Informatization Policy
    • /
    • v.27 no.4
    • /
    • pp.47-66
    • /
    • 2020
  • The rapid development of artificial intelligence technologies such as machine learning and deep learning is expanding its impact in the public administrative and public policy sphere. This paper is an exploratory study on policy decision-making in the age of artificial intelligence to design automated configuration and operation through data analysis and algorithm development. The theoretical framework was composed of the types of policy problems according to the degree of problem structuring, and the success and failure cases were classified and analyzed to derive implications. In other words, when the problem structuring is more difficult than others, the greater the possibility of failure or side effects of decision-making using artificial intelligence. Also, concerns about the neutrality of the algorithm were presented. As a policy suggestion, a subcommittee was proposed in which experts in technical and social aspects play a professional role in establishing the AI promotion system in Korea. Although the subcommittee works independently, it suggests that it is necessary to establish governance in which the results of activities can be synthesized and integrated.

Task Balancing Scheme of MPI Gridding for Large-scale LiDAR Data Interpolation (대용량 LiDAR 데이터 보간을 위한 MPI 격자처리 과정의 작업량 발란싱 기법)

  • Kim, Seon-Young;Lee, Hee-Zin;Park, Seung-Kyu;Oh, Sang-Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.1-10
    • /
    • 2014
  • In this paper, we propose MPI gridding algorithm of LiDAR data that minimizes the communication between the cores. The LiDAR data collected from aircraft is a 3D spatial information which is used in various applications. Since there are many cases where the LiDAR data has too high resolution than actually required or non-surface information is included in the data, filtering the raw LiDAR data is required. In order to use the filtered data, the interpolation using the data structure to search adjacent locations is conducted to reconstruct the data. Since the processing time of LiDAR data is directly proportional to the size of it, there have been many studies on the high performance parallel processing system using MPI. However, previously proposed methods in parallel approach possess possible performance degradations such as imbalanced data size among cores or communication overhead for resolving boundary condition inconsistency. We conduct empirical experiments to verify the effectiveness of our proposed algorithm. The results show that the total execution time of the proposed method decreased up to 4.2 times than that of the conventional method on heterogeneous clusters.

Serialized Multitasking Code Generation from Dataflow Specification (데이타 플로우 명세로부터 직렬화된 멀티태스킹 코드 생성)

  • Kwon, Seong-Nam;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.9_10
    • /
    • pp.429-440
    • /
    • 2008
  • As embedded system becomes more complex, software development becomes more important in the entire design process. Most embedded applications consist of multi -tasks, that are executed in parallel. So, dataflow model that expresses concurrency naturally is preferred than sequential programming language to develop multitask software. For the execution of multitasking codes, operating system is essential to schedule multi-tasks and to deal with the communication between tasks. But, it is needed to execute multitasking code without as when the target hardware platform cannot execute as or target platforms are candidates of design space exploration, because it is very costly to port as for all candidate platforms of DSE. For this reason, we propose the serialized multitasking code generation technique from dataflow specification. In the proposed technique, a task is specified with dataflow model, and generated as a C code. Code generation consists of two steps: First, a block in a task is generated as a separate function. Second, generated functions are scheduled by a multitasking scheduler that is also generated automatically. To make it easy to write customized scheduler manually, the data structure and information of each task are defined. With the preliminary experiment of DivX player, it is confirmed that the generated code from the proposed framework is efficiently and correctly executed on the target system.