• Title/Summary/Keyword: parallel algorithms

Search Result 653, Processing Time 0.024 seconds

GPGPU based Depth Image Enhancement Algorithm (GPGPU 기반의 깊이 영상 화질 개선 기법)

  • Han, Jae-Young;Ko, Jin-Woong;Yoo, Jisang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.12
    • /
    • pp.2927-2936
    • /
    • 2013
  • In this paper, we propose a noise reduction and hole removal algorithm in order to improve the quality of depth images when they are used for creating 3D contents. In the proposed algorithm, the depth image and the corresponding color image are both used. First, an intensity image is generated by converting the RGB color space into the HSI color space. By estimating the difference of distance and depth between reference and neighbor pixels from the depth image and difference of intensity values from the color image, they are used to remove noise in the proposed algorithm. Then, the proposed hole filling method fills the detected holes with the difference of euclidean distance and intensity values between reference and neighbor pixels from the color image. Finally, we apply a parallel structure of GPGPU to the proposed algorithm to speed-up its processing time for real-time applications. The experimental results show that the proposed algorithm performs better than other conventional algorithms. Especially, the proposed algorithm is more effective in reducing edge blurring effect and removing noise and holes.

Prefetching Policy based on File Acess Pattern and Cache Area (파일 접근 패턴과 캐쉬 영역을 고려한 선반입 기법)

  • Lim, Jae-Deok;Hwang-Bo, Jun-Hyeong;Koh, Kwang-Sik;Seo, Dae-Hwa
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.447-454
    • /
    • 2001
  • Various caching and prefetching algorithms have been investigated to identify and effective method for improving the performance of I/O devices. A prefetching algorithm decreases the processing time of a system by reducing the number of disk accesses when an I/O is needed. This paper proposes an AMBA prefetching method that is an extended version of the OBA prefetching method. The AMBA prefetching method will prefetching blocks continuously as long as disk bandwidth is enough. In this method, though there were excessive data request rate, we would expect efficient prefetching. And in the AMBA prefetching method, to prevent the cache pollution, it limits the number of data blocks to be prefetched within the cache area. It can be implemented in a user-level File System based on a Linux Operating System. In particular, the proposed prefetching policy improves the system performance by about 30∼40% for large files that are accessed sequentially.

  • PDF

Parallel solution of linear systems on the CRAY-2 using multi/micro tasking library (CRAY-2에서 멀티/마이크로 태스킹 라이브러리를 이용한 선형시스템의 병렬해법)

  • Ma, Sang-Back
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2711-2720
    • /
    • 1997
  • Multitasking and microtasking on the CRAY machine provides still another way to improve computational power. Since CRAY-2 has 4 processors we can achieve speedup up to 4 properly designed algorithms. In this paper we present two parallelizations of linear system solution in the CRAY-2 with multitasking and microtasking library. One is the LU decomposition on the dense matrices and the other is the iterative solution of large sparse linear systems with the preconditioner proposed by Radicati di Brozolo. In the first case we realized a speedup of 1.3 with 2 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 8192 with the microtasking. In the first case the speedup is limited because of the nonuniform vector lenghts. In the second case the ILU(0) preconditioner with Radicati's technique seem to realize a reasonable high speedup with 4 processors.

  • PDF

Interactive Visualization Technique for Adaptive Mesh Refinement Data Using Hierarchical Data Structures and Graphics Hardware (계층적 자료구조와 그래픽스 하드웨어를 이용한 적응적 메쉬 세분화 데이타의 대화식 가시화)

  • ;Chandrajit Bajaj
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.5_6
    • /
    • pp.360-370
    • /
    • 2004
  • Adaptive mesh refinement(AMR) is one of the popular computational simulation techniques used in various scientific and engineering fields. Although AMR data is organized in a hierarchical multi-resolution data structure, traditional volume visualization algorithms such as ray-casting and splatting cannot handle the form without converting it to a sophisticated data structure. In this paper, we present a hierarchical multi-resolution splatting technique using k-d trees and octrees for AMR data that is suitable for implementation on the latest consumer PC graphics hardware. We describe a graphical user interface to set transfer function and viewing / rendering parameters interactively. Experimental results obtained on a general purpose PC equipped with an nVIDIA GeForce3 card are presented to demonstrate that the proposed techniques can interactively render AMR data(over 20 frames per second). Our scheme can easily be applied to parallel rendering of time-varying AMR data.

All-port Broadcasting Algorithms on Wormhole Routed Star Graph Networks (웜홀 라우팅을 지원하는 스타그래프 네트워크에서 전 포트 브로드캐스팅 알고리즘)

  • Kim, Cha-Young;Lee, Sang-Kyu;Lee, Ju-Young
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.2
    • /
    • pp.65-74
    • /
    • 2002
  • Recently star networks are considered as attractive alternatives to the widely used hypercube for interconnection networks in parallel processing systems by many researchers. One of the fundamental communication problems on star graph networks is broadcasing In this paper we consider the broadcasting problems in star graph networks using wormhole routing. In wormhole routed system minimizing link contention is more critical for the system performance than the distance between two communicating nodes. We use Hamiltonian paths in star graph to set up link-disjoint communication paths We present a broadcast algorithm in n-dimensional star graph of N(=n!) nodes such that the total completion time is no larger than $([long_n n!]+1)$ steps where $([long_n n!]+1)$ is the lower bound This result is significant improvement over the previous n-1 step broadcasting algorithm.

Implementation of Neural Network Accelerator for Rendering Noise Reduction on OpenCL (OpenCL을 이용한 랜더링 노이즈 제거를 위한 뉴럴 네트워크 가속기 구현)

  • Nam, Kihun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.4 no.4
    • /
    • pp.373-377
    • /
    • 2018
  • In this paper, we propose an implementation of a neural network accelerator for reducing the rendering noise using OpenCL. Among the rendering algorithms, we selects a ray tracing to assure a high quality graphics. Ray tracing rendering uses ray to render, less use of the ray will result in noise. Ray used more will produce a higher quality image but will take operation time longer. To reduce operation time whiles using fewer rays, Learning Base Filtering algorithm using neural network was applied. it's not always produce optimize result. In this paper, a new approach to Matrix Multiplication that is based on General Matrix Multiplication for improved performance. The development environment, we used specialized in high speed parallel processing of OpenCL. The proposed architecture was verified using Kintex UltraScale XKU6909T-2FDFG1157C FPGA board. The time it takes to calculate the parameters is about 1.12 times fast than that of Verilog-HDL structure.

A Scalable ECC Processor for Elliptic Curve based Public-Key Cryptosystem (타원곡선 기반 공개키 암호 시스템 구현을 위한 Scalable ECC 프로세서)

  • Choi, Jun-Baek;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.8
    • /
    • pp.1095-1102
    • /
    • 2021
  • A scalable ECC architecture with high scalability and flexibility between performance and hardware complexity is proposed. For architectural scalability, a modular arithmetic unit based on a one-dimensional array of processing element (PE) that performs finite field operations on 32-bit words in parallel was implemented, and the number of PEs used can be determined in the range of 1 to 8 for circuit synthesis. A scalable algorithms for word-based Montgomery multiplication and Montgomery inversion were adopted. As a result of implementing scalable ECC processor (sECCP) using 180-nm CMOS technology, it was implemented with 100 kGEs and 8.8 kbits of RAM when NPE=1, and with 203 kGEs and 12.8 kbits of RAM when NPE=8. The performance of sECCP with NPE=1 and NPE=8 was analyzed to be 110 PSMs/sec and 610 PSMs/sec, respectively, on P256R elliptic curve when operating at 100 MHz clock.

Parallelization of Probabilistic RoadMap for Generating UAV Path on a DTED Map (DTED 맵에서 무인기 경로 생성을 위한 Probabilistic RoadMap 병렬화)

  • Noh, Geemoon;Park, Jihoon;Min, Chanoh;Lee, Daewoo
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.50 no.3
    • /
    • pp.157-164
    • /
    • 2022
  • In this paper, we describe how to implement the mountainous terrain, radar, and air defense network for UAV path planning in a 3-D environment, and perform path planning and re-planning using the PRM algorithm, a sampling-based path planning algorithm. In the case of the original PRM algorithm, the calculation to check whether there is an obstacle between the nodes is performed 1:1 between nodes and is performed continuously, so the amount of calculation is greatly affected by the number of nodes or the linked distance between nodes. To improve this part, the proposed LineGridMask method simplifies the method of checking whether obstacles exist, and reduces the calculation time of the path planning through parallelization. Finally, comparing performance with existing PRM algorithms confirmed that computational time was reduced by up to 88% in path planning and up to 94% in re-planning.

Performance Analysis for Privacy-preserving Data Collection Protocols (개인정보보호를 위한 데이터 수집 프로토콜의 성능 분석)

  • Lee, Jongdeog;Jeong, Myoungin;Yoo, Jincheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1904-1913
    • /
    • 2021
  • With the proliferation of smart phones and the development of IoT technology, it has become possible to collect personal data for public purposes. However, users are afraid of voluntarily providing their private data due to privacy issues. To remedy this problem, mainly three techniques have been studied: data disturbance, traditional encryption, and homomorphic encryption. In this work, we perform simulations to compare them in terms of accuracy, message length, and computation delay. Experiment results show that the data disturbance method is fast and inaccurate while the traditional encryption method is accurate and slow. Similar to traditional encryption algorithms, the homomorphic encryption algorithm is relatively effective in privacy preserving because it allows computing encrypted data without decryption, but it requires high computation costs as well. However, its main cost, arithmetic operations, can be processed in parallel. Also, data analysis using the homomorphic encryption needs to do decryption only once at any number of data.

Analysis on Contents and Problem solving methods of Fraction Division in Korean Elementary Mathematics Textbooks (우리나라 초등 수학 교과서에 제시된 분수 나눗셈 내용과 해결 방법 분석)

  • Lee, Daehyun
    • Journal of the Korean School Mathematics Society
    • /
    • v.25 no.2
    • /
    • pp.105-124
    • /
    • 2022
  • The contents of fraction division in textbooks are important because there were changes in situations and problem solving methods in textbooks according to the revision of the curriculum and the contents of textbooks affect students' learning directly. So, this study analyzed the achievement standards of the curriculum and formula types and situations, and the introduction process of non-standard and standard algorithms presented in Korean mathematics textbooks. The results are follows: there was little difference in the achievement standards of the curriculum, but there was a difference in the arrangement of contents by grades in textbooks. There was a difference in the types of formula according to textbooks. And the situation became more diverse; recent textbooks have changed to the direction of using the non-standard and the standard algorithm in parallel. In conclusion, I proposed categorizing rather than splitting the types of fraction division, the connection of non-standard and standard algorithm, and the need to prepare methods to pursue generalization and justification according to the common characteristics in the process of introducing standard algorithm.