• Title/Summary/Keyword: Parallel computer architecture

Search Result 231, Processing Time 0.058 seconds

The Methods of Promoting Greenness and the Target Levels of Greenness in Streetscape Suggested by Computer Simulation - The Case of Seoul - (경관 시뮬레이션을 통한 가로 녹시율 증진방안 및 목표수준 설정 - 서울시를 사례로 -)

  • Cho Yong-Hyeon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.34 no.2 s.115
    • /
    • pp.26-35
    • /
    • 2006
  • The purpose of this research is to suggest the planting methods and the reasonable target levels of IGS for promoting green streetscape in Seoul. Using the three dimensional computer simulations, various greening methods were applied to evaluate effectiveness of promoting green streetscape. The results of this study suggest that promoting tree planting on car lane is more effective than on pedestrian side walks. In wide streets, the height of tree has positive effects on promoting green streetscape. In both car lane and pedestrian side walks, the greening effects of tree planting both in zig-zag pattern and in parallel pattern were similarly most high. The width of strip in side strip planting has positive effect on promoting green streetscape. Promoting stratified planting is very effective. Promoting greening wall on pedestrian side walks is more effective than on car lane. Combined the results of IGS survey with the public officials and complex simulations, suggest that the optimal levels of IGS is ranging from 12.0% in alleys to 54.0% in car lanes among arterial roads.

Modular Multiplier based on Cellular Automata Over $GF(2^m)$ (셀룰라 오토마타를 이용한 $GF(2^m)$ 상의 곱셈기)

  • 이형목;김현성;전준철;유기영
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.1_2
    • /
    • pp.112-117
    • /
    • 2004
  • In this paper, we propose a suitable multiplication architecture for cellular automata in a finite field $GF(2^m)$. Proposed least significant bit first multiplier is based on irreducible all one Polynomial, and has a latency of (m+1) and a critical path of $ 1-D_{AND}+1-D{XOR}$.Specially it is efficient for implementing VLSI architecture and has potential for use as a basic architecture for division, exponentiation and inverses since it is a parallel structure with regularity and modularity. Moreover our architecture can be used as a basic architecture for well-known public-key information service in $GF(2^m)$ such as Diffie-Hellman key exchange protocol, Digital Signature Algorithm and ElGamal cryptosystem.

Design of a Parallel Multiplier for Irreducible Polynomials with All Non-zero Coefficients over GF($p^m$) (GF($p^m$)상에서 모든 항의 계수가 0이 아닌 기약다항식에 대한 병렬 승산기의 설계)

  • Park, Seung-Yong;Hwang, Jong-Hak;Kim, Heung-Soo
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.39 no.4
    • /
    • pp.36-42
    • /
    • 2002
  • In this paper, we proposed a multiplicative algorithm for two polynomials with all non-zero coefficients over finite field GF($P^m$). Using the proposed multiplicative algorithm, we constructed the multiplier of modular architecture with parallel in-output. The proposed multiplier is composed of $(m+1)^2$ identical cells, each cell consists of one mod(p) additional gate and one mod(p) multiplicative gate. Proposed multiplier need one mod(p) multiplicative gate delay time and m mod(p) additional gate delay time not clock. Also, our architecture is regular and possesses the property of modularity, therefore well-suited for VLSI implementation.

The Effect of Mesh Reordering on Laplacian Smoothing for Nonuniform Memory Access Architecture-based High Performance Computing Systems (NUMA구조를 가진 고성능 컴퓨팅 시스템에서의 메쉬 재배열의 라플라시안 스무딩에 대한 효과)

  • Kim, Jbium
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.3
    • /
    • pp.82-88
    • /
    • 2014
  • We study the effect of mesh reordering on Laplacian smoothing for parallel high performance computing systems. Specifically, we use the Reverse-Cuthill McKee algorithm to reorder meshes and use Laplacian Smoothing to improve the mesh quality on Nonuniform memory access architecture-based parallel high performance computing systems. First, we investigate the effect of using mesh reordering on Laplacian smoothing for a single core system and extend the idea to NUMA-based high performance computing systems.

A Processor Architecture with Effective Memory System for Sort-Last Parallel Rendering (Sort-Last 병렬 렌더링을 위한 효과적인 메모리 프로세서 구조)

  • Yoon Duk-Ki;Kim Kyoung-So;Lee Kyung-Ho;Park Wo-Chan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2006.05a
    • /
    • pp.1363-1366
    • /
    • 2006
  • 본 논문에서는 각각의 그래픽 가속기에 픽셀 캐시를 사용가능 하게 하면서 성능을 증가시키고 일관성 문제를 해결하는 병렬 렌더링 프로세서를 제안한다. 제안하는 구조에서는 픽셀 캐시 미스에 의한 latency를 감소시켰다. 이러한 2가지 성과를 위하여 현재의 새로운 픽셀 캐시 구조에 효과적인 메모리 구조를 포함시켰다. 실험 결과는 제안하는 구조가 16개 이상의 레스터라이저에서 거의 선형적으로 속도 향상을 가져옴을 보여준다..

  • PDF

Retina-Motivated CMOS Vision Chip Based on Column Parallel Architecture and Switch-Selective Resistive Network

  • Kong, Jae-Sung;Hyun, Hyo-Young;Seo, Sang-Ho;Shin, Jang-Kyoo
    • ETRI Journal
    • /
    • v.30 no.6
    • /
    • pp.783-789
    • /
    • 2008
  • A bio-inspired vision chip for edge detection was fabricated using 0.35 ${\mu}m$ double-poly four-metal complementary metal-oxide-semiconductor technology. It mimics the edge detection mechanism of a biological retina. This type of vision chip offer several advantages including compact size, high speed, and dense system integration. Low resolution and relatively high power consumption are common limitations of these chips because of their complex circuit structure. We have tried to overcome these problems by rearranging and simplifying their circuits. A vision chip of $160{\times}120$ pixels has been fabricated in $5{\times}5\;mm^2$ silicon die. It shows less than 10 mW of power consumption.

  • PDF

A Cost-Effective Hardware Image Compositor for Sort-Last Parallel Visualization Clusters (후정렬 병렬 가시화 클러스터를 위한 저비용의 하드웨어 영상 합성기)

  • Taropa Emanuel;Lee Won-Jong;Srini Vason P.;Han Tack-Don
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07a
    • /
    • pp.712-714
    • /
    • 2005
  • Real-time 3D visualization of large datasets imposes a distributed architecture of the rendering system and dedicated hardware for image composition. Previous work on this domain has relied on prohibitively expensive cluster systems with hardware composition done by complicated schemes. In this paper we propose a low-cost hardware compositor fur a high performance visualization cluster. We show the system's design and the results obtained using Simulink [1] for our image composition scheme.

  • PDF

Performance Analysis of DNN inference using OpenCV Built in CPU and GPU Functions (OpenCV 내장 CPU 및 GPU 함수를 이용한 DNN 추론 시간 복잡도 분석)

  • Park, Chun-Su
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.75-78
    • /
    • 2022
  • Deep Neural Networks (DNN) has become an essential data processing architecture for the implementation of multiple computer vision tasks. Recently, DNN-based algorithms achieve much higher recognition accuracy than traditional algorithms based on shallow learning. However, training and inference DNNs require huge computational capabilities than daily usage purposes of computers. Moreover, with increased size and depth of DNNs, CPUs may be unsatisfactory since they use serial processing by default. GPUs are the solution that come up with greater speed compared to CPUs because of their Parallel Processing/Computation nature. In this paper, we analyze the inference time complexity of DNNs using well-known computer vision library, OpenCV. We measure and analyze inference time complexity for three cases, CPU, GPU-Float32, and GPU-Float16.

FUZZY HYPERCUBES: A New Inference Machines

  • Kang, Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.2 no.2
    • /
    • pp.34-41
    • /
    • 1992
  • A robust and reliable learning and reasoning mechanism is addressed based upon fuzzy set theory and fuzzy associative memories. The mechanism stores a priori an initial knowledge base via approximate learning and utilizes this information for decision-making systems via fuzzy inferencing. We called this fuzzy computer architecture a 'fuzzy hypercube' processing all the rules in one clock period in parallel. Fuzzy hypercubes can be applied to control of a class of complex and highly nonlinear systems which suffer from vagueness uncertainty. Moreover, evidential aspects of a fuzzy hypercube are treated to assess the degree of certainty or reliability together with parameter sensitivity.

  • PDF

Architecture of General and Intelligent Parallel Processing System (범용성과 지능성을 갖는 병렬 처리기 구조)

  • Lee, Hyung;Choi, Sung-Hyuk;Kim, Jung-Bae;Park, Jong-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2000.10a
    • /
    • pp.601-604
    • /
    • 2000
  • 본 논문에서는 방대한 양의 영상데이터를 실시간으로 처리하기 위해 제안된 Park's 다중접근 기억장치를 이용한 SIMD 병렬 처리기 시스템의 효율성을 높이기 위하여 Semi-MIMD 구조를 갖는 병렬처리기 시스템을 제안한다.

  • PDF