• Title/Summary/Keyword: Parallel Computer

Search Result 1,772, Processing Time 0.036 seconds

PARALLEL DYNAMIC OCTAL COMPACT MAPPING

  • Min, Yong-Sik
    • Journal of applied mathematics & informatics
    • /
    • v.3 no.1
    • /
    • pp.35-46
    • /
    • 1996
  • This paper suggests a new coding method for the parallel machine which compresses the data be reducing redundancy. Paral-lel Dynamic octal Compact Mapping (PDOCM) compresses at least 1 byte per word compared with other coding techniques and achieves a 54. 188-fold speedup with 64 processors to transmit 10 million charac-ters.

J2.5dPathway: A 2.5D Visualization Tool to Display Selected Nodes in Biological Pathways, in Parallel Planes

  • Ham, Sung-Il;Song, Eun-Ha;Yang, San-Duk;Thong, Chin-Ting;Rhie, Arang;Galbadrakh, Bulgan;Lee, Kyung-Eun;Park, Hyun-Seok;Lee, San-Ho
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.171-174
    • /
    • 2009
  • The characteristics of metabolic pathways make them particularly amenable to layered graph drawing methods. This paper presents a visual Java-based tool for drawing and annotating biological pathways in two- and a-half dimensions (2.5D) as an alternative to three-dimensional (3D) visualizations. Such visualization allows user to display different groups of clustered nodes, in different parallel planes, and to see a detailed view of a group of objects in focus and its place in the context of the whole system. This tool is an extended version of J2dPathway.

A VISION SYSTEM IN ROBOTIC WELDING

  • Absi Alfaro, S. C.
    • Proceedings of the KWS Conference
    • /
    • 2002.10a
    • /
    • pp.314-319
    • /
    • 2002
  • The Automation and Control Group at the University of Brasilia is developing an automatic welding station based on an industrial robot and a controllable welding machine. Several techniques were applied in order to improve the quality of the welding joints. This paper deals with the implementation of a laser-based computer vision system to guide the robotic manipulator during the welding process. Currently the robot is taught to follow a prescribed trajectory which is recorded a repeated over and over relying on the repeatability specification from the robot manufacturer. The objective of the computer vision system is monitoring the actual trajectory followed by the welding torch and to evaluate deviations from the desired trajectory. The position errors then being transfer to a control algorithm in order to actuate the robotic manipulator and cancel the trajectory errors. The computer vision systems consists of a CCD camera attached to the welding torch, a laser emitting diode circuit, a PC computer-based frame grabber card, and a computer vision algorithm. The laser circuit establishes a sharp luminous reference line which images are captured through the video camera. The raw image data is then digitized and stored in the frame grabber card for further processing using specifically written algorithms. These image-processing algorithms give the actual welding path, the relative position between the pieces and the required corrections. Two case studies are considered: the first is the joining of two flat metal pieces; and the second is concerned with joining a cylindrical-shape piece to a flat surface. An implementation of this computer vision system using parallel computer processing is being studied.

  • PDF

A Sclable Parallel Labeling Algorithm on Mesh Connected SIMD Computers (메쉬 구조형 SIMD 컴퓨터 상에서 신축적인 병렬 레이블링 알고리즘)

  • 박은진;이갑섭성효경최흥문
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.731-734
    • /
    • 1998
  • A scalable parallel algorithm is proposed for efficient image component labeling with local operatos on a mesh connected SIMD computer. In contrast to the conventional parallel labeling algorithms, where a single pixel is assigned to each PE, the algorithm presented here is scalable and can assign m$\times$m pixel set to each PE according to the input image size. The assigned pixel set is converted to a single pixel that has representative value, and the amount of the required memory and processing time can be highly reduced. For N$\times$N image, if m$\times$m pixel set is assigned to each PE of P$\times$P mesh, where P=N/m, the time complexity due to the communication of each PE and the computation complexity are reduced to O(PlogP) bit operations and O(P) bit operations, respectively, which is 1/m of each of the conventional method. This method also diminishes the amount of memory in each PE to O(P), and can decrease the number of PE to O(P2) =Θ(N2/m2) as compared to O(N2) of conventional method. Because the proposed parallel labeling algorithm is scalable, we can adapt to the increase of image size without the hardware change of the given mesh connected SIMD computer.

  • PDF

A Comparative Performance Study for Compute Node Sharing

  • Park, Jeho;Lam, Shui F.
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.4
    • /
    • pp.287-293
    • /
    • 2012
  • We introduce a methodology for the study of the application-level performance of time-sharing parallel jobs on a set of compute nodes in high performance clusters and report our findings. We assume that parallel jobs arriving at a cluster need to share a set of nodes with the jobs of other users, in that they must compete for processor time in a time-sharing manner and other limited resources such as memory and I/O in a space-sharing manner. Under the assumption, we developed a methodology to simulate job arrivals to a set of compute nodes, and gather and process performance data to calculate the percentage slowdown of parallel jobs. Our goal through this study is to identify a better combination of jobs that minimize performance degradations due to resource sharing and contention. Through our experiments, we found a couple of interesting behaviors for overlapped parallel jobs, which may be used to suggest alternative job allocation schemes aiming to reduce slowdowns that will inevitably result due to resource sharing on a high performance computing cluster. We suggest three job allocation strategies based on our empirical results and propose further studies of the results using a supercomputing facility at the San Diego Supercomputing Center.

Performance Analysis of the XMESH Topology for the Massively Parallel Computer Architecture (대규모 병렬컴퓨터를 위한 교차메쉬구조 및 그의 성능해석)

  • 김종진;최흥문
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.5
    • /
    • pp.720-729
    • /
    • 1995
  • We proposed a XMESH(crossed-mesh) topology as a suitable interconnection for the massively parallel computer architectures, and presented performance analysis of the proposed interconnection topology. Horizontally, the XMESH has the same links as those of the toroidal mesh(TMESH) or toroid, but vertically, it has diagonal cross links instead of the vertical links. It reveals desirable interconnection characteristics for the massively parallel computers as the number of nodes increases, while retaining the same structural advantages of the TMESH such as the symmetric structure, periodic placement of subsystems, and constant degree, which are highly recommended features for VLSI/WSI implementations. Furthermore, n*k XMESH can be easily expanded without increasing the diameter as long as n.leq.k.leq.n+4. Analytical performance evaluations show that the XMESH has a shorter diameter, a shorter mean internode distance, and a higher message completion rate than the TMESH or the diagonal mesh(DMESH). To confirm these results, an optimal self-routing algorithm for the proposed topology is developed and is used to simulate the average delay, the maximum delay, and the throughput in the presence of contention. In all cases, the XMESH is shown to outperform the TMESH and the DMESH regardless of the communication load conditions or the number of nodes of the networks, and can provide an attractive alternative to those networks in implementing massively parallel computers.

  • PDF

Parallel Model Feature Extraction to Improve Performance of a BCI System (BCI 시스템의 성능 개선을 위한 병렬 모델 특징 추출)

  • Chum, Pharino;Park, Seung-Min;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.1022-1028
    • /
    • 2013
  • It is well knowns that based on the CSP (Common Spatial Pattern) algorithm, the linear projection of an EEG (Electroencephalography) signal can be made to spaces that optimize the discriminant between two patterns. Sharing disadvantages from linear time invariant systems, CSP suffers from the non-stationary nature of EEGs causing the performance of the classification in a BCI (Brain-Computer Interface) system to drop significantly when comparing the training data and test data. The author has suggested a simple idea based on the parallel model of CSP filters to improve the performance of BCI systems. The model was tested with a simple CSP algorithm (without any elaborate regularizing methods) and a perceptron learning algorithm as a classifier to determine the improvement of the system. The simulation showed that the parallel model could improve classification performance by over 10% compared to conventional CSP methods.

A Reconfigurable Parallel Processor for Efficient Processing of Mobile Multimedia (모바일 멀티미디어의 효율적 처리를 위한 재구성형 병렬 프로세서의 구조)

  • Yoo, Se-Hoon;Kim, Ki-Chul;Yang, Yil-Suk;Roh, Tae-Moon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.10
    • /
    • pp.23-32
    • /
    • 2007
  • This paper proposes a reconfigurable parallel processor architecture which can efficiently implement various multimedia applications, such as 3D graphics, H.264/H.263/MPEG-4, JPEG/JPEG2000, and MP3. The proposed architecture directly connects memories and processors so that memory access time and power consumption are reduced. It supports floating-point operations needed in the geometry stage of 3D graphics. It adopts partitioned SIMD to reduce hardware costs. Conditional execution of instructions is used for easy development of parallel algorithms.

An Efficient Face Detection Method using Skin Color Information and Parallel Processing in Multi-Core SoC (멀티코어 SoC에서 피부색상 정보와 병렬처리를 이용한 효율적인 얼굴 검출 방법)

  • Kim, Hong-Hee;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.16 no.4
    • /
    • pp.375-381
    • /
    • 2012
  • In this paper, we present an implementation of Viola-Jones algorithm in a multi-core SoC by using skin color information and a parallel processing method. In order to reduce unnecessary operations and improve the detection speed, we adopted a face detection algorithm based on skin color and deleted background image. The algorithm is functionally divided into several parts taking account of the size and the dependency so that the divided functions can be proceeded in parallel. Experiment results in SoC with built-in Cortex-A9 multi core show that it is about 1.8 times faster than the existing algorithm which is not divided.

Simulation of Deformable Objects using GLSL 4.3

  • Sung, Nak-Jun;Hong, Min;Lee, Seung-Hyun;Choi, Yoo-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4120-4132
    • /
    • 2017
  • In this research, we implement a deformable object simulation system using OpenGL's shader language, GLSL4.3. Deformable object simulation is implemented by using volumetric mass-spring system suitable for real-time simulation among the methods of deformable object simulation. The compute shader in GLSL 4.3 which helps to access the GPU resources, is used to parallelize the operations of existing deformable object simulation systems. The proposed system is implemented using a compute shader for parallel processing and it includes a bounding box-based collision detection solution. In general, the collision detection is one of severe computing bottlenecks in simulation of multiple deformable objects. In order to validate an efficiency of the system, we performed the experiments using the 3D volumetric objects. We compared the performance of multiple deformable object simulations between CPU and GPU to analyze the effectiveness of parallel processing using GLSL. Moreover, we measured the computation time of bounding box-based collision detection to show that collision detection can be processed in real-time. The experiments using 3D volumetric models with 10K faces showed the GPU-based parallel simulation improves performance by 98% over the CPU-based simulation, and the overall steps including collision detection and rendering could be processed in real-time frame rate of 218.11 FPS.