• Title/Summary/Keyword: parallel machines

Search Result 208, Processing Time 0.03 seconds

Design of an Image Processing ASIC Architecture using Parallel Approach with Zero or Little (통신부담을 감소시킨 영상처리를 위한 병렬처리 방식 ASIC구조 설계)

  • 안병덕;정지원;선우명훈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.10
    • /
    • pp.2043-2052
    • /
    • 1994
  • This paper proposes a new parallel ASIC architecture for real-time image processing to reduce inter-processing element (inter-PE) communication overhead, called a Sliding Memory Plane (SliM) Image Processor. The Slim Image Processor consists of $3\times3$ processing elements (PEs) connected by a mesh topology. With easy scalability due to the topology. a set of SliM Image Processors can form a mesh-connected SIMD parallel architecture. called the SliM Array Processor. The idea of sliding means that all pixels are slided into all neighboring PEs without interrupting PEs and without a coprocessor or a DMA controller. Since the inter-PE communication and computation occur simultaneously. the inter-PE communication overhead, significant disadvantage of existing machines greatly diminishes. Two I/O planes provide a buffering capability and reduce the date I/O overhead. In addition, using the by-passing path provides eight-way connectivity even with four links. with these salient features. SliM shows a significant performance improvement. This paper presents architectures of a PE and the SliM Image Processor, and describes the design of an instruction set.

  • PDF

Development of a Parallel Cell-Based DSMC Method Using Unstructured Meshes (비정렬격자에서 병렬화된 격자중심 직접모사 기법 개발)

  • Kim, Hyeong-Sun;Kim, Min-Gyu;Gwon, O-Jun
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.2
    • /
    • pp.1-11
    • /
    • 2002
  • In the present study, a parallel DSCM technique based on a cell-based data structure is developed for the efficient simulation of rarefied gas flows especially od PC clusters. Dynamic load balancing is archieved by decomposing the computational domain into several sub-domains and accounting for the number of particles and the number cells of each domain. Mesh adaptation algorithm is also applied to improve the resolution of the solution and to reduce the grid dependency. It was demonstrated that accurate solutions can be obtained after several levels of mesh adapation starting from a coars initial grid. The method was applied to a two-dimensioanal supersonic leading-edge flow and the axi-symmetric Rothe nozzle flow to validate the efficiency of the present method. It was found that the present method is a very effective tool for the efficient simulation of rarefied gas flow on PC-based parallel machines.

Transonic/Supersonic Nonlinear Aeroelastic Analysis of a Complete Aircraft Using High Speed Parallel Processing Technique (고속 병렬처리 기법을 이용한 전기체 항공기 형상의 천음속/초음속 비선형 공탄성 해석)

  • Kim, Dong-Hyun;Kwon, Hyuk-Jun;Lee, In;Kwon, Oh-Joon;Paek, Seung-Kil;Hyun, Yong-Hee
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.30 no.8
    • /
    • pp.46-55
    • /
    • 2002
  • A nonlinear aeroelastic analysis system in transonic and supersonic flows has been developed using high speed parallel processing technique on the network based PC-clustered machines. This paper includes the coupling of advanced numerical techniques such as computational structural dynamics (CSD), finite element method (FEM) and computational fluid dynamics (CFD). The unsteady Euler solver on dynamic unstructured meshes is employed and coupled with computational aeroelastic solvers. Thus it can give very accurate engineering data in the structural and aeroelastic design of flight vehicles. To show the great potential of useful application, transonic and supersonic flutter analyses have been conducted for a complete aircraft model under developing in Korea.

Web access prediction based on parallel deep learning

  • Togtokh, Gantur;Kim, Kyung-Chang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.51-59
    • /
    • 2019
  • Due to the exponential growth of access information on the web, the need for predicting web users' next access has increased. Various models such as markov models, deep neural networks, support vector machines, and fuzzy inference models were proposed to handle web access prediction. For deep learning based on neural network models, training time on large-scale web usage data is very huge. To address this problem, deep neural network models are trained on cluster of computers in parallel. In this paper, we investigated impact of several important spark parameters related to data partitions, shuffling, compression, and locality (basic spark parameters) for training Multi-Layer Perceptron model on Spark standalone cluster. Then based on the investigation, we tuned basic spark parameters for training Multi-Layer Perceptron model and used it for tuning Spark when training Multi-Layer Perceptron model for web access prediction. Through experiments, we showed the accuracy of web access prediction based on our proposed web access prediction model. In addition, we also showed performance improvement in training time based on our spark basic parameters tuning for training Multi-Layer Perceptron model over default spark parameters configuration.

Servo Drives State of the Art in Industrial Applications - A Survey

  • Kennel R.;Kobs G.;Weber R.
    • Proceedings of the KIPE Conference
    • /
    • 2001.10a
    • /
    • pp.321-325
    • /
    • 2001
  • Servo drives with microcomputer control provide the possibility of using modem and sophisticated control algorithms. As an additional feature it is possible to implement parallel and/or redundant software and hardware structures to realise safe motion or similar security functions. Unfortunately microcomputer control also has some impact on the behaviour of servo drives. Control algorithm, cycle time, sensors and interface have to be perfectly synchronised. Special control schemes are necessary on the line side (power supply) to meet the actual requirements concerning EMC. This contribution presents experiences and results obtained from a modem digital drive system pointing out the influences of low and high accuracy position sensors and the interdependencies mentioned above.

  • PDF

Study on the Reduction Method of Magnetic Noise and Vibration in Home Electric Motors (가전기기용전동기의 전자소음과 진동의 방지대책에 관한 연구)

  • 황영문;조철제
    • 전기의세계
    • /
    • v.26 no.5
    • /
    • pp.74-82
    • /
    • 1977
  • This study is to present a method for reductin of noise and vibration of home electric motors coupled to the mechanical load causing relatively big amplitude of vibration. The noise and vibration factors have been analysed in the divisions such as the pattern related to the armature reaction, the pattern related to the circulating current by induction and the other patterns those are affected by a dditive magnetic field and have an effect on mechanical constants. From the systematic mutual relations between the patterns and daping effects, it is possible to derive the fundamental measure for reduction of noise and vibration. Vibration measurements and analysis were carried out in accordance with the planned experimentation and thre object model was chosen randomly from the production line in a factory where home electric machines were mass-produced. Based on the above-mentioned fundamental measure, suppression effects on noise and vibration have been analysed according to the number of slots, the amount of rotor skew and the way the stator winding connection was series or parallel.

  • PDF

Fabrication and Characteristics of 30 MN Strain Gage Type Force Sensor (30 MN 스트레인 게이지 방식 힘 센서의 제작 및 특성)

  • Kang, D.I.;Song, H.K.;Lee, J.T.
    • Journal of Sensor Science and Technology
    • /
    • v.3 no.2
    • /
    • pp.24-32
    • /
    • 1994
  • A force sensor of 30 MN capacity using build-up technique in which three load cells of 10 MN capacity are arranged in parallel was fabricated. A column spring element was adopted as a shape of a strain gage type load cell. Temperature compensation circuits were used to reduce the error of a load cell. It was estimated that the total error of the fabricated force sensor is less than 0.1 %. The force sensor may be used to calibrate or test material testing machines above 4.5 MN capacity in industries.

  • PDF

Optimum Scheduling Algorithm for Job Sequence, Common Due Date Assignment and Makespan to Minimize Total Costs for Multijob in Multimachine Systems (다수(多数) 기계(機械)의 총비용(總費用)을 최소화(最小化)하는 최적작업순서, 공통납기일 및 작업완료일 결정을 위한 일정계획(日程計劃))

  • No, In-Gyu;Kim, Sang-Cheol
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.12 no.1
    • /
    • pp.1-11
    • /
    • 1986
  • This research is concerned with n jobs, m parallel identical machines scheduling problem in which all jobs have a common due date. The objective of the research is to develop an optimum scheduling algorithm for determining an optimal job sequence, the optimal value of the common due date and the minimum makespan to minimize total cost. The total cost is based on the common due date cost, the earliness cost, the tardiness cost and the flow time cost of each job in the selected sequence. The optimum scheduling algorithm is developed. A numerical example is given to illustrate the scheduling algorithm.

  • PDF

Distributed Indexing Methods for Moving Objects based on Spark Stream

  • Lee, Yunsou;Song, Seokil
    • International Journal of Contents
    • /
    • v.11 no.1
    • /
    • pp.69-72
    • /
    • 2015
  • Generally, existing parallel main-memory spatial index structures to avoid the trade-off between query freshness and CPU cost uses light-weight locking techniques. However, still, the lock based methods have some limits such as thrashing which is a well-known problem in lock based methods. In this paper, we propose a distributed index structure for moving objects exploiting the parallelism in multiple machines. The proposed index is a lock free multi-version concurrency technique based on the D-Stream model of Spark Stream. The proposed method exploits the multiversion nature of D-Stream of Spark Streaming.

Deep Learning Model Parallelism (딥러닝 모델 병렬 처리)

  • Park, Y.M.;Ahn, S.Y.;Lim, E.J.;Choi, Y.S.;Woo, Y.C.;Choi, W.
    • Electronics and Telecommunications Trends
    • /
    • v.33 no.4
    • /
    • pp.1-13
    • /
    • 2018
  • Deep learning (DL) models have been widely applied to AI applications such image recognition and language translation with big data. Recently, DL models have becomes larger and more complicated, and have merged together. For the accelerated training of a large-scale deep learning model, model parallelism that partitions the model parameters for non-shared parallel access and updates across multiple machines was provided by a few distributed deep learning frameworks. Model parallelism as a training acceleration method, however, is not as commonly used as data parallelism owing to the difficulty of efficient model parallelism. This paper provides a comprehensive survey of the state of the art in model parallelism by comparing the implementation technologies in several deep learning frameworks that support model parallelism, and suggests a future research directions for improving model parallelism technology.