• Title/Summary/Keyword: 멀티코어

Search Result 413, Processing Time 0.026 seconds

The Development of a MATLAB-based Discrete Event Simulation Framework for the Engagement Simulations of the Weapon Systems (무기체계 교전 시뮬레이션을 위한 매트랩 기반 이산사건시뮬레이션 프레임워크의 개발)

  • Hwang, Kun-Chul;Lee, Min-Gyu;Kim, Jung-Hoon
    • Journal of the Korea Society for Simulation
    • /
    • v.21 no.2
    • /
    • pp.31-39
    • /
    • 2012
  • Simulation Framework is a basic software tool used to develop simulation applications. This paper describes the development of a discrete event simulation framework based on DEVS(Discrete EVent System Specification) formalism, using MATLAB language which is widely used in technical computing and engineering disciplines. The newly developed framework utilizing MATLAB object oriented programming combines the convenience of MATLAB language and the sophisticated architecture of the DEVS formalism. Hence, it supports the productivity, flexibility, extensibility that are required for the simulation application software development of the weapon systems engagement. Moreover, it promises a simulation application the increased the computation speed proportional to the number of CPU of a multi-core processor, providing the batch simulation functionality based on MATLAB parallel computing technology.

Design and Simulation for Out-of-Order Execution Processor of a Fully Pipelined Scheme (완전한 파이프라인 방식의 비순차실행 프로세서의 설계 및 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.143-149
    • /
    • 2020
  • Currently, a multi-core processor is mainly used as a central processing unit of a computer system, and a high-performance out-of-order processor is adopted as each core to maximize system performance. The early out-of-order execution processor with Tomasulo algorithm aimed at floating-point instructions, and it took several cycles to execute by the use of complex structures such as reorder buffer and reservation station. However, in order for the processor to properly utilize out-of-order execution and increase the throughput of instructions, it must operate in a fully pipelined manner. In this paper, a fully pipelined out-of-order processor with speculative execution is designed with VHDL and verified with GHDL. As a result of the simulation, a program composed of ARM instructions is successfully performed.

Study on LLVM application in Parallel Computing System (병렬 컴퓨팅 시스템에서 LLVM 응용 연구)

  • Cho, Jungseok;Cho, Doosan;Kim, Yongyeon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.395-399
    • /
    • 2019
  • In order to support various parallel computing systems, it is necessary to extend LLVM IR to more efficiently support vector / matrix and to design LLVM IR to machine code as a new algorithm. As shown in the IR example, RISC instruction generation is naturally generated because the RISC instruction is basically composed of the RISC instruction, and the vector instruction is also not supported. There is a need for new IR structures, command generation algorithms and related extensions to support vector / matrix more robustly. To do this, it is important to map each instruction in the LLVM IR to the appropriate instruction in the target architecture (vector / matrix) (instruction selection algorithm). It is necessary to understand the meaning of LLVM IR command, to compare the meaning of each instruction of the target architecture with syntax, and to select the instruction that matches the pattern to make mapping efficient.

Particle Dispersion Model Speed Improvement and Evaluation for Quick Reaction to Pollutant Accidents (신속한 오염사고 대응을 위한 입자 분산 모형의 속도 개선 및 평가)

  • Shin, Jaehyun;Seong, Hoje;Park, Inhwan;Rhee, Dong Sop
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.12
    • /
    • pp.537-546
    • /
    • 2020
  • This study deals with the development and improvement of a particle dispersion model for quick response to water pollutant accidents. The developed model is based on the shear dispersion theory where vertical mixing is done by step by step mixing by vertical and molecular diffusion algorithm. For the quick response to chemical accidents, an algorithm for multi-core modeling for the particle dispersion model is applied. After the application of multi-core operation using OpenMP directives to the model, the relation for the calculation time and particle size were determined along with the number of cores used for parallel programming to determine the model time for chemical accident responses. The results showed the adequate conditions for the modeling of chemical accidents for quick response and to increase the applicability of the model.

Parallel Implementations of the Self-Organizing Network for Normal Mixtures (병렬처리를 통한 정규혼합분포의 추정)

  • Lee, Chul-Hee;Ahn, Sung-Mahn
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.459-469
    • /
    • 2012
  • This article proposes a couple of parallel implementations of the self-organizing network for normal mixtures. In principle, self-organizing networks should be able to be implemented in a parallel computing environment without issue. However, the network for normal mixtures has inherent problem in being operated parallel in pure sense due to estimating conditional expectations of the mixing proportion in each iteration. This article shows the result of the parallel implementations of the network using Java. According to the results, both of the implementations achieved a faster execution without any performance degradation.

Implementation of OpenVG Accelerator based on Multi-Core GP-GPU (멀티코어 GP-GPU 기반의 OpenVG 가속기 구현)

  • Lee, Kwang-Yeob;Park, Jong-Il;Lee, Chan-Ho
    • Journal of IKEEE
    • /
    • v.15 no.3
    • /
    • pp.248-254
    • /
    • 2011
  • Recently, processing burden of CPU is growing because of graphical user interface according to enhance the performance of mobile devices and various graphical effects and creation of contents with 3D graphical effect or Flash animation. Therefore, the GPU are introduced to mobile device for support to variety contents. In this paper, OpenVG accelerator was implemented based on multi-core GP-GPU. OpenVG accelerator is verified using the sample image provided by Khronos group, and overall function is processed by only instruction set without dedicate hardware. The performance of processing the Tiger Image was 2 frames/sec.

A Rendezvous Router Decision Algorithm Considering Routing Table Size (라우팅 테이블의 크기를 고려한 랑데부 라우터 선정 알고리즘)

  • Cho, Kee-Seong;Jang, Hee-Seon;Kim, Dong-Whee
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7 s.110
    • /
    • pp.905-912
    • /
    • 2006
  • Depending on the location of the rendezvous point (RP), the network efficiency is determined in the core based tree (CBT) or protocol independent multicast-sparse mode (PIM-5M) multicasting protocol to provide the multicast services based on the shared tree. In this paper, a new algorithm to allocate the RP using the estimated values of the total cost and the size(number of entries) of the routing tables is proposed for efficiently controlling the cost and the number of routing table entries. The numerical results show that the proposed algorithm reduces the total cost in 5.37%, and the size of routing tables in 13.35% as compared to the previous algorithm.

Fast and Efficient Implementation of Neural Networks using CUDA and OpenMP (CUDA와 OPenMP를 이용한 빠르고 효율적인 신경망 구현)

  • Park, An-Jin;Jang, Hong-Hoon;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.253-260
    • /
    • 2009
  • Many algorithms for computer vision and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation has two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job that needs much cooperation between CPU and GPU, which is usual in image processing and pattern recognition contrary to the graphic area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results in effectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text extraction system using the proposed architecture, and the computational times showed about 15 times faster than implementation on only GPU without OpenMP.

Synthesis of Multi-Terminalized Magnetic-Cored Dendrimer for Adsorption of Chromium and Enhancement of Magnetic Recovery (크롬 흡착 및 자성회수율 향상을 위한 멀티터미널 자성코어 덴드리머의 합성)

  • Yeo, In-Hwan;Jang, Jun-Won;Kim, Lyung-Joo;Park, Jae-Woo
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.34 no.9
    • /
    • pp.613-622
    • /
    • 2012
  • A chrome absorbent that is useful in rapid magnetic recovery and recycling was developed though a synthesis of Multi-Terminalized Magnetic-core Dendrimer (MTMD). Divergence through coprecipitation and rotation growth was used for synthesis. The dendrimer was multi-terminilized through methyl propionate and glutaric acid. The property analysis of the synthesized sample was performed through XRD, FT-IR, TEM, EDS, TGA and zeta potential analyzer. A magnetic-core of MTMD had a magnetite crystal and the size of 4th generation dendrimer was identified to be from 15 nm to 20 nm. Through the analysis of the TGA, the rate of the dendrimer branch for the first generation dendrimer was about 7% and 3% of diminished weight occurred as the generation grows. Also, the potential of the dendrimer when multi-terminalized, had variation from 25.26 mV to -6.53 mV. As a result of MTMD adsorption experiment, it absorbed more than 80% within 5 minutes and indicated absorptivity of 6.308 mg/g. When it was compared with COOH Dendrimer (COOH-D) after magnetic recovery, the recovery time was rapidly reduced by more than half and it could recover 100% within 30 minutes. In case of the regeneration experiment that used chrome, it was identified to maintain the same adsorptivity for four runs.

Heterogeneous Face Recognition Using Texture feature descriptors (텍스처 기술자들을 이용한 이질적 얼굴 인식 시스템)

  • Bae, Han Byeol;Lee, Sangyoun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.3
    • /
    • pp.208-214
    • /
    • 2021
  • Recently, much of the intelligent security scenario and criminal investigation demands for matching photo and non-photo. Existing face recognition system can not sufficiently guarantee these needs. In this paper, we propose an algorithm to improve the performance of heterogeneous face recognition systems by reducing the different modality between sketches and photos of the same person. The proposed algorithm extracts each image's texture features through texture descriptors (gray level co-occurrence matrix, multiscale local binary pattern), and based on this, generates a transformation matrix through eigenfeature regularization and extraction techniques. The score value calculated between the vectors generated in this way finally recognizes the identity of the sketch image through the score normalization methods.