• Title/Summary/Keyword: Xeon

Search Result 34, Processing Time 0.029 seconds

Benchmarking the Intel Xeon Phi Coprocessor with Intel MKL library (인텔 MKL 라이브러리를 이용한 Xeon Phi Coprocessor 벤치마크)

  • Park, Young-Soo;Park, Koo-Rack;Kim, Jin-Mook
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.1-4
    • /
    • 2014
  • 인텔 Many Integrated Core (MIC) 아키텍쳐는 61개의 코어가 하나의 칩에 결합되어 있다. Xeon Phi 로 명명된 인텔 MIC는 인텔 E5 Xeon CPU 보다 2배의 single precision GFLOPs 성능을 제공한다. 인텔 MIC 는 수치연산에 최적화 되어 있는 아키텍쳐이다. 우리는 Xeon Phi 7120P를 가지고 벤치마킹을 하였고 클락스피드 1.238GHz, 61Core 이고 한 개의 코어당 4쓰레드를 사용하며 이론상 최고 성능은 Peak Double Precision(GFLOP)는 약 2-TFlops 이다. 이에 우리는 인텔 X86 아키텍쳐에서 openMP 와 인텔 MKL(Math kernel library) 라이브러리를 이용한 병렬프로그램을 작성하여 쓰레드 수를 증가 시키면서 인텔 Xeon Phi 와 E5 Xeon CPU에서 single precision 성능을 벤치마킹 하여, Xeon Phi 와 Xeon E5 의 이론적인 성능을 비교해 보고자 한다. 또한 openMP와 인텔 MKL라이브러리를 사용한 병렬환경에서 CPU의 성능 지표인 클락스피드와 코어수 외에 Vector unit size 의 크기가 성능에 어떤 영향을 미치는지 살펴보았다.

  • PDF

Comparison of Parallel Computation Performances for 3D Wave Propagation Modeling using a Xeon Phi x200 Processor (제온 파이 x200 프로세서를 이용한 3차원 음향 파동 전파 모델링 병렬 연산 성능 비교)

  • Lee, Jongwoo;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.21 no.4
    • /
    • pp.213-219
    • /
    • 2018
  • In this study, we simulated 3D wave propagation modeling using a Xeon Phi x200 processor and compared the parallel computation performance with that using a Xeon CPU. Unlike the 1st generation Xeon Phi coprocessor codenamed Knights Corner, the 2nd generation x200 Xeon Phi processor requires no additional communication between the internal memory and the main memory since it can run an operating system directly. The Xeon Phi x200 processor can run large-scale computation independently, with the large main memory and the high-bandwidth memory. For comparison of parallel computation, we performed the modeling using the MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) libraries. Numerical examples using the SEG/EAGE salt model demonstrated that we can achieve 2.69 to 3.24 times faster modeling performance using the Xeon Phi with a large number of computational cores and high-bandwidth memory compared to that using the 12-core CPU.

Memory-Efficient High Performance Parallelization of Aho-Corasick Algorithm on Intel Xeon Phi (Intel Xeon Phi 에서의 Aho-Corasick 알고리즘을 위한 메모리 친화적인 고성능 병렬화)

  • Tran, Nhat-Phuong;Jeong, Yosang;Lee, Myungho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.87-89
    • /
    • 2014
  • Aho-Corasick (AC) algorithm is a multiple patterns string matching algorithm commonly used in many applications with real-time performance requirements. In this paper, we parallelize the AC algorithm on the Intel's Many Integrated Core (MIC) Architecture, Xeon Phi Coprocessor. We propose a new technique to compress the Deterministic Finite Automaton structure which represents the set of pattern strings again which the input data is inspected for possible matches. The new technique reduces the cache misses and leads to significantly improved performance on Xeon Phi.

Analysis of AGTL+ Driver Application (AGTL+ 구동기 분석)

  • Sim, W.S.;Hahn, J.S.;Hahn, W.J.
    • Electronics and Telecommunications Trends
    • /
    • v.13 no.6 s.54
    • /
    • pp.46-55
    • /
    • 1998
  • 인텔사의 Xeon 프로세서 버스에 정합한 입출력 구동기를 선정하고자, Xeon 프로세서의 구동기술인 AGTL+를 조사하고 AGTL+와 GTL+ 구동기의 혼용 가능성을 분석하였다. GTL+(NTL)을 AGTL+와 함께 사용하고자 할 때 배선 토폴로지, 터미네이션, 댐핑저항 값에 따른 신호 및 시간 특성을 시뮬레이션 하였다. 시뮬레이션의 결과로 80 MHz 이하의 버스속도에서는 Xeon 프로세서 네 개와 GTL+(NTL) 구동기를 혼용하여 안정적으로 동작시킬 수 있음을 보였다.

Performance Evaluation of Microservers to drive for Cloud Computing Applications (클라우드 컴퓨팅 응용 구동을 위한 마이크로서버 성능평가)

  • Myeong-Hoon Oh
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.85-91
    • /
    • 2023
  • In order to utilize KOSMOS, the performance evaluation results are presented in this paper with CloudSuite, an application service-based benchmark program in the cloud computing area. CloudSuite offers several distinct applications as cloud services in two parts: offline applications and online applications on containers. In comparison with other microservers which have similar hardware specifications of KOSMOS, it was observed that KOSMOS was superior in all CloudSuite benchmark applications. KOSMOS also showed higher performance than Intel Xeon CPU-based servers in an offline application. KOSMOS reduced completion time during executing Graph Analytics by 30.3% and 72.3% compared to two Intel Xeon CPU-based servers in an experimental configuration of multiple nodes in KOSMOS.

A Study of Distribute Computing Performance Using a Convergence of Xeon-Phi Processor and Quantum ESPRESSO (퀀텀 에스프레소와 제온 파이 프로세서의 융합을 이용한 분산컴퓨팅 성능에 대한 연구)

  • Park, Young-Soo;Park, Koo-Rack;Kim, Dong-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.7 no.5
    • /
    • pp.15-21
    • /
    • 2016
  • Recently the degree of integration of processor and developed rapidly. However, clock speed is not increased, a situation that increases the number of cores in the processor. In this paper, we analyze the performance of a typical Intel Xeon Phi of many core process used for the current operation accelerate. Utilizing the Quantum ESPRESSO, which was calculated using the FFTW library. By varying the number of ranks in MPI when running the benchmarks the performance Xeon Phi. The result shows a good performance in the handling of four job on one physical core. However, four or more to expand the number of MPI Rank is degraded. Through this convergence it was found to improve the performance of Quantum ESPRESSO. It is possible to check the hardware characteristics of the Xeon Phi.

Parallelizing 3D Frequency-domain Acoustic Wave Propagation Modeling using a Xeon Phi Coprocessor (제온 파이 보조 프로세서를 이용한 3차원 주파수 영역 음향파 파동 전파 모델링 병렬화)

  • Ryu, Donghyun;Jo, Sang Hoon;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.20 no.3
    • /
    • pp.129-136
    • /
    • 2017
  • 3D seismic data processing methods such as full waveform inversion or reverse-time migration require 3D wave propagation modeling and heavy calculations. We compared efficiency and accuracy of a Xeon Phi coprocessor to those of a high-end server CPU using 3D frequency-domain wave propagation modeling. We adopted the OpenMP parallel programming to the time-domain finite difference algorithm by considering the characteristics of the Xeon Phi coprocessors. We applied the Fourier transform using a running-integration to obtain the frequency-domain wavefield. A numerical test on frequency-domain wavefield modeling was performed using the 3D SEG/EAGE salt velocity model. Consequently, we could obtain an accurate frequency-domain wavefield and attain a 1.44x speedup using the Xeon Phi coprocessor compared to the CPU.

Server and Client Simulator for Web-based 3D Image Communication

  • Ko, Jung-Hwan;Lee, Sang-Tae;Kim, Eun-Soo
    • Journal of Information Display
    • /
    • v.5 no.4
    • /
    • pp.38-44
    • /
    • 2004
  • In this paper, a server and client simulator for the web-based multi-view 3D image communication system is implemented by using the IEEE 1394 digital cameras, Intel Xeon server computer and Microsoft's DirectShow programming library. In the proposed system, two-view image is initially captured by using the IEEE 1394 stereo camera and then, this data is compressed through extraction of its disparity information in the Intel Xeon server computer and transmitted to the client system, in which multi-view images are generated through the intermediate views reconstruction method and finally display on the 3D display monitor. Through some experiments it is found that the proposed system can display 8-view image having a grey level of 8 bits with a frame rate of 15 fps.

Distributed Stream Processing System with apache Hadoop for PTAM on Xeon Phi Cluster (PTAM을 위한 제온파이 기반 하둡 분산 스트림 프로세싱 시스템)

  • Seo, Jae Min;Cho, Kyu Nam;Kim, Do Hyung;Jeong, Chang-Sung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.184-186
    • /
    • 2015
  • 본 논문에서는 PTAM을 위한 새로운 분산 스트림 프로세싱 시스템을 제안한다. PTAM은 하나의 시스템에서 동작하도록 설계되었다. 이는 PTAM이 가지고 있는 한계점을 말해주는 부분인데, PTAM은 Bundle Adjustment의 계산 부하가 커지는 경우에 map을 구축하는데 있어 많은 시간과 리소스가 필요하다. 이에 하둡을 통해 계산 부하를 분산하고, PE(Processing Element)를 Xeon phi 시스템을 통해 동작되는 시스템을 제안한다.

A Virtualized Kernel for Effective Memory Test (효과적인 메모리 테스트를 위한 가상화 저널)

  • Park, Hee-Kwon;Youn, Dea-Seok;Choi, Jong-Moo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.12
    • /
    • pp.618-629
    • /
    • 2007
  • In this paper, we propose an effective memory test environment, called a virtualized kernel, for 64bit multi-core computing environments. The term of effectiveness means that we can test all of the physical memory space, even the memory space occupied by the kernel itself, without rebooting. To obtain this capability, our virtualized kernel provides four mechanisms. The first is direct accessing to physical memory both in kernel and user mode, which allows applying various test patterns to any place of physical memory. The second is making kernel virtualized so that we can run two or more kernel image at the different location of physical memory. The third is isolating memory space used by different instances of virtualized kernel. The final is kernel hibernation, which enables the context switch between kernels. We have implemented the proposed virtualized kernel by modifying the latest Linux kernel 2.6.18 running on Intel Xeon system that has two 64bit dual-core CPUs with hyper-threading technology and 2GB main memory. Experimental results have shown that the two instances of virtualized kernel run at the different location of physical memory and the kernel hibernation works well as we have designed. As the results, the every place of physical memory can be tested without rebooting.