• Title/Summary/Keyword: 고성능 컴퓨팅

Search Result 292, Processing Time 0.033 seconds

Optimization of Graph Processing based on In-Storage Processing (스토리지 내 프로세싱 방식을 사용한 그래프 프로세싱의 최적화 방법)

  • Song, Nae Young;Han, Hyuck;Yeom, Heon Young
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.8
    • /
    • pp.473-480
    • /
    • 2017
  • In recent years, semiconductor-based storage devices such as flash memory (SSDs) have been developed to high performance. In addition, a trend has been observed of optimally utilizing resources such as the central processing unit (CPU) and memory of the internal controller in the storage device according to the needs of the application. This concept is called In-Storage Processing (ISP). In a storage device equipped with the ISP function, it is possible to process part of the operation executed on the host system, thus reducing the load on the host. Moreover, since the data is processed in the storage device, the data transferred to the host are reduced. In this paper, we propose a method to optimize graph query processing by utilizing these ISP functions, and show that the optimized graph processing method improves the performance of the graph 500 benchmark by up to 20%.

Dynamic Power Management for Webpage Loading on Mobile Devices (모바일 웹 페이지 로딩에서 동적 관리 기법)

  • Park, Hyunjae;Choi, Youngjune
    • Journal of KIISE
    • /
    • v.42 no.12
    • /
    • pp.1623-1628
    • /
    • 2015
  • As the performance of mobile devices has increased, high-end multicore CPUs have become the norm in smartphones. However, such high performance devices are exposed to the problem of battery depletion due to the energy consumption caused by software performance, and despite increases in battery capacity. The required resources are dynamic and varied, and further user interaction is highly random; thus, Linux-based power management such as DVFS is needed to fulfill requirements. In order to reduce power consumption, we propose a method to restrict the CPU frequency of data download while maintaining user reactivity. This can supplement the weakness of existing Linux-based power management techniques like DVFS in relation to webpage loading. Through the implementation of our method at the application level, we confirm that energy consumption from webpage loading is reduced.

An Efficient Implementation of MPI over VMMC for Myrinet (Myrinet 상에서 VMMC를 기반으로 하는 효율적인 MPI 구현)

  • Kim, Ho-Joong;Maeng, Seung-Ryoul
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.539-547
    • /
    • 2001
  • Cluster systems employ high speed interconnection networks and use efficient communication layers to gain high performance and scalability. But the diversity in implementation mechanism among these communication layers causes lack of portability. A solution is to provide communication standard APIs such as MPI. This paper introduces MPI-VMMC: an MPI implementation on VMMC. Though the direct deposit transfer mechanism used in VMMC is not suitable for Send/Recv mechanism used in MPI, the proposed sub-layer laid between MPI and VMMC efficiently translates from one mechanism to the other. We also use the lazy pointer and selective zero-copy transfer technique to gain high performance. The peak performance of MPI-VMMC is 90.7Mbytes/sec, which is about 95% of the base communication layer\`s.

  • PDF

Economic Impact of HEMOS-Cloud Services for M&S Support (M&S 지원을 위한 HEMOS-Cloud 서비스의 경제적 효과)

  • Jung, Dae Yong;Seo, Dong Woo;Hwang, Jae Soon;Park, Sung Uk;Kim, Myung Il
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.10
    • /
    • pp.261-268
    • /
    • 2021
  • Cloud computing is a computing paradigm in which users can utilize computing resources in a pay-as-you-go manner. In a cloud system, resources can be dynamically scaled up and down to the user's on-demand so that the total cost of ownership can be reduced. The Modeling and Simulation (M&S) technology is a renowned simulation-based method to obtain engineering analysis and results through CAE software without actual experimental action. In general, M&S technology is utilized in Finite Element Analysis (FEA), Computational Fluid Dynamics (CFD), Multibody dynamics (MBD), and optimization fields. The work procedure through M&S is divided into pre-processing, analysis, and post-processing steps. The pre/post-processing are GPU-intensive job that consists of 3D modeling jobs via CAE software, whereas analysis is CPU or GPU intensive. Because a general-purpose desktop needs plenty of time to analyze complicated 3D models, CAE software requires a high-end CPU and GPU-based workstation that can work fluently. In other words, for executing M&S, it is absolutely required to utilize high-performance computing resources. To mitigate the cost issue from equipping such tremendous computing resources, we propose HEMOS-Cloud service, an integrated cloud and cluster computing environment. The HEMOS-Cloud service provides CAE software and computing resources to users who want to experience M&S in business sectors or academics. In this paper, the economic ripple effect of HEMOS-Cloud service was analyzed by using industry-related analysis. The estimated results of using the experts-guided coefficients are the production inducement effect of KRW 7.4 billion, the value-added effect of KRW 4.1 billion, and the employment-inducing effect of 50 persons per KRW 1 billion.

Analysis on the Active/Inactive Status of Computational Resources for Improving the Performance of the GPU (GPU 성능 저하 해결을 위한 내부 자원 활용/비활용 상태 분석)

  • Choi, Hongjun;Son, Dongoh;Kim, Jongmyon;Kim, Cheolhong
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.1-11
    • /
    • 2015
  • In recent high performance computing system, GPGPU has been widely used to process general-purpose applications as well as graphics applications, since GPU can provide optimized computational resources for massive parallel processing. Unfortunately, GPGPU doesn't exploit computational resources on GPU in executing general-purpose applications fully, because the applications cannot be optimized to GPU architecture. Therefore, we provide GPU research guideline to improve the performance of computing systems using GPGPU. To accomplish this, we analyze the negative factors on GPU performance. In this paper, in order to clearly classify the cause of the negative factors on GPU performance, GPU core status are defined into 5 status: fully active status, partial active status, idle status, memory stall status and GPU core stall status. All status except fully active status cause performance degradation. We evaluate the ratio of each GPU core status depending on the characteristics of benchmarks to find specific reasons which degrade the performance of GPU. According to our simulation results, partial active status, idle status, memory stall status and GPU core stall status are induced by computational resource underutilization problem, low parallelism, high memory requests, and structural hazard, respectively.

A Benchmark of Micro Parallel Computing Technology for Real-time Control in Smart Farm (MPICH vs OpenMP) (제목을스마트 시설환경 실시간 제어를 위한 마이크로 병렬 컴퓨팅 기술 분석)

  • Min, Jae-Ki;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.161-161
    • /
    • 2017
  • 스마트 시설환경의 제어 요소는 난방기, 창 개폐, 수분/양액 밸브 개폐, 환풍기, 제습기 등 직접적으로 시설환경의 조절에 관여하는 인자와 정보 교환을 위한 통신, 사용자 인터페이스 등 간접적으로 제어에 관련된 요소들이 복합적으로 존재한다. PID 제어와 같이 하는 수학적 논리를 바탕으로 한 제어와 전문 관리자의 지식을 기반으로 한 비선형 학습 모델에 의한 제어 등이 공존할 수 있다. 이러한 다양한 요소들을 복합적으로 연동시키기 위해선 기존의 시퀀스 기반 제어 방식에는 한계가 있을 수 있다. 관행의 방식과 같이 시계열 상에서 획득한 충분한 데이터를 이용하여 제어의 양과 시점을 결정하는 방식은 예외 상황에 충분히 대처하기 어려운 단점이 있을 수 있다. 이러한 예외 상황은 자연적인 조건의 변화에 따라 불가피하게 발생하는 경우와 시스템의 오류에 기인하는 경우로 나뉠 수 있다. 본 연구에서는 실시간으로 변하는 시설환경 내의 다양한 환경요소를 실시간으로 분석하고 상응하는 제어를 수행하여 수학적이며 예측 가능한 논리에 의해 준비된 제어시스템을 보완할 방법을 연구하였다. 과거의 고성능 컴퓨팅(HPC; High Performance Computing)은 다수의 컴퓨터를 고속 네트워크로 연동하여 집적적으로 연산능력을 향상시킨 기술로 비용과 규모의 측면에서 많은 투자를 필요로 하는 첨단 고급 기술이었다. 핸드폰과 모바일 장비의 발달로 인해 소형 마이크로프로세서가 발달하여 근래 2 Ghz의 클럭 속도에 이르는 어플리케이션 프로세서(AP: Application Processor)가 등장하기도 하였다. 상대적으로 낮은 성능에도 불구하고 저전력 소모와 플랫폼의 소형화를 장점으로 한 AP를 시설환경의 실시간 제어에 응용하기 위한 방안을 연구하였다. CPU의 클럭, 메모리의 양, 코어의 수량을 다음과 같이 달리한 3가지 시스템을 비교하여 AP를 이용한 마이크로 클러스터링 기술의 성능을 비교하였다.1) 1.5 Ghz, 8 Processors, 32 Cores, 1GByte/Processor, 32Bit Linux(ARMv71). 2) 2.0 Ghz, 4 Processors, 32 Cores, 2GByte/Processor, 32Bit Linux(ARMv71). 3) 1.5 Ghz, 8 Processors, 32 Cores, 2GByte/Processor, 64Bit Linux(Arch64). 병렬 컴퓨팅을 위한 개발 라이브러리로 MPICH(www.mpich.org)와 Open-MP(www.openmp.org)를 이용하였다. 2,500,000,000에 이르는 정수 중 소수를 구하는 연산에 소요된 시간은 1)17초, 2)13초, 3)3초 이었으며, $12800{\times}12800$ 크기의 행렬에 대한 2차원 FFT 연산 소요시간은 각각 1)10초, 2)8초, 3)2초 이었다. 3번 경우는 클럭속도가 3Gh에 이르는 상용 데스크탑의 연산 속도보다 빠르다고 평가할 수 있다. 라이브러리의 따른 결과는 근사적으로 동일하였다. 선행 연구에서 획득한 3차원 계측 데이터를 1초 단위로 3차원 선형 보간법을 수행한 경우 코어의 수를 4개 이하로 한 경우 근소한 차이로 동일한 결과를 보였으나, 코어의 수를 8개 이상으로 한 경우 앞선 결과와 유사한 경향을 보였다. 현장 보급 가능성, 구축비용 및 전력 소모 등을 종합적으로 고려한 AP 활용 마이크로 클러스터링 기술을 지속적으로 연구할 것이다.

  • PDF

An FSI Simulation of the Metal Panel Deflection in a Shock Tube Using Illinois Rocstar Simulation Suite (일리노이 록스타 해석환경을 활용한 충격파관 내 금속패널 변형의 유체·구조 연성 해석)

  • Shin, Jung Hun;Sa, Jeong Hwan;Kim, Han Gi;Cho, Keum Won
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.41 no.5
    • /
    • pp.361-366
    • /
    • 2017
  • As the recent development of computing architecture and application software technology, real world simulation, which is the ultimate destination of computer simulation, is emerging as a practical issue in several research sectors. In this paper, metal plate motion in a square shock tube for small time interval was calculated using a supercomputing-based fluid-structure-combustion multi-physics simulation tool called Illinois Rocstar, developed in a US national R amp; D program at the University of Illinois. Afterwards, the simulation results were compared with those from experiments. The coupled solvers for unsteady compressible fluid dynamics and for structural analysis were based on the finite volume structured grid system and the large deformation linear elastic model, respectively. In addition, a strong correlation between calculation and experiment was shown, probably because of the predictor-corrector time-integration scheme framework. In the future, additional validation studies and code improvements for higher accuracy will be conducted to obtain a reliable open-source software research tool.

Cache Replacement Strategies considering Location and Region Properties of Data in Mobile Database Systems (이동 데이타베이스 시스템에서 데이타의 위치와 영역 특성을 고려한 캐쉬 교체 기법)

  • Kim, Ho-Sook;Yong, Hwan-Seung
    • Journal of KIISE:Databases
    • /
    • v.27 no.1
    • /
    • pp.53-63
    • /
    • 2000
  • The mobile computing service market is increasing rapidly due to the development of low-cost wireless network technology and the high-performance mobile computing devices. In recent years, several methods have been proposed to effectively deal with restrictions of the mobile computing environment such as limited bandwidth, frequent disconnection and short-lived batteries. Amongst those methods, much study is being done on the caching method - among the data transmitted from a mobile support station, it selects those that are likely to be accessed in the near future and stores them in the local cache of a mobile host. Existing cache replacement methods have some limitations in efficiency because they do not take into consideration the characteristics of user mobility and spatial attributes of geographical data. In this paper, we show that the value and the semantic of the data, which are stored in the cache of a mobile host, changes according to the movement of the mobile host. We argue it is because data that are geographically near are better suited to provide an answer to a users query in the mobile environment. Also, we define spatial location of geographical data has effect on, using the spatial attributes of data. Finally, we propose two new cache replacement methods that efficiently support user mobility and spatial attributes of data. One is based on the location of data and the other on the meaningful region of data. From the comparative analysis of the previous methods and that they improve the cache hit ratio. Also we show that performance varies according to data density using this, we argue different cache replacement methods are required for regions with varying density of data.

  • PDF

Design and Implementation of Initial OpenSHMEM Based on PCI Express (PCI Express 기반 OpenSHMEM 초기 설계 및 구현)

  • Joo, Young-Woong;Choi, Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.3
    • /
    • pp.105-112
    • /
    • 2017
  • PCI Express is a bus technology that connects the processor and the peripheral I/O devices that widely used as an industry standard because it has the characteristics of high-speed, low power. In addition, PCI Express is system interconnect technology such as Ethernet and Infiniband used in high-performance computing and computer cluster. PGAS(partitioned global address space) programming model is often used to implement the one-sided RDMA(remote direct memory access) from multi-host systems, such as computer clusters. In this paper, we design and implement a OpenSHMEM API based on PCI Express maintaining the existing features of OpenSHMEM to implement RDMA based on PCI Express. We perform experiment with implemented OpenSHMEM API through a matrix multiplication example from system which PCs connected with NTB(non-transparent bridge) technology of PCI Express. The PCI Express interconnection network is currently very expensive and is not yet widely available to the general public. Nevertheless, we actually implemented and evaluated a PCI Express based interconnection network on the RDK evaluation board. In addition, we have implemented the OpenSHMEM software stack, which is of great interest recently.

Development of Low-Power IoT Sensor and Cloud-Based Data Fusion Displacement Estimation Method for Ambient Bridge Monitoring (상시 교량 모니터링을 위한 저전력 IoT 센서 및 클라우드 기반 데이터 융합 변위 측정 기법 개발)

  • Park, Jun-Young;Shin, Jun-Sik;Won, Jong-Bin;Park, Jong-Woong;Park, Min-Yong
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.34 no.5
    • /
    • pp.301-308
    • /
    • 2021
  • It is important to develop a digital SOC (Social Overhead Capital) maintenance system for preemptive maintenance in response to the rapid aging of social infrastructures. Abnormal signals induced from structures can be detected quickly and optimal decisions can be made promptly using IoT sensors deployed on the structures. In this study, a digital SOC monitoring system incorporating a multimetric IoT sensor was developed for long-term monitoring, for use in cloud-computing server for automated and powerful data analysis, and for establishing databases to perform : (1) multimetric sensing, (2) long-term operation, and (3) LTE-based direct communication. The developed sensor had three axes of acceleration, and five axes of strain sensing channels for multimetric sensing, and had an event-driven power management system that activated the sensors only when vibration exceeded a predetermined limit, or the timer was triggered. The power management system could reduce power consumption, and an additional solar panel charging could enable long-term operation. Data from the sensors were transmitted to the server in real-time via low-power LTE-CAT M1 communication, which does not require an additional gateway device. Furthermore, the cloud server was developed to receive multi-variable data from the sensor, and perform a displacement fusion algorithm to obtain reference-free structural displacement for ambient structural assessment. The proposed digital SOC system was experimentally validated on a steel railroad and concrete girder bridge.