• Title/Summary/Keyword: 고성능 컴퓨팅 시스템

Search Result 172, Processing Time 0.022 seconds

An Installation and Model Assessment of the UM, U.K. Earth System Model, in a Linux Cluster (U.K. 지구시스템모델 UM의 리눅스 클러스터 설치와 성능 평가)

  • Daeok Youn;Hyunggyu Song;Sungsu Park
    • Journal of the Korean earth science society
    • /
    • v.43 no.6
    • /
    • pp.691-711
    • /
    • 2022
  • The state-of-the-art Earth system model as a virtual Earth is required for studies of current and future climate change or climate crises. This complex numerical model can account for almost all human activities and natural phenomena affecting the atmosphere of Earth. The Unified Model (UM) from the United Kingdom Meteorological Office (UK Met Office) is among the best Earth system models as a scientific tool for studying the atmosphere. However, owing to the expansive numerical integration cost and substantial output size required to maintain the UM, individual research groups have had to rely only on supercomputers. The limitations of computer resources, especially the computer environment being blocked from outside network connections, reduce the efficiency and effectiveness of conducting research using the model, as well as improving the component codes. Therefore, this study has presented detailed guidance for installing a new version of the UM on high-performance parallel computers (Linux clusters) owned by individual researchers, which would help researchers to easily work with the UM. The numerical integration performance of the UM on Linux clusters was also evaluated for two different model resolutions, namely N96L85 (1.875° ×1.25° with 85 vertical levels up to 85 km) and N48L70 (3.75° ×2.5° with 70 vertical levels up to 80 km). The one-month integration times using 256 cores for the AMIP and CMIP simulations of N96L85 resolution were 169 and 205 min, respectively. The one-month integration time for an N48L70 AMIP run using 252 cores was 33 min. Simulated results on 2-m surface temperature and precipitation intensity were compared with ERA5 re-analysis data. The spatial distributions of the simulated results were qualitatively compared to those of ERA5 in terms of spatial distribution, despite the quantitative differences caused by different resolutions and atmosphere-ocean coupling. In conclusion, this study has confirmed that UM can be successfully installed and used in high-performance Linux clusters.

A Study on Scalability of Profiling Method Based on Hardware Performance Counter for Optimal Execution of Supercomputer (슈퍼컴퓨터 최적 실행 지원을 위한 하드웨어 성능 카운터 기반 프로파일링 기법의 확장성 연구)

  • Choi, Jieun;Park, Guenchul;Rho, Seungwoo;Park, Chan-Yeol
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.221-230
    • /
    • 2020
  • Supercomputer that shares limited resources to multiple users needs a way to optimize the execution of application. For this, it is useful for system administrators to get prior information and hint about the applications to be executed. In most high-performance computing system operations, system administrators strive to increase system productivity by receiving information about execution duration and resource requirements from users when executing tasks. They are also using profiling techniques that generates the necessary information using statistics such as system usage to increase system utilization. In a previous study, we have proposed a scheduling optimization technique by developing a hardware performance counter-based profiling technique that enables characterization of applications without further understanding of the source code. In this paper, we constructed a profiling testbed cluster to support optimal execution of the supercomputer and experimented with the scalability of the profiling method to analyze application characteristics in the built cluster environment. Also, we experimented that the profiling method can be utilized in actual scheduling optimization with scalability even if the application class is reduced or the number of nodes for profiling is minimized. Even though the number of nodes used for profiling was reduced to 1/4, the execution time of the application increased by 1.08% compared to profiling using all nodes, and the scheduling optimization performance improved by up to 37% compared to sequential execution. In addition, profiling by reducing the size of the problem resulted in a quarter of the cost of collecting profiling data and a performance improvement of up to 35%.

A Contextual Information and Physics-based Mobile Augmented Reality Contents Manipulation Method (맥락 정보와 물리적 속성 부여가 가능한 모바일 증강 현실 콘텐츠 조작 방법)

  • Hong, Dong-Pyo;Lee, Jeong-Gyu;Chae, Chang-Hun;Lee, Jong-Weon;Ko, Kwang-Hee;Woo, Woon-Taek
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.526-530
    • /
    • 2009
  • In this paper, we propose a contextual information and physics-based contents manipulation method for a mobile augmented reality authoring system. Due to proliferation of ubiquitous computing in information technology(IT) and advances in sensor technology and mobile devices, AR systems that were only possible in PC can be now feasible on mobile devices. In addition, many AR systems have been proposed that utilize sensory data and reflect them into. Thus, the proposed method provides appropriate visual cues for 3D manipulations of the augmented contents. In addition, uses can manipulate the augmented contents with sensory information through the assignment of sensors to the contents. Moreover, it supports not only a physics-based contents loader that enables users to specify physics properties into the contents, but also the transform matrix between AR and physics engine coordinates. To show the feasibility of the proposed method, we implemented a mobile augmented reality authoring system. We believe that the proposed method can be a key factor for context-aware mobile AR authoring system.

  • PDF

A High Performance Flash Memory Solid State Disk (고성능 플래시 메모리 솔리드 스테이트 디스크)

  • Yoon, Jin-Hyuk;Nam, Eyee-Hyun;Seong, Yoon-Jae;Kim, Hong-Seok;Min, Sang-Lyul;Cho, Yoo-Kun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.378-388
    • /
    • 2008
  • Flash memory has been attracting attention as the next mass storage media for mobile computing systems such as notebook computers and UMPC(Ultra Mobile PC)s due to its low power consumption, high shock and vibration resistance, and small size. A storage system with flash memory excels in random read, sequential read, and sequential write. However, it comes short in random write because of flash memory's physical inability to overwrite data, unless first erased. To overcome this shortcoming, we propose an SSD(Solid State Disk) architecture with two novel features. First, we utilize non-volatile FRAM(Ferroelectric RAM) in conjunction with NAND flash memory, and produce a synergy of FRAM's fast access speed and ability to overwrite, and NAND flash memory's low and affordable price. Second, the architecture categorizes host write requests into small random writes and large sequential writes, and processes them with two different buffer management, optimized for each type of write request. This scheme has been implemented into an SSD prototype and evaluated with a standard PC environment benchmark. The result reveals that our architecture outperforms conventional HDD and other commercial SSDs by more than three times in the throughput for random access workloads.

Volume Rendering System of e-Science Electron Microscopy using Grid (Gird를 이용한 e-사이언스 전자현미경 볼륨 랜더링 시스템)

  • Jeong, Won-Gu;Jeong, Jong-Man;Lee, Ho;Choe, Sang-Su;Ahn, Young-heon;Hur, Man-Hoi;Kim, Jay;Kim, Eunsung;Jung, Im Y.;Yeom, Heon Y.;Cho, Kum Won;Kweon, Hee-Seok
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.560-564
    • /
    • 2007
  • Korea Basic Science Institute(KBSI) has three general electron microscopes including High Voltage Electron Microscope(HVEM) which is the only one in Korea. Observed images through an electron microscope are what they are tilted by each step and saved, offering the more better circumstances for observers, a reconstruction to 3D could be a essential process. In this process, a warping method decreases distortions maximumly of avoided parts of a camera's focus. All these image treatment processes and 3D reconstruction processes are based on an accompaniment of a highly efficient computer, a number of Grid Node Personal computers share this process in a short time and dispose of it. Grid Node Personal computers' purpose is to make an owner can share different each other and various computing resources efficiently and also Grid Node Personal computers is applying to solve problems like a role scheduling needed for a constructing system, a resource management, a security, a capacity measurement, a condition monitoring and so on. Grid Node Personal computers accomplish roles of a highly efficient computer that general individuals felt hard to use, moreover, a image treatment using the warping method becomes a foundation for reconstructing to more closer shape with an real object of observation. Construction of the electron microscope volume 랜더링 system based on Grid Node Personal computer through the warping process can offer more convenient and speedy experiment circumstances to observers, and makes them meet with experiment outcome that is similar to real shapes and is easy to understand.

  • PDF

Real-Time GPU Task Monitoring and Node List Management Techniques for Container Deployment in a Cluster-Based Container Environment (클러스터 기반 컨테이너 환경에서 실시간 GPU 작업 모니터링 및 컨테이너 배치를 위한 노드 리스트 관리기법)

  • Jihun, Kang;Joon-Min, Gil
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.11
    • /
    • pp.381-394
    • /
    • 2022
  • Recently, due to the personalization and customization of data, Internet-based services have increased requirements for real-time processing, such as real-time AI inference and data analysis, which must be handled immediately according to the user's situation or requirement. Real-time tasks have a set deadline from the start of each task to the return of the results, and the guarantee of the deadline is directly linked to the quality of the services. However, traditional container systems are limited in operating real-time tasks because they do not provide the ability to allocate and manage deadlines for tasks executed in containers. In addition, tasks such as AI inference and data analysis basically utilize graphical processing units (GPU), which typically have performance impacts on each other because performance isolation is not provided between containers. And the resource usage of the node alone cannot determine the deadline guarantee rate of each container or whether to deploy a new real-time container. In this paper, we propose a monitoring technique for tracking and managing the execution status of deadlines and real-time GPU tasks in containers to support real-time processing of GPU tasks running on containers, and a node list management technique for container placement on appropriate nodes to ensure deadlines. Furthermore, we demonstrate from experiments that the proposed technique has a very small impact on the system.

An Efficient Buffer Page Replacement Strategy for System Software on Flash Memory (플래시 메모리상에서 시스템 소프트웨어의 효율적인 버퍼 페이지 교체 기법)

  • Park, Jong-Min;Park, Dong-Joo
    • Journal of KIISE:Databases
    • /
    • v.34 no.2
    • /
    • pp.133-140
    • /
    • 2007
  • Flash memory has penetrated our life in various forms. For example, flash memory is important storage component of ubiquitous computing or mobile products such as cell phone, MP3 player, PDA, and portable storage kits. Behind of the wide acceptance as memory is many advantages of flash memory: for instances, low power consumption, nonvolatile, stability and portability. In addition to mentioned strengths, the recent development of gigabyte range capacity flash memory makes a careful prediction that the flash memory might replace some of storage area dominated by hard disks. In order to have overwriting function, one block must be erased before overwriting is performed. This difference results in the cost of reading, writing and erasing in flash memory[1][5][6]. Since this difference has not been considered in traditional buffer replacement technologies adopted in system software such as OS and DBMS, a new buffer replacement strategy becomes necessary. In this paper, a new buffer replacement strategy, reflecting difference I/O cost and applicable to flash memory, suggest and compares with other buffer replacement strategies using workloads as Zipfian distribution and real data.

FAST : A Log Buffer Scheme with Fully Associative Sector Translation for Efficient FTL in Flash Memory (FAST :플래시 메모리 FTL을 위한 완전연관섹터변환에 기반한 로그 버퍼 기법)

  • Park Dong-Joo;Choi Won-Kyung;Lee Sang-Won
    • The KIPS Transactions:PartA
    • /
    • v.12A no.3 s.93
    • /
    • pp.205-214
    • /
    • 2005
  • Flash memory is at high speed used as storage of personal information utilities, ubiquitous computing environments, mobile phones, electronic goods, etc. This is because flash memory has the characteristics of low electronic power, non-volatile storage, high performance, physical stability, portability, and so on. However, differently from hard disks, it has a weak point that overwrites on already written block of flash memory is impossible to be done. In order to make an overwrite possible, an erase operation on the written block should be performed before the overwrite, which lowers the performance of flash memory highly. In order to solve this problem the flash memory controller maintains a system software module called the flash translation layer(FTL). Of many proposed FTL schemes, the log block buffer scheme is best known so far. This scheme uses a small number of log blocks of flash memory as a write buffer, which reduces the number of erase operations by overwrites, leading to good performance. However, this scheme shows a weakness of low page usability of log blocks. In this paper, we propose an enhanced log block buffer scheme, FAST(Full Associative Sector Translation), which improves the page usability of each log block by fully associating sectors to be written by overwrites to the entire log blocks. We also show that our FAST scheme outperforms the log block buffer scheme.

An Integrated Access Control for Sharing of E-Science Grid Resources (유휴 멀티 e-Science 그리드 자원 공유를 위한 통합 자원 접근 제어)

  • Jung, Im-Y.;Jung, Eun-Jin;Yeom, Heon-Y.
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.9_10
    • /
    • pp.452-465
    • /
    • 2008
  • This paper proposes a light-weight, seamless integrated access control for global e-Science resource sharing. E-Science, based on Grid Computing, was designed to help scientists to remotely control and process the Grid resources such as high-end equipments and remote machines. As many researchers engage in the e-Science Grids, the researchers in a grid often have to wait for or give up use of the Grid resources, even when there are idle resources in other Grids. In this case, provided that proper compensation is given, Grid resource sharing is helpful both for the researchers and the Grids which provide their resources. But, sharing Grid resources globally is not simple, as each e-Science Grid is especially designed for resource sharing in its Virtual Organization(VO) and already has its unique access control policy for its resources. This paper proposes a new integrated access control for e-Science Grid resource sharing. The access control is light-weight without any priori service level agreement(SLA)s among the Grids which share their resources and seamless because the users can use the resources shared as the ones belonging to their Grids without their additional registration to the other Grids.

Odysseus/m: a High-Performance ORDBMS Tightly-Coupled with IR Features (오디세우스/IR: 정보 검색 기능과 밀결합된 고성능 객체 관계형 DBMS)

  • Whang Kyu-Young;Lee Min-Jae;Lee Jae-Gil;Kim Min-Soo;Han Wook-Shin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.3
    • /
    • pp.209-215
    • /
    • 2005
  • Conventional ORDBMS vendors provide extension mechanisms for adding user-defined types and functions to their own DBMSs. Here, the extension mechanisms are implemented using a high-level interface. We call this technique loose-coupling. The advantage of loose-coupling is that it is easy to implement. However, it is not preferable for implementing new data types and operations in large databases when high Performance is required. In this paper, we propose to use the notion of tight-coupling to satisfy this requirement. In tight-coupling, new data types and operations are integrated into the core of the DBMS engine. Thus, they are supported in a consistent manner with high performance. This tight-coupling architecture is being used to incorporate information retrieval(IR) features and spatial database features into the Odysseus/IR ORDBMS that has been under development at KAIST/AITrc. In this paper, we introduce Odysseus/IR and explain its tightly-coupled IR features (U.S. patented). We then demonstrate a web search engine that is capable of managing 20 million web pages in a non-parallel configuration using Odysseus/IR.