• Title/Summary/Keyword: 거대 과학 데이터

Search Result 21, Processing Time 0.021 seconds

Implementation of marine static data collection and DB storage algorithms (해양 정적 데이터 수집 및 DB 저장 알고리즘 구현)

  • Seung-Hwan Choi;Gi-Jo Park;Ki-Sook Chung;Woo-Sug Jung;Kyung-Seok Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.95-101
    • /
    • 2023
  • Globally, the importance of utilization and management of marine spatial information is being maximized, and analyzing such data is emerging as a major driving force for R&D. In Korea, it is expected that collecting marine data from the past to the present and extracting its value will play an important role in the development of science in Korea in the future. In particular, marine static data constitutes a huge big database, and it is necessary to store and store the collected data without loss as high data collection costs and high-level observation techniques are required. In addition, the Disaster Safety Intelligence Convergence Center's "Marine Digital Twin Establishment and Utilization-Based Technology Research" task requires collection and analysis of marine data, so this paper conducts a current status survey of static marine data. And we present a series of algorithms that collect and store them in a database.

K-WeldPredictor based on SSH Port Forwarding using Supercomputer (슈퍼컴퓨터를 활용한 SSH 포트 포워딩 기반의 용접 시뮬레이터)

  • Kim, Myung-Il;Kim, Seung-Hae;Yang, Yuping
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.7-9
    • /
    • 2012
  • 슈퍼컴퓨터는 주로 거대 과학문제를 해결하는데 활용되었으나, 최근 산업체의 신제품 및 신기술 개발을 위한 활용이 확대되는 추세이다. 이를 보다 활성화시키기 위해서는 산업체의 연구개발자들이 슈퍼컴퓨터를 활용하여 보다 쉽고 편리하게 시뮬레이션을 수행할 수 있는 환경의 개발이 요구된다. 이를 통해 저비용/고효율의 제품 개발 기반을 확보할 수 있으며, 산업체의 글로벌 경쟁력 향상을 모색할 수 있다. 용접은 선박, 항공기 등의 대형 구조물뿐만 아니라 금속을 사용하는 소형 구조물의 접합을 위해 널리 활용되는 기술이다. K-WeldPredictor는 용접 기술의 80% 정도를 차지하는 아크 용접을 웹 사이트에서 시뮬레이션 할 수 있도록 지원하는 슈퍼컴퓨터 기반의 시뮬레이터이다. 미국 WeldPredictor와의 기술협력을 통해 개발되었으며, 슈퍼컴퓨터의 높은 보안 수준을 준수하기 위해 별도의 웹 서버를 활용하였다. 특히 웹 서버와 슈퍼컴퓨터와의 안전한 데이터 전송을 위해 SSH 로컬 포트 포워딩 기술을 활용하였다.

A synchronous/asynchronous hybrid parallel method for some eigenvalue problems on distributed systems

  • 박필성
    • Proceedings of the Korean Society of Computational and Applied Mathematics Conference
    • /
    • 2003.09a
    • /
    • pp.11-11
    • /
    • 2003
  • 오늘날 단일 슈퍼컴퓨터로는 처리가 불가능한 거대한 문제들의 해법이 시도되고 있는데, 이들은 지리적으로 분산된 슈퍼컴퓨터, 데이터베이스, 과학장비 및 디스플레이 장치 등을 초고속 통신망으로 연결한 GRID 환경에서 효과적으로 실행시킬 수 있다. GRID는 1990년대 중반 과학 및 공학용 분산 컴퓨팅의 연구 과정에서 등장한 것으로, 점차 응용분야가 넓어지고 있다. 그러나 GRID 같은 분산 환경은 기존의 단일 병렬 시스템과는 많은 점에서 다르며 이전의 기술들을 그대로 적용하기에는 무리가 있다. 기존 병렬 시스템에서는 주로 동기 알고리즘(synchronous algorithm)이 사용되는데, 직렬 연산과 같은 결과를 얻기 위해 동기화(synchronization)가 필요하며, 부하 균형이 필수적이다. 그러나 부하 균형은 이질 클러스터(heterogeneous cluster)처럼 프로세서들의 성능이 서로 다르거나, 지리적으로 분산된 계산자원을 사용하는 GRID 환경에서는 이기종의 문제뿐 아니라 네트워크를 통한 메시지의 전송 지연 등으로 유휴시간이 길어질 수밖에 없다. 이처럼 동기화의 필요성에 의한 연산의 지연을 해결하는 하나의 방안으로 비동기 반복법(asynchronous iteration)이 나왔으며, 지금도 활발히 연구되고 있다. 이는 알고리즘의 동기점을 가능한 한 제거함으로써 빠른 프로세서의 유휴 시간을 줄이는 것이 목적이다. 즉 비동기 알고리즘에서는, 각 프로세서는 다른 프로세서로부터 갱신된 데이터가 올 때까지 기다리지 않고 계속 다음 작업을 수행해 나간다. 따라서 동시에 갱신된 데이터를 교환한 후 다음 단계로 진행하는 동기 알고리즘에 비해, 미처 갱신되지 않은 데이터를 사용하는 경우가 많으므로 전체적으로는 연산량 대비의 수렴 속도는 느릴 수 있다 그러나 각 프로세서는 거의 유휴 시간이 없이 연산을 수행하므로 wall clock time은 동기 알고리즘보다 적게 걸리며, 때로는 50%까지 빠른 결과도 보고되고 있다 그러나 현재까지의 연구는 모두 어떤 수렴조건을 만족하는 선형 시스템의 해법에 국한되어 있으며 비교적 구현하기 쉬운 공유 메모리 시스템에서의 연구만 보고되어 있다. 본 연구에서는 행렬의 주요 고유쌍을 구하는 데 있어 비동기 반복법의 적용 가능성을 타진하기 위해 우선 이론적으로 단순한 멱승법을 사용하여 실험하였고 그 결과 순수한 비동기 반복법은 수렴하기 어렵다는 결론을 얻었다 그리하여 동기 알고리즘에 비동기적 요소를 추가한 혼합 병렬 알고리즘을 제안하고, MPI(Message Passing Interface)를 사용하여 수원대학교의 Hydra cluster에서 구현하였다. 그 결과 특정 노드의 성능이 다른 것에 비해 현저하게 떨어질 때 전체적인 알고리즘의 수렴 속도가 떨어지는 것을 상당히 완화할 수 있음이 밝혀졌다.

  • PDF

External Merge Sorting in Tajo with Variable Server Configuration (매개변수 환경설정에 따른 타조의 외부합병정렬 성능 연구)

  • Lee, Jongbaeg;Kang, Woon-hak;Lee, Sang-won
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.820-826
    • /
    • 2016
  • There is a growing requirement for big data processing which extracts valuable information from a large amount of data. The Hadoop system employs the MapReduce framework to process big data. However, MapReduce has limitations such as inflexible and slow data processing. To overcome these drawbacks, SQL query processing techniques known as SQL-on-Hadoop were developed. Apache Tajo, one of the SQL-on-Hadoop techniques, was developed by a Korean development group. External merge sort is one of the heavily used algorithms in Tajo for query processing. The performance of external merge sort in Tajo is influenced by two parameters, sort buffer size and fanout. In this paper, we analyzed the performance of external merge sort in Tajo with various sort buffer sizes and fanouts. In addition, we figured out that there are two major causes of differences in the performance of external merge sort: CPU cache misses which increase as the sort buffer size grows; and the number of merge passes determined by fanout.

Analysis of privacy issues and countermeasures in neural network learning (신경망 학습에서 프라이버시 이슈 및 대응방법 분석)

  • Hong, Eun-Ju;Lee, Su-Jin;Hong, Do-won;Seo, Chang-Ho
    • Journal of Digital Convergence
    • /
    • v.17 no.7
    • /
    • pp.285-292
    • /
    • 2019
  • With the popularization of PC, SNS and IoT, a lot of data is generated and the amount is increasing exponentially. Artificial neural network learning is a topic that attracts attention in many fields in recent years by using huge amounts of data. Artificial neural network learning has shown tremendous potential in speech recognition and image recognition, and is widely applied to a variety of complex areas such as medical diagnosis, artificial intelligence games, and face recognition. The results of artificial neural networks are accurate enough to surpass real human beings. Despite these many advantages, privacy problems still exist in artificial neural network learning. Learning data for artificial neural network learning includes various information including personal sensitive information, so that privacy can be exposed due to malicious attackers. There is a privacy risk that occurs when an attacker interferes with learning and degrades learning or attacks a model that has completed learning. In this paper, we analyze the attack method of the recently proposed neural network model and its privacy protection method.

Improving the I/O Performance of Disk-Based Graph Engine by Graph Ordering (디스크 기반 그래프 엔진의 입출력 성능 향상을 위한 그래프 오더링)

  • Lim, Keunhak;Kim, Junghyun;Lee, Eunjae;Seo, Jiwon
    • KIISE Transactions on Computing Practices
    • /
    • v.24 no.1
    • /
    • pp.40-45
    • /
    • 2018
  • With the advent of big data and social networks, large-scale graph processing becomes popular research topic. Recently, an optimization technique called Gorder has been proposed to improve the performance of in-memory graph processing. This technique improves performance by optimizing the graph layout on memory to have better cache locality. However, since it is designed for in-memory graph processing systems, the technique is not suitable for disk-based graph engines; also the cost for applying the technique is significantly high. To solve the problem, we propose a new graph ordering called I/O Order. I/O Order considers the characteristics of I/O accesses for SSDs and HDDs to improve the performance of disk-based graph engine. In addition, the algorithmic complexity of I/O Order is simple compared to Gorder, hence it is cheaper to apply I/O Ordering. I/O order reduces the cost of pre-processing up to 9.6 times compared to that of Gorder's, still its performance is 2 times higher compared to the Random in low-locality graph algorithms.

Analysis of the Abstract Structure in Scientific Papers by Gifted Students and Exploring the Possibilities of Artificial Intelligence Applied to the Educational Setting (과학 영재의 논문 초록 구조 분석 및 이에 대한 인공지능의 활용 가능성 탐색)

  • Bongwoo Lee;Hunkoog Jho
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.6
    • /
    • pp.573-582
    • /
    • 2023
  • This study aimed to explore the potential use of artificial intelligence in science education for gifted students by analyzing the structure of abstracts written by students at a gifted science academy and comparing the performance of various elements extracted using AI. The study involved an analysis of 263 graduation theses from S Science High School over five years (2017-2021), focusing on the frequency and types of background, objectives, methods, results, and discussions included in their abstracts. This was followed by an evaluation of their accuracy using AI classification methods with fine-tuning and prompts. The results revealed that the frequency of elements in the abstracts written by gifted students followed the order of objectives, methods, results, background, and discussions. However, only 57.4% of the abstracts contained all the essential elements, such as objectives, methods, and results. Among these elements, fine-tuned AI classification showed the highest accuracy, with background, objectives, and results demonstrating relatively high performance, while methods and discussions were often inaccurately classified. These findings suggest the need for a more effective use of AI, through providing a better distribution of elements or appropriate datasets for training. Educational implications of these findings were also discussed.

A Transdisciplinary and Humanistic Approach on the Impacts by Artificial Intelligence Technology (인공지능과 디지털 기술 발달에 따른 트랜스/포스트휴머니즘에 관한 학제적 연구)

  • Kim, Dong-Yoon;Bae, Sang-Joon
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.411-419
    • /
    • 2019
  • Nowadays we are not able to consider and imagine anything without taking into account what is called Artificial Intelligence. Even broadcasting media technologies could not be thought of outside this newly emerging technology of A.I.. Since the last part of 20th century, this technology seemingly is accelerating it's development thanks to an unbelievably enormous computational capacity of data information treatments. In conjunction with the firmly established worldwide platform companies like GAFA(Google, Amazon, Facebook, Apple), the key cutting edge technologies dubbed NBIC(Nanotech, Biotech, Information Technology, Cognitive science) converge to change the map of the current civilization by affecting the human relationship with the world and hence modifying what is essential in humans. Under the sign of the converging technologies, the relatively recently coined concepts such as 'trans(post)humanism' are emerging in the academic sphere in the North American and Major European regions. Even though the so-called trans(post)human movements are prevailing in the major technological spots, we have to say that these terms do not yet reach an unanimous acceptation among many experts coming from diverse fields. Indeed trans(post)humanism as a sort of obscure term has been a largely controversial trend. Because there have been many different opinions depending on scientific, philosophical, medical, engineering scholars like Peter Sloterdijk, K. N. Hayles, Neil Badington, Raymond Kurzweil, Hans Moravec, Laurent Alexandre, Gilbert Hottois just to name a few. However, considering the highly dazzling development of artificial intelligence technology basically functioning in conjunction with the cybernetic communication system firstly conceived by Nobert Wiener, MIT mathematician, we can not avoid questioning what A. I. signifies and how it will affect the current media communication environment.

A New Data Warehousing System Architecture Supporting High Performance View Maintenance (고성능 뷰 관리르 지원하는 새로운 데이터 웨어하우징 시스템 구조)

  • Kim, Jeom-Su;Lee, Do-Heon;Lee, Dong-Ik
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1156-1166
    • /
    • 1999
  • 의사결정 시스템은 전사적인 의사결정과 전략적 정보수집을 위해 거대한 량의 정보를 빠른 시간내에 제공할 것을 요구한다. 데이타 웨어하우스는 이러한 정보를 신속히 제공하기 위해 여러 지역 데이타베이스로부터 필요한 정보를 사전에 추출하고 가공 및 통합하여 별도의 저장공간에 저장한다. 일반적으로, 웨어하우스 내의 정보는 지역 데이타베이스에 저장된 정보에 대한 실체화된 뷰로서 간주하며 지역 데이타의 변경에 따라 일관성을 유지하도록 반영해야 한다. 본 논문에서는 일관성을 유지하기 위해 정보 공유가 가능한 데이타 웨어하우스 시스템의 구조와 비-보상 실체 뷰 관리 기법을 제안한다. 본 논문에서 제안한 데이타 웨어하우스 시스템의 구조는 지역 데이타베이스에서 추출된 정보를 관리하는 별도의 지역 정보 관리자를 두어 뷰 관리자들 간의 정보 공유가 가능하게 한다. 비-보상 실체 뷰 관리 기법은 지역 데이타 변경 사건에 따른 뷰 관리 시 다른 사건에 의해 영향을 받지 않도록 하기 때문에 기본의 사전 보상이나 나중 보상 기법과는 달리 추가적인 질의 처리를 요구하지 않는 기법이다.Abstract A decision support system(DSS) commonly requires fast access to tremendous volume of information. A data warehouse is a database storing the information that is extracted, filtered and integrated from several relevant local databases to reply upon aggregated queries. The information stored in the data warehouse can be regarded as materialized views. The materialized view has to be modified according to the change of the corresponding local databases to preserve the data consistency. In this paper, we propose a data warehousing system architecture allowing information sharing (DAWINS), and a non-compensating materialized view maintenance algorithm(NCA). DAWINS architecture allows relevant information to be shared by individual view managers with local data manager for each local database. Unlikely to the pre- or post-compensating algorithms, which are required to remove the effects of some events to other view in the process of view maintenance, NCA does not require any additional query processing, since a local data manager in DAWINS already maintains the effects of update events occurring in local systems.

An Embodiment of High Energy Physics Data Grid System (고에너지물리 데이타 그리드 시스템의 구현)

  • Cho Ki-Hyeon;Han Dae-Hee;Kwon Ki-Hwan;Kim Jin-Cheol;Yang Yu-Chul;Oh Young-Do;Kong Dae-Jung;Suh Jun-Suhk;Kim Dong-Hee;Son Dong-Chul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.7
    • /
    • pp.390-398
    • /
    • 2006
  • The objective of the High Energy Physics(HEP) is to understand the basic properties of elementary particles and their interactions. The CMS(Compact Muon Solenoid) experiment at CERN which will produce a few PetaByte of data and the size of collaboration is around 2000 physicists. We cannot process the amount of data by current concept of computing. Therefore, an area of High Energy Physics uses a concept of Tier and Data Grid. We also apply Data Grid to current High Energy Physics experiments. In this paper, we report High Energy Physics Data Grid System as an application of Grid.