• Title/Summary/Keyword: Scientific Workflow

Search Result 21, Processing Time 0.027 seconds

A Design of Integrated Scientific Workflow Execution Environment for A Computational Scientific Application (계산 과학 응용을 위한 과학 워크플로우 통합 수행 환경 설계)

  • Kim, Seo-Young;Yoon, Kyoung-A;Kim, Yoon-Hee
    • Journal of Internet Computing and Services
    • /
    • v.13 no.1
    • /
    • pp.37-44
    • /
    • 2012
  • Numerous scientists who are engaged in compute-intensive researches require more computing facilities than before, while the computing resource and techniques are increasingly becoming more advanced. For this reason, many works for e-Science environment have been actively invested and established around the world, but still the scientists look for an intuitive experimental environment, which is guaranteed the improved environmental facilities without additional configurations or installations. In this paper, we present an integrated scientific workflow execution environment for Scientific applications supporting workflow design with high performance computing infrastructure and accessibility for web browser. This portal supports automated consecutive execution of computation jobs in order of the form defined by workflow design tool and execution service concerning characteristics of each job to batch over distributed grid resources. Workflow editor of the portal presents a high-level frontend and easy-to-use interface with monitoring service, which shows the status of workflow execution in real time so that user can check the intermediate data during experiments. Therefore, the scientists can take advantages of the environment to improve the productivity of study based on HTC.

Development of a Grid-based Framework for High-Performance Scientific Knowledge Discovery (그리드 기반의 고성능 과학기술지식처리 프레임워크 개발)

  • Jeong, Chang-Hoo;Choi, Sung-Pil;Yoon, Hwa-Mook;Choi, Yun-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.12
    • /
    • pp.877-885
    • /
    • 2009
  • In this paper, we propose the SINDI-Grid which is a high-performance framework for scientific and technological knowledge discovery using the grid computing. By using the advantages of the grid computing providing data repository of large-volume and high-speed computing power, the SINDI-Grid framework provides a variety of grid services for distributed data analysis and scientific knowledge processing. And the SINDI-Workflow tool exploits these services so that performs the design and execution for scientific and technological knowledge discovery applications which integrate various information processing algorithms.

Bioworks - A scientific workflow platform for problem solving in biological domain (Workflow 기반의 생명정보 분석 자동화 환경 구축에 관한 연구)

  • Hahn, Youngmahn;Lee, Sang-Joo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.550-552
    • /
    • 2007
  • Bioworks is designed to make, visualize, automate and execute a model of biological analysis processes as a workflow in biotechnology field. It provides for constructing a workflow model of complex biological analysis processes more easily and reporting the analysis results of each step. Users can perform their analysis process by using a visuality and validation module provided as a plug-in program. It supports sharing their workflow in XML format, to which conversion is supplied by Bioworks, with Web Services to improve the efficiency of their study.

  • PDF

Design of Grid Workflow System Scheduler for Task Pipelining (작업 파이프라이닝을 위한 그리드 워크플로우 스케줄러 설계)

  • Lee, In-Seon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.7
    • /
    • pp.1-10
    • /
    • 2010
  • The power of computational Grid resources can be utilized on users desktop by employing workflow managers. It also helps scientists to conveniently put together and run their own scientific workflows. Generally, stage-in, process and stage-out are serially executed and workflow systems help automate this process. However, as the data size is exponentially increasing and more and more scientific workflows require multiple processing steps to obtain the desired output, we argue that the data movement will possess high portion of overall running time. In this paper, we improved staging time and design a new scheduler where the system can execute concurrently as many jobs as possible. Our simulation study shows that 10% to 40% improvement in running time can be achieved through our approach.

Workflow of Cryo-Electron Microscopy and Status of Domestic Infrastructure

  • Choi, Ki Ju;Shin, Jae In;Lee, Sung Hun
    • Applied Microscopy
    • /
    • v.48 no.1
    • /
    • pp.6-10
    • /
    • 2018
  • Cryo-electron microscopy (cryo-EM) allows the analysis of the near-native structures of samples such as proteins, viruses, and sub-cellular organelles at the sub-nano scale. With the recent development of analytical methods, this technique has achieved remarkable results. The importance of cryo-EM gained wide recognition due to last year's award of the Nobel Prize in Chemistry. To help promote the knowledge of this technique, this paper introduces the basic workflows of cryo-EM and domestic cryo-EM service institutes.

Topology-based Workflow Scheduling in Commercial Clouds

  • Ji, Haoran;Bao, Weidong;Zhu, Xiaomin;Xiao, Wenhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.11
    • /
    • pp.4311-4330
    • /
    • 2015
  • Cloud computing has become a new paradigm by enabling on-demand provisioning of applications, platforms or computing resources for clients. Workflow scheduling has always been treated as one of the most challenging problems in clouds. Commercial clouds have been widely used in scientific research, such as biology, astronomy and weather forecasting. Certainly, it is very important for a cloud service provider to pursue the profits for the commercial essence of clouds. This is also significantly important for the case of providing services to workflow tasks. In this paper, we address the issues of workflow scheduling in commercial clouds. This work takes the communication into account, which has always been ignored. And then, a topology-based workflow-scheduling algorithm named Resource Auction Algorithm (REAL) is proposed in the objective of getting more profits. The algorithm gives a good performance on searching for the optimum schedule for a sample workflow. Also, we find that there exists a certain resource amount, which gets the most profits to help us get more enthusiasm for further developing the research. Experimental results demonstrate that the analysis of the strategies for most profits is reasonable, and REAL gives a good performance on efficiently getting an optimized scheme with low computing complexity.

A Data Placement Scheme for the Characteristics of Data Intensive Scientific Workflow Applications (데이터 집약 과학 워크플로우 응용의 특성을 고려한 데이터 배치 기법)

  • Ahn, Julim;Kim, Yoonhee
    • KNOM Review
    • /
    • v.21 no.2
    • /
    • pp.46-52
    • /
    • 2018
  • For data-intensive scientific workflow application experiments that leverage the cloud computing environment, large amounts of data can be distributed across multiple data centers in the cloud. The generated intermediate data can also be transmitted through access between different data centers. When the application is executed, the execution result is changed according to the location of the data since the intermediate data generated is used. However, existing data placement strategies do not consider the characteristics of scientific applications. In this paper, we define a data-intensive tasks and propose runtime data placement in that interval. Through the proposed data placement scheme, we analyze the scenarios considering the number of times in the data intensive tasks defined in this study and derive the results. In addition, performance was compared by analyzing runtime data placement times and runtime data placement overhead.

An Integrated Scientific Workflow Environment over Multiple Infrastructures for Engineering Education of Aerodynamics (다중 인프라 기반의 공력 설계 교육을 위한 과학 워크플로우 통합 환경)

  • Kim, Seoyoung;Kang, Hyejeong;Kim, Yoonhee;Kim, Chongam
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.2
    • /
    • pp.234-240
    • /
    • 2013
  • All around the world, numerous scientists have been carried out researches of e-Science to improve performance of computations and accessibility of their experimental flexibilities for a long times. However, they still have been in difficulty securing high-performance computing facilities. In case of Aerodynamics, for example, a single experiment costs a tremendous amount of budget and requires a span of more than 6 months even though researchers have been developed diverse improved mathematical methods as well as relied on advanced computing technologies to reduce runtime and costs. In this paper, we proposed a multiple infrastructure-based scientific workflow environments for engineering education in fields of design optimization of aircraft and demonstrated the superiority. Since it offers diverse kind of computing resources, it can offer elastic resources regardless of the number of tasks for experiments and limitations of spaces. Also, it can improve education efficiency by using this environment to engineering education.

A Multi-objective Optimization Approach to Workflow Scheduling in Clouds Considering Fault Recovery

  • Xu, Heyang;Yang, Bo;Qi, Weiwei;Ahene, Emmanuel
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.976-995
    • /
    • 2016
  • Workflow scheduling is one of the challenging problems in cloud computing, especially when service reliability is considered. To improve cloud service reliability, fault tolerance techniques such as fault recovery can be employed. Practically, fault recovery has impact on the performance of workflow scheduling. Such impact deserves detailed research. Only few research works on workflow scheduling consider fault recovery and its impact. In this paper, we investigate the problem of workflow scheduling in clouds, considering the probability that cloud resources may fail during execution. We formulate this problem as a multi-objective optimization model. The first optimization objective is to minimize the overall completion time and the second one is to minimize the overall execution cost. Based on the proposed optimization model, we develop a heuristic-based algorithm called Min-min based time and cost tradeoff (MTCT). We perform extensive simulations with four different real world scientific workflows to verify the validity of the proposed model and evaluate the performance of our algorithm. The results show that, as expected, fault recovery has significant impact on the two performance criteria, and the proposed MTCT algorithm is useful for real life workflow scheduling when both of the two optimization objectives are considered.

Priority Data Handling in Pipeline-based Workflow (파이프라인 기반 워크플로우의 우선 데이터 처리 방안)

  • Jeon, Wonpyo;Heo, Daeyoung;Hwang, Suntae
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.12
    • /
    • pp.691-697
    • /
    • 2017
  • Volcanic ash has been predicted to be the main source of damage caused by a potential volcanic disaster around Mount Baekdu and the regions of the Korean peninsula. Computer simulations to predict the diffusion of volcanic ash should be performed according to prevalent meteorological situations within a predetermined time. Therefore, a workflow using pipelining is proposed to parallelize the software used for this computation. Due to the nature of volcanic calamities, the simulations need to be carried out for various plausible conditions given that the parameters cannot be precisely determined during the simulations, even at the time of a volcanic eruption. Among the given conditions, computations need to be first performed for the condition with the highest probability so that a response to the volcanic disaster can be provided using these results. Further action can then be performed later based on subsequent results. The computations need to be performed using a volcanic disaster damage prediction system on a computing server with limited computing performance. Hence, an optimal distribution of the computing resources is required. We propose a method through which specific data can be provided first to the proposed pipeline-based workflow.