Browse > Article
http://dx.doi.org/10.3745/KTSDE.2016.5.11.593

Development of a Simulation Prediction System Using Statistical Machine Learning Techniques  

Lee, Ki Yong (숙명여자대학교 컴퓨터과학부)
Shin, YoonJae (숙명여자대학교 컴퓨터과학부)
Choe, YeonJeong (숙명여자대학교 컴퓨터과학부)
Kim, SeonJeong (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터)
Suh, Young-Kyoon (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터)
Sa, Jeong Hwan (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터)
Lee, JongSuk Luth (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터)
Cho, Kum Won (한국과학기술정보연구원 슈퍼컴퓨팅융합연구센터)
Publication Information
KIPS Transactions on Software and Data Engineering / v.5, no.11, 2016 , pp. 593-606 More about this Journal
Abstract
Computer simulation is widely used in a variety of computational science and engineering fields, including computational fluid dynamics, nano physics, computational chemistry, structural dynamics, and computer-aided optimal design, to simulate the behavior of a system. As the demand for the accuracy and complexity of the simulation grows, however, the cost of executing the simulation is rapidly increasing. It, therefore, is very important to lower the total execution time of the simulation especially when that simulation makes a huge number of repetitions with varying values of input parameters. In this paper we develop a simulation service system that provides the ability to predict the result of the requested simulation without actual execution for that simulation: by recording and then returning previously obtained or predicted results of that simulation. To achieve the goal of avoiding repetitive simulation, the system provides two main functionalities: (1) storing simulation-result records into database and (2) predicting from the database the result of a requested simulation using statistical machine learning techniques. In our experiments we evaluate the prediction performance of the system using real airfoil simulation result data. Our system on average showed a very low error rate at a minimum of 0.9% for a certain output variable. Using the system any user can receive the predicted outcome of her simulation promptly without actually running it, which would otherwise impose a heavy burden on computing and storage resources.
Keywords
Simulation; Simulation Result Prediction; Statistical Machine Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 BiDaS [Internet], http://bioserver-3.bioacademy.gr/Bioserver/BiDaS/.
2 WebArrayDB [Internet], http://www.webarraydb.org/webarray/.
3 Cipran Docan, Manish Parashar, and Scott Klasky, "DataSpaces: an interaction and coordination framework for coupled simulation workflows," Cluster Computing, Vol.15, No.2, pp.163-181, 2012.   DOI
4 Adam Hospital, Pau Andrio, Cesare Cugnasco, Laia Codo, Yolanda Becerra, Pablo D. Dans, Federica Battistini, Jordi Torres, Ramon Goni, Modesto Orozco, and Josep Ll. Gelpi, "BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data," Nucleic Acids Research, Vol.44, 2016.
5 D. Mishin, D. Medvedev, A. S. Szalay, R. Plante, and M. Graham, "Data Sharing and Publication Using the SciDrive Service," Astronomical Data Analysis Software and Systems, Vol.485, 2014.
6 Anand Kumar, Vladimir Grupcev, Meryem Berrada, Joseph C. Fogarty, Yi-Cheng Tu, Xingquan Zhu, Sagar A Pandit, and Yuni Xia, "DCMS: A data analytics and management system for molecular simulation," Journal of Big Data, Vol.1, No.9, 2014.
7 Julien C. Thibault, Julio C. Facelli, and Thomas E. Cheatham, III, "iBIOMES: Managing and Sharing Biomolecular Simulation Data in a Distributed Environment," Journal of Chemical Information and Modeling, Vol.53, pp.726-736, 2013.   DOI
8 Jian Huang, Xuechen Zhang, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Stephane Ethier, and Scott Klasky, "Scibox: Online Sharing of Scientific Data via the Cloud," in Proceedings of the 28th IEEE International Parallel & Distributed Processing Symposium, pp.145-154, 2014.
9 Ki Yong Lee, Yoonjae Shin, Yeonjeong Choe, SeonJeong Kim, Young-kyoon Suh, Jeonghwan Sa, and Kum Won Cho, "Design and Implemenation of a Data-Driven Simulation Service System," in Proc. of 6th International Conference on Emerging Databases (EDB 2016), October, 2016.
10 Ki Yong Lee, Yoonjae Shin, Yeonjeong Choe, Young-kyoon Suh, Jeonghwan Sa, and Kum Won Cho, "Design of a Simulation Data Management System for Efficient Computational Science and Engineering Simulations," in Proc. of KIPS Spring Conference, April, 2016.
11 MongoDB [Internet], https://www.mongodb.com/.
12 Node.js [Internet], https://nodejs.org/.
13 rJava [Internet], https://www.rforge.net/rJava/.
14 EDISON-CFD, [Internet], https://cfd.edison.re.kr/.
15 NACA airfoil [Internet], https://en.wikipedia.org/wiki/NAC A_airfoil/.
16 T. Hastie, R. Tibshirani, and J. Friedman, "The Elements of Statistical Learning," 2nd Edition, Springer, 2008.
17 PhET [Internet], https://phet.colorado.edu/.
18 Angela B. Shiflet and George W. Shiflet, "Introduction to Computational Science: Modeling and Simulation for the Sciences," 2nd edition, Princeton University Press, 2014.
19 Y.-K. Suh, et. al., "EDISON: A Web-based HPC Simulation Execution Framework for Large-scale Scientific Computing Software," in Proc. of CCGrid'16, pp.608-612, May 2016.
20 R: The R Project for Statistical Computing [Internet], https://www.r-project.org/.
21 ALF: Simulating Genome Evolution [Internet], http://alfsim.org/.