Browse > Article
http://dx.doi.org/10.7472/jksii.2019.20.6.73

KISTI-ML Platform: A Community-based Rapid AI Model Development Tool for Scientific Data  

Lee, Jeongcheol (Center for Computational Science Platform, Korea Institute of Science and Technology Information (KISTI))
Ahn, Sunil (Center for Computational Science Platform, Korea Institute of Science and Technology Information (KISTI))
Publication Information
Journal of Internet Computing and Services / v.20, no.6, 2019 , pp. 73-84 More about this Journal
Abstract
Machine learning as a service, the so-called MLaaS, has recently attracted much attention in almost all industries and research groups. The main reason for this is that you do not need network servers, storage, or even data scientists, except for the data itself, to build a productive service model. However, machine learning is often very difficult for most developers, especially in traditional science due to the lack of well-structured big data for scientific data. For experiment or application researchers, the results of an experiment are rarely shared with other researchers, so creating big data in specific research areas is also a big challenge. In this paper, we introduce the KISTI-ML platform, a community-based rapid AI model development for scientific data. It is a place where machine learning beginners use their own data to automatically generate code by providing a user-friendly online development environment. Users can share datasets and their Jupyter interactive notebooks among authorized community members, including know-how such as data preprocessing to extract features, hidden network design, and other engineering techniques.
Keywords
Machine Learning; Big Data; MLaaS; Platform; Scientific Data;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 M. Abadi, A. Agarwal et al., "Tensorflow: large-scale machine learning on heterogeneous distributed systems," Software available from tensorflow.org. 2016.
2 Pedregosa et al., "Scikit-learn: machine learning in python," The Journal of Machine Learning Research 12, pp. 2825-2830, 2011.
3 Tianqi Chen et al., "MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems," arXiv e-prints abs/1512.01274, 2015.
4 Google Cloud AutoML, available from https://cloud.google.com/automl/
5 Amazon SageMaker, available from https://aws.amazon.com/ko/sagemaker/
6 Google Cloud AI, available from https://cloud.google.com/
7 J. Barnes, "Azure machine learning," Microsoft Azure Esstntials. 1st ed, Microsoft.
8 Kaggle: your home for data science, available from https://kaggle.com/
9 J. Vanschoren et al., "OpenML: networked science in machine learning," SIGKDD Explorations 15(2), pp. 49-60, 2013.   DOI
10 EDISON: EDucation-research-industry Integration through Simulation On the Net, available from https://www.edison.re.kr
11 T. Kluyver et al., "Jupyter notebooks: a publishing format for reproducible computational workflows," Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87-90, 2016.
12 Swarm: a docker-native clustering system, available from https://github.com/docker/swarm/
13 J. Lee et al., "KISTI-ML Platform: A Practical Guide to Machine Learning through the Automatic Code Generation with Your Own Datasets," KISS the 10th International Conference on Internet (ICONI), Cambodia, Dec. 2018.
14 EDISON-AI: a specialized Artificial Intelligence and Data Platform, available from https://www.edison.re.kr/web/ai
15 EDISON-PRAGMA: a specialized EDISON platform at Pacific Rim Application and Grid Middleware Assembly, available from https://www.edison.re.kr/web/pragma
16 H. Park, "Personal Credit Evaluation System through Telephone Voice Analysis: By Support Vector Machine," Journal of Internet Computing and Services, vol. 19, no. 6, pp. 63-72, 2018. https://doi.org/10.7472/jksii.2018.19.6.63   DOI
17 IBM WATSON, avail from https://www.ibm.com/watson
18 H. Agrawal, C. S. Mathialagan et al., "CloudCV: Large Scale Distributed Computer Vision as a Cloud Service," In Mobile Cloud Visual Media Computing, Springer, pp. 265-290, 2015. https://doi.org/10.1007/978-3-319-24702-1_11
19 H. Lee, "Development of Supervised Machine Learning based Catalog Entry Classification and Recommendation System," Journal of Internet Computing and Services, vol. 20, no. 1, pp. 57-65, 2019. https://doi.org/10.7472/jksii.2019.20.1.57   DOI
20 G. Shin, D. Kim, S. Hong and M. Han, "The Identification Framework for source code author using Authorship Analysis and CNN," Journal of Internet Computing and Services, vol. 19, no. 5, pp. 33-41, 2018. https://doi.org/10.7472/jksii.2018.19.5.33   DOI