Browse > Article
http://dx.doi.org/10.14400/JDC.2018.16.12.317

Issue Analysis on Gas Safety Based on a Distributed Web Crawler Using Amazon Web Services  

Kim, Yong-Young (Division of International Business, Konkuk University)
Kim, Yong-Ki (Department of Computer Engineering, Chungbuk National University)
Kim, Dae-Sik (Department of Computer Engineering, Chungbuk National University)
Kim, Mi-Hye (Department of Computer Engineering, Chungbuk National University)
Publication Information
Journal of Digital Convergence / v.16, no.12, 2018 , pp. 317-325 More about this Journal
Abstract
With the aim of creating new economic values and strengthening national competitiveness, governments and major private companies around the world are continuing their interest in big data and making bold investments. In order to collect objective data, such as news, securing data integrity and quality should be a prerequisite. For researchers or practitioners who wish to make decisions or trend analyses based on objective and massive data, such as portal news, the problem of using the existing Crawler method is that data collection itself is blocked. In this study, we implemented a method of collecting web data by addressing existing crawler-style problems using the cloud service platform provided by Amazon Web Services (AWS). In addition, we collected 'gas safety' articles and analyzed issues related to gas safety. In order to ensure gas safety, the research confirmed that strategies for gas safety should be established and systematically operated based on five categories: accident/occurrence, prevention, maintenance/management, government/policy and target.
Keywords
Gas Safety; Issue Analysis; Crawler; Distributed Web Crawler; AWS; Big Data;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 H. Chen, R. H. L. Chiang & V. C. Storey. (2012). Business Intelligence and Analytics: From Big Data to Big Impact, MIS Quarterly, 36(4), 1165-1188.   DOI
2 A. De Mauro, M. Greco & M. Grimaldi. (2016). A Formal Definition of Big Data Based on Its Essential Features, Library Review, 65(3), 122-135.   DOI
3 X. Wu et al. (2014). Data Mining with Big Data, IEEE Transactions on Knowledge and Data Engineering, 26(1), 97-107.   DOI
4 P. Philipp et al. (2017). A Semantic Framework for Sequential Decision Making, Journal of Web Engineering, 16(5-6), 471-504.
5 B. Shin & H. Jeon. (2018). A Study on Disaster Information Support Using Big Data, Journal of the Korea Convergence Society, 9(8), 25-32.   DOI
6 I. A. T. Hashem et al. (2015). The Rise of "Big Data" on Cloud Computing: Review and Open Research Issues, Information Systems, 47, 98-115.   DOI
7 A. S. Matteson, S. Choi & H. Lim. (2018), Inference of Korean Public Sentiment from Online News, Journal of the Korea Convergence Society, 9(7), 25-31.   DOI
8 H. Seo & H. Park. (2018). Design and Implementation of Potential Advertisement Keyword Extraction System Using SNS, Journal of the Korea Convergence Society, 9(7), 17-24.   DOI
9 Web Crawler. Available from: https://en.wikipedia.org/wiki/Web_crawler.
10 S. Thenmalar & T. V. Geetha. (2014). The Modified Concept Based Focused Crawling Using Ontology, Journal of Web Engineering, 13(5-6), 525-538.
11 S. Choudhary et al. (2014). Model-Based Rich Internet Applications Crawling: "Menu" and "Probability" Models, Journal of Web Engineering, 13(3-4), 243-262.
12 J. Cho & H. Garcia-Molina. (2002). Parallel Crawlers, 11th International Conference on World Wide Web.
13 AWS. Available from: https://aws.amazon.com/
14 A. Heydon & M. Najork. (1999). Mercator: A Scalable, Extensible Web Crawler, World Wide Web, 2(4), 219-229.   DOI
15 C. D. Manning, P. Raghavan & H. Schutze. (2008). Introduction to Information Retrieval, Cambridge University Press, 2008.
16 J. Cho et al. (2006). Stanford WebBase Components and Applications, ACM Transactions on Internet Technology, 6(2), 153-186.   DOI
17 S. Oh, J. M. Lee & Y. Y. Kim. (2017). A Study on the Job Satisfaction in the Smart Work Environment, Journal of the Korea Convergence Society, 8(11), 393-401.   DOI
18 J. Cho, H. Garcia-Molina & L. Page. (1998). Efficient Crawling through URL Ordering, Computer Networks and ISDN Systems, 30(1), 161-172.   DOI