Browse > Article
http://dx.doi.org/10.7838/jsebs.2021.26.4.001

Semi-automatic Data Fusion Method for Spatial Datasets  

Yoon, Jong-chan (Department of Electrical and Computer Engineering, University of Seoul)
Kim, Han-joon (Department of Electrical and Computer Engineering, University of Seoul)
Publication Information
The Journal of Society for e-Business Studies / v.26, no.4, 2021 , pp. 1-13 More about this Journal
Abstract
With the development of big data-related technologies, it has become possible to process vast amounts of data that could not be processed before. Accordingly, the establishment of an automated data selection and fusion process for the realization of big data-based services has become a necessity, not an option. In this paper, we propose an automation technique to create meaningful new information by fusing datasets containing spatial information. Firstly, the given datasets are embedded by using the Node2Vec model and the keywords of each dataset. Then, the semantic similarities among all of datasets are obtained by calculating the cosine similarity for the embedding vector of each pair of datasets. In addition, a person intervenes to select some candidate datasets with one or more spatial identifiers from among dataset pairs with a relatively higher similarity, and fuses the dataset pairs to visualize them. Through such semi-automatic data fusion processes, we show that significant fused information that cannot be obtained with a single dataset can be generated.
Keywords
Spatial Data; Data Fusion; Embedding; Semantic Similarity; Big Data; Dataset;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Khan, S., Nazir, S., Garcia-Magarino, I., and Hussain, A., "Deep learning-based urban big data fusion in smart cities: Towards traffic monitoring and flow-preserving fusion," Computers & Electrical Engineering, Vol. 89, 106906, 2021.   DOI
2 Li, Y. and Yang, T., "Word embedding for understanding natural language: A survey," Guide to big data applications, pp. 83-104, Springer, 2018.
3 Wiemann, S., and Lars, B., "Spatial data fusion in spatial data infrastructures using linked data," International Journal of Geographical Information Science, Vol. 30, No. 4, pp. 613-636, 2016.   DOI
4 Bleiholder, Jens, and Felix, N., "Data fusion," ACM computing surveys (CSUR), Vol. 41, No. 1, pp. 1-41, 2009.   DOI
5 Cho, S. R. and Kim, H. J., "A Preliminary Study on Improving Korean Text Embedding Model," Proceedings of KICS Winter Conference, 2020.
6 Cho, S. R. and Kim, H. J., "Topic Re-modeling System using Node2Vec," Proceedings of Fall Conference of 2020 Korea Associations of Information Systems, 2020.
7 Gao, J., Li, P., Chen, Z., and Zhang, J., "A Survey on Deep Learning for Multimodal Data Fusion," Neural Computation, Vol. 32, No. 5, pp. 829-864, 2020.   DOI
8 Grover, A. and Leskovec, "Node2Vec: Scalable feature learning for networks," Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016.
9 Korea Ministry of the Interior and Safety, Road Name Address System, http://www.juso.go.kr/.
10 Lee, S. H., Yang, C. M., and Baek, S. C., "Improvement on Location Based Parcel Numbering System," Journal of Cadastre & Land Informatix, Vol. 42, No. 1, pp. 148-149, 2012.
11 Chang, T. W., "A Study on Integration and Application Plans of Address and Location Information," The Journal of Society for e-Business Studies, Vol. 15, No. 2, pp. 93-105, 2010.
12 Xia, P., Zhang, L., and Li, F., "Learning Similarity with Cosine Similarity Ensemble," Information Sciences, Vol. 307, pp. 39-52, 2015.   DOI
13 Liu, J., Li, T., Xie, P., Du, S., Teng, F., and Yang, X., "Urban big data fusion based on deep learning: An overview," Information Fusion, Vol. 53, pp. 123-133, 2020.   DOI
14 Ma, L. and Zhang, Y., "Using Word2Vec to process big text data," Proceedings of IEEE International Conference on Big Data, pp. 2895-2897, 2015.
15 Winarno, E., Hadikurniawati, W., and Rosso, R. N., "Location based Service for Presence System using Haversine Method," Proceedings of 2017 International Conference on Innovative and Creative Information Technology (ICITech), pp. 1-4, 2017.
16 Choi, Y. S., Park, H. G., and Kim, G. S., "Establishment of th Plane Coordinate System for Framework Data(UTM-K) in Korea," Korean Journal of Geomatics, Vol. 22, No. 4, 2004.