• Title/Summary/Keyword: CKAN harvester

Search Result 1, Processing Time 0.015 seconds

Comprehensive Knowledge Archive Network harvester improvement for efficient open-data collection and management

  • Kim, Dasol;Gil, Myeong-Seon;Nguyen, Minh Chau;Won, Heesun;Moon, Yang-Sae
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.835-855
    • /
    • 2021
  • With the recent increase in data disclosure, the Comprehensive Knowledge Archive Network (CKAN), which is an open-source data distribution platform, is drawing much attention. CKAN is used together with additional extensions, such as Datastore and Datapusher for data management and Harvest and DCAT for data collection. This study derives the problems of CKAN itself and Harvest Extension. First, CKAN causes two problems of data inconsistency and storage space waste for data deletion. Second, Harvest Extension causes three additional problems, namely source deletion that deletes only sources without deleting data themselves, job stop that cannot delete job during data collection, and service interruption that cannot provide service, even if data exist. Based on these observations, we propose herein an improved CKAN that provides a new deletion function solving data inconsistency and storage space waste problems. In addition, we present an improved Harvest Extension solving three problems of the legacy Harvest Extension. We verify the correctness and the usefulness of the improved CKAN and Harvest Extension functions through actual implementation and extensive experiments.