Browse > Article
http://dx.doi.org/10.9717/kmms.2014.17.2.240

Effective Indexing for Evolving Data Collection by Using Ontology  

Kim, Jong Wook (상명대학교 미디어소프트웨어학과)
Bae, Myung Soo (서울아산병원 영상의학과)
Publication Information
Abstract
Data which is created and shared on the Web is characterized by the massive amount of user generated content on various applications and dynamically evolving content on the basis of user interests. Thus, in order to benefit from Web data, it is essential to provide (a) the mechanisms which enable scalable processing of large data collections and (b) the organization schemes which reduce the navigational overhead within complex and dynamically growing content. Between these two impending needs, in this paper, we are interested in developing an indexing scheme which aims to reduce the time and effort needed to access the relevant piece of information by leveraging ontologies. In particular, considering evolving nature of Web contents, the proposed technique in this paper computes the sub-ontology, which best matches a given data collection, from the existing large size of ontology. Case studies show that the proposed indexing scheme in this paper indeed helps organize dynamically evolving content.
Keywords
Evolving Content; Ontology; Navigation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 J. Dean and S. Ghemawat. "MapReduce: Simplified Data Processing on Large Clusters," Symposium on Opearting Systems Design and Implementation, pp. 137-150, 2004.
2 M. Cataldi, K.S. Candan, and M.L. Sapino, "Narrative-based Taxonomy Distillation for Effective Indexing of Text Collections," Data and Knowledge Engineering, Vol. 72, No 2, pp. 103-125, 2012.   DOI
3 J.W. Kim, "Data Partitioning on MapReduce by Leveraging Data Utility," Journal of Korea Multimedia Society, Vol. 16, No. 5, pp. 657-666, 2013.   과학기술학회마을   DOI
4 Teradata, http://www.teradata.com. 1979.
5 IBM Netezza Data Warehouse Appliances, http://www-01.ibm.com/software/data/netezza/, 2000.
6 O. Zamir and O. Etzioni, "Web Document Clustering: A Feasibility Demonstration," Proc. of the International ACM SIGIR Conference, pp. 46-54, 1998.
7 WordNet, A lexical database for English, http://wordnet.princeton.edu/, 2013.
8 Wikipedia, http://www.wikipedia.org/, 2001.
9 Open Directory Project, http://www.dmoz.org/, 1998.
10 J.W. Kim and K.S. Candna, "CP/CV: Concept Similarity Mining without Frequency Information from Domain Describing Taxonomy," Proc. of the International ACM CIKM Conference, pp. 483-492, 2006.
11 M. Cataldi, C. Schifanella, K.S. Candan, M.L. Sapino, and L.D. Caro, "CoSeNa: A Context-based Search and Navigation System," Proc. of the International Conference on Management of Emergent Digital EcoSystems, pp. 218-225, 2009.
12 L.D. Caro, K.S. Candan, and M.L. Sapino, "Using tagFlake for Condensing Navigable Tag Hierarchies from Tag Clouds," Proc. of the International ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1069-1072, 2008.
13 I.S. Dhillon, S. Mallela, and D.S. Modha, "Information-Theoretic Co-clustering," Proc. of the International ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 89-98, 2003.
14 I.S. Dhillon, "Co-clustering Documents and Words using Bipartite Spectral Graph Partitioning," Proc. of the International ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 269-274, 2001.
15 J. Zhao and G. Karypis, "Evaluation of Hierarchical Clustering Algorithms for Document Datasets," Proc. of the International ACM CIKM Conference, pp. 515-524, 2002.
16 R.T. Ng and J. Han, "Efficient and Effective Clustering Methods for Spatial Data Mining," Proc. of the International Conference on Very Large Data Bases, pp. 144-155, 1994.
17 ACM Digital Library. http://portal.acm.org, 2014.