Acknowledgement
Supported by : 한국연구재단
References
- C. Xiao, W. Wang, X. Lin, and J. X. Yu, "Efficient similarity joins for near-duplicate detection," Proc. of the 17th international conference on World Wide Web, pp. 131-140, 2008. (in USA)
- S. Sarawagi, A. Kirpal, "Efficient set joins on similarity predicates," Proc. of the 2004 ACM SIGMOD international conference on Management of data, pp. 743-754, 2004. (in USA)
- S. Chaudhuri, V. Ganti, R. Kaushik, "A primitive operator for similarity joins in data cleaning," Proc. of the IEEE 22nd International Conference on Data Engineering, pp. 0-5, 2006.
- X. Yang, B. Wang, and C. Li, "Cost-Based Variable-Length-Gram Selection for String Collections to Support Approximate Queries Efficiently," Proc. of the 2008 ACM SIGMOD international conference on Management of data, pp. 353-364, 2008. (in USA)
- C. Li, B. Wang, and X. Yang, "VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams," Proc. of the 33rd international conference on Very large data bases, pp. 303-314, 2007.
- C. Xiao, W. Wang, X. Lin, "Ed-Join: an efficient algorithm for similarity joins with edit distance constraints," Proc. of the VLDB Endowment VLDB Endowment Hompage archive Volume 1 Issue 1, August 2008, pp. 933-944, 2008. https://doi.org/10.14778/1453856.1453957
- J. Kim, H. Lee, "Efficient exact similarity searches using multiple token orderings," proc. of the IEEE 28th International Conference on Data Engineering, pp. 822-833, 2012.
- J. Kim, "An effective candidate generation method for improving performance of edit similarity query processing," Information Systems, pp. 116-128, 2015.
- V. N. Ang and A. Moffat, "Inverted Index Compression Using Word-Aligned Binary Codes," Information Retrieval, pp. 151-166, 2005.
- H. Williams, J. Zobel, "Compressing Integers for Fast File Access," The Computer Journal, pp. 193-201, 1999.
- M. Zukowski, S. Heman, N. Nes, P. Boncz, "Super-Scalar RAM-CPU Cache Compression," proc. of the IEEE 22nd International Conference on Data Engineering, pp. 59, 2006.
- A. Clauset, C. Shalizi, M. Newman, "Power-law distributions in empirical data," SIAM Review, pp. 661-703, 2009.