Browse > Article
http://dx.doi.org/10.13089/JKIISC.2013.23.3.383

Dummy Data Insert Scheme for Privacy Preserving Frequent Itemset Mining in Data Stream  

Jung, Jay Yeol (Graduate School of Information Management and Security, Korea University)
Kim, Kee Sung (Graduate School of Information Management and Security, Korea University)
Jeong, Ik Rae (Graduate School of Information Management and Security, Korea University)
Abstract
Data stream mining is a technique to obtain the useful information by analyzing the data generated in real time. In data stream mining technology, frequent itemset mining is a method to find the frequent itemset while data is transmitting, and these itemsets are used for the purpose of pattern analyze and marketing in various fields. Existing techniques of finding frequent itemset mining are having problems when a malicious attacker sniffing the data, it reveals data provider's real-time information. These problems can be solved by using a method of inserting dummy data. By using this method, a attacker cannot distinguish the original data from the transmitting data. In this paper, we propose a method for privacy preserving frequent itemset mining by using the technique of inserting dummy data. In addition, the proposed method is effective in terms of calculation because it does not require encryption technology or other mathematical operations.
Keywords
Data Stream; Privacy Preserving; Frequent Itemset mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Muthukrishnan, Data streams: algorithms and applications, Lightning Source Inc, Jan. 2005.
2 G. Cormode and S. Muthukrishnan, "What's hot and what's not: tracking most frequent items dynamically," Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART sy- mposium on Principles of database systems, pp. 296-306, Jun. 2003.
3 C. Jin, W. Qian, C. Sha, J.X. Yu, and A. Zhou, "Dynamically maintaining frequent items over a data stream," Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management, pp. 287-294, Nov. 2003.
4 G.S. Manku and R. Motwani, "Approximate frequency counts over data streams," Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346-357, Aug. 2002.
5 E. Demaine, A. Lopez-Ortiz, and J. Munro, "Frequency estimation of internet packet streams with limited spa- ce," Proceedings of the 10th Annual European Symposium, pp. 348-360, Sep. 2002.
6 A. Metwally, D. Agrawal, and A.E. Abbadi, "Efficient computation of frequent and top-k elements in data strea- ms," Proceedings of the 10th Interna- tional Conference on Database Theor- y, pp. 398-412, Jan. 2005.
7 H. Liu, Y. Lin, and J. Han, "Methods for mining frequent items in data streams: an overview," Knowledge and Information Systems, vol. 26, no. 1, pp. 1-30, Jan. 2011.   DOI
8 S. Pramod and O.P. Vyas, "Recent frequent itemsets mining over data streams," Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology, pp. 484-489, Oct. 2012.
9 M. Deypira, M.H. Sadreddinib, and S. Hashemib, "Towards a variable size sliding window model for frequent itemset mining over data streams," Computers & Industrial Engineering, vol. 63, no. 1, pp. 161-172, Aug. 2012.   DOI   ScienceOn
10 S. Oliveira and O. Zaiane, "Achieving privacy preservation when sharing data for clustering," Proceedings of International Workshop on Secure Data Management in a Connected World, pp. 67-82, Aug. 2004.
11 B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen, "On private scalar product computation for privacy-pres- erving data mining," In The 7th Ann- ual International Conference in Information Security and Cryptology, pp. 104-120, Dec. 2004.
12 M.A. Ouda, S.A. Salem, I.A. Ali, and E.M. Saad, "Privacy-preserving data mining (PPDM) method for horizon- tally partitioned data," International Journal of Computer Science Issues, vol. 9, no. 5, pp. 339-347, Sec. 2012.
13 R. Agrawal and R. Srikant, "Privacy-preserving data mining," Proceedings of the 2000 ACM SIGMOD interna- tional conference on Management of data, pp. 439-450, Jun. 2000.
14 P.K. Fong and J.H. Weber-Jahnke, "Privacy Preserving Decision Tree Learning Using Unrealized Data Sets," IEEE Transactions on Knowle- dge and Data Engineering, vol. 24, no. 2, pp. 353-364, Feb. 2012.   DOI   ScienceOn
15 M.J. Fischer and S.L. Salzberg, "Finding a majority among n votes," Research Report 252, Department of Computer Science, University of Yale, Oct. 1982.