DOI QR코드

DOI QR Code

Data Mining for Uncertain Data Based on Difference Degree of Concept Lattice

  • Qian Wang (School of Computer Science and Technology, Zhoukou Normal University) ;
  • Shi Dong (School of Computer Science and Technology, Zhoukou Normal University) ;
  • Hamad Naeem (School of Computer Science and Technology, Zhoukou Normal University)
  • Received : 2022.03.10
  • Accepted : 2022.11.01
  • Published : 2024.06.30

Abstract

Along with the rapid development of the database technology, as well as the widespread application of the database management systems are more and more large. Now the data mining technology has already been applied in scientific research, financial investment, market marketing, insurance and medical health and so on, and obtains widespread application. We discuss data mining technology and analyze the questions of it. Therefore, the research in a new data mining method has important significance. Some literatures did not consider the differences between attributes, leading to redundancy when constructing concept lattices. The paper proposes a new method of uncertain data mining based on the concept lattice of connotation difference degree (c_diff). The method defines the two rules. The construction of a concept lattice can be accelerated by excluding attributes with poor discriminative power from the process. There is also a new technique of calculating c_diff, which does not scan the full database on each layer, therefore reducing the number of database scans. The experimental outcomes present that the proposed method can save considerable time and improve the accuracy of the data mining compared with U-Apriori algorithm.

Keywords

Acknowledgement

This paper is supported by Open Foundation of State key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (SKLNST-2020-2-01), Key Scientific Research Projects of Colleges and Universities in Henan Province (Grant No. 23A520054).

References

  1. M. Gao, "Data provenance management and similarity query over uncertain data," M.S. thesis, Fudan University, Shanghai, China, 2011.
  2. S. Rao and P. Gupta, "Implementing improved algorithm over APRIORI data mining association rule algorithm," International Journal of Computer Science and Technology (IJCST), vol. 3, no. 1, pp. 489-493, 2012.
  3. J. Han and M. Kamber, Data Mining: Concepts and Techniques. San Francisco, CA: Morgan Kaufmann, 2001.
  4. H. Zhao, C. Cai, and X Li, "Overview of association rules Apriori mining algorithm," Journal of Sichuan University of Science and Technology (Natural Science Edition), vol. 24, no. 1, pp. 66-70, 2011. https://doi.org/10.3969/j.issn.1673-1549.2011.01.019
  5. L. Chi, "A two-class classification mining algorithm based on the FP-growth algorithm and its application," M.S. thesis, Qingdao University, Qingdao, China, 2012.
  6. Z. Liu and R. Chang, "Fast algorithm of frequent itemset mining based on matrix from uncertain data," Journal of Nanjing University of Science and Technology (Natural Science Edition), vol. 39, no. 4, pp. 420-425, 2015. https://doi.org/10.14177/j.cnki.32-1397n.2015.39.04.007
  7. J. Wang, L. Zhang, Q. Deng, F. Wang, and Y. Wang, "Survey on algorithm of mining frequent itemsets from uncertain data," Computer Engineering and Applications, vol. 47, no. 20, pp. 121-125, 2011. https://doi.org/10.3778/j.issn.1002-8331.2011.20.035
  8. C. Chen, J. Huang, and Y. Jiang, "Mining frequent items in uncertain dataset using compressed UF-tree," Application Research of Computers, vol. 31, no. 3, pp. 716-719, 2014. https://doi.org/10.3969/j.issn.1001-3695.2014.03.018
  9. Z. Li and J. Mo, "An improved data mining algorithm based on concept lattice," Journal of Chongqing Normal University (Natural Science Edition), vol. 30, no. 2, pp. 92-95, 2013. https://doi.org/10.11721/cqnuj20130221
  10. J. C. W. Lin, T. Li, M. Pirouz, J. Zhang, and P. Fournier-Viger, "High average-utility sequential pattern mining based on uncertain databases," Knowledge and Information Systems, vol. 62, pp. 1199-1228, 2020. https://doi.org/10.1007/s10115-019-01385-8
  11. Y. Baek, U. Yun, E. Yoon, and P. Fournier-Viger, "Uncertainty-based pattern mining for maximizing profit of manufacturing plants with list structure," IEEE Transactions on Industrial Electronics, vol. 67, no. 11, pp. 9914-9926, 2020. https://doi.org/10.1109/TIE.2019.2956387
  12. M. M. Rahman, C. F. Ahmed, and C. K. S. Leung, "Mining weighted frequent sequences in uncertain databases," Information Sciences, vol. 479, pp. 76-100, 2019. https://doi.org/10.1016/j.ins.2018.11.026
  13. G. Lee and U. Yun, "A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives," Future Generation Computer Systems, vol. 68, pp. 89-110, 2017. https://doi.org/10.1016/j.future.2016.09.007
  14. G. Lee, U. Yun, and K. M. Lee, "Analysis of tree-based uncertain frequent pattern mining techniques without pattern losses," The Journal of Supercomputing, vol. 72, pp. 4296-4318, 2016. https://doi.org/10.1007/s11227-016-1847-z
  15. X. Zhao, D. Miao, and B. Q. Hu, "On relationship between three-way concept lattices," Information Sciences, vol. 538, pp. 396-414, 2020. https://doi.org/10.1016/j.ins.2020.06.007
  16. G. Liao, L. Wu, and C. Wan, "Frequent pattern mining of uncertain data streams based on probability decay window model," Journal of Computer Research and Development, vol. 49, no. 5, pp. 1105-1115, 2012.
  17. S. Wang, G. Yang, and Z. Zhu, "Frequent item query algorithm based on uncertain data," Journal of Northeastern University (Natural Science Edition), vol. 32, no. 3, pp. 344-347, 2011. https://doi.org/10.3969/j.issn.1005-3026.2011.03.010
  18. L. Zhang, H. Zhang, L. Yin, and D. Han, "Theory and algorithms of attribute decrement for concept lattice," Journal of Computer Research and Development, vol. 50, no. 2, pp. 248-259, 2013.