Acknowledgement
이 논문은 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (NO. NRF-2020R1F1A1A01071036).
References
- Agrawal R and Srikant R (1994). Fast algorithms for mining association rules. In Proceedings of the 20th International Conference Very Large Data Bases, VLDB, 125, 487-499.
- Agrawal R, Gehrke J, Gunopulos D, and Raghavan P (1998). Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, 94-105.
- Barnett V and Lewis T (1984). Outliers in Statistical Data(2nd ed), Chichester, Wiley.
- Beckmann N, Kriegel HP, Schneider R, and Seeger B (1990). The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, 322-331.
- Bennett KP, Fayyad U, and Geiger D (1999). Density-based indexing for approximate nearest-neighbor queries. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 233-243.
- Beyer K, Goldstein J, Ramakrishnan R, and Shaft U (1999). When is "nearest neighbor" meaningful?. In International Conference on Database Theory, Springer, Berlin, 217-235.
- Breunig MM, Kriegel HP, Ng RT, and Sander J (2000). LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 93-104.
- Campos GO, Zimek A, Sander J, et al. (2016). On the evaluation of unsupervised outlier detection: Measures, datasets, and an empirical study, Data Mining and Knowledge Discovery, 30, 891-927. https://doi.org/10.1007/s10618-015-0444-8
- Durrant RJ and Kaban A (2009). When is 'nearest neighbour' meaningful: A converse theorem and implications, Journal of Complexity, 25, 385-397. https://doi.org/10.1016/j.jco.2009.02.011
- Eskin E, Arnold A, Prerau M, Portnoy L, and Stolfo S (2002). A geometric framework for unsupervised anomaly detection, In Applications of Data Mining in Computer Security, Springer, Boston, 77-101.
- Fawcett T and Provost F (1997). Adaptive fraud detection, Data Mining and Knowledge Discovery, 1, 291-316. https://doi.org/10.1023/A:1009700419189
- Hawkins DM (1980). Identification of Outliers, Chapman and Hall, London.
- Houle ME, Kriegel HP, Kroger P, Schubert E, and Zimek A. (2010). Can shared-neighbor distances defeat the curse of dimensionality?. In International Conference on Scientific and Statistical Database Management, Springer, Berlin, 482-500.
- Keller F, Muller E, and Bohm K (2012). HiCS: High contrast subspaces for density-based outlier ranking. In 2012 IEEE 28th International Conference on Data Engineering, 1037-1048.
- Kriegel HP, Kroger P, Schubert E, and Zimek A (2009). Outlier detection in axis-parallel subspaces of high dimensional data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Berlin, 831-838.
- Lazarevic A and Kumar V (2005). Feature bagging for outlier detection. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 157-166.
- Liu FT, Ting KM, and Zhou ZH (2008). Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, 413-422.
- Muller E, Schiffer M, and Seidl T (2011). Statistical selection of relevant subspace projections for outlier ranking. In 2011 IEEE 27th International Conference on Data Engineering, 434-445.
- Muller E, Assent I, Iglesias P, Mulle Y, and Bohm K (2012). Outlier ranking via subspace analysis in multiple views of the data. In 2012 IEEE 12th International Conference on Data Mining, 529-538.
- Nguyen HV, Muller E, Vreeken J, Keller F, and Bohm K (2013). CMI: An information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In Proceedings of the 2013 SIAM International Conference on Data Mining, 198-206.
- Parsons L, Haque E, and Liu H (2004). Subspace clustering for high dimensional data: a review, Acm Sigkdd Explorations Newsletter, 6, 90-105. https://doi.org/10.1145/1007730.1007731
- Penny KI and Jolliffe IT (2001). A comparison of multivariate outlier detection methods for clinical laboratory safety data, Journal of the Royal Statistical Society: Series D (The Statistician), 50, 295-307. https://doi.org/10.1111/1467-9884.00279
- Powers DM (2020). Evaluation: from Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation.
- Procopiuc CM, Jones M, Agarwal PK, and Murali TM (2002). A Monte Carlo algorithm for fast projective clustering. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 418-427.
- Schubert E and Zimek A (2019). ELKI: A Large Open-Source Library for Data Analysis-ELKI Release 0.7. 5" Heidelberg.
- Silverman BW (1986). Density Estimation for Statistics and Data Analysis, 26, CRC press.
- Steinbiss V, Tran BH, and Ney H (1994). Improvements in beam search. In Third International Conference on Spoken Language Processing.
- Stephens MA (1970). Use of the kolmogorov-smirnov, cramer-von mises and related statistics without extensive tables, Journal of the Royal Statistical Society: Series B (Methodological), 32, 115-122. https://doi.org/10.1111/j.2517-6161.1970.tb00821.x
- Tukey JW (1977). Exploratory Data Analysis, 2, 131-160.
- Zimek A, Schubert E, and Kriegel HP (2012). A survey on unsupervised outlier detection in high-dimensional numerical data, Statistical Analysis and Data Mining: The ASA Data Science Journal, 5, 363-387. https://doi.org/10.1002/sam.11161