Analytical Approach for Scalable Feature Selection

Yang, Jae-Kyung;Lee, Tae-Han;

산업경영시스템학회지 (Journal of Korean Society of Industrial and Systems Engineering)

제29권2호
/
Pages.75-82
/
2006
/
2005-0461(pISSN)
/
2287-7975(eISSN)

한국산업경영시스템학회 (Society of Korea Industrial and System Engineering)

확장 가능한 요소선택방법을 위한 분석적 접근

Analytical Approach for Scalable Feature Selection

양재경 (전북대학교 산업정보시스템공학과, 공업기술연구센터) ;
이태한 (전북대학교 산업정보시스템공학과, 공업기술연구센터)

Yang, Jae-Kyung (Department of Industrial and Information Systems Engineering, Research Center of Industrial Technology, Chonbuk National University) ;
Lee, Tae-Han (Department of Industrial and Information Systems Engineering, Research Center of Industrial Technology, Chonbuk National University)

발행 : 2006.06.30

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 연구에서 조합 최적화(Combinatorial Optimization) 이론에 바탕을 두고 있는 네스티드 분할(Nested Partition, 이하 NP) 방법을 이용한 최적화 기탄 요소선택 방법(Feature Selection)을 제안한다. 이 새로운 방법은 좋은 요소 부분집합을 찾는 휴리스틱 탐색 절차를 채용하고 있으며 데이터의 인스턴스(Instances 또는 Records)의 무작위 추출(Random Sampling)을 이용하여 이 요소선택 방법의 처리시간 관점에서의 성능을 항상 시키고자 한다. 이 새로운 접근 방법은 처리시간 향상을 위해 2단계 샘플링 방법을 채용하여 근접 최적해로의 수렴(Convergence)을 보장하는 샘플 사이즈를 결정한다. 이는 앨고리듬이 유한한 시간내에 끝이날 때 최종 요소 부분집합 해의 질(Qualtiy)에 관한 정확한 설명을 할 수 있는 이론적인 배경을 제시한다. 중요 결과를 예시하기 위해서 다양한 형태의 다섯 개의 데이터 셋을 이용하였으며 다섯 번의 반복 실험을 통한 실험 결과가 제시되며, 이 새로운 접근 방법이 기존의 단순 네스티드 분할 방법 기반의 요소선택 방법보다 처리시간 관점에서 더욱 효율적임을 보여준다.

키워드

참고문헌

Blake, C. L. and Merz, C. J., UCI Repository of machine learning databases, http://www.ics.uci.edu/mlearn/MLRepository.html, University of California, Irvine, CA (Date Accessed: October 31, 2003), 1998
Hall, M. A., 'Correlation-based feature selection for discrete and numeric class machine learning,' in Proceedings of the Seventeenth International Conference on Machine Learning, Stanford University, CA. Morgan Kaufmann, 1998
Kivinen, J. and Mannila, H., 'The power of sampling in knowledge discovery,' in ACM Symposium on Principles of Database Theory, pp. 77-85, 1994
Olafsson, S., 'Two-stage nested partitions method for stochastic optimization,' Methodology and Computing in Applied Probability, 6 : 5-27, 2004 https://doi.org/10.1023/B:MCAP.0000012413.54789.cc
Olafsson, S. and Yang, J., 'Intelligent partitioning for feature selection,' INFORMS Journal on Computing, in print, 2005
Provost, F., Jensen, D. and Oates, T., 'Efficient progressive sampling,' in Proceedings of the fifth International Conference on Knowledge Discovery and Data Mining, pp. 23-32, 1999
Shi, L. and Olafsson, S., 'Nested partitions method for global optimization,' Operations Research, 48 : 390-407, 2000 https://doi.org/10.1287/opre.48.3.390.12436
Toivonen, H., 'Sampling large databases for association rules,' in Proceedings of the 22nd International Conference on Very Large Databases, pp. 134-145, 1996
Weiss, G. M. and Provost, F., 'The effect of class distribution on classifier learning: an empirical study,' Technical Report ML-TR-44, Department of Computer Science, Rutgers University August 2, 2001
Yang, J. and Honavar, V., 'Feature subset selection using a genetic algorithm,' In H. Motada and H. Liu (eds), Feature Selection, Construction, and Subset Selection: A Data Mining Perspective, Kluwer, New York, 1998

산업경영시스템학회지 (Journal of Korean Society of Industrial and Systems Engineering)

확장 가능한 요소선택방법을 위한 분석적 접근

Analytical Approach for Scalable Feature Selection

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)