DOI QR코드

DOI QR Code

Calculation of a Threshold for Decision of Similar Features in Different Spatial Data Sets

이종의 공간 데이터 셋에서 매칭 객체 판별을 위한 임계값 산출

  • 김지영 (서울대학교 대학원 공과대학 건설환경공학부) ;
  • 허용 (서울대학교 공학연구소) ;
  • 유기윤 (서울대학교 공과대학 건설환경공학부) ;
  • 김정옥 (서울대학교 공학연구소)
  • Received : 2013.01.09
  • Accepted : 2013.02.15
  • Published : 2013.02.28

Abstract

The process of a feature matching for two different spatial data sets is similar to the process of classification as a binary class such as matching or non-matching. In this paper, we calculated a threshold by applying an equal error rate (EER) which is widely used in biometrics that classification is a main topic into spatial data sets. In a process of discriminating what's a matching or what's not, a precision and a recall is changed and a trade-off appears between these indexes because the number of matching pairs is changed when a threshold is changed progressively. This trade-off point is EER, that is, threshold. To the result of applying this method into training data, a threshold is estimated at 0.802 of a value of shape similarity. By applying the estimated threshold into test data, F-measure that is a evaluation index of matching method is highly value, 0.940. Therefore we confirmed that an accurate threshold is calculated by EER without person intervention and this is appropriate to matching different spatial data sets.

이종의 공간 데이터 셋을 매칭하는 과정은 매칭 또는 비 매칭의 이진 클래스로 판별하는 과정과 비슷하다. 이에 이진 클래스의 판별이 중요한 연구주제인 바이오인식 분야에서 임계값을 구하는데 이용되는 동일 오류율을 공간 데이터 셋의 매칭에 적용하여 임계값을 산출하였다. 매칭유무를 판별하는 과정에서 임계값이 계속 바뀌면 매칭으로 판별되는 객체 쌍이 상이해지면서 정확도와 재현율도 바뀌게 되며, 이들 지표 사이에 trade-off가 나타나는 지점이 EER, 즉 임계값이 된다. 동일 오류율 기반의 임계값 산출 방법을 훈련 자료에 적용하여 형상유사도 0.802가 임계값으로 구해졌다. 이를 실험 자료에 적용한 결과, 매칭의 성능을 평가하는 척도인 F-measure가 0.940으로 높게 나타났다. 이를 통하여 동일 오류율을 이용하여 연구자의 개입이 없이 정확한 임계값이 산출되고, 동일 오류율 기반의 임계값 산출이 이종의 공간 데이터 셋 매칭에 적합하다는 것을 알 수 있었다.

Keywords

References

  1. Bel Hadj Ali, A. (2001), Positional and shape quality of areal entities in geographic databases: quality information aggregation versus measures classification, Proceeding of ECSQARU''2001 Workshop on Spatio-Temporal Reasoning and Geographic Information Systems, Toulouse, pp. 1-16.
  2. Bengio, S., Maréthoz, J. and Keller, M. (2005), The expected performance curve, Proceedings of the ICML'05 workshop on ROC analysis in machine learning, Germany, pp. 43-50.
  3. Davis, J. and Goadrich, M. (2006), The relationship between precision-recall and ROC curves, Proceedings of the 23rd International Conference on Machine Learning, USA, pp. 233-240.
  4. Han, J., Kamber, M. and Pei, J. (2011), Data Mining: Concepts and Techniques, Third Edition, Morgan Kaufmann, USA, pp. 364-370.
  5. Huh, Y. and Yu, K. (2012), Shape similarity measure for M:N areal object pairs using the Zernike moment descriptor, Korean Journal of Geomatics, Vol. 30, No. 2, pp. 153-162. https://doi.org/10.7848/ksgpc.2012.30.2.153
  6. Kim, K., Huh, Y. and Yu, K. (2011a), Study on building data Set matching considering position error, Korea Spatial Information Society, Vol. 19, No. 2, pp. 37-46.
  7. Kim, J. (2010), Method of feature matching for geospatial datasets using the geographic context, PhD dissertation, Seoul National University, Seoul, Korea, pp. 28-37.
  8. Kim, J., Kim, D., Huh, Y. and Yu, K. (2011b), A new method for automatic areal feature matching based on shape similarity using CRITIC method, Korean Journal of Geomatics, Vol. 28, No. 2, pp. 113-121. https://doi.org/10.7848/ksgpc.2011.29.2.113
  9. Kim, J., Kim, J., Yu, K. and Huh, Y. (2013), Evaluation of classifiers performance for areal feature matching, Korean Journal of Geomatics (Under review)
  10. Moon, Y., Park, K. and Choi, S. (2011), The research of effectively matching method of building objects to register UFID, The Korea Society For GeospatIal Information System, Vol. 19, No. 2, pp. 75-83.
  11. Qi, H. B., Li, Z. L. and Chen, J. (2010), Automated change detection for updating settlements at smaller-scale maps from updated larger-scale maps, Journal of Spatial Science, Taylor & Francis, Vol. 51, No.1, pp. 133-146.
  12. Samal, A., Seth, S. and Cueto, K. (2004), A featurebased approach to conflation of geospatial sources, International Journal of Geographical Information science, Taylor & Francis, Vol. 18, No. 5, pp. 459-489. https://doi.org/10.1080/13658810410001658076
  13. Snelick, R., Uludag, U., Mink, A., Indovina, M. and Jain, A. (2005), Large scale evaluation of multimodal biometric authentication using state-of-the-art systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 3, pp. 450-455. https://doi.org/10.1109/TPAMI.2005.57
  14. Yatskevich, M., Giunchiglia, F. and Avesani, P. (2006), A large scale dataset for the evaluation of matching systems, URL: http://eprints.biblio.unitn.it/1015/, University of Trento, Italia, (last date accessed: 7 February 2013).

Cited by

  1. Machine Learning Classification of Buildings for Map Generalization vol.6, pp.10, 2017, https://doi.org/10.3390/ijgi6100309