Browse > Article
http://dx.doi.org/10.14699/kbiblia.2017.28.4.301

A Method for Same Author Name Disambiguation in Domestic Academic Papers  

Shin, Daye (경북대학교 문헌정보학과 대학원)
Yang, Kiduk (경북대학교 문헌정보학과)
Publication Information
Journal of the Korean BIBLIA Society for library and Information Science / v.28, no.4, 2017 , pp. 301-319 More about this Journal
Abstract
The task of author name disambiguation involves identifying an author with different names or different authors with the same name. The author name disambiguation is important for correctly assessing authors' research achievements and finding experts in given areas as well as for the effective operation of scholarly information services such as citation indexes. In the study, we performed error correction and normalization of data and applied rules-based author name disambiguation to compare with baseline machine learning disambiguation in order to see if human intervention could improve the machine learning performance. The improvement of over 0.1 in F-measure by the corrected and normalized email-based author name disambiguation over machine learning demonstrates the potential of human pattern identification and inference, which enabled data correction and normalization process as well as the formation of the rule-based diambiguation, to complement the machine learning's weaknesses to improve the author name disambiguation results.
Keywords
Author Name Disambiguation; Machine Learning; Rule-based Method; Heuristic;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 이미화. 2014. 전거제어를 위한 국제표준이름식별자(ISNI의 활용가능성에 관한 연구. 정보관리학회지, 31(3): 133-151. Lee, Mi-hwa. 2014. "A Study on the Applicability of ISNI for Authority Control." Journal of the Korean Society for Information Management, 31(3): 133-151.   DOI
2 이승우 외. 2006. 서지정보의 동명이인 구별을 위한 공저자 관계의 효용성 연구. 한국정보과학회학술발표 논문집, 10-12. Lee, Seung-woo et al. 2006. "A Research on the Effectiveness of Co-authorship for Identity Resolution in Bibliography." Korea Computer Congress 2006, 10-12.
3 정한민, 이승우, 강인수, 성원경. 2006. 과학기술 문헌으로부터의 URI 기반 인력정보 구축. 한국콘텐츠학회논문지, 6(9): 152-163. Jung, Han-Min, Seung-Woo Lee, In-Su Kang, and Won-Kyung Sung. 2006. "The Construction of URI-Based Human Resource Information from Science and Technology Papers." The Journal of the Korea Contents Association, 6(9): 152-163.
4 조재인. 2013. ORCID 기반의 학술 연구 결과물 저자명 식별 시스템 구축 방안에 관한 연구. 한국비블리아학회지, 24(1): 45-62. Cho, Jane. 2013. "A Study on the Construction Methods for Author. Identification System of Research Outcome based on ORCID." Journal of the Korean Biblia Society for Library and Information Science, 24(1): 45-62.   DOI
5 Smalheiser, Neil R. and Vetle I. Torvik. 2009. "Author Name Disambiguation." Annual Review of Information Science and Technology, 43(1): 1-43.
6 강인수. 2011b. 저자 식별에 기반한 저자 그래프 생성. 정보관리연구, 42(1): 47-62. Kang, In-Su. 2011b. "Author Graph Generation Based on Author Disambiguation." Journal of Information Science Theory and Practice, 42(1): 47-62.
7 강인수. 2008a. 저자식별을 위한 전자메일의 추출 및 활용. 한국콘텐츠학회논문지, 8(6): 261-268. Kang, In-Su. 2008a. "Email Extraction and Utilization for Author Disambiguation." The Journal of the Korea Contents Association, 8(6): 261-268.   DOI
8 강인수. 2008b. 한글 저자명 중의성 해소를 위한 기계학습기법의 적용. 정보관리학회지, 25(3): 27-39. Kang, In-Su. 2008b. "Application of Machine Learning Techniques for Resolving Korean Author Names." Journal of the Korean Society for Information Management, 25(3): 27-39.   DOI
9 강인수. 2009. 한글 저자명 군집화를 위한 계층적 기법 비교. 정보관리연구, 40(2): 95-115. Kang, In-Su. 2009. "Exploration of Hierarchical Techniques for Clustering Korean Author Names." Journal of Information Science Theory and Practice, 40(2): 95-115.
10 강인수. 2011a. 동시인용정보를 이용한 동명이인 저자의 중의성 해소. 정보관리연구, 42(3): 167-186. Kang, In-Su. 2011a. "Disambiguation of Author Names Using Co-citation." Journal of Information Science Theory and Practice, 42(3): 167-186.
11 김하진, 정효정, 송민. 2014. 토픽모델링을 통한 저자명 식별 성능 비교. 한국정보관리학회 학술대회논문집, 149-152. Kim, Ha-Jin, Hyo-jung Jung, and Min Song. 2014. "A Comparison of Author Name Disambiguation Performance through Topic Modeling." Proceedings of the 21th Korean Society for Information Management 2014, 149-152.
12 강인수 외. 2006. 논문 원문을 이용한 동명 저자 자동 군집화. 한국콘텐츠학회 종합학술대회 논문집, 4(2): 652-656. Kang, In-Su et al. 2006. "Automatic Clustering of Same-Name Authors Using Full-text of Articles." Proceeding of The Journal of the Korea Contents Association, 4(2): 652-656.
13 강인수 외. 2009. 저자 식별을 위한 대용량 평가셋 구축. 한국콘텐츠학회논문지, 9(11): 455-464. Kang, In-Su et al. 2009. "A Largescale Test Set for Author Disambiguation." The Journal of the Korea Contents Association, 9(11): 455-464.   DOI
14 김일환, 이도길. 2015. 저자 판별을 위한 전산 문체론 - 초기 현대소설을 대상으로. 국어국문학, 170(0): 207-239. Kim, Il-hwan and Do-Gil Lee. 2015. "Computational Stylistics for Authorship Attribution - based on Early Modern Korean Novels." The Korean Language and Literature, 170(0): 207-239.
15 신동욱. 2009. 사회망을 이용한 서지정보의 저자명 명확화 기법. 석사학위논문. 한양대학교 컴퓨터공학과. Shin, Dong-Wook. 2009. Name Disambiguation Using Social Networks on Bibliographic Data. M.A. Thesis. HanYang University.