Browse > Article
http://dx.doi.org/10.1633/JIM.2011.42.1.047

Author Graph Generation based on Author Disambiguation  

Kang, In-Su (School of Computer, Kyungsung University)
Publication Information
Journal of Information Management / v.42, no.1, 2011 , pp. 47-62 More about this Journal
Abstract
While an ideal author graph should have its nodes to represent authors, automatically-generated author graphs mostly use author names as their nodes due to the difficulty of resolving author names into individuals. However, employing author names as nodes of author graphs merges namesakes, otherwise separate nodes in the author graph, into the same node, which may distort the characteristics of the author graph. This study proposes an algorithm which resolves author ambiguities based on co-authorship and then yields an author graph consisting of not author name nodes but author nodes. Scientific collaboration relationship this algorithm depends on tends to produce the clustering results which minimize the over-clustering error at the expense of the under-clustering error. In experiments, the algorithm is applied to the real citation records where Korean namesakes occur, and the results are discussed.
Keywords
Author Graph; Author Disambiguation; Community Detection;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 강인수, 이승우, 정한민, 김평, 구희관, 이미경, 성원경, 박동인. 2008. 저자 식별을 위한 자질 비교. 한국콘텐츠학회논문지, 8(2): 41-47.
2 강인수. 2009. 한글 저자명 중의성 해소를 위한 기계학습기법의 적용. 정보관리학회지, 25(3): 27-39.
3 Blei, D., Ng, A., Jordan, M. 2003. "Latent Dirichlet allocation." Journal of Machine Learning Research, 3: 951- 991.
4 Braun, T., Glanzel, W., Schubert, A. 2001. "Publication and cooperation patterns of the authors of neuroscience journals." Scientometrics, 51(12): 499-510.   DOI
5 Elmacioglu, E., Dongwon, L. 2005. "On six degrees of separation in DBLP-DB and more." SIGMOD Record, 34(2): 33-40.
6 Fortunato, S. (2010). Community detection in graphs. Physics Report, 486: 75-174.   DOI   ScienceOn
7 Girvan, M., Newman, M. 2002. "Community structure in social and biological networks." In Proceedings of the National Academy of Science, 99, pp.7821-7826.   DOI   ScienceOn
8 Hofmann, T. 1999. "Probabilistic latent semantic indexing." In Proceedings of the ACM SIGIR 22nd Annual International Conference on Research and Development in Information Retrieval, pp.50-57.
9 Huang, J., Ertekin., S., Giles, C. 2006. "Efficient name disambiguation for largescale databases." In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp.536-544.
10 Kang, I., Na, S., Lee, S., Jung. H., Kim, P., Sung, W., Lee, J. 2009. "On co -authorship for author disambiguation," Information Processing & Management, 45(1): 84-97.   DOI   ScienceOn
11 Liu, X., Bollen, J., Nelson, M., Sompel, H. 2005. "Co-authorship networks in the digital library research community." Information Processing and Management, 41: 1462-1480.   DOI   ScienceOn
12 Newman, M. 2004a. "Coauthorship networks and patterns of scientific collaboration." In Proceedings of the National Academy of Science, 101, pp.5200-5205.   DOI   ScienceOn
13 Tan, Y., Kan, M., Lee, D. 2006. "Search engine driven author disambiguation," In Proceedings of the 6th ACM/ IEEE-CSJoint Conference on Digital libraries, pp.314-315.
14 Newman, M. 2004b. Who is the best connected scientist? A study of scientific coauthorship networks. In E. Ben -Naim, H. Frauenfelder, & Z. Toroczkai Eds. Complex networks, Berlin: Springer.
15 Newman, M. 2004c. Fast algorithm for detecting community structure in networks. Physical Review E, 69, 066133.   DOI
16 Pereira, D., Ribeiro-Neto, B., Ziviani, N., Laender, A., Goncalves, M., Ferreira, A. 2009. "Using web information for author name disambiguation." In Proceedings of the 9th ACM/IEEE -CS Joint Conference on Digital libraries, pp.49-58.
17 Smalheiser, N., Torvik, V. 2009. "Author name disambiguation." Annual Review of Information Science and Technology, 43:287-313.
18 Song, Y., Huang, J., Councill, I., Li, J., Giles, C. 2007. "Efficient topic-based unsupervised name disambiguation." In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, pp.342-351.
19 Tomassini, M., Luthi, L., Giacobini, M., Langdon, W. 2007. "The structure of the genetic programming collaboration network." Genetic Programming and Evolvable Machines, 8(1): 97-103.   DOI
20 Torvik, V., Weeber, M., Swanson, D., Smalheiser, N. 2005. "A probabilistic similarity metric for Medline records: a model for author name disambiguation." Journal of the American Society for Information Science and Technology, 56: 140-158.   DOI   ScienceOn