• Title/Summary/Keyword: Same-Name Authors

Search Result 20, Processing Time 0.021 seconds

Email Extraction and Utilization for Author Disambiguation (저자 식별을 위한 전자메일의 추출 및 활용)

  • Kang, In-Su
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.6
    • /
    • pp.261-268
    • /
    • 2008
  • An author of a paper is represented as his/her personal name in a bibliographic record. However, the use of names to indicate authors may deteriorate recall and precision of paper and/or author search, since the same name can be shared by many different individuals and a person can write his/her name in different forms. To solve this problem, it is required to disambiguate same-name author names into different persons. As features for author resolution, previous studies have exploited bibliographic attributes such as co-authors, titles, publication information, etc. This study attempts to apply email addresses of authors to disambiguate author names. For this, we first handle the extraction of email addresses from full-text papers, and then evaluate and analyze the effect of email addresses on author resolution using a large-scale test set.

Name Disambiguation using Cycle Detection Algorithm Based on Social Networks (사회망 기반 순환 탐지 기법을 이용한 저자명 명확화 기법)

  • Shin, Dong-Wook;Kim, Tae-Hwan;Jeong, Ha-Na;Choi, Joong-Min
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.306-319
    • /
    • 2009
  • A name is a key feature for distinguishing people, but we often fail to discriminate people because an author may have multiple names or multiple authors may share the same name. Such name ambiguity problems affect the performance of document retrieval, web search and database integration. Especially, in bibliography information, a number of errors may be included since there are different authors with the same name or an author name may be misspelled or represented with an abbreviation. For solving these problems, it is necessary to disambiguate the names inputted into the database. In this paper, we propose a method to solve the name ambiguity by using social networks constructed based on the relations between authors. We evaluated the effectiveness of the proposed system based on DBLP data that offer computer science bibliographic information.

An analysis on the bibliographical description of the Hong-ssi Tok-so-rok(홍씨독서록) (홍씨독서록의 목록기술방식에 대한 고찰)

  • Lee Sang-Yong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.27
    • /
    • pp.215-228
    • /
    • 1994
  • This study is to analyze the background and circumstances of the bibliographical description method appearing in the Hong-ssi Tok-so-rok, or an annotated classified bibliography of Korean and Chinese books edited for the Hongs and their clan. The conclusions are as follows. Each entries of the bibliography are entered under titles, and generally followed by bibliographic elements of volumes, written age, author's name, functional word of authorship, and annotation. The written age is stated by the dynasty name for the first authors within each classes. However some anonymous works and government compiled works are recorded the king's shrine name or the reign title. Entries of the bibliography are arranged by the chronological order in each classes. The writer's name is generally described by 'surname + given name'. However it is sometimes also recorded in the one of the following forms; Appellation (hao, 호) or posthumous title + surname + given name. Sumame + appellation or posthumous title + given name. Appellation ( (hao, 호) or posthumous title + sumame + Sonsaeng (선행) + given name. Sumame + government position title + given name. Appellation (hao, 호) + surname + cha(자, master). surname + ssi(씨). ect. Married women's names are stated by her husband's surname followed by the Chinese character 부 or 절부 which signifies wife or virtuous women, and then her given name. The works written or compiled by King's order (명찬서) are generally described in the form of 명제신+ functional word of authorship. Names of government agencies are occasionally stated as the authors' for the government publications or government compiled works. The functional words of authorship are described in the phrase of 소작야, 소편야 instead of 저, 찬, ect. It is more noticeable that in the case of the collections of individual writers' works the wording of 지문야, 지시야 is written after the name of the author. More complicated descriptive forms are seen in the entries of works for the shared authorship and mixed responsibility. Two or more than two monographic works of the same author classed in the same class are annotated all together.

  • PDF

A Method for Same Author Name Disambiguation in Domestic Academic Papers (국내 학술논문의 동명이인 저자명 식별을 위한 방법)

  • Shin, Daye;Yang, Kiduk
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.28 no.4
    • /
    • pp.301-319
    • /
    • 2017
  • The task of author name disambiguation involves identifying an author with different names or different authors with the same name. The author name disambiguation is important for correctly assessing authors' research achievements and finding experts in given areas as well as for the effective operation of scholarly information services such as citation indexes. In the study, we performed error correction and normalization of data and applied rules-based author name disambiguation to compare with baseline machine learning disambiguation in order to see if human intervention could improve the machine learning performance. The improvement of over 0.1 in F-measure by the corrected and normalized email-based author name disambiguation over machine learning demonstrates the potential of human pattern identification and inference, which enabled data correction and normalization process as well as the formation of the rule-based diambiguation, to complement the machine learning's weaknesses to improve the author name disambiguation results.

Application of Machine Learning Techniques for Resolving Korean Author Names (한글 저자명 중의성 해소를 위한 기계학습기법의 적용)

  • Kang, In-Su
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.3
    • /
    • pp.27-39
    • /
    • 2008
  • In bibliographic data, the use of personal names to indicate authors makes it difficult to specify a particular author since there are numerous authors whose personal names are the same. Resolving same-name author instances into different individuals is called author resolution, which consists of two steps: calculating author similarities and then clustering same-name author instances into different person groups. Author similarities are computed from similarities of author-related bibliographic features such as coauthors, titles of papers, publication information, using supervised or unsupervised methods. Supervised approaches employ machine learning techniques to automatically learn the author similarity function from author-resolved training samples. So far however, a few machine learning methods have been investigated for author resolution. This paper provides a comparative evaluation of a variety of recent high-performing machine learning techniques on author disambiguation, and compares several methods of processing author disambiguation features such as coauthors and titles of papers.

WordNet-Based Category Utility Approach for Author Name Disambiguation (저자명 모호성 해결을 위한 개념망 기반 카테고리 유틸리티)

  • Kim, Je-Min;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.225-232
    • /
    • 2009
  • Author name disambiguation is essential for improving performance of document indexing, retrieval, and web search. Author name disambiguation resolves the conflict when multiple authors share the same name label. This paper introduces a novel approach which exploits ontologies and WordNet-based category utility for author name disambiguation. Our method utilizes author knowledge in the form of populated ontology that uses various types of properties: titles, abstracts and co-authors of papers and authors' affiliation. Author ontology has been constructed in the artificial intelligence and semantic web areas semi-automatically using OWL API and heuristics. Author name disambiguation determines the correct author from various candidate authors in the populated author ontology. Candidate authors are evaluated using proposed WordNet-based category utility to resolve disambiguation. Category utility is a tradeoff between intra-class similarity and inter-class dissimilarity of author instances, where author instances are described in terms of attribute-value pairs. WordNet-based category utility has been proposed to exploit concept information in WordNet for semantic analysis for disambiguation. Experiments using the WordNet-based category utility increase the number of disambiguation by about 10% compared with that of category utility, and increase the overall amount of accuracy by around 98%.

Automatic Clustering of Same-Name Authors Using Full-text of Articles (논문 원문을 이용한 동명 저자 자동 군집화)

  • Kang, In-Su;Jung, Han-Min;Lee, Seung-Woo;Kim, Pyung;Goo, Hee-Kwan;Lee, Mi-Kyung;Goo, Nam-Ang;Sung, Won-Kyung
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.11a
    • /
    • pp.652-656
    • /
    • 2006
  • Bibliographic information retrieval systems require bibliographic data such as authors, organizations, source of publication to be uniquely identified using keys. In particular, when authors are represented simply as their names, users bear the burden of manually discriminating different users of the same name. Previous approaches to resolving the problem of same-name authors rely on bibliographic data such as co-author information, titles of articles, etc. However, these methods cannot handle the case of single author articles, or the case when articles do not have common terms in their titles. To complement the previous methods, this study introduces a classification-based approach using similarity between full-text of articles. Experiments using recent domestic proceedings showed that the proposed method has the potential to supplement the previous meta-data based approaches.

  • PDF

Review of Author Name Disambiguation Techniques for Citation Analysis (인용분석에서의 모호한 저자명 식별을 위한 방법들에 관한 고찰)

  • Kim, Hyun-Jung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.23 no.3
    • /
    • pp.5-17
    • /
    • 2012
  • In citation analysis, author names are often used as the unit of analysis and some authors are indexed under the same name in bibliographic databases where the citation counts are obtained from. There are many techniques for author name disambiguation, using supervised, unsupervised, or semisupervised learning algorithms. Unsupervised approach uses machine learning algorithms to extract necessary bibliographic information from large-scale databases and digital libraries, while supervised approaches use manually built training datasets for clustering author groups for combining them with learning algorithms for author name disambiguation. The study examines various techniques for author name disambiguation in the hope for finding an aid to improve the precision of citation counts in citation analysis, as well as for better results in information retrieval.

An OSI and SN Based Persistent Naming Approach for Parametric CAD Model Exchange (기하공간정보(OSI)와 병합정보(SN)을 이용한 고유 명칭 방법)

  • Han S.H.;Mun D.H.
    • Korean Journal of Computational Design and Engineering
    • /
    • v.11 no.1
    • /
    • pp.27-40
    • /
    • 2006
  • The exchange of parameterized feature-based CAD models is important for product data sharing among different organizations and automation systems. The role of feature-based modeling is to gonerate the shape of product and capture design intends In a CAD system. A feature is generated by referring to topological entities in a solid. Identifying referenced topological entities of a feature is essential for exchanging feature-based CAD models through a neutral format. If the CAD data contains the modification history in addition to the construction history, a matching mechanism is also required to find the same entity in the new model (post-edit model) corresponding to the entity in the old model (preedit model). This problem is known as the persistent naming problem. There are additional problems arising from the exchange of parameterized feature-based CAD models. Authors have analyzed previous studies with regard to persistent naming and characteristics for the exchange of parameterized feature-based CAD models, and propose a solution to the persistent naming problem. This solution is comprised of two parts: (a) naming of topological entities based on the object spore information (OSI) and secondary name (SN); and (b) name matching under the proposed naming.

Unity and Consistency in the Romanization of Korean Personal Names. (한국인의 로마자 인명 표기의 통일성과 일관성: ≪영어영문학≫게재자를 중심으로)

  • 김혜숙
    • Korean Journal of English Language and Linguistics
    • /
    • v.1 no.3
    • /
    • pp.417-435
    • /
    • 2001
  • The aim of this paper is two-fold. First, it examines the romanization of personal names of the teachers who teach English at a university and compares it with the romanization of the general public to see whether there is a unity between the two groups. Second, it explores whether the teachers romanize their personal names consistently and, if they don't, how differently they romanize their names. The data used in this study are the romanized names of the 313 authors who published their articles in The Journal of English Language and Literature from 1991 to 2000. The study shows that the English teachers and the general public differ in the order of the given name and surname as well as formatting. Most of the English teachers prefer to put their surnames last while the majority of the general public put their surnames first. The English teachers opt Gn-Gn and Gngn whereas the general public select Gn Gn for their given names. However, both groups, in general, spell the surname with the same Roman alphabets. The study also shows that the English teachers frequently reverse the order of the given name and surname, and change the formatting of their given names. They, however, spell their names rather consistently. This result indicates that Koreans may be lenient with the order of the given name and surname and formatting of their given names. However, they will unlikely change the spelling of their names even when a new policy on personal names is promulgated.

  • PDF