• Title/Summary/Keyword: data citation

Search Result 307, Processing Time 0.027 seconds

A Rule-based Approach to Identifying Citation Text from Korean Academic Literature (한국어 학술 문헌의 본문 인용문 인식을 위한 규칙 기반 방법)

  • Kang, In-Su
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.4
    • /
    • pp.43-60
    • /
    • 2012
  • Identifying citing sentences from article full-text is a prerequisite for creating a variety of future academic information services such as citation-based automatic summarization, automatic generation of review articles, sentiment analysis of citing statements, information retrieval based on citation contexts, etc. However, finding citing sentences is not easy due to the existence of implicit citing sentences which do not have explicit citation markers. While several methods have been proposed to attack this problem for English, it is difficult to find such automatic methods for Korean academic literature. This article presents a rule-based approach to identifying Korean citing sentences. Experiments show that the proposed method could find 30% of implicit citing sentences in our test data in nearly 70% precision.

Quality Factor: A new Bibliometric Measure for Assessing the Quality of Faculty Research Performance (Quality Factor: 교수연구업적평가를 위한 새로운 계량 지표)

  • Choi, Eun-Ju;Yang, Kiduk;Lee, Hye-Kyung
    • Journal of Korean Library and Information Science Society
    • /
    • v.47 no.2
    • /
    • pp.287-304
    • /
    • 2016
  • This paper introduces a new bibliometric measure called Quality Factor, which assesses multiple facets of faculty research performance. The computation of Quality factor is based on a combination of publication count, citation count, h-index, and Impact Factor. In order to analyze the relationship between Quality Factor and other bibliometric measures (publication count, citation count, h-index, g-index, Impact Factor), the study collected publication data of 189 Korean Library and Information Science professors from 2001 to 2014 to produce the rankings of the faculty by each bibliometric measure and computed Spearman's rank correlations between the rankings. The overall results showed Quality Factor to be correlated to citation-driven measures (citation count, h-index, g-index), but the scatterplot as well as rank-interval analysis showed Quality Factor to be distinctive and more discriminating than other measures.

A Comparative Study on Interdisciplinarity in the Fields of Science and Technology Based on Journal Citation and Web Link Analyses (학술지 인용과 웹 링크 분석을 통한 과학기술분야의 학제성 비교 연구)

  • Jung, Ho-Yeun;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.3
    • /
    • pp.179-200
    • /
    • 2007
  • This study identifies the interdisciplinary structures of 8 scientific disciplines in science and technology using the data from journal citations and web links, and compares the interdisciplinarity among these scientific disciplines. The interdisciplinarity refers to interdisciplinary connections among scientific fields and the degree of interdisciplinarity is measured by the number of associated fields and the rate of self-citation. A re-arranged classification scheme for science and technology was adopted to identify subject categories of journals and web pages. Web link analysis revealed a few additional interdisciplinary connections that were not identified by the journal citation analysis, thus demonstrating that it is useful means of investigating the interdisciplinarity of scientific fields. Besides, in most of the cases the interdisciplinarity of the engineering fields were found greater than that of the fields in natural sciences in both analyses.

Domain Analysis on Economics by Utilizing Cocitation Analysis of Multiple Authorship (복수저자기반 동시인용분석을 활용한 지적구조 분석: 경제학 분야를 중심으로)

  • Kwak, Sun-Young;Chung, Eun-Kyung
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.1
    • /
    • pp.115-134
    • /
    • 2012
  • The author co-citation analysis is generally based on the frequency of the first author because most citation databases include only the first author in the bibliographic information. In this sense, the purpose of this study is to provide a better knowledge structure by utilizing the multiple authorship of author co-citation analysis. To achieve the purpose of this study, four different data sets are prepared: (1) counting the first author, (2) counting all the author without limiting the total frequency, (3) counting all the author with limiting the total frequency, and (4) counting adjusted frequencies based on the order of author subscription. The findings of this study show that there are clear differences between the knowledge structure counting all the author and the one counting only the first author. In addition, depending on the different methods, there are subtle changes of cluster members for authors.

Citations to arXiv Preprints by Indexed Journals and Their Impact on Research Evaluation

  • Ferrer-Sapena, Antonia;Aleixandre-Benavent, Rafael;Peset, Fernanda;Sanchez-Perez, Enrique A.
    • Journal of Information Science Theory and Practice
    • /
    • v.6 no.4
    • /
    • pp.6-16
    • /
    • 2018
  • This article shows an approach to the study of two fundamental aspects of the prepublication of scientific manuscripts in specialized repositories (arXiv). The first refers to the size of the interaction of "standard papers" in journals appearing in the Web of Science (WoS)-now Clarivate Analytics-and "non-standard papers" (manuscripts appearing in arXiv). Specifically, we analyze the citations found in the WoS to articles in arXiv. The second aspect is how publication in arXiv affects the citation count of authors. The question is whether or not prepublishing in arXiv benefits authors from the point of view of increasing their citations, or rather produces a dispersion, which would diminish the relevance of their publications in evaluation processes. Data have been collected from arXiv, the websites of the journals, Google Scholar, and WoS following a specific ad hoc procedure. The number of citations in journal articles published in WoS to preprints in arXiv is not large. We show that citation counts from regular papers and preprints using different sources (arXiv, the journal's website, WoS) give completely different results. This suggests a rather scattered picture of citations that could distort the citation count of a given article against the author's interest. However, the number of WoS references to arXiv preprints is small, minimizing this potential negative effect.

Some Improvements on H-Index : Measuring Research Outputs by Citations (연구성과 측정을 위한 h-지수의 개량에 관한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.3 s.61
    • /
    • pp.167-186
    • /
    • 2006
  • The h-index, also called as Hirsch-index, is a new tool for measuring research outputs by citations. This h-index is not only easy to calculate, but also robust enough to handle various citation data. After its suggestion by Hirsch in 2005, many researchers applied the h-index to their own areas, and some others tried to improve the weak points of the h- index such as low discriminating power. Firstly, several of these efforts are reviewed in the present article, and then novel indexes are suggested to measure research outputs by citations more fairly and reasonably. Calculating these indexes on both artificial data and real data showed that the newly suggested indexes in this article can replace the h-index and its variants.

Paper Recommendation Using SPECTER with Low-Rank and Sparse Matrix Factorization

  • Panpan Guo;Gang Zhou;Jicang Lu;Zhufeng Li;Taojie Zhu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1163-1185
    • /
    • 2024
  • With the sharp increase in the volume of literature data, researchers must spend considerable time and energy locating desired papers. A paper recommendation is the means necessary to solve this problem. Unfortunately, the large amount of data combined with sparsity makes personalizing papers challenging. Traditional matrix decomposition models have cold-start issues. Most overlook the importance of information and fail to consider the introduction of noise when using side information, resulting in unsatisfactory recommendations. This study proposes a paper recommendation method (PR-SLSMF) using document-level representation learning with citation-informed transformers (SPECTER) and low-rank and sparse matrix factorization; it uses SPECTER to learn paper content representation. The model calculates the similarity between papers and constructs a weighted heterogeneous information network (HIN), including citation and content similarity information. This method combines the LSMF method with HIN, effectively alleviating data sparsity and cold-start issues and avoiding topic drift. We validated the effectiveness of this method on two real datasets and the necessity of adding side information.

Court's Criteria for Judging Research Misconduct and JRPE Goals

  • HWANG, Hee-Joong
    • Journal of Research and Publication Ethics
    • /
    • v.1 no.1
    • /
    • pp.23-28
    • /
    • 2020
  • Purpose: Focusing on Supreme Court precedents, we intend to establish criteria for judging research misconduct. Research design, data and methodology: In addition, I would like to propose the criteria for judging research misconduct by the KODISA, which applies the court's standards well in practice, and guidelines for preventing research misconduct. Research design, data and methodology: After classifying the case of research misconduct into six cases, the court's judgment and practical application will be reviewed. Results: First, research misconduct that has passed the disciplinary prescription can be punished. This is because the state of illegality continues to this day. Second, even if there were no punishment regulations at the time of research misconduct, it can be retroactively punished with the current punishment regulations. This is because research ethics is a universal and common standard and does not change. Third, if there is a fact that infringes on intellectual property rights, it is presumed unwritten intentions. Therefore, the act of taking and using the work of another person without permission or proper citation procedure, even if it is unintentional and for the public interest, is a research misconduct. Fourth, if there is an inappropriate citation notation, the intention of research misconduct is presumed. It is the judgment of the court that even if a quotation is marked, if it is incomplete, it is recognized as plagiarism. Fifth, if the author uses the work of another person without proper source indication, it is plagiarism even if the other person who owns the copyright agrees to it. The understanding or consent of some parties does not justify research misconduct in violation of public trust. Sixth, it is a research misconduct to create a new work without citations for one's previous work. In addition, even if there is a citation, if the subsequent writing is not original, it is a research misconduct. Conclusions: Academia should clarify the scope of research misconduct by referring to the Research Ethics Regulations of KODISA, and deal with research results that lack the value as creative works similar to those of research misconduct.

Research on the type of technology convergence in the medical device industry based on topic modeling and citation analysis (토픽모델링과 인용 분석에 기반한 의료기기 산업의 기술융합 유형 연구)

  • Lee, Seonjae;Lee, Sungjoo;Seol, Hyeonju
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.207-220
    • /
    • 2021
  • Industrial convergence is manifested in various forms by various drivers, and understanding and categorizing the direction of convergence according to the factors in which the convergence occurs is an essential requirement for the establishment of a company's customized convergence strategy and the government's corporate support policy. In this study, the type of convergence is analyzed from the perspective of knowledge flow between heterogeneous technologies, and for this purpose, the result of topic modeling of the text information of the patent and the citation information of the corresponding patent allocated for each topic are used. The methodology presented through case studies in the medical device field is verified. Through the proposed methodology, companies can predict the flow of convergence and use it as decision-making data to create new business opportunities. It is expected that the government and research institutions will be usefully used as basic data for policy preparation.

The Effect of Patent Citation Relationship on Business Performance : A Social Network Analysis Perspective (특허 인용 관계가 기업 성과에 미치는 영향 : 소셜네트워크분석 관점)

  • Park, Jun Hyung;Kwahk, Kee-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.127-139
    • /
    • 2013
  • With an advent of recent knowledge-based society, the interest in intellectual property has increased. Firms have tired to result in productive outcomes through continuous innovative activity. Especially, ICT firms which lead high-tech industry have tried to manage intellectual property more systematically. Firm's interest in the patent has increased in order to manage the innovative activity and Knowledge property. The patent involves not only simple information but also important values as information of technology, management and right. Moreover, as the patent has the detailed contents regarding technology development activity, it is regarded as valuable data. The patent which reflects technology spread and research outcomes and business performances are closely interrelated as the patent is considered as a significant the level of firm's innovation. As the patent information which represents companies' intellectual capital is accumulated continuously, it has become possible to do quantitative analysis. The advantages of patent in the related industry information and it's standardize information can be easily obtained. Through the patent, the flow of knowledge can be determined. The patent information can analyze in various levels from patent to nation. The patent information is used to analyze technical status and the effects on performance. The patent which has a high frequency of citation refers to having high technological values. Analyzing the patent information contains both citation index analysis using the number of citation and network analysis using citation relationship. Network analysis can provide the information on the flows of knowledge and technological changes, and it can show future research direction. Studies using the patent citation analysis vary academically and practically. For the citation index research, studies to analyze influential big patent has been conducted, and for the network analysis research, studies to find out the flows of technology in a certain industry has been conducted. Social network analysis is applied not only in the sociology, but also in a field of management consulting and company's knowledge management. Research of how the company's network position has an impact on business performances has been conducted from various aspects in a field of network analysis. Social network analysis can be based on the visual forms. Network indicators are available through the quantitative analysis. Social network analysis is used when analyzing outcomes in terms of the position of network. Social network analysis focuses largely on centrality and structural holes. Centrality indicates that actors having central positions among other actors have an advantage to exert stronger influence for exchange relationship. Degree centrality, betweenness centrality and closeness centrality are used for centrality analysis. Structural holes refer to an empty place in social structure and are defined as efficiency and constraints. This study stresses and analyzes firms' network in terms of the patent and how network characteristics have an influence on business performances. For the purpose of doing this, seventy-four ICT companies listed in S&P500 are chosen for the sample. UCINET6 is used to analyze the network structural characteristics such as outdegree centrality, betweenness centrality and efficiency. Then, regression analysis test is conducted to find out how these network characteristics are related to business performance. It is found that each network index has significant impacts on net income, i.e. business performance. However, it is found that efficiency is negatively associated with business performance. As the efficiency increases, net income decreases and it has a negative impact on business performances. Furthermore, it is shown that betweenness centrality solely has statistically significance for the multiple regression analysis with three network indexes. The patent citation network analysis shows the flows of knowledge between firms, and it can be expected to contribute to company's management strategies by analyzing company's network structural positions.