1. INTRODUCTION
Metrics/indices play a crucial role for peer-based, metrics-based, or hybrid research evaluation approaches. Selection and usage of indices to appraise quantity and impact of the productive core is a sensitive subject for Research Performance Evaluation (RPE). In evaluative scientometric studies, these parameters are measured by Activity Indicator (AI), Observed Impact Indicator (OII), journal related indices, and/or other newly introduced global indices (h and h-type indices). These indicators stand for the quantity, impact, influence, or quality of the scholarly communication. AI measures the quantity of the productivity core (publication) while OII stands for impact of productivity core (citation and its subsequent metrics).
Disciplinary perspectives, the use of indicators in different contexts, the arbitrary nature of indicators, and electronic publishing scenarios have turned the attention of scientometricians, policymakers, and researchers of other fields to modifying the existing indices and to discovering new metrics to gauge quantity and quality. Citation, its subsequent metrics, and the root indicator publications have a sound place in the decision-making process.
In 2005, Hirsch proposed h-index, which was immediately noticed by the scientometricians and warmly welcomed by all stakeholders. It is defined as: “A scientist has index h if h of his/her Np papers has at least h citations each and the other (Np − h) papers have no more than h citations each” (Hirsch, 2005, p. 16569). The said index aims to measure the impact of scholarly communication in terms of quality (citation) and productivity (publication) in an objective manner. It represents the most productive core of an author’s output in terms of the most cited papers (Burrell, 2007). A continuous debate among scientometricians, policymakers, as well as researchers of other fields has made h-index one of the hottest topics in the history of scientometric research.
2. BACKGROUND OF THE STUDY
Rousseau (2006) introduced the term Hirsch core (h-core), which is a group of highperformance publications with respect to the scientist’s career (Jin, et al., 2007). A good indicator should be intuitive and sensitive to the number of uncited papers (Tol, 2009). Such an index should exceed from h-core papers (Vinkler, 2007) and “must assign a positive score to each new citation as it occurs” (Anderson, et al., 2008). Notwithstanding, h-index also suffers from several implicit disadvantages such as sensitivity to highly cited paper (Egghe, 2006a; Egghe, 2006b; Norris & Oppenheim, 2010), giving more weight to one or few highly cited publications (Glänzel, 2006; Egghe, 2006a; Costas & Bordons, 2007), lacking in sensitivity to performance change (Bihui, et al., 2007), disadvantaging earlier career work (Glanzel, 2006; Burrell, 2007), and being time dependent (Burrell, 2007). While highly cited papers may represent breakthrough results in computing h-index (Vinkler, 2007), this index is also criticized for its lack in accuracy and precision (Lehmann, et al., 2005).
Soon after h-index, several modification and improvements have been proposed. Due to its persuasive nature, the field dependence, self-citation, multi-authorship, and career length were also taken into account (Bornmann, et al., 2011; Norris & Oppenheim, 2009). It is important to note that most of the new indices focused on h-core only, while citation distribution in the head and tail cores remain ignored due to their formulaic limitations (Pathap, 2010; Bornmann, et al., 2011; Zahang, 2009; 2013). Fig. 1 shows the head, tail, and h-core. The publications and citations, which define h-index, are called h-core; whereas publications with citations more/less than h-core are defined as head and core, respectively.
Fig. 1 h, head, and tail core (a modification of Harish’s h-index figure)
The literature reveals that the h-index not only incorporates quantity and quality, but it is also simple, efficient, and has ease in use. It laurels over the other research evaluation metrics due to a blend of objectivity and subjectivity, and its scientific and persuasive nature. This index is insensitive to highly as well as zero cited articles and is robust (van Raan 2006; Cronin & Meho 2006; Imperial & Navarro 2007; Oppenheim 2007; Luz et.al., 2008; Bornmann, et al., 2008; Bouabid & Martin 2008; Lazaridis 2009, Norris and Oppenheim, 2010, Tahira, et al., 2013). These underpinnings have led to the introduction of numerous h-type indices, mostly focused on citation distribution issues. We refer to review studies by Norris and Oppenheim (2010) and Bornmann, et al., (2011). Though h-index has made its place for Research Performance Evaluation (RPE), yet there is a need to address its inherent reservations more intimately and logically.
3. METHODOLOGY
The actor CPP was considered as a multiplicative connection to the Corrected Quality Ratio (CQ) to incorporate the overall quality of production (Lindsey, 1978). The hG-H model used it to link to publications (Schubert & Glänzel, 2007) and in p-index with citation as quantity indicator (Parthap, 2010). We are considering CPP actor to deal with the core issue of the citation distribution as evident in classic h-index. The aim is to address the implicit dimensions of original h-index.
Our proposed index uses ‘Citation Per Publication’ (CPP) as a balancing correction to improve the original h-index underpinnings related to citation distribution issues in the head and tail cores. It is expressed as a multiplicative connection between h and CPP with the geometric mean of these functions (\(\sqrt[3]{h×h×CPP}\) ) (Fig. 1). We employed a geometric mean to compute different functions, which are multiplied together to produce a single “figure of merit” (Spizman & Weinstein, 2008).
Keeping in view the foundation issues of original h-index (see Table 1), we have designed three categories from the proposed h-type indices: modified h-indices, h-type indices dependent on h-core, and h-type indices independent of h-core. These categories are concerned with h-core, head, and tail citation distributions. For the present study, we have considered at least one index from these categories. To avoid redundancy, a few indices which fall in these categories, like hw (Egghe & Rousseau, 2008) and v-index (Riikonen & Vihinen, 2008), are not considered. These selected indices along with the proposed h-cpp index are examined and evaluated to check their performance for evaluation purposes.
Two experiments are conducted at the author level. Jacso wrote a series of articles on pros and cons of popular online referenced enhanced databases e.g. Google scholar, Scopus, and Web of Science (WoS™) (Jacso, 2005a; 2005b; 2008a; 2008b; 2008c). He found WoS™ appropriate for calculating h-index scores (Jacso, 2008c).
The study first refers to the case of the first 100 most productive Malaysian related engineers’ data from WoS™ over a ten year period (2001-2010). Our research term was ‘Malaysia’ and we limited to only those engineering categories from WoS™ that have the word ‘engineering’ in common. The term ‘Malaysian related engineers’ is used for researchers who are affiliated to with11 selected Malaysian universities (> 50 publications) under nine WoS™ engineering categories for document type articles and reviews only. The second data set used as the benchmark is the 100 most prolific economists dataset from Tol’s study (Tol, 2009), with his permission.
4. EMINENCE OF SCIENTISTS
The eminence of scientists is manifested by their activity and impact indicators. Overall, much fluctuation is observed among scientists’ positions when applying the original h, H’, and h-cpp indices. The CPP as a quality measure is criticized owing to its penalizing of high productivity (Hirsch, 2005; Tahira, et al., 2013). This fact is evident in Table 2. We discuss the positioning order of these authors by employing the four Cole and Cole (1973) criteria based on publication and citation behavior of author publishing.
Table 1. Salient Features of h-type Indices of Three Designed Categories
either give some insight or add value in one or another way. Here, the question immediately arises, which index is the best to accomplish different dimensions of performance evaluation, with less reservation, or is there any possible improvement to handle the quantity and quality aspects of research evaluation?
Publication is a base and other measures such as activity, observed, expected, and relative impact indicators are developed from it. Publication is an indicator rather easy to handle and can be manipulated purposely. Eventually, these strategies have effect on impact indices. Let us elaborate the case with four group analysis at author level.
To explore the effect of these strategies on publications and impact behavior, we applied Cole and Cole (1967; 1973) dichotomous cross classification criteria on our 100 most productive Malaysian related engineers’ data. We used Coastas and Bordons’ (2008) denomination of the groups as mentioned in Table 2. We categorized four groups employing the threshold strategy for P and CPP of their fifty percentiles. The median of the ‘total number of documents’ and ‘citations per document rate’ of this case was (P50=17) and (P50=4.6), respectively. Researchers are classified into four groups and are named as ‘top producer,’ ‘big producer,’ ‘selective,’ and ‘silent’ groups (as illustrated in Table 2).
5. DESCRIPTIVE ANALYSIS AND BOX PLOTS ILLUSTRATIONS OF FOUR GROUPS
Selective researchers’ average Citation per Publication (CPP) as calculated from their group data is almost the same as for top producers (8.712 and 8.012) (See Table 3). On the other hand, big and low producer groups have the same average value of CPP (3.285 and 3.287). The four groups of Malaysian related engineers are compared for their performances via box plot illustrations (Fig. 2a-c). In accordance to h and g indices, the plots of the revised index demonstrate a better median for extreme upper and lower values.
6. SIGNIFICANCE IN THE DIFFERENCE BETWEEN TYPES OF SCIENTISTS
Raan empirically concluded that the h-index is not so good for discriminating among excellent and good peer rated chemistry groups. Costas and Bordons (2007) observed that highly visible scientists might be underestimated. The performance evaluation of traditional metrics (total publication and total citation) and h-index is observed to be similar in the case study at institutional level for two groups (RU and non-RU Malaysian universities) in engineering departmental data. We found that only CPP has an exception for RU and non-RU universities (Tahira, et al., 2013). On the other hand, at researcher level, Coastas and Bordons (2008) compared the h and g-indices for four group analysis. They argued that the g-index is slightly better in distinguishing author due to a longer core. Schreiber (2010) also made such observation.
In order to determine if the proposed revision creates any difference between types of scientists, we employed Mann-Whitney U on six variables as shown in Table 4. We hypothesized that these indices are good for discriminating at group level. The test statistics is examined by the Asymptotic Sig. (2-tailed) and Exact Sig. (2-tailed) and with their point of probability. With reference to h and g indices, we can see no significant difference between big producers and selective researchers, whereas the h-cpp does discriminate among all groups including big producers and selective researchers. In Coastas and Bordons’ (2008) case, similar findings were observed for these two groups of Natural Resource Scientists in relation to h and g indices. They argued that the g-index is slightly better because it is sensitive to selective scientists, and this group shows in average a higher g-index/h-index ratio and better positioning in g-index ranking.
Table 2. Typology of Malaysian Related Engineers
Table 3. Descriptive Analysis
Fig. 2 (a-c). Box plot illustrations of h-index, g-index, and h-cpp
Table 4. Statistical Significance in Differences Between Types of Scientists (Mann-Whitney U)
7. VALIDATION OF REVISED INDEX
High correlation is observed in several studies among h-type indices. On the basis of correlation, it is not justified to differentiate and make a difference among the performance of different indices. For the evaluation of models in Table 1, we apply correlation analysis and three stage statistical techniques: Multiple regressions (R) with their Mean Square Error (MSE) and Mean Absolute Error (MAE). MSE and MAE can help out to differentiate the performance of these models better (Willmott & Matsuura, 2005). At first, we evaluate the case of the 100 Malaysian related engineers and after that we re-examine a dataset of the 100 most prolific economists of in Tol’s study for the same set of indices.
8. MALAYSIAN RELATED ENGINEERS CASE
A whole set of h-type indices are considered for the first case (Table 5); the results indicate that all indices show a high correlation with the traditional metrics, but this relation is stronger with the OII. Only H' shows no correlation with AI and A. H' and h-cpp have a high correlation with CPP (>0.8). A-index is h- core dependent, and the last two models address the head and tail citation distribution. On the other hand, g (a modified and a substitute of h-index) and R (h-core dependent) exhibit very good correlation (>0.7), while q2 and hg as composite indices gives >0.7 and >0.2 values with CPP.
The proposed model (h-ccp) exhibits a high significant ‘R’ like other studied indices with the exception from g and R, while low values of MSE and MAE are observed for h-cpp compared to all competitors’ indices (Table 6).
9. PROLIFIC ECONOMICS RESEARCHERS
In the second case (based on Tol’s study), we could evaluate h, g, h-cpp, and hg models due to the non-availability of authors’ all citation data. High order correlation of these indices with OII (C and CPP) is presented in Table 7. It is observed that among all indices, h-cpp shows a better correlation with CPP, whereas for C, the correlation is higher than h and less for g and hg indices.All of the studied models (Table 8) have significantly high values of R (>0.9). Revised index depicts a slightly higher value of R than h and hg indices. However, h-cpp indicates low values of MSE and MAE for all cases. The revised index is intuitively reasonable and simple to compute. The new development provides a better model fit with less statistical errors.
Table 5. Results of Correlation Matrix
Table 6. Results of Regression Analysis
Table 7. Results of Correlation Matrix
Table 8. Results of Regression Analysis
10. CONCLUDING REMARKS
The sole use of CPP as a quality measure is criticized owing to its penalizing of high productivity (Hirsch 2005, Tahira, et al., 2013). When this actor (CPP) is used with other metrics/indices as CQ, it characterizes the scientific output of researchers with aggregated values in a more balanced way as observed in cases of P-index (Parthap, 2009) and recent proposed development h-cpp. This incorporation holds h as representative of ‘Quantity of the Productive Core’ and CPP as ‘Impact of the Productive Core. Previously the actor CPP was used with P and C to equate with the value of h-index’ (Schubert & Glanzel, 2007; Parthap, 2010).
In order to tackle the implicit disadvantages of h-index, we have proposed a revision named h-cpp and empirically examined it for research performance evaluation. The incorporation of CPP as CQ with h-index makes it sensitive to hyper-cited articles, less below the index publications, zero citations, and similar h-index. CPP is a potential actor along with h-index to rectify inaccuracy and unfairness for broader impact. Reflection on h-type indices shows that another potential evaluative composite index is P-index. This composite index incorporates CPP as corrected quality ratio with an assumption that \(h^2\) is nearly proportional to the ‘C,’ and this index assigns more weight to total citations and aims to equate with h-index.
The beauty of the revised index is working closely with the h-index theory and inclusion of the implicit dimensions with a sort of normalization in dataset. Its value can be greater, equal, or less than the classic h-index. A single number cannot reflect all aspects (van Raan, 2005). Although this revision checks the h-index robustness as several other h-type indices: g, hg, \(q^2\) etc., h-cpp as a composite indicator can be more informed, economical, and robust for RPE and incorporates the reservations of a good index for research. The fact that stands out as fundamental is the need to address the existing underpinnings logically to incorporate the reservations of a good index for research evaluation purpose in a single composite number. Another possibility is to bracket CPP with h-index in one set (representing both quantity and impact core) for evaluation purposes rather than use of CQ. We suggest more discussion and analysis at different aggregate levels with various composite indices to explore the dimensions of research activity.
ACKNOWLEDGEMENTS
One of the authors would like to thank Universiti Teknologi Malaysia and the Higher Education Commission of Pakistan for the award of the International Doctoral Fellowship (IDF) and Partial Support Grant.
References
- Alonso, S., Cabrerizo, F. J., Herrera-Viedma, E., & Herrera, F. (2010). hg-index: A new index to characterize the scientific output of researchers based on the h- and g- indices. Scientometrics, 82(2), 391-400. https://doi.org/10.1007/s11192-009-0047-5
- Anderson, T. R., Hankin, R. K. S., & Killworth, P. D. (2008). Beyond the Durfee square: Enhancing the h-index to score total publication output. Scientometrics, 76, 577-588. https://doi.org/10.1007/s11192-007-2071-2
- Bihui, J., LiMing, L., Rousseau, R., & Egghe, L. (2007). The R- and AR-indices: Complementing the h-index. Chinese Science Bulletin, 52(6), 855-963. https://doi.org/10.1007/s11434-007-0145-9
- Bornmann, L., Mutz, R., Hug, S. E., & Daniel, H.D. (2011) A multilevel meta-analysis of studies reporting correlations between the h index and 37 different h index variants. Journal of Informetrics, 5(3), 346-359. https://doi.org/10.1016/j.joi.2011.01.006
- Bornmann, L., Wallon, G., & Ledin, A. (2008). Is the h index related to (standard) bibliometric measures and to the assessments by peers? An investigation of the h index by using molecular life sciences data. Research Evaluation, 17(2), 149-156. https://doi.org/10.3152/095820208X319166
- Bouabid, H., & Martin, B. (2009). Evaluation of Moroccan research using a bibliometric-based approach: Investigation of the validity of the h-index. Scientometrics, 78(2), 203-217. https://doi.org/10.1007/s11192-007-2005-4
- Burrell, Q. L. (2007). On the h-index, the size of the Hirsch core and Jin's A-index. Journal of Informetrics, 1, 170-177. https://doi.org/10.1016/j.joi.2007.01.003
- Cabrerizo, F.J., Alonso, S., Herrera-Viedma, E., & Herrera, F. (2009). q2-Index: Quantitative and qualitative evaluation based on the number and impact of papers in the Hirsch Core. Journal of Informetrics, 4(1), 23-28.
- Costas, R., & Bordons, M. (2007). The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level. Journal of Informetrics, 1, 193-203. https://doi.org/10.1016/j.joi.2007.02.001
- Cronin, B., & Meho, L. (2006). Using the h-index to rank influential information scientists. Journal of the American Society for Information Science and Technology, 57(9), 1275-1278. https://doi.org/10.1002/asi.20354
- Egghe, L. (2006a). An improvement of the h-index: The g-index. ISSI Newsletter, 2(1), 8-9.
- Egghe, L. (2006b). Theory and practice of the g-index. Scientometrics, 69(1), 131-152. https://doi.org/10.1007/s11192-006-0144-7
- Egghe, L., & Rousseau, R. (2008). An h-index weighted by citation impact. Information Processing & Management, 44(2), 770-780. https://doi.org/10.1016/j.ipm.2007.05.003
- Egghe, L. (2012). Remarks on the paper of A. De Visscher, "What does the g-index really measure?" Journal of the American Society for Information Science and Technology, 63(10), 2118-2121. https://doi.org/10.1002/asi.22651
- Glanzel, W. (2006). On the opportunities and limitations of the h-index. Science Focus, 1(1), 10-11.
- Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569-72. https://doi.org/10.1073/pnas.0507655102
- Imperial, J., & Navarro, A. (2007). Usefulness of Hirsch's h-index to evaluate scientific research in Spain. Scientometrics, 71(2), 271-282. https://doi.org/10.1007/s11192-007-1665-4
- Jin, B. H., Liang, L. M., Rousseau, R., & Egghe, L. (2007). The R- and AR-indices: Complementing the h-index. Chinese Science Bulletin, 52(6), 855-863. https://doi.org/10.1007/s11434-007-0145-9
- Jin, B. H. (2006). h-Index: An evaluation indicator proposed by scientist. Science Focus, 1(1), 8-9.
- Lazaridis, L. (2010). Ranking university departments using the mean h-index. Scientometrics, 82(2), 11-16.
- Lehmann, S. A. D., Jackson, & Lautrup, B. (2008). A quantitative analysis of indicators of scientific performance. Scientometrics 76(2), 369-390. https://doi.org/10.1007/s11192-007-1868-8
- Lindsey, D. (1978). The corrected quality ratio: A composite index of scientific contribution to knowledge. Social Studies of Science, 8(3), 349-354. https://doi.org/10.1177/030631277800800307
- Luz, M. P. et. al. (2008). Institutional H-Index: The performance of a new metric in the evaluation of Brazilian Psychiatric Post-graduation Programs. Scientometrics, 77(2), 361-368. https://doi.org/10.1007/s11192-007-1964-9
- Mehta, C. R., & Pate, N. R. (2011). IBM SPSS Exact Tests. IBM.P.236. Retrieved on September, 30, 2013, from http://admin-apps.webofknowledge.com/jcr/jcr?pointofentry= home&sid= 4bowr-74js6miuqyl3av.
- Moed, H. F. (2008). UK research assessment exercise: Informed judgments on research quality or quantity? Scientometrics, 74(1), 153-161. https://doi.org/10.1007/s11192-008-0108-1
- Norris, M., & Oppenheim, C. (2010). The h-index: A broad review of a new bibliometric indicator. Journal of Documentation, 66(5), 681-705. https://doi.org/10.1108/00220411011066790
- Oppenheim, C. (2007). Using the h-index to rank influential British researchers in information science and librarianship. Journal of the American Society for Information Science and Technology, 58(2), 297-301. https://doi.org/10.1002/asi.20460
- Parthap, G. (2010). The 100 most prolific economists using the p-index. Scientometrics, 84(1), 167-172. https://doi.org/10.1007/s11192-009-0068-0
- Riikonen, P., & Vihinen, M. (2008) National research contributions: A case study on Finnish biomedical research. Scientometrics, 77(2), 207-222. https://doi.org/10.1007/s11192-007-1962-y
- Rousseau, R. (2006). New developments related to the Hirsch index. Industrial Sciences and Technology, Belgium. Retrieved from http://eprints.rclis.org/6376/.
- Cole, S., & Cole, J.R. (1973). Social stratification in science. Chicago: University Press.
- Schreiber, M. (2010). Revisiting the g-index: The average number of citations in the g-core. Journal of the American Society for Information Science and Technology, 61(1), 169-174. https://doi.org/10.1002/asi.21218
- Schreiber, M. (2010). Twenty Hirsch index variants and other indicators giving more or less preference to highly cited papers. Retrieved on October 31, 2013, from arxiv.org/pdf/1005.5227 https://doi.org/10.1002/andp.201000046
- Schreiber, M. (2013). Do we need the g-index? Retrieved on October 27, 2013, from http://arxiv.org/ftp/arxiv/papers/1301/1301.4028.pdf
- Schubert, A., & Glanzel, W. (2007). A systematic analysis of Hirsch-type indices for journals. Journal of Informetrics, 1, 179-184. https://doi.org/10.1016/j.joi.2006.12.002
- Spizman, L., & Weinstein. M. A. (2008). A note on utilizing the geometric mean: When, why and how the forensic economist should employ the geometric mean. Journal of Legal Economics, 15(1), 43-55.
- Tahira, M., Alias, R. A., & Bakri, A. (2013). Scientometric assessment of engineering in Malaysian universities. Scientometrics, 96(3), 865-879. https://doi.org/10.1007/s11192-013-0961-4
- Tol, R. S. J. (2009). The h-index and its alternatives: An application to the 100 most prolific economists. Scientometrics, 80(2), 319-326.
- Willmott, C. J., & Matsuura, K. (2005). Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE) in assessing average model performance. Climate Research, 30, 79-82. https://doi.org/10.3354/cr030079
- van Raan, A. F. J. (2005). Fatal attraction: Conceptual and methodological problems in the ranking of universities by bibliometric methods. Scientometrics, 62(1), 133-143. https://doi.org/10.1007/s11192-005-0008-6
- van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics, 67(3), 491-502. https://doi.org/10.1556/Scient.67.2006.3.10
- Vinkler, P. (2007). Eminence of scientists in the light of the h-index and other scientometric indicators. Journal of Information Science, 33, 481-491. https://doi.org/10.1177/0165551506072165
- Zhang, C. T. (2009). The e-index, complementing the h-index for Excess Citations. PLOS ONE 4(5), e5429. https://doi.org/10.1371/journal.pone.0005429
- Zhang, C.T. (2013). The h' Index, effectively improving the h-index based on the citation distribution. PLOS ONE, 8(4), e59912. https://doi.org/10.1371/journal.pone.0059912
Cited by
- Meso-level institutional and journal related indices for Malaysian engineering research vol.107, pp.2, 2016, https://doi.org/10.1007/s11192-016-1850-4