We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

University Library

Using publication and citation metrics appropriately

Many publication or citation metrics have both strengths and weaknesses which may determine how appropriate or relevant they are to use for the purpose identified. These vary from identified bias in existing metrics, to the quality and appropriateness of the data used in their calculation.

What can influence citation metrics?

When using any citation metrics, you should be aware of what factors citation indicators may be affected by.

Document Type Research Career Stage
Subject or Discipline Time since publication
Research Type Source of Citation Data
Equality and Diversity Size of citation dataset

See also: Tahamtan, I., Safipour Afshar, A. & Ahamdzadeh, K. Scientometrics (2016) 107: 1195. doi:10.1007/s11192-016-1889-2 (subscription access only).

Document type: in general terms, the type of output can affect the number of citations an output might be expected to accrue.

  • Citation rates differ between document types, for example between monographs and journal articles.
  • Different types of article follow different citation rates: review papers generally attract more citations than non-review papers.
  • Different 'standard' metrics may include or exclude differing document types, and may not be appropriate in some subject areas.

For example, see Glänzel, W. & Moed, H.F. Scientometrics (2002) 53: 171. doi:10.1023/A:1014848323806 (page 8, table 3), Hamarfelt, B Journal of the American Society for Information Science and Technology (2010) 62(5): 819-830. doi:10.1002/asi.21504 or Torres-Salina, D et al Online Information Review (2013) 38(1). doi:10.1108/OIR-10-2012-0169

Subject or Discipline: Publication and citation rates vary across disciplines, and are not directly comparable. This can be illustrated if comparing the aggregate JIF or Citescore for different subject categories side by side.

Aggregate Citescore by subject, May 2017

JIF 2016 by subject

Research Type: The type of research may have an impact upon the potential citation rate it may attract.

  • Applied research has been observed to attract fewer citations on average than "basic" or "pure" research in some fields.
  • One example study (Fawcett, 2012) indicates that biologists tend to cite very mathematical based biological science papers less frequently than more qualitative papers, irrespective of the comparative ‘impact’ on subsequent research.

Read further: Yegros-Yegros, A. and Rafols, I. PLoS One (2015) 10(8): e0135095. doi:10.1371/journal.pone.0135095 and Fawcett, T and Higginson, A. D. PNAS (2012) 109(29). 11735-11739. doi:10.1073/pnas.1205259109.

Gender inequality: There is a growing body of evidence suggest various aspects of gender bias and inequality within the academic community, and this extends to publication and citation rates; women, on average, publish less than men, are less likely to be listed as either the first or last author, and are less likely to be involved in international collaborations (which tend to attract a greater number of citations). A global cross-disciplinary study also suggested that those papers with female authors in dominant author positions receive on average fewer citations in comparison.

Read further: West, J. D. et al PLoS One (2013) 8(7): e66212. doi:10.1371/journal.pone.0066212, Symonds, M. R. E. et al PLoS One (2006) 1(1): e127. doi:10.1371/journal.pone.0000127 and Sugimoto, C. et al. Nature (2013) 504. 211-213. doi:10.1038/504211a.

Research Career Stage: The career stage of an author can impact both on citation rate, and the appropriateness of some metrics.

  • There is an element of "prestige" of an author and the so-called Matthew effect whereby the more citations an author has, the more likely they are to accrue further citations in part due to their prominence and prestige within their field of research.
  • Some metrics, such as the h-index, are measures of both impact and productivity, and may not fully reflect the impact of an author with only a few publications to their name.

Further reading: See Bornmann, L. et al Journal of Informetrics (2012) 6(1). 11-18. doi:10.1016/j.joi.2011.08.004, Collet, F. et al Strategic Organization (2014) 12(3). 157-179. doi:10.1177/1476127014530124 or for an alternative conclusion, Wang, J Journal of Informetrics (2014) 8(2). 329-339. doi:10.1016/j.joi.2014.01.006.

Time since publication: Citations are accrued over time, and thus the date at which a metric is calculated, and the date range which citation is collected from, will affect the outcome.

  • Citations may take longer to appear in some disciplines than they might in others.
  • It can be difficult to assess the citation impact for more recent publications which have not had time to be disseminated and assimilated in to the research conversation.
  • A research area may rise and fall in popularity over time, meaning that citation accrual may not always reflect a current subject norm.

Source of citation data: Scopus, Web of Science and Google Scholar provide different coverage, both in terms of publications indexed, types of publication indexed and the date coverage of those publications. This will impact on any metrics calculated from these data-sets: your h-index as calculated using data from Scopus will be different to that if calculated using citation data from Google Scholar.

Size of dataset: Most outputs do not attract large numbers of citations; a few attract many citations and thus "inflate" the average of the dataset as a whole. As no source of citation date is complete, all citation-based metrics are calculated from a sample of the complete data. The smaller the 'sample' dataset, the more extreme outliers are likely to have a greater impact on any metrics which use the arithmetic mean of the dataset.

If a metric is liable to be affected in this way, the 'confidence' interval for that value will likely grow the smaller the dataset, to the point where direct comparison may potentially result in the opposite of what appears to be presented actually being correct.

For example, is 15 (X) definitely greater than 14 (Y)?

If you can only be 95% confident that each value is correct within a range of +/- 8% (e.g. you are 95% confident that X has a value within the limits of 13.8 and 16.2, and Y has a value within the limits of 12.88 and 15.12), does that change how the metric should (or could) be used? In this example, within the stated confidence interval it could be true that X was actually 13.9 and Y was 15.1.

Why are publications cited?

"The popular view that citation rate is a measure of scientific quality is not supported by the bibliometric expert community. Bibliometricians generally see the citation rate as a proxy measure of scientific impact or of impact on the relevant scientific communities. This is one of the dimensions of scientific or scholarly quality."

'The Metric Tide: Literature Review' (July 2015)

If the rate of citation is to be seen as an indicator or proxy for impact or quality, then this assumes that a citation is awarded in recognition of the contribution that research has made. In most cases, a citation is made to reference an idea, a conclusion, a methodology or observed data which underpins, contributed to or is required to understand the research being reported in the citing publication.

But a publication may have been cited for a variety of reasons:

  • Citation to highlight, correct or refer to research found to be, or later considered to be, of poor or flawed methodology, conclusion or application of research theory.
  • Citation to satisfy an expected high profile peer reviewer
  • Citation to "assist" a colleague (or the author) by highlighting their previous research and 'boosting' their citation rates.
  • Citation to provide an impression of a wider community/audience interested in the topic.
  • Citation as part of any informal (and unethical) citation cartel activity: see examples of this identified problem here, here, here and here.

It should be recognised that these 'negative' or 'false' citations may then be included in various calculation of metrics as 'positive' indicators of a publications 'impact' or 'value'.

H-index and M-index

Example: An author has published 22 publications. Of these publications, at least 8 have received at least 8 citations each. The author does not have 9 publications which have received at least 9 citations. Therefore, that author has an h-index of 8. The author has been actively publishing for 4 years since their first article was published. Therefore, the author has an m-index of 5.5.

For a detailed definintion of these metrics, see our Overview of metrics pages.

For a useful review and critique of the h-index, see Barnes, C (2017) 'The h-index Debate: An Introduction for Librarians' The Journal of Academic Librarianship, 43(6) pp487-494, doi:10.1016/j.acalib.2017.08.013.

Limitations of the h-index

  • The h-index does not take into account differences in rate or publication and citation across disciplines, or fields of study within the same discipline, and so is of limited use as a comparative metric for authors across different research areas.
  • The importance of an author's highly cited papers may not be reflected by an author's h-index. In the example above, one of the author's papers may have been cited over 1000 times, but if they only had 7 other publications with at least 8 citations, then this is not reflected in their h-index.
  • An author's h-index can never be higher than the number of papers they have published. Therefore, the h-index is less favourable to early career researchers with fewer publications to their name.
  • An author with a "high" h-index for their subject discipline will not see any negative impact on their h-index should they stop publishing, or fail to be cited, for several years. An h-index will not therefore accurately reflect the impact of any decline in productivity or relevance of research.
  • The h-index does not account for those who work part-time or may take a career break. This may include those who work outside of academia and so publish less frequently for a period, but is also more likely to discriminate against women or those with health difficulties or caring responsibilities.
  • In disciplines where papers may have very high numbers of authors ("kilo-papers"), the h-index does not take into account the level of the contribution of an author to that paper.
  • An author's h-index is dependent upon the data source used, so will vary between Web of Science, Scopus and other sources.

Limitations of the m-index

  • The m-index is often not a stable metric for early career researchers. For early career researchers with a lower h-index, small changes in their h-index may have more significant changes on the m-index than might be seen for a mid-late career researcher.
  • As with the h-index, in disciplines where papers may have very high numbers of authors ("kilo-papers"), the m-index does not take into account the level of the contribution of an author to that paper.
  • The m-index does not correct the h-index propensity to discriminate against those who may work part-time or have career interruptions (often women), and may increase the effect of such bias.
  • Also as with the h-index, highly cited papers may not be reflected, no account is taken for differences in rate or publication and citation across disciplines, or fields of study within the same discipline and it is dependent upon the data source used, so will vary between Web of Science, Scopus and other sources.

Journal level metrics

"The Journal Impact Factor ... was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article. With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment."
The San Francisco Declaration on Research Assessment, (2012), Available at

Citation distribution Subject Classification Potential for gaming
Numerators & Denominators Negative citations Encouragement of negative behaviours

Discussion and use of Journal Metrics

There is some discussion across the academic community around the when, where and how of using journal level metrics as a basis for any evaluation of the research output of an individual author or group of authors.

However, it remains to be the case that in many situations, a value is placed upon where an author has published, not just what they have published, and this may impact upon your career as a researcher.

  • Recruitment panels may officially, or unoffiiclaly, use the perceived 'ranking' of a journal to support long- or short- listing of candidates.
  • Some national governments, including China, Turkey and South Korea, have in the past offered financial incentives to publish in 'high impact' journals.
  • Some University rankings use journal level metrics to a certain extent, such as the Shanghai Academic Ranking of World Universities.
  • Whilst officially journal level metrics such as the Journal Impact Factor (JIF) were not used as a means of assessment in the UK's Research Excellence Framework (REF), in some discipline areas there was a correlation between the REF2014 results and the JIF of the journals from which articles were submitted. This may incentivise individuals and institutions as to where they may aim to publish their research in the future.

Read further: Casadevall, A & Fang, F. C. mBio (2014) 5(2). doi:10.1128/mBio.00064-14, Franzoni, C. et al Science (2011) 333(6043). doi:10.1126/science.1197286, [Editorial] Nature (2016) 535(466). doi:10.1038/535466a.

Distribution of citations

“The vast majority of the journal’s papers — fully 85% — have fewer citations than the average. The impact factor is a statistically indefensible indicator of journal performance.”

Professor Stephen Curry, Imperial College London (2012)

The Journal Impact Factor (JIF), Citescore and other metrics present a measure of the 'average citations per article' a journal received over a set period. However, the distribution of citations to articles are often highly skewed, with some very highly cited articles and many articles which may not have received any citations at all. Many argue, therefore, that use of such journal level metrics are a poor indicator of the quality of an individual article.

One counter argument to this is that journals with a high Citescore or JIF are often more able to be selective of which articles they publish.

Read further: Lariviere, V et al. bioRxiv (2016) doi:10.1101/062109, Colquhoun, D Nature (2003) 423, De Marchi, M & Rocchi, M Scientometrics (2001) 51(2) doi:10.1023/A:1012705818635, Blanford, J Journal of Materials Science (2016) 51(23) doi:10.1007/s10853-016-0285-x

Limitations of subject classifications

There are notable differences in publication and citation culture between disciplines, in terms of how frequently authors publish, in what format they publish, and what and how frequently they cite previous literature.

If this is accepted, it is easy to understand it is not possible to compare an author publishing predominantly in history journals against an author publishing predominantly in economics journals, based purely on the JIF or Citescore of the journals in which they have published; this does not offer a useful comparison as it does not take into account these subject differences.

An additional difficulty can be in comparing journals within a single subject classification, which may not reflect the specialisation of a particular journal or research area. For example:

  • The top-ranked journal (by 2016 JIF) in the JCR category "History" is the Economic History Review. The Journal of the History of Sexuality is ranked 50th within the same category, but it would be inappropriate to compare research published in these journals based on their JIF.
  • The top-ranked journal (by Citescore as at May 2017) in the ASJC category "History" is Social Studies of Science. The Journal of African Archaeology is ranked 48th within the same category, but it would be inappropriate to compare research published in these journals based on their Citescore ranking.

Potential for Gaming

There is a recognised potential for ‘influencing’ the journal impact factor of a journal, which may be in the interests of editors, publishers or authors with a stake in the particular journal. Not all activities are necessarily unethical, but do have an impact on the citation rate to articles in the journal.

For example:

  • Encouraging the citation of particular articles, or articles in particular journals, will influence journal metrics calculated for that journal.
  • Selection (or rejection) of particular types of study which generally attract less citations, but are still valuable to the scientific community.
    • For example, confirmatory or negative studies, irrespective of the quality of the research.
  • Editorial or comment pieces to direct attention to particular articles.
  • Suggesting, requesting or requiring the inclusion of citation to particular articles or studies previously published within the same journal (or linked journals).

Whilst not common, each year the new JCR edition is usually accompanied by a short list of titles which have been “suppressed” from that year’s edition. Whilst the JCRs do not assume any motive on behalf of any party, this is sometimes due to concerns about attempts to influence citation culture or journal metrics, and action is taken by evaluating the most recent year’s citation data to identify extreme outliers in citation behaviour, which have in the past included patterns in pairs of ‘donor’ and ‘recipient’ journals which show increased concentrations of citations between published articles.

Read further: Journal Citation Reports: Title Suppressions, Van Noorden, R. Nature Blogs (2012), Falagas, M. E. and Alexiou, V. G. Archivum Immunologiae et Therapiae Experimentalis (2008) 56(223) doi: 10.1007/s00005-008-0024-3

Numerators and denominators

Differences between how metrics such as JIF and Citescore are calculated can be reflected in different rankings based upon the type of articles published in a journal and the types of article a particular metric may deem ‘citeable’.

One criticism of some journal level metrics is that some journal content (for example, letters, editorial or commentary material and other ‘front matter’) is not deemed ‘citeable’, and so these article types are not included in the ‘denominator’ of any journal level metric calculation – but any citations those articles do attract may still be counted in the ‘total citations’ to the journal (the numerator). In some disciplines there has been a trend towards these type of materials increasingly attracting more citations than they may have previously.

This can lead to journals with a higher proportion of ‘front matter’ performing higher in some journal level metrics rankings (for example, by Journal Impact Factor) whilst performing comparatively less well in other rankings (Citescore), and the difficulties in offering a clear comparison within each of the ranking systems.

Read further: Bergstrom, C. T. and West, J., Gasparyan, A. Y. et al JKMI (2017) 32(2) doi:10.3346/jkms.2017.32.2.173, Liu, X-L, Gai, S-S and Zhou, J PLoS One (2016) 11(3) doi:10.1371/journal.pone.0151414

Negative citations

An article can attract a large number of citations due to disagreement with the findings, methodology and or conclusions. This is all a valid and essential part of the scholarly discussion.

However, where these citations are caused because of a significant flaw or failure in the research reported in the article, unless that article is redacted (which may not be justified), those citations will still contribute to the aggregated total of citations used in the calculation of most journal level metrics.

This has the potential for a journal which published a significantly flawed, but not redacted, research paper, to see a boost in its journal level metrics, in part because of the citations to that paper.

Encouragement of negative behaviours

One of the greatest concerns amongst the academic community about journal level metrics is not about the reliability of the metrics or the limitations of their use, but about incentivising some negative behaviours in authors and the communication of research.

Such behaviours might include:

  • Influence on choice of journal based upon perception of journal level metrics as mark of quality, rather than the appropriateness of the journal for reaching the target audience.
  • Potential for authors working in multi-disciplinary research to be undo undue pressure to publish in journals based on a perception of ‘good’ journal level metrics which do not match their own specific field of study.
  • Pressures on editors and publishers to select articles for publication based on likely level of citation rather than quality or value of research.
  • The use of ‘fake’ Impact Factors by some predatory journals, and the levying of page and colour diagram charges by some ‘high impact’ journals.
  • Impacts upon recruitment, promotion, evaluation and selection of researchers where the ‘venue’ of publication might carry undue weight in comparison to the actual quality of their research.

Read further: See the San Francisco Declaration on Research Assessment or, if you like your research delivered with a certain level of ‘humour for neuroscientists’, Paulus, F PLoS One (2015) doi: 10.1371/journal.pone.0142537.

Article level metrics

When using any publication or citation metrics, using a range of metrics is always adviseable in preference to reliance on a single metric. Below are some of the issues to consider when looking at some of the metrics which you may come across or wish to use.

Publication and Citation counts

  • Be aware of differences in publication and citation cultures across disciplines.
  • An Early Career Researcher will usually have less publications than a senior academic, face different barriers to getting their research published, and be less likely to attract citations to their authored articles than a well-known author.
  • See also all the influences on citation metrics as summarised in the Overview section.

Citation Impact and FWCI

As with any metric calculated on (or based on the calculation) the aggregated mean of set of data, outlying values can have a disproportionate effect in smaller datasets.

You are therefore advised to use such metrics with caution for use to assess the performance of an individual author, or a small set of publications.

Outputs in Top Journals

If there are already issues identified in the calculation of various journal level metrics, and the ranking of journals based on these metrics, then these also impact on any metric which looks at these as a metric for an author or group of publication. IN addition, other considerations should be taken into account:

  • Researcher's publishing in new subject areas may be served better by publishing in new or niche journals which do not always rank highly, if at all, in some journal rankings which require several years of publication and citation data to feature in the rankings.
  • In some subject areas, such journal metrics are of less or little importance, or do not reflect the range of subjects across a discipline effectively.
  • Where high impact journals charge publication costs, some authors may be priced out of the ability to publish in journals without funder or institutional support.

Your Academic Liaison Librarian

James Bisset

Senior Manager
Library Research Services

0191 334 1589

DU Library Blog

Metrics Top Tips

  1. Always use quantitative metrics together with qualitative inputs, such as expert opinion or peer review.
  2. Always use more than one quantitative metric to get the richest perspective. 
  3. If comparing entities, normalise the data to account for differences in subject area, year of publication and document type.

See our pages on Responsible Metrics for further information.