Understanding Core Data - Baselines & Percentiles

Essential Science Indicators Data Information


Baselines are measures of cumulative citation frequency across large groups of papers that provide expected citation rates for groups or tiers of papers. Since citation frequency is highly skewed, with many infrequently cited papers and relatively few highly cited papers, average citation rates should not be interpreted as representing the central tendency of the distribution, but rather as guidelines or benchmarks. Similarly, percentiles, or other fixed percentage cuts, indicate the citation rates for specific top segments of the citation distribution.


The term "percentile" denotes a citation threshold at or above which a fixed fraction of papers fall. Usually meaning a 1 percent cut, the term percentile is used here to denote any fixed fraction of top papers ordered by citation count. The levels we have selected for listing by field and year are 0.01%, 0.1%, 1.0%, and 10%.


Types of items counted

Papers are defined as regular scientific articles, review articles, proceedings papers, and research notes. Letters to the editor, correction notices, and abstracts are not counted. Only Clarivate Analytics-indexed journal articles or papers are counted.

Journals included

Essential Science IndicatorsSM counts are based on an Clarivate Analytics journal set (see complete journal list for Essential Science Indicators) categorized into 22 broad fields. Fields are defined by a unique grouping of journals, with no journal being assigned to more than one field. The Multidisciplinary field contains journals such as Science and Nature which in an article level classification would be assigned to specific fields. This should be taken into account when analyzing the field ranking of an individual scientist, institution, or country.

Time period for counts

The count period for baselines is 10 years, plus partial-year counts for the current year (data is updated six times a year). For the all-years counts any papers in the 10+ year period can be cited by any items in that same period. For individual year counts citations are cumulated from the beginning year to the end of the 10+ year period. Clarivate Analytics database years are used to define the time periods, that is, when items entered the Clarivate Analytics database.

Average Citation Rates

Average citation rates are calculated for each year of the 10-year period, based on a culmination of citations from the year of publication to the current year. (Averages are calculated by adding up the citation counts of individual papers and dividing by the number of papers.) An average for the full 10-year period is also given in "All Years." Rates are given for individual fields or all fields combined.

An average of 10.3 for physics in 1991 means that on average, papers in physics journals were cited 10.3 time from 1991 to the present. Ten-year averages for each field (or all fields) can be used as baselines for the citations per paper values given in the scientist, institution, country, and journal rankings, provided that the entity published papers over the same 10-year period.

Field averages for individual years can be used to compare the performance of individual papers published in the given year, whether those papers are among the highly cited papers listed in Essential Science IndicatorsSM or papers from the Web of Science®.


The distribution of citations over papers is highly skewed, approximating a power law distribution, with relatively many infrequently cited papers and few highly cited papers. One method for making a selection is to rank papers in descending order by citation frequency, and select the top fraction of papers. The percentile table shows the citation count threshold for four different percentile cuts for each field and year, as well as all fields.

For example, a threshold of 44 citations for 1993 papers in astrophysics will select about 1 percent of the 1993 papers in the astrophysics journal set.

