Dear colleagues,

In another context, some of us had an exchange about the special report
"Changing US Output of Scientific Articles: 1988-2003" which the NSF issued
in 2007. (The data is rather similar to the ones used in the Indicators
series of the NSF.) Since the problems are of a more general nature and the
public interest involved, I decided to post a lightly edited version of my
contribution to this debate to this list.

The issues relate nicely to my recent paper in JASIST about the Caveats of
using scientometric indicators. While the paper in JASIST focuses on the use
of citations in evaluations, this contribution mentions some of the problems
when counting publications. Some of these discussions have been around for
more than twenty years without having been resolved. 

Best wishes,  Loet


Dear colleague,

Thank you for your extensive reaction to our correspondence. I find the work
of the US-NSF in this area very useful and frequently use the Indicators
series. However, I am aware of the flaws in the database used, while the
general public is not. You may wish to reflect about how this data is used,
as sometimes different from the intentions of your agency to provide neutral

For example, in your report on "Changing Output" you and your coauthors
state on p. 3 "What prompted this study?" and provide an alarming curve.
This curve is partly an artifact of the major development in the science
system during the 1990s, that is, international coauthorship relations.
Using fractional counting, this hits the US stronger than the EU because
most international coauthorship relations within the EU are among EU member
states. The impossibility to distinguish between "real" decline and this
effect using fractional counting has been discussed in the literature.

You make reference to some of that literature on p. 2 when you state:
"Nonetheless, these more targeted studies and others like them, which use
other, more theoretically driven analytical models and less-comprehensive
databases, suggest significant avenues for further research." Some of the
authors cited indeed use less-comprehensive databases, but others use the
full databases. Other authors who have used the full database have argued
about the limitation of the fixed set, but are never cited by NSF studies.
You may take my own studies since the late 1980s as a case in point, but let
me also point to the Hungarian center.

A major difference is the issue about the fixed and changing journal sets.
While everybody can control the changing (ISI) journal set, the fixed
journal set used by your organization is difficult to access or reconstruct.
More fundamentally, I have elaborated on the analytical point about fixing a
set in my paper "Dynamic and Evolutionary Updates of Classificatory Schemes
in Scientific Journal Structures," [Journal of the American Society for
Information Science and Technology (JASIST), 53(12) (2002) 987-994]. I
provide arguments why the journal set should be fixed ex post and not ex
ante: in a dynamic set one updates and fixes with hindsight, that is based
on the current understanding of categories. This point has been completely
ignored by your organization.

The most amazing thing for me is that you and your colleagues provide data
in this report till 2003 while it was issued in 2007. At the same time, I
presented a paper at the ISSI conference in Madrid (June 2007) entitled "Is
the United States losing ground in science? A global perspective on the
world science system," coauthored with Caroline Wagner (SRI; cc) and
accepted for publication by Scientometrics in April 2007 [available from my
website]. We provided data including 2006, as everyone in this specialty
nowadays has data available for 2007. The JCR of the ISI appear in June of
the next year, and I can imagine that your organization also needs half a
year for the data processing. However, your data run several years behind
and make them relatively unuseful as a source of reliable statistics for
policy making. (Nevertheless, policy makers may find them useful if they fit
their purposes.)

If you would have studied the more recent statistics, you would have noticed
that the EU and the US exhibit similar development trends during the last
decade. (It sometimes seems almost like the EU and the US are increasingly
coupled systems.) Let me attach/insert the latest data:

The linear growth of South-Korea may be bending off (as expected), but the
exponential growth of the Chinese contribution is still continuous. The US
contribution is growing less than the EU, but in the above mentioned article
you find the arguments about weaknesses in the EU (which are more worrysome
than for the US, in my opinion). The data for this graph was collected on
January 21, 2008, in collaboration with my coauthor Ping Zhou (cc; only
articles + reviews + notes + letters, ISI-dataset).
In summary, my suggestions for improvements in your series of studies would
be (since you asked for this): 
1. try to extend the data used by the NSF with the latest available; make
the statistics current;
2. be aware of the effects of international collaboration on fractional
counting. (I expect major developments using fractional counting because of
the changes in patterns of international collaboration in recent years.
Caroline Wagner and I are working on a study about this.)
3. if you wish to use fixed journal sets, fix them ex post (see my paper
about this; we do this now regularly in detailed studies);
4. provide also references to publications which were critical about
previous reports of your agency.
With best wishes, 

