The funder identification problem

David Wojick dwojick at CRAIGELLACHIE.US
Thu May 29 16:20:45 EDT 2014

Dear all,
Here is an excerpt from the May 8 issue of Inside Public Access that gives 
the flavor of the funder identification problem. It is a very interesting 

"The complexity of Federal Funder identification

The core challenge in the US Public Access program is to precisely identify 
the funders of the research that leads to a given journal article. This 
sounds easy but it can be a difficult and complex process. The US 
Government is a vast and complex organization, with hundreds of different 
offices sponsoring research. Moreover, each office can be referred to in 
many different ways, creating a major name disambiguation problem in the 
funder data.

CHORUS and FundRef are attacking this funder identification problem using a 
standardized menu of funder names and DOIs. The basic idea is that the 
submitting author will pick out the standard names of all the offices that 
contributed to the research that underlies the submitted article. Again 
this sounds simple but it is not, because building a comprehensive taxonomy 
of all possible funders is far from simple.

To begin with they have elected to build this menu to identify all the 
funders in the world, not just the US Federal funders. As a result the menu 
of funders already has six thousands names and it will probably have many 
thousands more before it stabilizes. The size of the funder list alone thus 
creates a big discovery problem, because many funders have similar names.

Then there is the hierarchy problem, especially within the vast US 
Government complex. Funding offices occur at many different scales, which 
are arranged within one another in the tree-like organization chart. For 
example in the US Energy Department there may be five or more layers of 
funding offices. Saying which layer should be named in the funding data for 
a given article is not simple. Moreover if offices in different layers are 
named for different articles, then the resulting data will have to somehow 
be aggregated by layer in order to be useful. To make matters worse there 
are also cross cutting programs that involve multiple offices. In short any 
taxonomy of US Federal funding offices is going to be a complex system, not 
a simple listing.

Given these complexities it may be better to have an editor name the 
funders based on the acknowledgements section of the article, rather than 
presenting the author with a complex taxonomy of possible funders. There 
seems to be some experimentation in this direction, but it is a labor 
intensive solution. The question is also whether the resulting data would 
be accurate enough for agency purposes; given that acknowledgement has been 
a relatively informal process. There is also the question of when to 
collect this funder data, given the labor involved. Should it be upon 
submission or after acceptance?"


At 06:30 AM 5/29/2014, you wrote:
>Stevan, the well established fact that you do not like the US Public 
>Access program and CHORUS is somewhat beside the point. I am tracking what 
>is actually happening in the US, not what you wish would happen. There are 
>about 20 federal agencies preparing to implement Public Access, 
>representing perhaps $100 billion/year in funding (we really do no know 
>how much leads to journal articles). To my knowledge none of them is going 
>to do it your way.
>There are however some big bibliometric issues here. Linking articles to 
>funding should provide for new forms of bibliometric assessment of agency 
>and research program performance. At this point we do not even know which 
>research funding programs are leading to journal articles, much less their 
>impact. It is a whole new world to explore.
>But getting accurate article funding data is turning out to be difficult, 
>in part due to the incredible complexity of the Federal funding system. In 
>the CHORUS pilot they found a high incidence of cases where the FundRef 
>funder data did not match the article acknowledgement funder statements. 
>Solving this funder data problem is now a major effort, one I am tracking 
>In fact to me the bibliometric issues are far more interesting than the OA 
>issues. The bibliometric community should be more heavily involved in the 
>US Public Access program. The agency offices that are designing the 
>various agency programs know very little about bibliometrics, because the 
>have never dealt with journal articles before. They mostly process final 
>research reports. Thus they are not thinking about how the funder data 
>will be used for performance evaluation; rather their focus is on 
>providing access, getting the articles in and out the door, as it were.
>I think performance evaluation is going to be a very big deal, because of 
>the huge sums involved in Federal R&D.
>>>David Wojick, Ph.D.
>>>Inside Public Access
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the SIGMETRICS mailing list