Query on usage research and logfiles

Phil Davis pmd8 at CORNELL.EDU
Tue Dec 2 09:30:17 EST 2008

You'll need to remember that a) interface will affect download counts; 
and b) different sites count differently

a). Even the same content found on different publisher's websites will 
show different patterns of usage [1]

b). Different sites often use different protocols for filtering out 
downloads from known indexing robots (eg. Google), double-clicks, and 
the like. For example, we found that known robots were responsible for 
nearly half of the downloads for free online articles (compared to 
subscription access articles) within the first six-months [2]

As a result, simply aggregating multiple data sources is not going to 
give you a consistent readership count.

[1] Davis, P. M., & Price, J. S. (2006). eJournal interface can 
influence usage statistics: implications for libraries, publishers, and 
Project COUNTER. Journal of the American Society for Information Science 
and Technology, 57(9), 1243-1248. http://arxiv.org/abs/cs.IR/0602060

[2] Davis, P. M., Lewenstein, B. V., Simon, D. H., Booth, J. G., & 
Connolly, M. J. L. (2008). Open access publishing, article downloads and 
citations: randomised trial. BMJ, 337, 586-. 

--Phil Davis

Armbruster, Chris wrote:
> Adminstrative info for SIGMETRICS (for example unsubscribe):
> http://web.utk.edu/~gwhitney/sigmetrics.html
> Some promising usage research based on logfiles of publishers, repositories etc. has been going on. 
> I have two queries in respect of comparative usage research:
> - If I want to compare usage across two or more repositories (or publishers’ sites), how important is the uniformity of logfile formats or production? If logfiles are configured differently, how difficult is it to achieve comparable data?
> - If I want to compare usage for a specific item across repositories and publishers’ sites (i.e. a pre-print versus the published version): What might be suitable procedures for doing so?
> And a more general query: 
> Could specific logfile configurations make it problematic to understand the source and nature of usage?
> Thanks.
> Chris Armbruster

Philip M. Davis
PhD Student
Department of Communication
301 Kennedy Hall
Cornell University, Ithaca, NY 14853
email: pmd8 at cornell.edu
phone: 607 255-2124

More information about the SIGMETRICS mailing list