[Eurchap] Using Search Engines and Web Crawlers for Web Research

Thelwall, Mike (Dr) M.Thelwall at wlv.ac.uk
Thu Jun 17 11:23:58 EDT 2004


Accepted abstract from the forthcoming ASIST-AoIR workshop
http://www.asis.org/Chapters/europe/announcements/AoIR.htm

Using Search Engines and Web Crawlers for Web Research

Mike Thelwall
School of Computing and Information Technology, University of Wolverhampton,
Wulfruna Street, Wolverhampton WV1 1SB, UK. 

The importance of the Web for publishing and seeking information is a
powerful argument for the need for social scientists to develop sound
methods to analyse it. This abstract suggests some lessons that web
researchers can learn from link analysis in university web spaces.
Experts in citation analysis have conjectured that hyperlinks may be usable
in an analogous way to citations: counting links to web pages/web
sites/countries could be used to measure their online impact or to track the
flow of scholarly communication. After many studies of this issue,
hyperlinking between university web sites has been shown to be very
different to citations in patterns of use. The findings, summarised below,
point to the care needed to interpret any hyperlink data.
	Why are inter-university links created? Relatively few, probably
less than 1%, are directly equivalent to citations in the sense of
connecting two scholarly pieces of work equivalent to journal articles or
conference papers. Many links are purely symbolic, not referring to
information but acknowledging a relationship between the source and target
university, such as joint membership of a research project. About 90% of
inter-university hyperlinks indicate some kind of scholarly or educational
connection, however, with the rest relating to recreational, administrative
or support activities.
	Are inter-university link counts useful? Counts of links to a
university can be used to estimate its research productivity, or, if this is
known, can be used to assess whether its web presence is at the level to be
expected for its research productivity.
	For those analysing hyperlinks outside of academic sites, the most
important lesson that can be learned from this research is the need to
assess the meaning and value of link counts through (a) investigations of
the purpose of random samples of individual links, and (b) correlating link
counts with related measures that are hypothesised to be similar. Both
approaches are needed: researchers should not assume that obvious reasons
for link creation are the ones that are used in practice.





More information about the Eurchap mailing list