Faceted classification, was Patent Classification as Indicators of Cognitive Structures

David E. Wojick dwojick at HUGHES.NET
Fri Mar 23 16:49:31 EDT 2007

So-called faceted taxonomies, etc., are indeed a reflection of the 
fact that there are multiple, important ways to classify the same 
information. I know of no formal mathematical definition of faceting, 
though there may be one. My theory is designed to deliniate all 
possible structures, taxonomic and otherwise, in a body of 
information. Many of the classification systems in use today combine 
and confuse multiple, distinct stuctures. By "structure" I mean the 
application of a single well-defined relation, of which there are a 
great many.


It suggests to me that it might be a faceted classification system - 
or a system that is applied in a faceted manner.

For the purposes of a sort of thought experiment, we might consider 
MeSH to be faceted.  For an article on say leg fractures of adult 
human males we might have fractures, bone
femur OR Fibula OR Patella OR Tibia
femoral neck fractures, etc.
maybe even "activity"/adverse effects or some such...

If you tried to cluster on, say, femur you would get a weak 
co-classification because there are indeed many articles on leg 
breaks, but also bone cancer and genetic anomalies, etc.

Maybe I should read the article :)

Christina K. Pikas, MLS
R.E. Gibson Library & Information Center
The Johns Hopkins University Applied Physics Laboratory
Voice  240.228.4812 (Washington), 443.778.4812 (Baltimore)
Fax 443.778.5353

From: ASIS&T Special Interest Group on Metrics 
[mailto:SIGMETRICS at LISTSERV.UTK.EDU] On Behalf Of David E. Wojick
Sent: Friday, March 23, 2007 11:38 AM
Subject: Re: [SIGMETRICS] Patent Classification as Indicators of 
Cognitive Structures

0px; MARGIN-BOTTOM: 0px }
I have a general theory of the structure of information that may be 
relevant. I have not published it but there is a brief essay on it 

It suggests that in typical cases there will be many useful ways to 
systematically organize or classify a given body of information. If 
so then there is probably no single way that will be generally useful 
or representative. A working instance of this principle is the NASA 
taxonomy system, with 11 independent taxonomies. However, many of the 
underlying structures are not taxonomies, nor even tree-structures. 
Many are networks with convergence as well as tree-like divergence.

Interestingly, since some of these structures are based on the way 
the things the information is about are related to one another, some 
of the most important structures are unknown until science finds them.

David Wojick

  info for SIGMETRICS (for example unsubscribe): 

<http://www.leydesdorff.net/wip06>Patent Classifications as 
Indicators of Cognitive Structures

Paper to be presented at the Annual Meeting of the

Society for the Social Studies of Science (4S),

Montreal, October 2007

·       <http://www.leydesdorff.net/wipo06/wipo06.pdf>pdf-version of 
the full paper

<http://www.leydesdorff.net/wipo06/paper/index.htm>html-version of 
the full paper

Using the 138,751 patents filed in 2006 under the Patent Cooperation 
Treaty, co-classification analysis is pursued on the basis of three- 
and four-digit codes in the International Patent Classification (IPC, 
8th edition). The initial hypothesis that classifications might be 
considered as the organizers of patents into classes, and that 
therefore co-classification patterns would be useful for mapping, is 
discarded in favor of using co-word analysis among titles of patents. 
The classifications hang weakly together, even at the four-digit 
level; at the country level, more specificity can be made visible. 
The co-classifications among the patents enable us to analyze and 
visualize the relations among technologies at different levels of 
aggregation. However, countries are not the appropriate units of 
analysis because patent portfolios are largely similar in many 
advanced countries in terms of the classes attributed.

The following files are input files for Pajek based on the cosines 
between the 4-digit classifications for each country separately and 
for the complete set ("World"):

<http://www.leydesdorff.net/wipo06/world.zip>World (135,536 patents; zipped)

<http://www.leydesdorff.net/wipo06/AD.txt>Andorra (4 patents)
<http://www.leydesdorff.net/wipo06/AE.txt>United Arab Emirates (15 patents)
<http://www.leydesdorff.net/wipo06/AG.txt>Antigua and Barbuda (4 patents)
...., etc.

<http://www.leydesdorff.net/wipo06/ES.txt>Spain (1114 patents)
<http://www.leydesdorff.net/wipo06/FI.txt>Finland (1651 patents)
<http://www.leydesdorff.net/wipo06/FR.zip>France (6958 patents; zipped)
...., etc.

<http://www.leydesdorff.net/wipo06/NL.txt>Netherlands (3287 patents)
<http://www.leydesdorff.net/wipo06/NO.txt>Norway (665 patents)
<http://www.leydesdorff.net/wipo06/NZ.txt>New Zealand (444 patents)

....., etc.

Loet Leydesdorff
Amsterdam School of Communications Research (ASCoR)
Kloveniersburgwal 48, 1012 CX Amsterdam
Tel.: +31-20- 525 6598; fax: +31-20- 525 3681
<mailto:loet at leydesdorff.net>loet at leydesdorff.net ; 

Now available: 
Knowledge-Based Economy: Modeled, Measured, Simulated. 385 pp.; US$ 
Self-Organization of the Knowledge-Based Society; 
Challenge of Scientometrics


"David E. Wojick, Ph.D." <WojickD at osti.gov>
Senior Consultant -- The DOE Science Accelerator 
A strategic initiative of the Office of Scientific and Technical 
Information, US Department of Energy

(540) 858-3150
391 Flickertail Lane, Star Tannery, VA 22654 USA
http://www.bydesign.com/powervision/resume.html provides my bio and 
client list.
presents some of my own research on information structure and 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/sigmetrics/attachments/20070323/370a226d/attachment.html>

More information about the SIGMETRICS mailing list