new Arrowsmith one-node search tool is live

Smalheiser, Neil Nsmalheiser at PSYCH.UIC.EDU
Mon Dec 8 16:37:11 EST 2014

We have maintained a free, public Arrowsmith two-node search tool for over a decade. Now, at long last, we have implemented a one-node search tool and invite feedback and suggestions.  Whereas the Arrowsmith two-node literature tool is designed for scientists to ASSESS a hypothesis relating two literatures A and C, the one-node tool is designed to help scientists FIND a promising hypothesis in the first place. For example, given an existing drug, one may wish to repurpose it, i.e., find some disease which has NOT previously been treated with the drug, yet may be promising in this regard. Conversely, for a given disease, one may wish to find drugs which have never been used to treat the disease, but which would be promising to try, based on the fact that the disease exhibits a list of multiple phenotypes or molecular alterations that are also affected (in reverse direction) by the drug [1]. Our late colleague Don Swanson had hosted a one-node search tool, but it was small scale and not user-friendly, and the web service was not maintained. We have now programmed a novel version of the one-node search tool that was proposed in [2].

The user enters a PubMed search for the A literature, representing a problem domain (e.g. Huntington disease). Next, the user will be prompted to choose a category of Medical Subject Headings (MeSH) to search within, which encompass a set of literatures describing entities (or classes of entities) that represent possible approaches or solutions to the problem. (Alternatively, the user can choose the Free Format option, to enter a list of PubMed search queries, one on each line.)

For example, to search among different classes of drugs according to their molecular mechanism using the MeSH Tree option, the user would drill down from Chemicals and Drugs to Chemical Actions and Uses to Pharmacologic Actions to finally, Molecular Mechanisms of Pharmacological Action [D27.505.519]. This category includes about twenty classes of drugs, including Alkylating Agents [D27.505.519.124], Angiotensin Receptor Antagonists [D27.505.519.162], Antacids [D27.505.519.170], Antifoaming Agents [D27.505.519.178], and so on. Once the user chooses this MeSH term category, the software will carry out a series of two-node searches, each consisting of A = Huntington disease vs. C = one of the drug classes. These two-node searches are characterized according to the total number of articles in A and C (and nAC, the intersection of A and C), as well as the total number of B-terms and pR, the percentage of B-terms that are predicted to be relevant for meaningful linkage. The two-node search results are all individually stored temporarily by job ID so users can go back without needing to re-run the search each time.

For screening purposes, we suggest that a promising C-literature can be regarded as one that has a very low nAC value (that is, has not been studied much in the context of the A-literature previously) but a very high pR value (that is, the A and C literature share a lot of implicit information [3]). Note that pR values < 0.1 are probably at chance levels whereas a pR value > 0.3 is reasonably high [3]. Once an interesting pair of literatures has been found, then one needs to examine in detail how the B-terms link the A and C-literatures and what it means! Often, an initial one-node search may be followed by additional one-node searches (i.e., choosing progressively more specific MeSH categories). As specific candidates are identified, two-node searches, other literature searches, and pragmatic considerations are needed to assess how promising the candidate is for further study.

References:  1. Swanson DR, Smalheiser NR. An interactive system for finding complementary literatures: a stimulus to scientific discovery.<> Artificial Intelligence 1997; 91: 183-203.
2. Smalheiser NR. Literature-based discovery: beyond the ABCs.<> J. Am. Information Sci. Technol. 2011. 63: 218-224.
3. Torvik VI, Smalheiser NR. A quantitative model for linking two disparate sets of articles in Medline.<> Bioinformatics 2007; 23(13): 1658-1665.
Neil R. Smalheiser, MD, PhD
University of Illinois at Chicago
Psychiatric Institute MC912
1601 W. Taylor Street
Chicago, IL 60612

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the SIGMETRICS mailing list