Monev V. "Introduction to similarity searching in chemistry" MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, (51): 7-38 APR 2004

Eugene Garfield garfield at CODEX.CIS.UPENN.EDU
Mon Aug 9 17:34:20 EDT 2004


E-mail Address: vmonev at orgchm.bas.bg
Reprints in pdf format are available on request from the author

The author has cited a paper by Eugene Garfield....
"Reaction Similarity and Retrieval" Current Contents #5, p.3-5, January 30,
1995.
The url for this paper has changed. The correct url is :
http://www.isinet.com/essays/chemicalliterature/15.html/


Author(s): Monev, V

Title: Introduction to similarity searching in chemistry

Source: MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY,
(51): 7-38 APR 2004


Abstract:
The similarity concept and its database implementation - similarity
searching, are overviewed in the context of chemoinformatics.
Similarity is defined in terms of matches/overlap, dissimilarity in terms of
mismatches/difference, for qualitative/quantitative characteristics.
Similarity, dissimilarity and composite measures are constructed from
similarity or/and dissimilarity components. Asymmetric measures are
constructed by unequal weighting of dissimilarity components. Whole objects
or local regions of them are compared, yielding global or local similarity.
Asymmetric local similarity is obtained by treating the objects in the
comparison unequally, e.g. by ignoring parts of them. Global characteristics
provide overall descriptions of objects, local characteristics provide
sufficient locational information for object alignment/superposition to be
effected. Similar objects are likely to have similar properties - similar
property principle.
In chemical similarity searching, molecules, fragments of molecules,
reactions, mixtures, journal articles, etc. are selected as objects of
interest. The selection of characteristics and their encoding is illustrated
using the atom pair and topological torsion descriptors, as well as their
variants of increased fuzziness. Similarity measure selection is still very
much a matter of trial and error. Standard query object specification is
made easier by using query by example, multiple searches using a single
query yield a highly informative hyperlinked screen, and joint queries
involve more than one object. Similarity scores illustrate results from
similarity searches and measures of their effectiveness. Areas of
application include direct and reverse property prediction, data mining,
virtual screening, diversity analysis, pharmacophore searching, ligand
docking, structure elucidation, pattern matching, and signature analysis.
Addresses: Bulgarian Acad Sci, Inst Organ Chem, BU-1113 Sofia, Bulgaria
Reprint Address: Monev, V, Bulgarian Acad Sci, Inst Organ Chem, BU-1113
Sofia, Bulgaria.

Cited References:
2001, SPSS REFERENCE MANUA.
*FT T, 2003, 9 INT C FUZZ THEOR T.
*MDDR, 2002, MOL DES DRUG DAT REP.
AMAT L, 2001, J CHEM INF COMP SCI, V41, P978.
BILGIC T, 1999, HDB FUZZY SETS SYSTE, V1, P195.
BRADSHAW J, 1997, INTRO TVERSKY SIMILA.
BRADSHAW J, 2001, EUROMUG01.
CARBO R, 1980, INT J QUANTUM CHEM, V17, P1185.
CARBO R, 1987, INT J QUANTUM CHEM, V32, P517.
CARBODORCA R, 1998, THEOCHEM-J MOL STRUC, V451, P11.
CARHART RE, 1985, J CHEM INF COMP SCI, V25, P64.
CHEN X, 1999, J CHEM INF COMP SCI, V39, P887.
CIOSLOWSKI J, 1991, INT J QUANTUM CHEM Q, V25, P81.
CIOSLOWSKI J, 1991, J AM CHEM SOC, V113, P64.
CIOSLOWSKI J, 1991, J AM CHEM SOC, V113, P6756.
CIOSLOWSKI J, 1992, THEOCHEM, V255, P9.
CIOSLOWSKI J, 1993, J AM CHEM SOC, V115, P11213.
CIOSLOWSKI J, 1998, ENCY COMPUTATIONAL C, V2, P892.
COMMITTEE NRC, 1995, MATH CHALLENGES THEO.
COOPER DL, 1989, J COMPUT AID MOL DES, V3, P253.
CRIPPEN GM, 1999, J COMPUT CHEM, V20, P1577.
DELANEY MF, 1985, J CHEM INF COMP SCI, V25, P27.
DIXON SL, 1999, J MED CHEM, V42, P2887.
DOWNS GM, 1996, REV COMP CH, V7, P1.
FISANICK W, 1994, J CHEM INF COMP SCI, V34, P130.
FRATEV F, 1979, J MOL STRUCT, V56, P245.

GARFIELD E, 2002, REACTION SIMILARITY.
http://www.isinet.com/essays/chemicalliterature/15.html/

GASTEIGER J, 2003, HDB CHEMINFORMATICS.
GILLET VJ, 1998, J CHEM INF COMP SCI, V38, P165.
GILLET VJ, 1999, J CHEM INF COMP SCI, V39, P169.
GILLET VJ, 2002, COMBINING DIFFERENT.
GOWER JC, 1985, ENCY STATISTICAL SCI, V5, P397.
GRETHE G, 1990, J CHEM INF COMP SCI, V30, P511.
HAGADONE TR, 1992, J CHEM INF COMP SCI, V32, P515.
HOLLIDAY JD, 2002, COMB CHEM HIGH T SCR, V5, P155.
HORVATH D, 2000, ACTUAL CHIMIQUE, V9, P64.
HULL RD, 2001, J MED CHEM, V44, P1177.
HULL RD, 2001, J MED CHEM, V44, P1185.
JAMES CA, 2000, DAYLIGHT THEORY MANU.
JOHNSON MA, 1990, CONCEPTS APPL MOL SI.
KEARSLEY SK, 1996, J CHEM INF COMP SCI, V36, P118.
KLEIN DJ, 1997, J CHEM INF COMP SCI, V37, P656.
KOCHEV N, 2003, CHEMOINFORMATICS TXB, P291.
MESTRES J, 1997, J COMPUT CHEM, V18, P934.
MEZEY PG, 1999, MOL PHYS, V96, P169.
NILAKANTAN R, 1987, J CHEM INF COMP SCI, V27, P82.
PONEC R, 1990, COLLECT CZECH CHEM C, V55, P896.
RAREY M, 1998, J COMPUT AID MOL DES, V12, P471.
RAREY M, 2001, J COMPUT AID MOL DES, V15, P497.
RHODES N, 2000, J CHEM INF COMP SCI, V40, P210.
RICHARDS WG, 1988, CHEM BRIT, V24, P1141.
ROBINSON DD, 1999, J CHEM INF COMP SCI, V39, P594.
ROBINSON DD, 2000, J CHEM INF COMP SCI, V40, P503.
SANTINI S, 1999, IEEE T PATTERN ANAL, V21, P871.
SHERIDAN RP, 1996, J CHEM INF COMP SCI, V36, P128.
SHERIDAN RP, 1998, J CHEM INF COMP SCI, V38, P915.
SHERIDAN RP, 2000, J CHEM INF COMP SCI, V40, P1456.
SHERIDAN RP, 2001, J CHEM INF COMP SCI, V41, P1395.
SHERIDAN RP, 2002, J CHEM INF COMP SCI, V42, P103.
SINGH SB, 2001, J MED CHEM, V44, P1564.
SKVORTSOVA MI, 1998, J CHEM INF COMP SCI, V38, P785.
SNEATH PHA, 1966, J THEOR BIOL, V12, P157.
SNEATH PHA, 1973, NUMERICAL TAXONOMY.
SZABO A, 1989, MODERN QUANTUM CHEM.
TRINAJSTIC N, 1986, INT J QUANTUM CHEM Q, V20, P699.
VOIGT JH, 2001, J CHEM INF COMP SCI, V41, P702.
WANG P, 1996, IEEE T SYST MAN CY B, V26, P321.
WILLETT P, 1987, SIMILARITY CLUSTERIN.
WILLETT P, 1998, ENCY COMPUTATIONAL C, P2748.
WILLIAMS A, 2000, CURR OPIN DRUG DISC, V3, P298.

Cited Reference Count: 70
Times Cited: 0

Publisher: UNIV BAYREUTH, DEPT MATHEMATICS
Publisher Address: C/O PROF DR A KERBER, D-95440 BAYREUTH, GERMANY

ISSN: 0340-6253
29-char Source Abbrev.: MATCH-COMMUN MATH COMPUT CHEM
ISO Source Abbrev.: Match-Commun. Math. Cmput. Chem.
Source Item Page Count: 32
Subject Category: CHEMISTRY, MULTIDISCIPLINARY; COMPUTER SCIENCE,
INTERDISCIPLINARY APPLICATIONS; MATHEMATICS, INTERDISCIPLINARY APPLICATIONS
ISI Document Delivery No.: 833JD



More information about the SIGMETRICS mailing list