Affiliations 

  • 1 Faculty of Information Science and Technology, National University of Malaysia, 43600 UKM Bangi, Malaysia
J Chem Inf Model, 2010 Aug 23;50(8):1340-9.
PMID: 20672867 DOI: 10.1021/ci1001235

Abstract

This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures' fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.