Statistical confidence for variable selection in QSAR models via Monte Carlo cross-validation

Konovalov, Dmitry A., Sim, Nigel, Deconinck, Eric, Vander Heyden, Yvan, and Coomans, Danny (2008) Statistical confidence for variable selection in QSAR models via Monte Carlo cross-validation. Journal of Chemical Information and Modeling, 48 (2). pp. 370-383.

[img] PDF (Published version)
Restricted to Repository staff only

View at Publisher Website: http://dx.doi.org/10.1021/ci700283s

Abstract

A new variable selection wrapper method named the Monte Carlo variable selection (MCVS) method was developed utilizing the framework of the Monte Carlo cross-validation (MCCV) approach. The MCVS method reports the variable selection results in the most conventional and common measure of statistical hypothesis testing, the P-values, thus allowing for a clear and simple statistical interpretation of the results. The MCVS method is equally applicable to the multiple-linear-regression (MLR)-based or non-MLR-based quantitative structure-activity relationship (QSAR) models. The method was applied to blood-brain barrier (BBB) permeation and human intestinal absorption (HIA) QSAR problems using MLR to demonstrate the workings of the new approach. Starting from more than 1600 molecular descriptors, only two (TPSA(NO) and ALOGP) yielded acceptably low P-values for the BBB and HIA problems, respectively. The new method has been implemented in the QSAR-BENCH v2 program, which is freely available (including its Java source code) from www.dmitrykonovalov.org for academic use.

Item ID: 8714
Item Type: Article (Refereed Research - C1)
ISSN: 1549-9596
Date Deposited: 25 Feb 2010 22:50
FoR Codes: 01 MATHEMATICAL SCIENCES > 0104 Statistics > 010401 Applied Statistics @ 100%
SEO Codes: 97 EXPANDING KNOWLEDGE > 970101 Expanding Knowledge in the Mathematical Sciences @ 100%
Citation Count from Web of Science Web of Science 18
Downloads: Total: 2
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page