Hits:
Indexed by:期刊论文
Date of Publication:2009-04-01
Journal:QSAR & COMBINATORIAL SCIENCE
Included Journals:SCIE、Scopus
Volume:28
Issue:4
Page Number:396-405
ISSN No.:1611-020X
Key Words:Androgen receptor; Classification; QSAR; Random forest
Abstract:The purpose of the present study was to develop in silico models allowing for a reliable prediction of androgenic and nonandrogenic compounds based on a large diverse dataset of 205 compounds. As a new classification method, the Random Forest (RF) was applied, its performance to classify these compounds in terms of their Quantitative Structure-Activity Relationships (QSAR) was evaluated and also compared with the widely used Partial Least Squares (PLS) analysis for the dataset. The predictive power of these methods was verified with five-fold cross-validation and an independent test set. For the RF model, the prediction accuracies of the androgenic and nonandrogenic compounds are 81.0 and 77.0% for cross-validation, respectively, averaging 87.3% of correctly classified compounds in the external tests. The PLS is slightly weak, showing an average prediction accuracy of 75 and 74.7% for the cross-validation and external validation, respectively. Our analysis demonstrates that RF is a powerful tool capable of building models for the data and should be valuable for virtual screening of androgen receptor-binding ligands.