Release Time:2019-03-09 Hits:
Indexed by: Journal Article
Date of Publication: 2009-04-01
Journal: QSAR & COMBINATORIAL SCIENCE
Included Journals: Scopus、SCIE
Volume: 28
Issue: 4
Page Number: 396-405
ISSN: 1611-020X
Key Words: Androgen receptor; Classification; QSAR; Random forest
Abstract: The purpose of the present study was to develop in silico models allowing for a reliable prediction of androgenic and nonandrogenic compounds based on a large diverse dataset of 205 compounds. As a new classification method, the Random Forest (RF) was applied, its performance to classify these compounds in terms of their Quantitative Structure-Activity Relationships (QSAR) was evaluated and also compared with the widely used Partial Least Squares (PLS) analysis for the dataset. The predictive power of these methods was verified with five-fold cross-validation and an independent test set. For the RF model, the prediction accuracies of the androgenic and nonandrogenic compounds are 81.0 and 77.0% for cross-validation, respectively, averaging 87.3% of correctly classified compounds in the external tests. The PLS is slightly weak, showing an average prediction accuracy of 75 and 74.7% for the cross-validation and external validation, respectively. Our analysis demonstrates that RF is a powerful tool capable of building models for the data and should be valuable for virtual screening of androgen receptor-binding ligands.