Hits:
Indexed by:期刊论文
Date of Publication:2011-03-25
Journal:ANALYTICA CHIMICA ACTA
Included Journals:PubMed、SCIE、EI
Volume:690
Issue:1
Page Number:53-63
ISSN No.:0003-2670
Key Words:P2Y(12); Genetic algorithm; Support vector machine; Descriptor selection
Abstract:Presently, a genetic algorithm (GA)-support vector machine (SVM) coupled approach is proposed for optimizing the 2D molecular descriptor subset generated for series of P2Y(12) (members of the G-protein-coupled receptor family) antagonists, with the statistical performance and efficiency of the model being simultaneously enhanced by SVM kernel-based nonlinear projection. As we know, this is the first QSAR study for prediction of P2Y(12) inhibition activity based on an unusually large dataset of 364 P2Y(12) antagonists with diversity of structures. In addition, three other widely used approaches, i.e., partial least squares (PLS), random forest (RF), and Gaussian process (GP) routines combined with GA (namely, GA-PLS, GA-RF, GA-GP, respectively) are also employed and compared with the GA-SVM method in terms of several rigorous evaluation criteria. The obtained results indicate that the GA-SVM model is a powerful tool for prediction of P2Y(12) antagonists, producing a conventional correlation coefficient R-2 of 0.976 and R-CV(2) (cross-validation) of 0.829 for the training set as well as R-pred(2) of 0.811 for the test set, which significantly outperforms the other three methods with the average R-2 = 0.894, R-CV(2) = 0.741, R-pred(2) = 0.693. The proposed model with excellent prediction capacity from both the internal to external quality should be helpful for screening and optimization of potential P2Y(12) antagonists prior to chemical synthesis in drug development. (C) 2011 Elsevier B.V. All rights reserved.