Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Nonlinear Multivariate Regression Outperforms Several Concisely Designed Neural Networks on Three QSPR Data Sets (CROSBI ID 86904)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Lučić, Bono ; Amić, Dragan ; Trinajstić, Nenad Nonlinear Multivariate Regression Outperforms Several Concisely Designed Neural Networks on Three QSPR Data Sets // Journal of chemical information and computer sciences, 40 (2000), 2; 403-413-x

Podaci o odgovornosti

Lučić, Bono ; Amić, Dragan ; Trinajstić, Nenad

engleski

Nonlinear Multivariate Regression Outperforms Several Concisely Designed Neural Networks on Three QSPR Data Sets

Neural networks (NNs) are accepted as the most powerful nonlinear technique in QSAR and QSPR modeling. However, the NN models are often very robust, containing a large number of parameters optimized during the training procedure. We have recently found (J. Chem. Inf. Comput. Sci. 1999, 39, 121-132) that the simpler nonlinear multiregression (MR) models are significantly better than the robust NNs, according to the same statistical parameters. In the present paper we investigated whether the nonlinear MR models are also better than the concisely designed NN models. Nonlinear MR models were generated in the following way. First, nonlinear terms, 2-fold and 3-fold cross-products of initial descriptors, were calculated and added to initial descriptors. Then, the combination of two powerful techniques for descriptor selection (CROMRsel for the best selection and CROMRiisel for approximative, i by i stepwise selection) were used to detect the most important descriptors in MR models. For boiling points (BPs) of 150 alkanes the 20-descriptor MR model produced the cross-validated (CV) standard error of 2.88 K, and the best NN model (with 70-80 adjusted weights) had 3.60 K. Prediction of BPs of 50 compounds using the 17-descriptor MR model (obtained on 100 compounds) gave the standard error of 3.58 K. In the case of modeling of 243 chemical shifts CV standard errors were (in ppm) 0.89 and 1.19 with 15- and 9-descriptor MR models, respectively. The best NN models adjusted 60-90 weights and achieved 1.42 ppm. The standard error in predicting the 83 chemical shifts using the 10-descriptor MR model obtained on 160 samples was 1.25 ppm. It is also shown in this data set that the model quality depends on the scaling procedure used for transformation of the initial descriptors. In modeling of sublimation enthalpy the CV correlation coefficient was 0.97 using the best 4-descriptor MR model versus 0.93 obtained using NN with 50 adjusted weights. The CV correlation coefficient in predicting the sublimation enthalpies for 21 compounds using the 4-descriptor MR model was 0.98. This is, to our knowledge, the first unambiguous result which shows a way for obtaining nonlinear MR models having better fitted, cross-validated, and predictive performances than the corresponding NN models. Moreover, the nonlinear MR models are significantly simpler than the NN models, which allows one to establish the functional relationships between the modeled property/activity and descriptors.

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

40 (2)

2000.

403-413-x

objavljeno

0095-2338

Povezanost rada

Kemija

Indeksiranost