Methods for building linear regression equations in the “structure-property” problems
Abstract
The application of different alternative approaches for building linear regression equations in tasks which are connected with description of physicochemical parameters of molecules has been described. The Ordinary Least Squares, the Least Absolute Deviation, and the Orthogonal Distances methods are among the chosen approaches. In tasks, connected with multicollinearity of predictor sets, the principle component regression and L2-regularization have been applied. The special attention has been given to those approaches that made possible to reduce the number of predictors (the L1-regularization, the Least Angles methods). In case of data with noticeable errors in both dependent and independent variables, the orthogonal distance method has been examined as an alternative to the least square approach. The adequacy of previously investigated least absolute deviation of orthogonal distances (LADOD) method has been demonstrated.
Downloads
References
Reinhard M., Drefahl A. Handbook for Estimating Physicochemical Properties of Organic Compounds / New York, John Wiley & sons, inc. – 1999. – 238 p.
Kubinyi H. QSAR: Hansch Analysis and Related Approaches / New York, VCH. – 1993. – 240 p.
Statisticians of the centuries (eds. Heyde C. C., Seneta E.) / New York, Springer-Verlag. – 2001. – 500 p.
Demidenko E. Z. Lineynaya i nelineynaya regressii / M., Finansy' i statistika. – 1981. – 301 s. [in Russian]
Louson CH., Henson R. Chislennoe reshenie zadach metoda naimen'shih kvadratov / M., Nauka. - 1986. - 232 s. [in Russian]
Bloomfield P., Steiger W. L. Least Absolute Deviations. Theory, Applications and Algorithms / Boston, Birkhäuser. – 1983. – 349 p.
Mudrov V. I., Kushko V. L. Metod naimen'shih moduley / M., Znanie. - 1971. - 59 s. [in Rus-sian]
Mudrov V. I., Kushko V. L. Metody' obrabotki izmereniy / M., Sovetskoe radio. - 1976. - 190 s. [in Russian]
Tikhonov A. N., Arsenin V. Y. Solutions of ill-posed problems / New York, John Wiley & Sons. – 1977. – 270 p.
Morozov V. A. Regulation Methods for ill-posed problems / New York, CRC Press. – 1993. – 273 p.
Geladi P., Kowalski B. R. // Analytica Chimica Acta. – 1986. – V. 185. – P. 1-17.
Handbook of Partial Least Squares (eds. Vinzi V. E., Chin W. W.) / New York, Springer-Verlag. – 2010. – 798 p.
New Perspectives in Partial Least Squares and Related Methods (eds. Abdi H., Chin W. W., Vinzi V. E., et. al.) / New York, Springer. – 2013. – 344 p.
Tibshirani R. // J. Roy. Statist. Soc. 1996. – B58, № 1. – P. 267–288.
Hastie T., Tibshirani R., Wainwright M. Statistical Learning with Sparsity. The Lasso and Generalizations / L., CRC Press. – 2015. – 335 p.
Zou H., Hastie T.// J. R. Statist. Soc. B. – 2005. – V. 67, Part 2. – P. 301–320.
Efron B., Hastie T., Johnstone I., Tibshirani R. // The Annals of Statistics. – 2004. –V. 32, № 2. – P. 407–451.
Tibshirani R. J. // Electronic Journal of Statistics. –2013. – V. 7. – P. 1456–1490.
Miller A., Subset Selection in Regression / New York, Chapman & Hall CRC. – 2002. – 234 p.
Rozenfel'd B. A. Mnogomerny'e prostranstva / M., Nauka. - 1966. - 547 s. [in Russian]
Onijuk N. O., Ivanov V. V., Panteleymonov A. V., Holin YU. V. // Methods and Objects of Chemical Analysis.- 2017.- V. 12, №. 3. - P. 105-111. [in Russian]
Ryan T. P. Modern Regression Methods / New York, Wiley. – 2008. – 672 р.
Veerasamy R., Rajak H., Jain A., Sivadasan S. // Int. J. Drug Design and Discovery. – 2011. – V. 2, № 3. – Р. 511-519.
Golbraikh A., Tropsha A. // J. Mol. Graph. And Mod. – 2002. – V. 20. – P. 269-276.
Consonni V., Ballabio D., Todeschini R. // J. Chem. Inf. Model. – 2009. – V. 49. – P. 1669-1678.
Schmidt M. W., Baldridge K. K., Boatz J. A., Elbert S. T., et al. // J. Comput. Chem. – 1993. – V. 14, № 1. – P. 1347-1363.
Yap C. W. // Comput. Chem. – 2011. – V. 32, № 7. – P. 1466–1474.
Albert A., Serjeant E. P. The Determination of the Ionization Constants. A Laboratory Manual / London, Chapman & Hall. – 1984. – 218 p.
Zefirov N. S. and Palyulin V. A. // J. Chem. Inf. Comput. Sci. – 2001. – V. 41. – P. 1022-1027.
Platts J. A., Butina D., Abraham M. H., and Hersey A. // J. Chem. Inf. Comput. Sci. – 1999. – V. 39. – P. 835-845.
Todeschini R., Consonni V. Handbook of Molecular Descriptors / New York, Wiley-VCH Verlag. – 2000. – 667 p.
Suzuki T., Ohtaguchi K., Koide K. // Computers Сhem. Eng. – 1996. – V. 20, № 2. – Р. 161-173.
Liptak M. D., Gross K. C., Seybold P. G., et al // J. Am. Chem. Soc. –2002. – 124, Р. 6421-6427.
Efron B., Tibshirani R. J., An Introduction to the Bootstrap / New York, Chapman & Hall CRC, 1993, 436 p.
Burnham K. P., Anderson D. R. Model Selection and Multimodel Inference: A Practical In-formation-Theoretic Approach / New York, Springer. – 1998. – 488 p.