Methods for building linear regression equations in the “structure-property” problems

Keywords: The Least Squares Method, Least Absolute Deviations method, regularization, The Principle Component Regression, Orthogonal Distance method, Physical-Chemistry molecular properties

Abstract

The application of different alternative approaches for building linear regression equations in tasks which are connected with description of physicochemical parameters of molecules has been described. The Ordinary Least Squares, the Least Absolute Deviation, and the Orthogonal Distances methods are among the chosen approaches. In tasks, connected with multicollinearity of predictor sets, the principle component regression and L2-regularization have been applied. The special attention has been given to those approaches that made possible to reduce the number of predictors (the L1-regularization, the Least Angles methods). In case of data with noticeable errors in both dependent and independent variables, the orthogonal distance method has been examined as an alternative to the least square approach. The adequacy of previously investigated least absolute deviation of orthogonal distances (LADOD) method has been demonstrated.

Downloads

Download data is not yet available.

References

Reinhard M., Drefahl A. Handbook for Estimating Physicochemical Properties of Organic Compounds / New York, John Wiley & sons, inc. – 1999. – 238 p.

Kubinyi H. QSAR: Hansch Analysis and Related Approaches / New York, VCH. – 1993. – 240 p.

Statisticians of the centuries (eds. Heyde C. C., Seneta E.) / New York, Springer-Verlag. – 2001. – 500 p.

Demidenko E. Z. Lineynaya i nelineynaya regressii / M., Finansy' i statistika. – 1981. – 301 s. [in Russian]

Louson CH., Henson R. Chislennoe reshenie zadach metoda naimen'shih kvadratov / M., Nauka. - 1986. - 232 s. [in Russian]

Bloomfield P., Steiger W. L. Least Absolute Deviations. Theory, Applications and Algorithms / Boston, Birkhäuser. – 1983. – 349 p.

Mudrov V. I., Kushko V. L. Metod naimen'shih moduley / M., Znanie. - 1971. - 59 s. [in Rus-sian]

Mudrov V. I., Kushko V. L. Metody' obrabotki izmereniy / M., Sovetskoe radio. - 1976. - 190 s. [in Russian]

Tikhonov A. N., Arsenin V. Y. Solutions of ill-posed problems / New York, John Wiley & Sons. – 1977. – 270 p.

Morozov V. A. Regulation Methods for ill-posed problems / New York, CRC Press. – 1993. – 273 p.

Geladi P., Kowalski B. R. // Analytica Chimica Acta. – 1986. – V. 185. – P. 1-17.

Handbook of Partial Least Squares (eds. Vinzi V. E., Chin W. W.) / New York, Springer-Verlag. – 2010. – 798 p.

New Perspectives in Partial Least Squares and Related Methods (eds. Abdi H., Chin W. W., Vinzi V. E., et. al.) / New York, Springer. – 2013. – 344 p.

Tibshirani R. // J. Roy. Statist. Soc. 1996. – B58, № 1. – P. 267–288.

Hastie T., Tibshirani R., Wainwright M. Statistical Learning with Sparsity. The Lasso and Generalizations / L., CRC Press. – 2015. – 335 p.

Zou H., Hastie T.// J. R. Statist. Soc. B. – 2005. – V. 67, Part 2. – P. 301–320.

Efron B., Hastie T., Johnstone I., Tibshirani R. // The Annals of Statistics. – 2004. –V. 32, № 2. – P. 407–451.

Tibshirani R. J. // Electronic Journal of Statistics. –2013. – V. 7. – P. 1456–1490.

Miller A., Subset Selection in Regression / New York, Chapman & Hall CRC. – 2002. – 234 p.

Rozenfel'd B. A. Mnogomerny'e prostranstva / M., Nauka. - 1966. - 547 s. [in Russian]

Onijuk N. O., Ivanov V. V., Panteleymonov A. V., Holin YU. V. // Methods and Objects of Chemical Analysis.- 2017.- V. 12, №. 3. - P. 105-111. [in Russian]

Ryan T. P. Modern Regression Methods / New York, Wiley. – 2008. – 672 р.

Veerasamy R., Rajak H., Jain A., Sivadasan S. // Int. J. Drug Design and Discovery. – 2011. – V. 2, № 3. – Р. 511-519.

Golbraikh A., Tropsha A. // J. Mol. Graph. And Mod. – 2002. – V. 20. – P. 269-276.

Consonni V., Ballabio D., Todeschini R. // J. Chem. Inf. Model. – 2009. – V. 49. – P. 1669-1678.

Schmidt M. W., Baldridge K. K., Boatz J. A., Elbert S. T., et al. // J. Comput. Chem. – 1993. – V. 14, № 1. – P. 1347-1363.

Yap C. W. // Comput. Chem. – 2011. – V. 32, № 7. – P. 1466–1474.

Albert A., Serjeant E. P. The Determination of the Ionization Constants. A Laboratory Manual / London, Chapman & Hall. – 1984. – 218 p.

Zefirov N. S. and Palyulin V. A. // J. Chem. Inf. Comput. Sci. – 2001. – V. 41. – P. 1022-1027.

Platts J. A., Butina D., Abraham M. H., and Hersey A. // J. Chem. Inf. Comput. Sci. – 1999. – V. 39. – P. 835-845.

Todeschini R., Consonni V. Handbook of Molecular Descriptors / New York, Wiley-VCH Verlag. – 2000. – 667 p.

Suzuki T., Ohtaguchi K., Koide K. // Computers Сhem. Eng. – 1996. – V. 20, № 2. – Р. 161-173.

Liptak M. D., Gross K. C., Seybold P. G., et al // J. Am. Chem. Soc. –2002. – 124, Р. 6421-6427.

Efron B., Tibshirani R. J., An Introduction to the Bootstrap / New York, Chapman & Hall CRC, 1993, 436 p.

Burnham K. P., Anderson D. R. Model Selection and Multimodel Inference: A Practical In-formation-Theoretic Approach / New York, Springer. – 1998. – 488 p.

Published
2018-09-03
Cited
How to Cite
Berdnyk, M. I., Onizhuk, M. O., & Ivanov, V. V. (2018). Methods for building linear regression equations in the “structure-property” problems. Kharkiv University Bulletin. Chemical Series, (30), 6-17. https://doi.org/10.26565/2220-637X-2018-30-01