COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS AND REGRESSIONS FOR CAR PRICE PREDICTION

Keywords: car price, regression, neural networks, ensemble of models

Abstract

The purpose of the research described in this article is a comparative analysis of the predictive qualities of some models of machine learning and regression. The factors for models are the consumer characteristics of a used car: brand, transmission type, drive type, engine type, mileage, body type, year of manufacture, seller's region in Ukraine, condition of the car, information about accident, average price for analogue in Ukraine, engine volume, quantity of doors, availability of extra equipment, quantity of passenger’s seats, the first registration of a car, car was driven from abroad or not. Qualitative variables has been encoded as binary variables or by mean target encoding. The information about more than 200 thousand cars have been used for modeling. All models have been evaluated in the Python Software using Sklearn, Catboost, StatModels and Keras libraries. The following regression models and machine learning models were considered in the course of the study: linear regression; polynomial regression; decision tree; neural network; models based on "k-nearest neighbors", "random forest", "gradient boosting" algorithms; ensemble of models. The article presents the best in terms of quality (according to the criteria R2, MAE, MAD, MAPE) options from each class of models. It has been found that the best way to predict the price of a passenger car is through non-linear models. The results of the modeling show that the dependence between the price of a car and its characteristics is best described  by the ensemble of models, which includes a neural network, models using "random forest" and "gradient boosting" algorithms. The ensemble of models showed an average relative approximation error of 11.2% and an average relative forecast error of 14.34%. All nonlinear models for car price have approximately the same predictive qualities (the difference between the MAPE within 2%) in this research.

Downloads

Download data is not yet available.

References

Osokina, O. (2015). Econometric modeling of the cost of the car Renault Duster in the secondary market. Moscow: VII International Student Scientific Conference "Student Scientific Forum". p.8. (in Russian)

Zhurkina, E. (2015). Econometric modeling of the cost of the car Ford Fiesta in the secondary market, the example calculations. Moscow: VII International Student Scientific Conference "Student Scientific Forum". p.6. (in Russian)

Mrochko, A., Batojargalov, B. (2015). Econometric analysis of the market for used cars (BMW, Mercedes, Audi). Moscow: International student scientific newsletter, 6, 28. (in Russian)

Valeeva, Z., Isavnin, A. (2016). Econometric modeling of price car in the secondary market in the city of Naberezhnye Chelny. Moscow: Fundamental research, 6, 154-158. (in Russian)

Utakaeva, I.H. (2019). Experience of econometric modeling using the statistical analysis package Python. Bulletin of the Altai Academy of Economics and Law, 2, 346-351. (in Russian)

Gegic, E., Isakovic, B., Keco, D., Masetic, Z. & Kevric, J. (2019). Car Price Prediction using Machine Learning Techniques. Saraevo: TEM Journal, 1 (8), 113-118.

Kanwal, N., Sadaqat, J. (2017). Vehicle Price Prediction System using Machine Learning Techniques. New York: International Journal of Computer Applications, 167, 27-31.

Ozcalici, M. (2017). Predicting Second-Hand Car Sales Price Using Decision Trees and Genetic Algorithms. Istanbul: Alphanumeric Journal, №5, 103-114.

Autoria. Retrieved from https://auto.ria.com.

Statmodels. Retrieved from https://www.statsmodels.org/.

Sklearn. Retrieved from https://scikit-learn.org/.

Catboost. Retrieved from https://catboost.ai/.

Keras. Retrieved from https://keras.io/.

How to Build, Develop and Deploy a Machine Learning Model to predict cars price using Neural Networks (2019). Medium. Retrieved from https://medium.com/thelaunchpad/how-to-build-develop-and-deploy-a-machine-learning-model-to-predict-cars-price-using-neural-7f7439a37300.

Published
2019-12-26
How to Cite
Kovpak, E., & Orlov, F. (2019). COMPARATIVE ANALYSIS OF MACHINE LEARNING MODELS AND REGRESSIONS FOR CAR PRICE PREDICTION. Bulletin of V. N. Karazin Kharkiv National University Economic Series, (97), 31-40. https://doi.org/10.26565/2311-2379-2019-97-04
Section
Modelling, simulation and information technology in economics and management