Economic series prediction basing on internet users sentiment analysis

  • Катерина Юріївна Кононова V.N. Karazin Kharkov National University, Svobody Square, 4, Kharkiv, Kharkivs'ka oblast, 61000, Ukraine http://orcid.org/0000-0001-6990-5746
  • Антон Олегович Дек V.N. Karazin Kharkov National University, Svobody Square, 4, Kharkiv, Kharkivs'ka oblast, 61000, Ukraine
  • Вадим Вікторович Марков V.N. Karazin Kharkov National University, Svobody Square, 4, Kharkiv, Kharkivs'ka oblast, 61000, Ukraine
  • Максим Вікторович Шпакович Kharkiv National University of Radioelectronics, Nauky Ave, 14, Kharkiv, Kharkivs'ka oblast, 61000, Ukraine
Keywords: bitcoin price forecasting, microblog posts, financial news feeds, latent semantic analysis, opinion mining, neural networks

Abstract

A set of economic time series forecasting models (based on objective and subjective internet content analysis) is proposed in the article. The Bitcoin (BTC) system, the first and most popular cryptocurrency today, was chosen for the analysis. Exponential increase of the cryptocurrency market stipulates relevance of BTC rate forecasting in recent years. Theoretical framework of the research is based on the behavioral finance concept supposing that traders’ behavior is irrational, and the character of their decisions largely depends on psychological factors. The aim of the research is to develop currency rate forecasting methodology and a set of models based on the analysis of factual (objective) and conceptual (subjective) internet content. The source of factual information is relevant newsfeed of specialized news portals, the source of conceptual information is users’ records in microblogs. The basis of the models’ set includes: 1) parsing scripts of news portals and microblogs; 2) algorithm of factual and conceptual exogenous variables generation on the basis of latent-semantic analysis, sentiment analysis, Granger causality analysis; 3) chosen mathematic forecasting tools such as feedforward neural networks and recurrent networks with long short-term memory (LSTM) the set of inputs of which was formed by applying genetic algorithms. As a result of news feeds and tweets database processing, the set of exogenous factors including four out of fourteen factual variables (infrastructure, activity, dissemination and expect) and two out of eight conceptual ones (calm and confusion) was worked out. Automation of neural networks architecture optimization was conducted with the use of genetic algorithms: chromosome length equaled the number of variables; species were subject to hybridization, mutation and selection based on their fit function, which is MSE for the validation dataset. Comparative analysis of different neural networks architectures allowed proving the expediency of Internet content application for economic time series forecasting and demonstrated high appropriateness of the developed models.

Downloads

Download data is not yet available.

References

Zheludev I., Smith R., Aste T. When Can Social Media Lead Financial Markets? [Електронний ресурс]. – Режим доступу : http://www.nature.com/articles/srep04213

Zhang X., Fuehres H., Gloor P. Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear” [Електронний ресурс]. – Режим доступу : doi:10.1016/j.sbspro.2011.10.562

Lachanski M. Did Twitter “Calm”-ness Really Predict the DJIA? [Електронний ресурс]. – Режим доступу : http://files.meetup.com/7616132/DC-NLP-2015-07%20Michael%20Lachanski.pdf

Mao Y., Wei W., Wang B. Twitter Volume Spikes: Analysis and Application in Stock Trading [Електронний ресурс]. – Режим доступу : http://nlab.engr.uconn.edu/papers/SNAKDD025.pdf

Preis T., Moat H., Stanley E. Quantifying Trading Behavior in Financial Markets Using Google Trends [Електронний ресурс]. – Режим доступу : http://www.nature.com/articles/srep01684

Ruiz E. J., Hristidis,V., Castillo C., Gionis A., Jaimes, A. Correlating Financial Time Series with Micro Blogging Activity [Електронний ресурс]. – Режим доступу : http://www.cs.ucr.edu/~vagelis/publications/wsdm2012-microblog-financial.pdf

Challet, D., Ayed, A. Predicting financial markets with Google Trends and not so random keyword [Електронний ресурс]. – Режим доступу : https://arxiv.org/pdf/1307.4643v3.pdf

Blockchain [Електронний ресурс]. – Режим доступу : http://blockchain.info

Бейкер К., Нофсингер Дж. Поведенческие финансы. Инвесторы, компании, рынки. Маросейка, 2016.

Coindesk [Електронний ресурс]. – Режим доступу : http://www.coindesk.com/

Алгоритм Портера [Електронний ресурс]. – Режим доступу : http://www.cs.toronto.edu/~frank/csc2501/Readings/R2_Porter/Porter-1980.pdf

Метод ієрархічної агломеративної кластеризації [Електронний ресурс]. – Режим доступу : http://www.mathworks.com/help/stats/hierarchical-clustering.html

Bollen J., Mao H., Zeng X. Twitter mood predicts the stock market [Електронний ресурс]. – Режим доступу : http://arxiv.org/PS_cache/arxiv/pdf/1010/1010.3003v1.pdf

Sloot D. The junk science behind the ‘Twitter Hedge Fund’ [Електронний ресурс]. – Режим доступу : http://sellthenews.tumblr.com/post/21067996377/noitdoesnot

Sharma J., Vyas A. Twitter Sentiment Analysis [Електронний ресурс]. – Режим доступу : http://www.cse.iitk.ac.in/users/cs365/2012/ submissions/jaysha/cs365/projects/report.pdf

Wilson T., Hoffmann P., Somasundaran S., Kessler J., Wiebe J. OpinionFinder: A system for subjectivity analysis [Електронний ресурс]. – Режим доступу : http://people.cs.pitt.edu/~swapna/papers/OpinionFinder-extendedabstract.pdf

Thelwall M., Buckley K., Paltoglou G., Cai D. Sentiment Strength Detection in Short Informal Text [Електронний ресурс]. – Режим доступу : http://www.scit.wlv.ac.uk/~cm1993/papers/SentiStrengthPreprint.doc

Pollock V., Cho D., Reker D, Volavka J. Profile of mood states: the factors and their psychological correlates [Електронний ресурс]. – Режим доступу : http://sci-hub.cc/10.1097/00005053-197910000-00004

Google Books. Ngram Viewer [Електронний ресурс]. – Режим доступу : https://books.google.com/ngrams

WordNet [Електронний ресурс]. – Режим доступу : http://wordnetweb.princeton.edu/perl/webwn

Olah C. Understanding LSTM Networks [Електронний ресурс]. – Режим доступу : http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Published
2017-03-01
How to Cite
Кононова, К. Ю., Дек, А. О., Марков, В. В., & Шпакович, М. В. (2017). Economic series prediction basing on internet users sentiment analysis. Bulletin of V. N. Karazin Kharkiv National University Economic Series, (91), 90-99. Retrieved from https://periodicals.karazin.ua/economy/article/view/8072
Section
Modelling, simulation and information technology in economics and management