Usage of graph databases for social graph modeling
Abstract
This article is devoted to graph database management systems. The main characteristics and capabilities of those systems have been contemplated. The problems that may occur during the social network development have been selected to be solved using a graph data model. The most popular database management systems nowadays, namely, Neo4J, OrientDB and ArangoDB have been chosen for the study. Such characteristics of the selected databases as whether the software is proprietary or freely distributed, whether databases have up-to-date documentation or not, whether they are supported by developers, whether there is a community where you can get answers to your questions, and how much time is needed to master the database have been elaborated. The typical social network queries, when you need to receive results with a large depth of search quickly, have been developed using the query languages Cypher, OrientDB SQL and AQL used in Neo4J, OrientDB and ArangoDB respectively. The comparison of query execution speed has been performed for the selected databases. For this purpose, a graph that has 5000 nodes and 24900 connections has been built by implementing the Barabashi-Albert model for generating random-scale networks. The test tasks for finding friends of three users with the depth of 5 have been generated. The average time for each request has been estimated for several executions. The conclusions have been drawn and the recommendations regarding the selection of the best graph database for social network implementation have been made.
Downloads
References
/References
E. F. Codd, "A Relational Model of Data for Large Shared Data Banks." Communications of the ACM, vol. 13, pp. 377–387, 1970.
"NoSQL" En.wikipedia.org. Internet: https://eu.wikipedia.org/wiki/NoSQL, Aug 17, 2019 [Oct 25, 2019].
P. Sadalage and M. Fowler, NoSQL distilled. Upper Saddle River, NJ: Addison-Wesley, 2015.
NitrosData, "Usage of graph databases". Internet: http://nitrosdata.ru/2019/02/20/primenenie-grafovyh-baz-dannyh/ [Oct 20, 2019].
"Gartner Identifies Top 10 Data and Analytics Technology Trends for 2019." Internet: https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo Feb, 2019 [Sep 25, 2019].
I. Robinson, and J. Webber, E. Eifrem, Graph Databases. O’Reilly Media, 2013.
DB-Engines Ranking provided by solid IT. Internet: https://db-engines.com/en/ranking, Sep, 2019. [Sep 30, 2019]
Documentation Neo4j. Internet: https://neo4j.com/docs/ [Oct 27, 2019]
Y.Tymoshenko and V. Lazurik "Rational vs graph databases" Proceedings of the international scientific and technical conference "Computer modeling in high technology", Kharkiv: V. N. Karazin Kharkiv National University, pp. 289-292, 2018. [in Russian]
"OrientDB" En.wikipedia.org. Internet: https://ru.wikipedia.org/wiki/OrientDB, Oct 11, 2019 [Oct 27, 2019].
OrientDB Community. Internet: https://orientdb.org/docs/ [Oct 27, 2019]
"ArangoDB" En.wikipedia.org. Internet: https://en.wikipedia.org/wiki/ArangoDB [Oct 27, 2019]
ArangoDB v3.5.0 Documentation. Internet : https://www.arangodb.com/docs/stable/ , [Oct 27, 2019].
"Barabasi-Albert model" En.wikipedia.org. Internet: https://en.wikipedia.org/wiki/Barab%C3%A1si%E2%80%93Albert_model, Oct 21, 2019 [Oct 27, 2019].
Codd E.F. A Relational Model of Data for Large Shared Data Banks. Communications of the ACM. 13 (6): 377–387. doi:10.1145/362384.362685 (June 1970).
NoSQL: Матеріал з Вікіпедії – вільної енциклоредії. [Електроний ресурс] Режим доступу: https://ru.wikipedia.org/wiki/NoSQL
Мартин Фаулер, Прамодкумар, Дж. Садаладж. NoSQL. Новая методология разработки нереляционных баз данных.: пер. с англ. М.: ООО "И.Д. Вильямс", 2013. 192 с.
NitrosData: Применение графовых баз данных. [Электронный ресурс] Режим доступа: http://nitrosdata.ru/2019/02/20/primenenie-grafovyh-baz-dannyh/
Gartner Identifies Top 10 Data and Analytics Technology Trends for 2019. [Електроний ресурс] Режим доступу: https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo
Ian Robinson Graph Databases / Ian Robinson, Jim Webber, Emil Eifrem. O’Reilly Media, 2013. 178 p.
DB-Engines Ranking provided by solid IT, September 2019. [Електроний ресурс] Режим доступу: https://db-engines.com/en/ranking
Документація Neo4j. [Електроний ресурс] Режим доступу https://neo4j.com/docs/
Тимошенко Е.С., Лазурик В.М. Реляционные или графовые базы данных: Труды международной науч.-техн. конференции. Компьютерное моделирование в наукоемких технологиях. Харьков: ХНУ им. В.Н.Каразина, 2018 . С. 289-292.
OrientDB: Матеріал з Вікіпедії – вільної енциклоредії. [Електроний ресурс] Режим доступу: https://ru.wikipedia.org/wiki/OrientDB
OrientDB Community. [Електроний ресурс] Режим доступу https://orientdb.org/docs/
ArangoDB: Матеріал з Вікіпедії – вільної енциклоредії. [Електроний ресурс] Режим доступу: https://ru.wikipedia.org/wiki/ ArangoDB
ArangoDB v3.5.0 Documentation. [Електроний ресурс] Режим доступу: https://www.arangodb.com/docs/stable/
Модель Барабаши — Альберт. [Електроний ресурс] Режим доступу: https://ru.wikipedia.org/wiki/Модель_Барабаши_—_Альберт.