KEMBAR78
A Comparison of Current Graph Database Models | PDF | Databases | Conceptual Model
0% found this document useful (0 votes)
105 views7 pages

A Comparison of Current Graph Database Models

Uploaded by

sheltonma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views7 pages

A Comparison of Current Graph Database Models

Uploaded by

sheltonma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2012 IEEE 28th International Conference on Data Engineering Workshops

A Comparison of Current Graph Database Models


Renzo Angles
Department of Computer Science, Engineering Faculty, Universidad de Talca
Camino Los Niches, Km. 1, Curicó, Chile
renzoangles@gmail.com

Abstract— The limitations of traditional databases, in partic- The benefits of using a graph data model are given by: the
ular the relational model, to cover the requirements of current introduction of a level of abstraction which allows a more
applications has lead the development of new database technolo- natural modeling of graph data; query languages and operators
gies. Among them, the Graph Databases are calling the attention
of the database community because in trendy projects where a for querying directly the graph structure; and ad-hoc structures
database is needed, the extraction of worthy information relies and algorithms for storing and querying graphs.
on processing the graph-like structure of the data. In this paper Motivation. In order to choose the most suitable graph
we present a systematic comparison of current graph database
models. Our review includes general features (for data storing database, to be used in some application domain, we need
and querying), data modeling features (i.e., data structures, to known its features, advantages and disadvantages. In par-
query languages, and integrity constraints), and the support for ticular, the data model implemented by a database play an
essential graph queries. important role in its selection because this ensure (in advance)
I. I NTRODUCTION an innate support for storing and querying the data for a
desired application domain.
The limitations of traditional databases, in particular the
relational model, to cover the requirements of current applica- Related Work. Concerning the origins of graph databases,
tion domains, has lead the development of new technologies Angles and Gutierrez [2] developed a survey on graph database
called NOSQL databases [1]. According to its data model, models proposed before the year 2002. The authors synthe-
these databases can be categorized as: Wide-column stores, sized the notion of a “graph database model” and compare the
which follow the BigTable model of Google (e.g., Cassandra); proposals available at the moment. It is important to emphasize
Document stores, which are oriented to store semi-structured that most of the works reviewed by the authors followed a
data (e.g., MongoDB); Key-value stores, which implement a theoretical interest more than practical developments. Hence,
key to value persistent map for data indexing and retrieval a practical evaluation of the models is not available.
(e.g. BerkeleyDB); and Graph Databases, which are oriented With respect to the recent developments in the area. Pere
to store graph-like data. Burton [9] reviewed six graph databases (Neo4j, Hyper-
Activity around graph databases flourished in the first half GraphDB, DEX, InfoGrid, Sones and VertexDB) and pub-
of the nineties and then the topic almost disappeared [2]. Re- lished a comparison-matrix that included information like
cently the area is gaining attention because in trendy projects software features (e.g., license), schema features (e.g., types
where a database is needed (for example chemistry [3], biology of nodes and edges), query features (e.g., language and traver-
[4], Web Mining [5] and semantic Web [6]), the importance sals), general database features (e.g., transactions, indexing),
of the information relies on the relations more or equal than database operation utilities (e.g., protocols), language bindings
on the entities (a basic principle of every graph database). and operating systems. This work summarizes the features
Moreover, the continued emergence and increase of massive but does not include major discussion nor analysis. Another
and complex graph-like data makes a graph database a crucial informal review [10] included an interesting questionnaire
requirement. This renascence is showed by the availability of about desirable features for graph databases. Domingues-Sal
several graph databases systems. et al.[11] evaluated the performance of three graph databases
One of the most important elements conforming a database (Neo4j, HypergraphDB and DEX) and an RDF Database
is its database model (or simply data model). In the most (Jena). The tests, that included the evaluation of several typical
general sense a data model is a collection of conceptual tools graph operations over different graph sizes, shown that DEX
used to model representations of real-world entities and the and Neo4j were the most efficient implementations.
relations among these entities [7]. From a database point of In summary, a systematic analysis and comparison of the
view, a data model consists of three components: a set of data current graph data models is, to the best of our knowledge,
structure types, a set of operators or inference rules, and a set not available.
of integrity rules [8]. Objective and Contribution. In this paper we compare
Graph database models can be characterized as those where current graph databases concentrating on their data model
data structures for the schema and instances are modeled as features, that is data structures, query facilities, and integrity
graphs or generalizations of them, and data manipulation is ex- constraints. We restrict our study to the logical level and avoid
pressed by graph-oriented operations and type constructors [2]. physical and implementation considerations. Additionally, and

978-0-7695-4748-0/12 $26.00 © 2012 IEEE 162


171
DOI 10.1109/ICDEW.2012.31

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
considering that graph databases are oriented to store and Graph Stores. This category grouped implementations pro-
query graph data, we evaluate graph databases in terms of viding basic facilities for storing and querying graphs. Among
their support for querying a set of essential graph queries. them, Filament [20] is a project for a graph storage library
The paper is organized as follows: In Section II, we sur- with default support for SQL through JDB; G-Store [21]
veyed current graph databases. Section III, presents the com- is a basic storage manager for large vertex-labeled graphs;
parison of graph database models. The support for essential redis graph [22] provides a basic Python implementation for
graph queries is discussed in Section IV. Finally, in Section storing graphs; and VertexDB [23] implements a graph store
V we draw some conclusions and future work. on top of TokyoCabinet (a B-tree key/value disk store). Ad-
ditionally, we can mention CloudGraph [24], Horton [25] and
II. C URRENT G RAPH DATABASES
Trinity [26] as prototypes of graph databases.
In the last time there have been an increasing work on graph
Technologies related to graph databases. Graph databases
databases. Next we present a list of current implementations,
are oriented to store any type of graph, hence they are distinct
including a short description of each of them. Considering
from specialized data management tools that use graph notions
their level of maturity, in terms of the facilities provided by
in their implementation. Among them we can mention: Web-
a database management system, we consider two types of
oriented databases, which use graph structures for modeling
developments: graph databases and graph stores. Additionally,
data used in Web-oriented applications (e.g., InfoGrid and
we review several developments related to graph databases.
FlockDB); Document-oriented databases, which implement
Graph Databases. We assume that a graph database must graph algorithms in order to traverse the relations between
provide most of the major components in database manage- documents (e.g., OrientDB [27]); and Triple Stores (also called
ment systems, being them: external interfaces (user interface RDF databases), which are oriented to store RDF graphs
or API), database languages (for data definition, manipulation consisting in statements of the form subject-predicate-object
and querying), query optimizer, database engine (middle-level (e.g., 4Store, Virtuoso and Bigdata). Giraph [28] is a graph
model), storage engine (low-level model), transaction engine, processing infrastructure that runs on Hadoop (see Pregel);
management and operation features (tuning, backup, recovery, AnGrapa [29] is an large-scale graph data management frame-
etc.). Among the developments satisfying the above condition, work for analytical processing; GoldenOrb [30] is a cloud-
we found AllegroGraph, DEX, HypergraphDB, InfiniteGraph, based open source project for massive-scale graph analysis;
Neo4J and Sones. Phoebus [31] is an implementation of Google’s Pregel for
AllegroGraph[12] is one of the precursors in the current distributed processing of very large graphs.
generation of graph databases. Although it was born as a Additionally, we can found several in-memory graph tools
graph database, its current development is oriented to meet characterized by their restriction to work with small graphs.
the Semantic Web standards (i.e., RDF/S, SPARQL and For example, we can mention complex analysis tools (e.g.,
OWL). Additionally, AllegroGraph provides special features Cytoscape), and graph visualization tools (e.g., JUNG, IGraph,
for GeoTemporal Reasoning and Social Network Analysis. GraphViz, Gephi and NodeXL).
DEX[13], [14] provides a Java library for management of
persistent and temporary graphs. Its implementation, based on III. C OMPARISON OF G RAPH DATABASE M ODELS
bitmaps and other secondary structures, is oriented to ensure A comparison among databases is typically done by either
a good performance in the management of very large graphs. using a set of common features or by defining a general model
HyperGraphDB [15], [16] is a database that implements the used as a comparison basis. The evaluation presented in this
hypergraph data model where the notion of edge is extended section is oriented to evaluate the data model provided by each
to connect more than two nodes. This model allows a natural graph database, in terms of data structures, query language and
representation of higher-order relations, and is particularly use- integrity constraints.
ful for modeling data of areas like knowledge representation, As an initial approach we consider some general features for
artificial intelligence and bio-informatics. data storing, operation and manipulation (see Table I). From
InfiniteGraph [17] is a database oriented to support large- the point of view of data storing, we review the support for
scale graphs in a distributed environment. It aims the efficient three storing schemas (i.e., main memory, external memory,
traversal of relations across massive and distributed data and back-end storage) and the implementation of indexes. It is
stores. Its focus of attention is to extend business, social and important to emphasize that managing a huge amount of data
government intelligence with graph analysis. is a important requirement in real-life applications for graph
Neo4j [18] is based on a network oriented model where databases. Hence the support for external memory storage is
relations are first class objects. It implements an object- a main requirement. Additionally, indexes are the basis to
oriented API, a native disk-based storage manager for graphs, improve data retrieval operations.
and a framework for graph traversals. From the point of view of data operation and manipulation,
Sones [19] is a graph database which provides an inherent we evaluate whether a graph database implements database
support for high-level data abstraction concepts for graphs languages, application programming interfaces (API) and
(e.g., walks). It defines its own graph query language and a graphical user interfaces (GUI). We consider three database
underlying distributed file system. languages: the Data Definition Language, which allows to

172
163

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
modify the schema of the database by adding, changing, or We consider four graph data structures: simple graphs,
deleting its objects; the Data Manipulation Language, which hypergraphs, nested graphs and attributed graphs. The basic
allows to insert, delete and update data in the database; and structure is a simple flat graph defined as a set of nodes
the Query Language, which allows to retrieve data by using (or vertices) connected by edges (i.e., a binary relation over
a query expression. Data operation and manipulation features the set of nodes). An Hypergraph extends this notion by
are summarized in Table II. allowing an edge to relate an arbitrary set of nodes (called
In comparison with the traditional approach in databases, an hyperedge). A nested graph is a graph whose nodes can
where high-level languages for data operation and manipu- be themselves graphs (called hypernodes). Attributed graphs
lation are provided, the most common mechanism in graph are graphs where nodes and edges can contain attributes for
databases is the use of APIs. It means several advantages: describing their properties [32]. Additionally, over the above
standard vocabulary (for functions and procedures), easy de- types of graphs, we consider directed or undirected edges,
velopment of applications, and an unlimited power for query- labeled or unlabeled nodes/edges, and attributed nodes/edges
ing data. However, it also brings serious problems: lower level (i.e., edges between edges are possible).
of abstraction (for the general user), programming language Note that most graph databases are based on simple graphs
restrictions, implementation-dependent efficiency, and decid- or attributed graphs. Only two support hypergraphs and no
ability problems. one nested graphs. We can remark that hypergraphs and
An important feature, not included in Table I, is the support attributed graphs can be modeled by nested graphs. In contrast,
to import and export data in different data formats. Although the multilevel nesting provided by nested graphs cannot be
there exists some data formats for encoding graphs (e.g, modeled by any of the other structures [2].
GraphML and TGV) none of them has been selected as In comparison with past graph database models, the in-
the standard one. This issue is particularly relevant for data clusion of attributes for nodes and edges is a particular
exchange and sharing. feature in current proposals. The introduction of attributes is
TABLE I oriented to improve the speed of retrieval for the data directly
DATA STORING FEATURES related to a given node. This feature shows the influence of
implementation issues in the selection and definition of the
Graph Main External Backend Indexes
data structures (and consequently of the data model).
Database memory memory Storage
AllegroGraph • • • TABLE III
DEX • • • G RAPH DATA STRUCTURES
Filament • •
G-Store •
HyperGraphDB • • • • Graphs Nodes Edges
• •

Attributed graphs
InfiniteGraph

Node attribution

Edge attribution
Neo4j • • •
Simple graphs

Nested graphs

Node labeled

Edge labeled
Hypergraphs

Sones • •
vertexDB • •

Directed
Graph Database
TABLE II AllegroGraph • • • •
DEX • • • • • •
O PERATION AND MANIPULATION FEATURES Filament • • • •
G-Store • • • •
Data Data Query API GUI HyperGraphDB • • • •
Graph Definition Manipulat. Language InfiniteGraph • • • • • •
Database Language Language Neo4j • • • • • •
AllegroGraph • • • • • Sones • • • • • • •
DEX • vertexDB • • • •
Filament •
G-Store • • •
HyperGraphDB • The expressive power for data modeling can be analyzed by
InfiniteGraph •
Neo4j •
comparing the support for representing entities, properties and
Sones • • • • • relations at both instance and schema levels. This evaluation
vertexDB • is shown in Table IV.
At the schema level we found that models support the defini-
tion of node, attribute and relation types. We also evaluate the
A. Graph data structures support for several nodes and relations at the instance level: an
The data structures refer to the types of entities or objects object node, identified by an object-ID, represents an instance
that can be used to model data. In the case of graph databases, of a node type; a value node represents an entity identified
the data structure is naturally defined around the notions of by a primitive value (i.e., its name); a complex node can
graphs, nodes and edges (see Table III). represent an special complex entity, for example a tuple or a

173
164

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
set; an object relation, identified by a relation-ID, is an instance Data retrieval is the main objective in current graph
of a relation type; a simple relation represents a node-edge- databases. AllegroGraph supports reasoning via its Prolog
node instance; a complex relation is a relation with special implementation. Data analysis is supported in terms of special
semantics, for example grouping, derivation, and inheritance. functions (e.g., shortest path) for querying graph properties.
Value nodes and simple relations are supported by all the The lack of a standard query language is a disadvantage
models. The reason is that both conform the most basic and of current graph databases. Recall that in mature databases
simple model for representing graph data. The inclusion of the operation of the database is performed via standard and
object-oriented concepts (e.g., IDs for objects) for representing well-defined database languages. Instead, the focus in current
entities and relations reflects the use of APIs as the favorite graph databases is to provide APIs for popular programming
high-level interface for the database. Note that this issue is not languages. Hence, the selection is hardly determined by the
new in graph databases. In fact, it was naturally introduced by programmer skills or by application requirements.
the so called graph object-oriented data models [2].
Finally, the use of objects (for both nodes and relations) TABLE V
is different of using values. For example, an object node C OMPARISON OF QUERY FACILITIES (• INDICATES SUPPORT, AND ◦
represents an entity identified by an object-ID, but it does PARTIAL SUPPORT )

not represent the value-name of the entity. In this case, it is


necessary to introduce an explicit property or relation “name” Type Use

Graphical Q. L.
in order to specify the name of the entity. The same applies for

Query Lang.
relations. This issue generates an unnatural form of modeling

Reasoning
Retrieval

Analysis
graph data.

API
TABLE IV Graph Database
AllegroGraph ◦ • • • • •
R EPRESENTATION OF ENTITIES AND RELATIONS
DEX • • •
Filament • •
Schema Instance G-Store • •
HyperGraphDB • •
Complex relations

• •
Simple relations

InfiniteGraph
Object relations
Complex nodes
Property types

Relation types

Neo4j ◦ • •
Object nodes

Value nodes
Node types

Sones • • • •
vertexDB • •

Graph Database
AllegroGraph • •
DEX • • • • • •
Filament • • C. Integrity constraints
G-Store • •
HyperGraphDB • • • • • Integrity constraints are general statements and rules that
InfiniteGraph • • • • • • define the set of consistent database states, or changes of state,
Neo4j • • • • or both [2]. Table VI shows that integrity constraints are poorly
Sones • • •
vertexDB • •
studied in graph databases. In fact, there are not important
variations of the notions studied in the past.
We consider several integrity constraints: types checking,
to test the consistency of an instance with respect to the
B. Query languages definitions in the schema; node/edge identity, to verify that
A query language is a collection of operators or inference an entity or relation can be identified by either a value (e.g.,
rules that can be applied to any valid instance of the database, name or ID) or the values of its attributes (e.g., neighborhood
this with the objective of manipulating and querying data in identification); referential integrity, to test that only existing
any combination desired [2]. As is shown in Table III, query entities are referenced; cardinality checking, to verify unique-
languages are not frequent in current graph databases. In fact, ness of properties or relations; functional dependency, to test
there is not proposal for a standard one. that an element in the graph determines the value of another;
AllegroGraph supports SPARQL, the standard query lan- and graph pattern constraints, to verify an structural restriction
guage for RDF. SPARQL is based on graph pattern matching (e.g., path constraints).
but is not oriented to querying the graph structure of RDF The support for evolving schemas is a characteristic of graph
data. Neo4j is developing Cypher, a query language for databases that is commonly used to justify the lack of integrity
property graphs. G-Store and Sones include SQL-based query constraints. We aim that is not a valid argument assuming that
languages with special instructions for querying graphs. To the data consistency in a database is equal or even more important
best of our knowledge, there is not a formal definition of the than a flexible schema. Moreover, an evolving schema can be
semantics for the above query languages, making a systematic supported by allowing flexible structures in the schema (as in
study of their complexity and expressive power difficult. semi-structure data models). For example, the definition of a

174
165

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
relation type as optional, enables the user to decide either the complexity. For example, finding simple paths with desired
inclusion or the absence of such relation for a given entity. properties in direct graphs is an NP-complete problem [34].
3) Pattern matching queries: Graph pattern matching con-
TABLE VI
sists in to find all sub-graphs of a data graph that are
C OMPARISON OF INTEGRITY CONSTRAINTS
isomorphic to a pattern graph. Particularly, it deals with two
problems: the graph isomorphism problem that has a unknown

Graph pattern constrains


computational complexity; and the sub-graph isomorphism

Functional dependency
Cardinality checking
problem which is an NP-complete problem [33].

Referential integrity
Node/edge identity
Types checking
Pattern matching has attracted a great deal of attention
in database theory [33], [34], [46], [47], data mining [48],
bioinformatics [49], and semantic Web [50].
4) Summarization queries: This type of queries are not
Graph Database related to consult the graph structure. Instead they are based
DEX • • •
HyperGraphDB • •
on special functions that allow to summarize or operate on
InfiniteGraph • • the query results, normally returning a single value. Aggregate
Sones • • functions (e.g., average, Count, maximum, etc.) are included
in this group.
Additionally, we consider functions to compute some prop-
IV. C URRENT G RAPH DATABASES AND THEIR SUPPORT erties of a graph and its elements. For example: the order of
FOR QUERYING GRAPHS the graph (i.e., the number of vertices), the degree of a node
(i.e., the number of neighbors of the node), the minimum,
In the database literature we can found several works
maximum and average degree in the graph, the length of a path
studying problems related to storing and querying graphs [33],
(i.e., the number of edges in the path), the distance between
[34]. Such problems play an important role for applications
nodes (i.e., the length of a shortest path between the nodes),
that use graphs as their data model [4]. In this section, we
the diameter of the graph (i.e., the greatest distance between
compare graph databases in terms of their facilities to solve
any two nodes), etc.
several queries which can be considered essential in graphs.
In Table VII, we summarized the support that current graph
We grouped then in adjacency, reachability, pattern matching
databases provide for answering the queries defined above.
and summarization queries.1
Recall that most graph databases implements an API instead
1) Adjacency queries: The primary notion in this type of
of a query language. Hence, we evaluate whether an API
queries is node/edge adjacency. Two nodes are adjacent (or
can answer an essential query by using a combination of
neighbors) when there is and edge between them. Similarly,
basic functions, more than the facilities to implement an
two edges are adjacent when they share a common node.
algorithm that solve the query (feature clearly supported by
Some typical queries in this group are: basic node/edge
a programming language).
adjacency, to test whether two nodes (edges) are adjacent [36];
Additionally, Table VIII presents the results of a previous
and k-neighborhood of a node, to list all the neighbors of the
study [35] about (past) graph query languages and their sup-
node [37]. Adjacency queries are useful in several contexts, for
port for querying essential graph queries. Such study provides
example in molecular biology [38], information retrieval[39]
a positive conclusion about the feasibility of developing a well-
and the semantic Web [40].
designed graph query language. It is important to note that
2) Reachability queries: This type of queries is character-
these languages were well-studied from a theoretical point
ized by path or traversal problems. The problem of reachability
of view. Hence, they provide a formal background for the
tests whether two given nodes are connected by a path.
definition of a standard query language for graph databases.
In this context, we can consider two types of paths: fixed-
length paths, which contain a fixed number of nodes and V. C ONCLUSIONS
edges; and regular simple paths, which allow some node and
In this paper we surveyed current graph databases and
edge restrictions (e.g., regular expressions). A related but more
compare them according to their data modeling features. We
complicated problem is to find the shortest path, that is to
shown that most graph database models provide an innate
compute the quickest/shortest route between two nodes.
support for different graph structures, query facilities in the
Reachability queries are studied and required in several
form of APIs (most of the models) and query languages
application domains. For example in database theory [41],
(a few of them), and basic notions of integrity constraints.
[42] (where recursive queries are in practice graph traversals),
Additionally, we defined a set of essential graph queries and
spatial databases [43], biological databases [44], and the
evaluated the query facilities provided by graph databases in
semantic Web [45]. One of the challenges to incorporate reach-
order to answer such queries.
ability queries into a query language is their computational
The review shown that some aspects of current graph
1 A similar classification was proposed and used in a previous evaluation database models deserve more development. In particular, the
of graph query languages [35]. definition of standard graph database languages (for defining,

175
166

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
TABLE VII [4] B. A. Eckman and P. G. Brown, “Graph data management for molecular
C URRENT GRAPH DATABASES AND THEIR SUPPORT FOR ESSENTIAL and cell biology,” IBM Journal of Research and Development, vol. 50,
no. 6, pp. 545–560, 2006.
GRAPH QUERIES [5] A. Schenker, H. Bunke, M. Last, and A. Kandel, Graph-Theoretic
Techniques for Web Content Mining, ser. Series in Machine Perception
and Artificial Intelligence. World Scientific, 2005, vol. 62.
Adjacency Reachability
[6] J. Hayes and C. Gutierrez, “Bipartite Graphs as Intermediate Model for

Node/edge adjacency

Regular simple paths


RDF,” in Proceedings of the 3th International Semantic Web Conference

Fixed-length paths
(ISWC), ser. LNCS, no. 3298. Springer-Verlag, Nov 2004, pp. 47–61.

Pattern matching
k-neighborhood
[7] A. Silberschatz, H. F. Korth, and S. Sudarshan, “Data Models,” ACM

Summarization
Shortest path
Computing Surveys, vol. 28, no. 1, pp. 105–108, 1996.
[8] E. F. Codd, “Data Models in Database Management,” in Proceedings
of the 1980 Workshop on Data abstraction, Databases and Conceptual
Modeling. ACM Press, 1980, pp. 112–114.
Graph Database [9] P. Urbón, “Nosql graph database matrix,”
Allegro • • • http://nosql.mypopescu.com/post/619181345/nosql-graph-database-
DEX • • • • • matrix, May 2010.
Filament • • • [10] “Short overview on the emerging world of graph databases,”
G-Store • • • • • http://www.graph-database.org/overview.html.
HyperGraph • • [11] D. Dominguez-Sal, P. Urbón-Bayes, A. Giménez-Vañó, S. Gómez-
Infinite • • • • • Villamor, N. Martı́nez-Bazán, and J. L. Larriba-Pey, “Survey of graph
Neo4j • • • • • database performance on the hpc scalable graph analysis benchmark,”
Sones • • in Proc. of the 2010 international conference on Web-age information
vertexDB • • • • management (WAIM). Springer-Verlag, 2010, pp. 37–48.
[12] “AllegroGraph,” http://www.franz.com/agraph/allegrograph/.
[13] N. Martı́nez-Bazan, V. Muntés-Mulero, S. Gómez-Villamor, J. Nin, M.-
TABLE VIII A. Sánchez-Martı́nez, and J.-L. Larriba-Pey, “DEX: High-Performance
Exploration on Large Graphs for Information Retrieval,” in Proceedings
PAST GRAPH QUERY LANGUAGES AND THEIR SUPPORT FOR ESSENTIAL
of the 16th Conference on Information and Knowledge Management
GRAPH QUERIES (• INDICATES SUPPORT, AND ◦ PARTIAL SUPPORT ) (CIKM). ACM, 2007, pp. 573–582.
[14] “DEX,” http://www.sparsity-technologies.com/dex.
[15] “HyperGraphDB,” http://www.hypergraphdb.org/.
Distance between nodes

[16] B. Iordanov, “Hypergraphdb: a generalized graph database,” in Pro-


Node/edge adjacency

Regular simple paths

ceedings of the 2010 international conference on Web-age information


Fixed-length paths

management (WAIM). Springer-Verlag, 2010, pp. 25–36.


Degree of a node

[17] “Infinitegraph,” http://infinitegraph.com/.


[18] “Neo4j,” http://neo4j.org/.
[19] “Sones GraphDB,” http://www.sones.com/.
Diameter

[20] “Filament,” http://filament.sourceforge.net/.


[21] “G-Store,” http://g-store.sourceforge.net/.
Graph Database [22] “redis graph: Graph database for python,”
G ◦ • • https://github.com/amix/redis graph.
G+ • • • • • • [23] “vertexdb,” http://www.dekorte.com/projects/opensource/vertexdb/.
GraphLog • • • • • • [24] “Cloudgraph,” http://www.cloudgraph.com/.
Gram • • • [25] “Horton,” http://research.microsoft.com/en-us/projects/ldg.
GraphDB ◦ • • [26] “Trinity,” http://research.microsoft.com/en-us/projects/trinity/.
Lorel • • • [27] “Orientdb,” http://www.orientechnologies.com/.
F-G ◦ • • [28] “Giraph,” https://github.com/aching/Giraph.
[29] “Angrapa - the graph package,” http://wiki.apache.org/hama/GraphPackage.
[30] “Goldenorb,” http://www.goldenorbos.org/.
[31] “Phoebus,” https://github.com/xslogic/phoebus.
[32] H. Ehrig, U. Prange, and G. Taentzer, “Fundamental theory for typed
manipulating and querying the data) and notions of integrity attributed graph transformation,” in Proc. of the 2nd Int. Conference on
constraints (for preserving the consistency of the database). Graph Transformation (ICGT), ser. LNCS, no. 3256. Springer, 2004,
As future work we plan to develop an empirical evaluation pp. 162–177.
[33] M. Yannakakis, “Graph-Theoretic Methods in Database Theory,” in
of current graph databases; this oriented to make a quantita- Proceedings of the 9th Symposium on Principles of Database Systems
tive and qualitative analysis of their support for storing and (PODS). ACM Press, 1990, pp. 230–242.
querying graph data. [34] D. Shasha, J. T. L. Wang, and R. Giugno, “Algorithmics and Ap-
plications of Tree and Graph Searching,” in Proceedings of the 21th
Symposium on Principles of Database Systems (PODS). ACM Press,
ACKNOWLEDGMENT 2002, pp. 39–52.
This work was supported by the Chilean Fondecyt Project [35] R. Angles and C. Gutierrez, “Querying RDF Data from a Graph
Database Perspective,” in Proceedings of the 2nd European Semantic
No. 11100364. Web Conference (ESWC), ser. LNCS, no. 3532, 2005, pp. 346–360.
[36] L. Kowalik, “Adjacency queries in dynamic sparse graphs,” Information
R EFERENCES Processing Letters, vol. 102, pp. 191–195, May 2007.
[37] A. N. Papadopoulos and Y. Manolopoulos, Nearest Neighbor Search -
[1] “NOSQL Databases,” http://nosql-database.org/. A Database Perspective, ser. Series in Computer Science. Springer,
[2] R. Angles and C. Gutierrez, “Survey of graph database models,” ACM 2005.
Computing Surveys (CSUR), vol. 40, no. 1, pp. 1–39, 2008. [38] T. Seidl and H. peter Kriegel, “A 3d molecular surface representation
[3] A. Nayak. and I. Stojmenovic, Handbook of Applied Algorithms: Solving supporting neighborhood queries,” in Proc. of the 3rd Conference on
Scientific, Engineering, and Practical Problems. Wiley-IEEE Press, Intelligent Systems for Molecular Biology (ISMB). Springer, 1995, pp.
2008, ch. Graph Theoretic Models in Chemistry and Molecular Biology, 240–258.
pp. 85–113.

176
167

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.
[39] C.-S. Chang and A. L. P. Chen, “Supporting conceptual and neighbor-
hood queries on the world wide web,” IEEE Transactions on Systems,
Man, and Cybernetics (TSMC), vol. 28, no. 2, pp. 300–308, 1998.
[40] R. Guha, R. McCool, and E. Miller, “Semantic Search,” in Proceedings
of the 12th International Conference on World Wide Web (WWW). ACM
Press, 2003, pp. 700–709.
[41] P. Barceló, C. Hurtado, L. Libkin, and P. Wood, “Expressive Languages
for Path Queries over Graph-structured Data,” in Proc. of the 29th
ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database
Systems (PODS), 2010, pp. 3–14.
[42] W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu, “Adding regular expressions
to graph reachability and pattern queries,” in Proc. of the IEEE 27th
International Conference on Data Engineering (ICDE), 2011, pp. 39–
50.
[43] J. Paredaens and B. Kuijpers, “Data Models and Query Languages for
Spatial Databases,” Data & Knowledge Engineering (DKE), vol. 25, no.
1-2, pp. 29–53, 1998.
[44] A. Matono, T. Amagasa, M. Yoshikawa, and S. Uemura, “An Eficient
Pathway Search Using an Indexing Scheme for RDF,” Genome Infor-
matics, no. 14, pp. 374–375, 2003.
[45] A. Gubichev and T. Neumann, “Path query processing on very large rdf
graphs,” in Proc. of the 14th International Workshop on the Web and
Databases (WebDB), 2011.
[46] P. Barcelo, L. Libkin, and J. Reutter, “Querying graph patterns,” in Proc.
of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles
of Database Systems (PODS), 2011, pp. 199–210.
[47] J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang, “Fast graph pattern
matching,” in Proc. of the 2008 IEEE 24th International Conference on
Data Engineering (ICDE). IEEE Computer Society, 2008, pp. 913–922.
[48] T. Washio and H. Motoda, “State of the Art of Graph-based Data
Mining,” SIGKDD Explorer Newsletter, vol. 5, no. 1, pp. 59–68, 2003.
[49] X. Wang, “Finding patterns on protein surfaces: Algorithms and appli-
cations to protein classification,” IEEE Transactions on Knowledge and
Data Engineering, vol. 17, pp. 1065–1078, 2005.
[50] J. Carroll, “Matching RDF Graphs,” in Proceedings of the International
Semantic Web Conference (ISWC), 2002.

177
168

Authorized licensed use limited to: Zhejiang University. Downloaded on October 12,2021 at 11:58:12 UTC from IEEE Xplore. Restrictions apply.

You might also like