NOSQL Databases
NOSQL Databases
As another example, consider an application such as Facebook, with millions of users who
submit posts, many with images and videos; then these posts must be displayed on pages of
other users using the social media relationships among the users. User profiles, user
relationships, and posts must all be stored in a huge collection of data stores, and the appropriate
posts must be made available to the sets of users that have signed up to see these posts. Some
of the data for this type of application is not suitable for a traditional relational system and
typically needs multiple types of databases and data storage systems.
Some of the organizations that were faced with these data management and storage
applications decided to develop their own systems:
Google developed a proprietary NOSQL system known as BigTable, which is used in
many of Google’s applications that require vast amounts of data storage, such as Gmail,
Google Maps, and Web site indexing. Apache Hbase is an open source NOSQL system
based on similar concepts. Google’s innovation led to the category of NOSQL systems
known as column-based or wide column stores; they are also sometimes referred to as
column family stores.
Amazon developed a NOSQL system called DynamoDB that is available through
Amazon’s cloud services. This innovation led to the category known as key-value data
stores or sometimes key-tuple or key-object data stores.
Facebook developed a NOSQL system called Cassandra, which is now open source
and known as Apache Cassandra. This NOSQL system uses concepts from both key-
value stores and column-based systems.
Other software companies started developing their own solutions and making them
available to users who need these capabilities—for example, MongoDB and CouchDB,
which are classified as document-based NOSQL systems or document stores.
Another category of NOSQL systems is the graph-based NOSQL systems, or graph
databases; these include Neo4J and GraphBase, among others.
Some NOSQL systems, such as OrientDB, combine concepts from many of the
categories discussed above.
In addition to the newer types of NOSQL systems listed above, it is also possible to
classify database systems based on the object model or on the native XML model as
NOSQL systems, although they may not have the high-performance and replication
characteristics of the other types of NOSQL systems.
1. Scalability: There are two kinds of scalability in distributed systems: horizontal and vertical.
In NOSQL systems, horizontal scalability is generally used, where the distributed system is
expanded by adding more nodes for data storage and processing as the volume of data grows.
Vertical scalability, on the other hand, refers to expanding the storage and computing power of
existing nodes. In NOSQL systems, horizontal scalability is employed while the system is
operational, so techniques for distributing the existing data among new nodes without
interrupting system operation are necessary.
2. Availability, Replication and Eventual Consistency: Many applications that use NOSQL
systems require continuous system availability. To accomplish this, data is replicated over two
or more nodes in a transparent manner, so that if one node fails, the data is still available on
other nodes. Replication improves data availability and can also improve read performance,
because read requests can often be serviced from any of the replicated data nodes. However,
write performance becomes more cumbersome because an update must be applied to every
copy of the replicated data items; this can slow down write performance if serializable
consistency is required. Many NOSQL applications do not require serializable consistency, so
more relaxed forms of consistency known as eventual consistency are used.
3. Replication Models: Two major replication models are used in NOSQL systems:
master-slave and master-master replication.
Master-slave replication requires one copy to be the master copy; all write operations must
be applied to the master copy and then propagated to the slave copies, usually using eventual
consistency (the slave copies will eventually be the same as the master copy). For read, the
master-slave paradigm can be configured in various ways. One configuration requires all reads
to also be at the master copy, so this would be similar to the primary site or primary copy
methods of distributed concurrency control, with similar advantages and disadvantages.
Another configuration would allow reads at the slave copies but would not guarantee that the
values are the latest writes, since writes to the slave nodes can be done after they are applied to
the master copy.
The master-master replication allows reads and writes at any of the replicas but may not
guarantee that reads at nodes that store different copies see the same values. Different users
may write the same data item concurrently at different nodes of the system, so the values of the
item will be temporarily inconsistent. A reconciliation method to resolve conflicting write
operations of the same data item at different nodes must be implemented as part of the master-
master replication scheme.
4. Sharding of Files: In many NOSQL applications, files (or collections of data objects) can
have many millions of records (or documents or objects), and these records can be accessed
concurrently by thousands of users. So it is not practical to store the whole file in one node.
Sharding of the file records is often employed in NOSQL systems. This serves to distribute
the load of accessing the file records to multiple nodes. The combination of sharding the file
records and replicating the shards works in tandem to improve load balancing as well as data
availability.
5. High-Performance Data Access: In many NOSQL applications, it is necessary to find
individual records or objects (data items) from among the millions of data records or objects in
a file. To achieve this, most systems use one of two techniques: hashing or range partitioning
on object keys. The majority of accesses to an object will be by providing the key value rather
than by using complex query conditions. The object key is similar to the concept of object id.
In hashing, a hash function h(K) is applied to the key K, and the location of the object with key
K is determined by the value of h(K).
In range partitioning, the location is determined via a range of key values; for example,
location i would hold the objects whose key values K are in the range Kimin ≤ K ≤ Kimax. In
applications that require range queries, where multiple objects within a range of key values are
retrieved, range partitioned is preferred. Other indexes can also be used to locate objects based
on attribute conditions different from the key K.
NOSQL systems emphasize performance and flexibility over modeling power and complex
querying. We discuss some of these characteristics next.
1. Not Requiring a Schema: The flexibility of not requiring a schema is achieved in many
NOSQL systems by allowing semi-structured, self-describing data. The users can specify a
partial schema in some systems to improve storage efficiency, but it is not required to have a
schema in most of the NOSQL systems. As there may not be a schema to specify constraints,
any constraints on the data would have to be programmed in the application programs that
access the data items. There are various languages for describing semi structured data, such as
JSON (JavaScript Object Notation) and XML (Extensible Markup Language).
JSON is used in several NOSQL systems, but other methods for describing semi-structured
data can also be used.
2. Less Powerful Query Languages: Many applications that use NOSQL systems may not
require a powerful query language such as SQL, because search (read) queries in these systems
often locate single objects in a single file based on their object keys. NOSQL systems typically
provide a set of functions and operations as a programming API (application programming
interface), so reading and writing the data objects is accomplished by calling the appropriate
operations by the programmer. In many cases, the operations are called CRUD operations, for
Create, Read, Update, and Delete. In other cases, they are known as SCRUD because of an
added Search (or Find) operation. Some NOSQL systems also provide a high-level query
language, but it may not have the full power of SQL; only a subset of SQL querying capabilities
would be provided. In particular, many NOSQL systems do not provide join operations as part
of the query language itself; the joins need to be implemented in the application programs.
3. Versioning: Some NOSQL systems provide storage of multiple versions of the data items,
with the timestamps of when the data version was created.
Additional categories can be added as follows to include some systems that are not easily
categorized into the above four categories, as well as some other types of systems that have
been available even before the term NOSQL became widely used.
5. Hybrid NOSQL systems: These systems have characteristics from two or more of the above
four categories.
6. Object databases
7. XML databases
Even keyword-based search engines store large amounts of data with fast search access, so the
stored data can be considered as large NOSQL big data stores.
The CAP Theorem
In a system with data replication, concurrency control becomes more complex because there
can be multiple copies of each data item. So if an update is applied to one copy of an item, it
must be applied to all other copies in a consistent manner. The possibility exists that one copy
of an item X is updated by a transaction T1 whereas another copy is updated by a transaction
T2, so two inconsistent copies of the same item exist at two different nodes in the distributed
system. If two other transactions T3 and T4 want to read X, each may read a different copy of
item X.
The CAP theorem, which was originally introduced as the CAP principle, can be used to
explain some of the competing requirements in a distributed system with replication. The three
letters in CAP refer to three desirable properties of distributed systems with replicated data:
consistency (among replicated copies), availability (of the system for read and write
operations) and partition tolerance (in the face of the nodes in the system being partitioned
by a network fault).
Availability means that each read or write request for a data item will either be processed
successfully or will receive a message that the operation cannot be completed.
Partition tolerance means that the system can continue operating if the network connecting the
nodes has a fault that results in two or more partitions, where the nodes in each partition can
only communicate among each other.
Consistency means that the nodes will have the same copies of a replicated data item visible
for various transactions.
The CAP theorem states that it is not possible to guarantee all three of the desirable
properties—consistency, availability, and partition tolerance—at the same time in a distributed
system with data replication. If this is the case, then the distributed system designer would have
to choose two properties out of the three to guarantee. It is generally assumed that in many
traditional (SQL) applications, guaranteeing consistency through the ACID properties is
important. On the other hand, in a NOSQL distributed data store, a weaker consistency level is
often acceptable, and guaranteeing the other two properties (availability, partition tolerance) is
important. Hence, weaker consistency levels are often used in NOSQL system instead of
guaranteeing serializability. In particular, a form of consistency known as eventual
consistency is often adopted in NOSQL systems.
Document-Based NOSQL Systems and MongoDB
MongoDB documents are stored in BSON (Binary JSON) format, which is a variation of JSON
with some additional data types and is more efficient for storage than JSON. Individual
documents are stored in a collection. We will use a simple example based on our COMPANY
database that we used throughout this book. The operation createCollection is used to create
each collection. For example, the following command can be used to create a collection called
project to hold PROJECT objects from the COMPANY database:
The first parameter “project” is the name of the collection, which is followed by an optional
document that specifies collection options. In our example, the collection is capped; this
means it has upper limits on its storage space (size) and number of documents (max). The
capping parameters help the system choose the storage options for each collection.
For our example, we will create another document collection called worker to hold information
about the EMPLOYEEs who work on each project; for example:
db.createCollection(“worker”, { capped : true, size : 5242880, max : 2000 } ) )
Each document in a collection has a unique ObjectId field, called _id, which is automatically
indexed in the collection unless the user explicitly requests no index for the _id field. The value
of ObjectId can be specified by the user, or it can be system-generated if the user does not
specify an _id field for a particular document.
System-generated ObjectIds have a specific format, which combines the timestamp when the
object is created (4 bytes, in an internal MongoDB format), the node id (3 bytes), the process
id (2 bytes), and a counter (3 bytes) into a 16-byte Id value.
User-generated ObjectsIds can have any value specified by the user as long as it uniquely
identifies the document and so these Ids are similar to primary keys in relational systems.
A collection does not have a schema. The structure of the data fields in documents is chosen
based on how documents will be accessed and used, and the user can choose a normalized
design (similar to normalized relational tuples) or a denormalized design (similar to XML
documents or complex objects). Interdocument references can be specified by storing in one
document the ObjectId or ObjectIds of other related documents.
Figure 24.1(a) shows a simplified MongoDB document showing some of the data from Figure
5.6 from the COMPANY database example that is used throughout the book. In our example,
the _id values are user-defined, and the documents whose _id starts with P (for project) will be
stored in the “project” collection, whereas those whose _id starts with W (for worker) will be
stored in the “worker” collection.
Another option is to use the design in Figure 24.1(b), where worker references are embedded
in the project document, but the worker documents themselves are stored in a separate “worker”
collection.
A third option in Figure 24.1(c) would use a normalized design, similar to First Normal Form
relations. The choice of which design option to use depends on how the data will be accessed.
MongoDB CRUD Operations
MongoDb has several CRUD operations, where CRUD stands for (create, read, update,
delete). Documents can be created and inserted into their collections using the insert operation,
whose format is:
db.<collection_name>.insert(<document(s)>)
The parameters of the insert operation can include either a single document or an array of
documents, as shown in Figure 24.1(d). The delete operation is called remove, and the format
is:
db.<collection_name>.remove(<condition>)
The documents to be removed from the collection are specified by a Boolean condition on
some of the fields in the collection documents. There is also an update operation, which has a
condition to select certain documents, and a $set clause to specify the update. It is also possible
to use the update operation to replace an existing document with another one but keep the same
ObjectId. For read queries, the main command is called find, and the format is:
db.<collection_name>.find(<condition>)
General Boolean conditions can be specified as <condition>, and the documents in the
collection that return true are selected for the query result.
Replication in MongoDB.
The concept of replica set is used in MongoDB to create multiple copies of the same data set
on different nodes in the distributed system, and it uses a variation of the master-slave
approach for replication. For example, suppose that we want to replicate a particular document
collection C. A replica set will have one primary copy of the collection C stored in one node
N1, and at least one secondary copy (replica) of C stored at another node N2. Additional copies
can be stored in nodes N3, N4, etc., as needed, but the cost of storage and update (write)
increases with the number of replicas. The total number of participants in a replica set must be
at least three, so if only one secondary copy is needed, a participant in the replica set known as
an arbiter must run on the third node N3. The arbiter does not hold a replica of the collection
but participates in elections to choose a new primary if the node storing the current primary
copy fails. If the total number of members in a replica set is n (one primary plus i secondaries,
for a total of n = i + 1), then n must be an odd number; if it is not, an arbiter is added to ensure
the election process works correctly if the primary fails.
In MongoDB replication, all write operations must be applied to the primary copy and then
propagated to the secondaries. For read operations, the user can choose the particular read
preference for their application. The default read preference processes all reads at the primary
copy, so all read and write operations are performed at the primary node. In this case, secondary
copies are mainly to make sure that the system continues operation if the primary fails, and
MongoDB can ensure that every read request gets the latest document value. To increase read
performance, it is possible to set the read preference so that read requests can be processed at
any replica (primary or secondary); however, a read at a secondary is not guaranteed to get the
latest version of a document because there can be a delay in propagating writes from the
primary to the secondaries.
Sharding in MongoDB.
When a collection holds a very large number of documents or requires a large storage space,
storing all the documents in one node can lead to performance problems, particularly if there
are many user operations accessing the documents concurrently using various CRUD
operations.
Sharding of the documents in the collection—also known as horizontal partitioning— divides
the documents into disjoint partitions known as shards. This allows the system to add more
nodes as needed by a process known as horizontal scaling of the distributed system, and to
store the shards of the collection on different nodes to achieve load balancing. Each node will
process only those operations pertaining to the documents in the shard stored at that node. Also,
each shard will contain fewer documents than if the entire collection were stored at one node,
thus further improving performance.
There are two ways to partition a collection into shards in MongoDB—range partitioning and
hash partitioning. Both require that the user specify a particular document field to be used as
the basis for partitioning the documents into shards. The partitioning field—known as the
shard key in MongoDB—must have two characteristics: it must exist in every document in the
collection, and it must have an index. The ObjectId can be used, but any other field possessing
these two characteristics can also be used as the basis for sharding. The values of the shard key
are divided into chunks either through range partitioning or hash partitioning, and the
documents are partitioned based on the chunks of shard key values.
Range partitioning creates the chunks by specifying a range of key values; for example, if the
shard key values ranged from one to ten million, it is possible to create ten ranges—1 to
1,000,000; 1,000,001 to 2,000,000; … ; 9,000,001 to 10,000,000—and each chunk would
contain the key values in one range.
Hash partitioning applies a hash function h(K) to each shard key K, and the partitioning of keys
into chunks is based on the hash values. In general, if range queries are commonly applied to
a collection (for example, retrieving all documents whose shard key value is between 200 and
400), then range partitioning is preferred because each range query will typically be submitted
to a single node that contains all the required documents in one shard. If most searches retrieve
one document at a time, hash partitioning may be preferable because it randomizes the
distribution of shard key values into chunks.
When sharding is used, MongoDB queries are submitted to a module called the query router,
which keeps track of which nodes contain which shards based on the particular partitioning
method used on the shard keys. The query (CRUD operation) will be routed to the nodes that
contain the shards that hold the documents that the query is requesting. If the system cannot
determine which shards hold the required documents, the query will be submitted to all the
nodes that hold shards of the collection. Sharding and replication are used together; sharding
focuses on improving performance via load balancing and horizontal scalability, whereas
replication focuses on ensuring system availability when certain nodes fail in the distributed
system.
DynamoDB Overview
The DynamoDB system is an Amazon product and is available as part of Amazon’s AWS/SDK
platforms (Amazon Web Services/Software Development Kit). It can be used as part of
Amazon’s cloud computing services, for the data storage component.
DynamoDB data model. The basic data model in DynamoDB uses the concepts of tables,
items, and attributes. A table in DynamoDB does not have a schema; it holds a collection of
self-describing items. Each item will consist of a number of (attribute, value) pairs, and
attribute values can be single-valued or multivalued. So basically, a table will hold a collection
of items, and each item is a self-describing record (or object). DynamoDB also allows the user
to specify the items in JSON format, and the system will convert them to the internal storage
format of DynamoDB.
When a table is created, it is required to specify a table name and a primary key; the primary
key will be used to rapidly locate the items in the table. Thus, the primary key is the key and
the item is the value for the DynamoDB key-value store.
The primary key attribute must exist in every item in the table. The primary key can be one of
the following two types:
■ A single attribute. The DynamoDB system will use this attribute to build a hash index on
the items in the table. This is called a hash type primary key. The items are not ordered in
storage on the value of the hash attribute.
■ A pair of attributes. This is called a hash and range type primary key. The primary key will
be a pair of attributes (A, B): attribute A will be used for hashing, and because there will be
multiple items with the same value of A, the B values will be used for ordering the records with
the same A value. A table with this type of key can have additional secondary indexes defined
on its attributes. For example, if we want to store multiple versions of some type of items in a
table, we could use ItemID as hash and Date or Timestamp (when the version was created) as
range in a hash and range type primary key.
■ Simple basic operations. A collection of (key, value) pairs is kept in a Voldemort store. In
our discussion, we will assume the store is called s. The basic interface for data storage and
retrieval is very simple and includes three operations: get, put, and delete. The operation
s.put(k, v) inserts an item as a key-value pair with key k and value v. The operation s.delete(k)
deletes the item whose key is k from the store, and the operation v = s.get(k) retrieves the value
v associated with key k. The application can use these basic operations to build its own
requirements. At the basic storage level, both keys and values are arrays of bytes (strings).
■ High-level formatted data values. The values v in the (k, v) items can be specified in JSON
(JavaScript Object Notation), and the system will convert between JSON and the internal
storage format. Other data object formats can also be specified if the application provides the
conversion (also known as serialization) between the user format and the storage format as a
Serializer class. The Serializer class must be provided by the user and will include operations
to convert the user format into a string of bytes for storage as a value, and to convert back a
string (array of bytes) retrieved via s.get(k) into the user format. Voldemort has some built-in
serializers for formats other than JSON.
■ Consistent hashing for distributing (key, value) pairs. A variation of the data distribution
algorithm known as consistent hashing is used in Voldemort for data distribution among the
nodes in the distributed cluster of nodes. A hash function h(k) is applied to the key k of each
(k, v) pair, and h(k) determines where the item will be stored. The method assumes that h(k) is
an integer value, usually in the range 0 to Hmax = 2n−1, where n is chosen based on the desired
range for the hash values. This method is best visualized by considering the range of all possible
integer hash values 0 to Hmax to be evenly distributed on a circle (or ring). The nodes in the
distributed system are then also located on the same ring; usually each node will have several
locations on the ring. The positioning of the points on the ring that represent the nodes is done
in a psuedorandom manner.
An item (k, v) will be stored on the node whose position in the ring follows the position of h(k)
on the ring in a clockwise direction. In Figure 24.2(a), we assume there are three nodes in the
distributed cluster labeled A, B, and C, where node C has a bigger capacity than nodes A and
B. In a typical system, there will be many more nodes. On the circle, two instances each of A
and B are placed, and three instances of C (because of its higher capacity), in a pseudorandom
manner to cover the circle. Figure 24.2(a) indicates which (k, v) items are placed in which
nodes based on the h(k) values.
Figure 24.2(a)
The h(k) values that fall in the parts of the circle marked as range 1 in Figure 24.2(a) will have
their (k, v) items stored in node A because that is the node whose label follows h(k) on the ring
in a clockwise direction; those in range 2 are stored in node B; and those in range 3 are stored
in node C. This scheme allows horizontal scalability because when a new node is added to the
distributed system, it can be added in one or more locations on the ring depending on the node
capacity. Only a limited percentage of the (k, v) items will be reassigned to the new node from
the existing nodes based on the consistent hashing placement algorithm. Also, those items
assigned to the new node may not all come from only one of the existing nodes because the
new node can have multiple locations on the ring.
Figure 24.2(b)
For example, if a node D is added and it has two placements on the ring as shown in Figure
24.2(b), then some of the items from nodes B and C would be moved to node D. The items
whose keys hash to range 4 on the circle (see Figure 24.2(b)) would be migrated to node D.
This scheme also allows replication by placing the number of specified replicas of an item on
successive nodes on the ring in a clockwise direction. The sharding is built into the method,
and different items in the store (file) are located on different nodes in the distributed cluster,
which means the items are horizontally partitioned (sharded) among the nodes in the distributed
system. When a node fails, its load of data items can be distributed to the other existing nodes
whose labels follow the labels of the failed node in the ring. And nodes with higher capacity
can have more locations on the ring, as illustrated by node C in Figure 24.2(a), and thus store
more items than smaller-capacity nodes.
Oracle key-value store. Oracle has one of the well-known SQL relational database systems,
and Oracle also offers a system based on the key-value store concept; this system is called the
Oracle NoSQL Database.
Redis key-value cache and store. Redis differs from the other systems discussed here because
it caches its data in main memory to further improve performance. It offers master-slave
replication and high availability, and it also offers persistence by backing up the cache to disk.
Apache Cassandra. Cassandra is a NOSQL system that is not easily categorized into one
category; it is sometimes listed in the column-based NOSQL category or in the key-value
category. If offers features from several NOSQL categories and is used by Facebook as well
as many other customers.
Column-Based or Wide Column NOSQL Systems
As with other NOSQL systems, unique keys are associated with stored data items for fast
access, but the keys identify cells in the storage system. Because the focus is on high
performance when storing huge amounts of data, the data model includes some storage-related
concepts. We discuss the Hbase data modeling concepts and define the terminology next. It is
important to note that the use of the words table, row, and column is not identical to their use
in relational databases, but the uses are related.
■ Tables and Rows. Data in Hbase is stored in tables, and each table has a table name. Data
in a table is stored as self-describing rows. Each row has a unique row key, and row keys are
strings that must have the property that they can be lexicographically ordered, so characters
that do not have a lexicographic order in the character set cannot be used as part of a row key.
■ Column Families, Column Qualifiers, and Columns. A table is associated with one or
more column families. Each column family will have a name, and the column families
associated with a table must be specified when the table is created and cannot be changed later.
Figure 24.3(a) shows how a table may be created; the table name is followed by the names of
the column families associated with the table. When the data is loaded into a table, each column
family can be associated with many column qualifiers, but the column qualifiers are not
specified as part of creating a table. So the column qualifiers make the model a self-describing
data model because the qualifiers can be dynamically specified as new rows are created and
inserted into the table. A column is specified by a combination of
ColumnFamily:ColumnQualifier. Basically, column families are a way of grouping together
related columns (attributes in relational terminology) for storage purposes, except that the
column qualifier names are not specified during table creation. Rather, they are specified when
the data is created and stored in rows, so the data is selfdescribing since any column qualifier
name can be used in a new row of data (see Figure 24.3(b)). However, it is important that the
application programmers know which column qualifiers belong to each column family, even
though they have the flexibility to create new column qualifiers on the fly when new data rows
are created. The concept of column family is somewhat similar to vertical partitioning, because
columns (attributes) that are accessed together because they belong to the same column family
are stored in the same files. Each column family of a table is stored in its own files using the
HDFS file system.
■ Versions and Timestamps. Hbase can keep several versions of a data item, along with the
timestamp associated with each version. The timestamp is a long integer number that
represents the system time when the version was created, so newer versions have larger
timestamp values. Hbase uses midnight ‘January 1, 1970 UTC’ as timestamp value zero, and
uses a long integer that measures the number of milliseconds since that time as the system
timestamp value. It is also possible for the user to define the timestamp value explicitly in a
Date format rather than using the system-generated timestamp.
■ Cells. A cell holds a basic data item in Hbase. The key (address) of a cell is specified by a
combination of (table, rowid, columnfamily, columnqualifier, timestamp). If timestamp is left
out, the latest version of the item is retrieved unless a default number of versions is specified,
say the latest three versions. The default number of versions to be retrieved, as well as the
default number of versions that the system needs to keep, are parameters that can be specified
during table creation.
■ Namespaces. A namespace is a collection of tables. A namespace basically specifies a
collection of one or more tables that are typically used together by user applications, and it
corresponds to a database that contains a collection of tables in relational terminology.
In conventional graph theory, nodes and relationships are generally called vertices and edges.
The Neo4j graph data model somewhat resembles how data is represented in the ER and EER
models, but with some notable differences. Comparing the Neo4j graph model with ER/EER
concepts, nodes correspond to entities, node labels correspond to entity types and subclasses,
relationships correspond to relationship instances, relationship types correspond to
relationship types, and properties correspond to attributes. One notable difference is that a
relationship is directed in Neo4j, but is not in ER/EER. Another is that a node may have no
label in Neo4j, which is not allowed in ER/EER because every entity must belong to an entity
type. A third crucial difference is that the graph model of Neo4j is used as a basis for an actual
high-performance distributed database system whereas the ER/EER model is mainly used for
database design.
Figure 24.4(a) shows how a few nodes can be created in Neo4j. There are various ways in
which nodes and relationships can be created; for example, by calling appropriate Neo4j
operations from various Neo4j APIs. We will just show the high-level syntax for creating nodes
and relationships; to do so, we will use the Neo4j CREATE command, which is part of the
high-level declarative query language Cypher.
■ Labels and properties. When a node is created, the node label can be specified. It is also
possible to create nodes without any labels. In Figure 24.4(a), the node labels are EMPLOYEE,
DEPARTMENT, PROJECT, and LOCATION, and the created nodes correspond to some of
the data from the COMPANY database with a few modifications; for example, we use EmpId
instead of SSN, and we only include a small subset of the data for illustration purposes.
Properties are enclosed in curly brackets { … }. It is possible that some nodes have multiple
labels; for example the same node can be labeled as PERSON and EMPLOYEE and
MANAGER by listing all the labels separated by the colon symbol as follows:
PERSON:EMPLOYEE:MANAGER. Having multiple labels is similar to an entity belonging
to an entity type (PERSON) plus some subclasses of PERSON (namely EMPLOYEE and
MANAGER) in the EER model but can also be used for other purposes.
■ Relationships and relationship types. Figure 24.4(b) shows a few example relationships in
Neo4j based on the COMPANY database.
The → specifies the direction of the relationship, but the relationship can be traversed in either
direction. The relationship types (labels) in Figure 24.4(b) are WorksFor, Manager, LocatedIn,
and WorksOn; only relationships with the relationship type WorksOn have properties (Hours)
in Figure 24.4(b).
■ Paths. A path specifies a traversal of part of the graph. It is typically used as part of a query
to specify a pattern, where the query will retrieve from the graph data that matches the pattern.
A path is typically specified by a start node, followed by one or more relationships, leading to
one or more end nodes that satisfy the pattern. It is somewhat similar to the concepts of path
expressions in the context of query languages for object databases (OQL) and XML (XPath
and XQuery).
■ Optional Schema. A schema is optional in Neo4j. Graphs can be created and used without
a schema, but in Neo4j version 2.0, a few schema-related functions were added. The main
features related to schema creation involve creating indexes and constraints based on the labels
and properties. For example, it is possible to create the equivalent of a key constraint on a
property of a label, so all nodes in the collection of nodes associated with the label must have
unique values for that property.
■ Indexing and node identifiers. When a node is created, the Neo4j system creates an internal
unique system-defined identifier for each node. To retrieve individual nodes using other
properties of the nodes efficiently, the user can create indexes for the collection of nodes that
have a particular label. Typically, one or more of the properties of the nodes in that collection
can be indexed. For example, Empid can be used to index nodes with the EMPLOYEE label,
Dno to index the nodes with the DEPARTMENT label, and Pno to index the nodes with the
PROJECT label.
Neo4j Interfaces and Distributed System Characteristics
Neo4j has other interfaces that can be used to create, retrieve, and update nodes and
relationships in a graph database. It also has two main versions: the enterprise edition, which
comes with additional capabilities, and the community edition. We discuss some of the
additional features of Neo4j in this subsection.
■ Enterprise edition vs. community edition.
Both editions support the Neo4j graph data model and storage system, as well as the Cypher
graph query language, and several other interfaces, including a high-performance native API,
language drivers for several popular programming languages, such as Java, Python, PHP, and
the REST (Representational State Transfer) API. In addition, both editions support ACID
properties. The enterprise edition supports additional features for enhancing performance, such
as caching and clustering of data and locking.
■ Graph visualization interface.
Neo4j has a graph visualization interface, so that a subset of the nodes and edges in a database
graph can be displayed as a graph. This tool can be used to visualize query results in a graph
representation.
■ Master-slave replication.
Neo4j can be configured on a cluster of distributed system nodes (computers), where one node
is designated the master node. The data and indexes are fully replicated on each node in the
cluster.
Various ways of synchronizing the data between master and slave nodes can be configured in
the distributed cluster.
■ Caching.
A main memory cache can be configured to store the graph data for improved performance.
■ Logical logs.
Logs can be maintained to recover from failures.