Key-value pair Databases – Redis & DynamoDB
Dr. Richa Sharma
Commonwealth University
1
Introduction
Simplest database design for storing data is key-value (KV)
pair (just like a hash-table).
Key-value database – a type of non-relational database,
NoSQL database.
Stores data as a collection of key-value pairs in which a key
serves as a unique identifier. Both keys and values can be
anything, ranging from simple objects to complex
compound objects!
These databases are highly partitionable and allow
horizontal scaling! 2
Introduction
Database administrators can quickly pull the data by
identifying a specific key.
A query language is not necessary when retrieving data
from KV database, which provides convenience for users
who are lacking query language knowledge.
Apart from being simple, few more benefits of KV
database include:
◦ Scalability
◦ Performance
◦ Ease of use 3
Benefits: KV Database
Scalability: Databases become bottleneck with web-based
applications. However, KV databases scale horizontally and
automatically distribute data across servers to reduce bottlenecks
at a single server.
Performance: KV databases process constant read-write
operations with low-overhead server calls:
◦ Improved latency and reduced response time give better performance at
scale.
◦ Being based on simple, single-table structures rather than multiple
interrelated tables (in RDBMS), no joins required, these databases are faster
in performance
Ease of use: No specific query language, can be combined with
programming languages! 4
Application areas: KV Database
Session management: Web applications create a session
when user logs in – each user session has a unique identifier.
Session data (such as profile information, messages,
personalized data and themes etc.) is never queried by
anything other than a primary key, so a fast key-value store is
a better fit for storing session data.
In fact, KV databases may provide smaller per-page overhead
than relational databases.
Shopping cart: A KV database can handle receiving and
processing billions of orders. Key-value stores have built-in
redundancy, which can handle the loss of storage nodes.
5
Application areas: KV Database
Caching: KV databases find a suitable application
for storing data temporarily for faster retrieval. E.g.
social media applications storing frequently
accessed data like news feed content.
6
Features: KV Database
Support for complex data types: support simple as well as
complex data types such as arrays, nested dictionaries,
images, videos, and semi-structured data.
No need for table joins: flexibility of storing the data as
key-value pair accommodates all the needed information
in a single table, and therefore, no table joins needed!
Sorted keys: A key-value store can sort keys so that data
is stored systematically and for implementing partitioning
as well! Sorting can be done alphabetically, numerically,
by data size, chronologically.
7
Features: KV Database
Secondary key support: key-value stores allow defining two
or more different keys or secondary indexes to access the
same data.
Replication: most of the key-value stores offer built-in
replication support by automatically copying data across
multiple storage nodes! This helps in auto-recovery from
failures.
ACID support: Some advanced KV databases provide native,
server-side support for ACID – allowing for coordinated, all-
or-nothing changes to multiple items. With transaction
support, developers can extend the scale, performance, and
8
enterprise benefits to a broader set of applications.
Limitations: KV Database
Absence of complex queries: KV databases don't support
complex queries. Data operations in KV database are
mainly through simple query language terms such as get,
put, and delete. So, there are limitations to how much
data filtering and sorting can be done.
Schema mismanagement: KV database does not enforce
a schema. So, developers teams have to plan the data
model systematically to avoid long-term problems. The
lack of a tight schema also means that the application is
responsible for the proper interpretation of the data it
consumes/reads.
9
Redis
10
Introduction - Redis
Simplest implementation of KV database can be thought of as a
file system where one can assume the file path as the key and
the file contents as the value.
With such an idea of implementation, we don’t need pre-define
tables.
Redis database is the example of such simplest NoSQL
database. Implementation of Redis is in terms of file systems,
hash tables, lists, sets, sorted sets.
With simple architecture and easy-to-use commands, the
performance of Redis is incredibly high!!
11
CRUD operations
• Redis provides simple operations to add and retrieve data from
database:
• SET command to add a key-value pair to the database. We need to
provide both arguments fro setting – key & value.
• GET command to retrieve a key-value pair to the database by
providing key as the input.
• MSET and MGET commands for adding multiple KV pairs in one go
and for retrieving multiple KV pairs in one go respectively!
• INCR command is also there for those cases where key is
numeric and needs incrementing by one for reaching next key.
For example: when array has been used for implementation of
Redis. 12
Concept of Transactions
Though Redis has no concept of tables but it has notion of
transactions if SET and INCR commands need to be
executed together.
In the event of failure, similar to ROLLBACK in SQL, one can
stop a transaction with the DISCARD command in Redis.
However unlike ROLLBACK, it won’t revert the database; it
will simply not run the transaction at all. The effect is
identical, though the underlying concept is a different
mechanism.
13
Attributes of Redis
Communication interface – Redis provides command-
line interface as well as client GUI interface with mouse-
clickable icons!
Durability – Persistence of Redis ranges from no
persistence to moderate level of persistence depending
on the configuration we setup!
Security – Redis is not, by design, supposed to be a fully
secure server. If one wants Redis security, it’d be better
to have a good firewall and SSH security.
14
Attributes of Redis
Database Replication – Unlike other NoSQL databases,
Redis supports master-slave replication. One server is
the master by default if we don’t set it as a slave of
anything. Data will be replicated to any number of slave
servers! Making slave servers also an easy command-
line instructions.
Performance – Incredibly High!
Scalability – Highly scalable.
15
DynamoDB
16
Introduction - DynamoDB
• DynamoDB is a cloud-based database available through Amazon
Web Services (AWS).
• This makes it easy to install, start working with it, or maintaining it.
• One just needs to sign up for an AWS account, create a
DynamoDB table, and just get started!
• DynamoDB does require some operations-style thinking and
preparation, but one does not need to provide any XML
configuration file unlike other noSQL databases!!
• DynamoDB is a database that runs itself, and yet is capable of
allowing webscale, consistent performance!
17
Core Concepts - DynamoDB
DynamoDB is a KV pair database and resembles the Redis
DB in the sense of storing key-value pairs – but these pairs
are stored in a table (much like relational DB)
These tables need to be created and defined in advance
whereas no such requirement is there with Redis DB!!
Key-value pairs stored in DynamoDB are known as items –
these are much like row in relational DB.
DynamoDB, in addition, provides query-processing support
too just like relational DB (but no concept of table joins in
DynamoDB)! This allows retrieving keys for values too unlike
Redis DB. 18
Webscale facts for DynamoDB
• How webscale one can get with DynamoDB:
• One can store as many items as one want in any
DynamoDB table.
• Each item (the equivalent of a row in an SQL database)
can hold as many attributes as one want, although there is
a hard size limit of 400 KB per item (can be increased too).
• With right data modelling, one may experience good
performance even when the tables store petabytes of data.
• Over 100,000 AWS customers currently use DynamoDB.
• DynamoDB handles well over a trillion total requests a day
(across all AWS customers).
19
Advantages- DynamoDB
Being available on cloud through Amazon Web Services
(AWS), database team is free from handling unforeseeable
hardware outages and network failures; scaling out disk
capacity to meet unexpected spikes in demand.
All data in DynamoDB is stored on high-performing Solid
State Disks (SSDs) and automatically replicated across
multiple availability zones within an AWS region (which
guarantees redundancy even within a single region)!!
One can expect genuine downtime out of DynamoDB only
in the rare event that an entire AWS datacenter goes down. 20
CRUD operations
DynamoDB supports Strings, Numbers, Boolean, Binary
and Null value – a total of 5 data types!
Besides querying like relational DB for retrieving data (eg.
Select * from table-name), DynamoDB provides following
commands for writing and reading from the database:
◦ Put-item: this command is meant for writing an item (i.e. A
key-value pair to the table in DynamoDB). The item needs to
be provided in json format. Example:
put-item --table-name myTable –item '{"ItemName": {1:
"Database-Introduction"}}‘
21
CRUD operations
DynamoDB provides following commands for writing and
reading from the database:
◦ Scan: this command is meant for retrieving data from a
table and is similar to query: select * from <table> .
Example:
scan --table-name myTable
◦ Get-item: this command is meant for retrieving one
item from the table. Example:
get-item --table-name myTable –key'{"ItemName": {1:
22
"Database-Introduction"}}‘
Attributes of DynamoDB
Nature of problem and usage of database – Simple
database applications that require quick storage of data for
processing purpose, where security is not a major concern!
Unique characteristic of database – unique blend of
extreme scalability, predictably solid performance as we
scale out, and freedom from operational burdens!
Communication interface of database –provides command-
line interface as well as client GUI interface with mouse-
clickable icons!
23
Attributes of DynamoDB
Database Replication – Automated replication
through cloud web-services.
Security – Being cloud-based, security can be a
concern for confidential data.
Performance – Good!
Scalability – Moderately scalable.
24