Using ScyllaDB for Extreme Scale Workloads

Using ScyllaDB for
Extreme Scale
Workloads
Tzach Livyatan, VP Product, ScyllaDB
Attila Tóth, Developer Advocate, ScyllaDB

Poll
How often do you scale your database?

Presenters
Attila Tóth, Developer Advocate
+ Working as a software engineer / dev advocate in the data space for
6+ years
+ Lives in Budapest, Hungary
Tzach Livyatan, VP Product
+ Working for product manager for ages.
+ Lives in Tel Aviv, Israel

Agenda
+ Why ScyllaDB?
+ Scylla Use Cases
+ Design For High Throughput and Low Latency
+ Coming Soon

Why ScyllaDB?
Best High Availability in the industry
Best Disaster Recovery in the industry
Best scalability in the industry
Best Price/Performance in the industry Auto-tune - out of the box performance
Compatible with Cassandra & DynamoDB
The power of Cassandra at the speed of Redis with the usability of DynamoDB
No Lock-in
Open Source Software

+400 Gamechangers Leverage ScyllaDB

NoSQL - By Availability vs Consistency
Pick Two
Availability
Partition
Tolerance
Consistency
Or use a more
granular model,
like PACELC

Document store Wide Column Key-value:
Simple DB
NoSQL - By data model
Graph store

What is important for data-intensive
applications?
High Throughput Low Latency Predictable Cost

Predictable performance at scale
Low Latency
Our low-level design
plus adaptive
capabilities keep
P99s predictably low.
High Throughput
Sustain millions of
ops/sec with low
P99s. No item or
partition size limits,
no throttling down
your workloads.
Global Scale
Operate at a global
scale with high
availability, fewer
nodes and reduced
administration.

Active/active, replicated, auto-sharded
ScyllaDB Architecture

Active/Active, replicated, auto-sharded
12
Tunable, Eventual Consistency
App
App
App
App
App
App
CL= Local
Quorum
CL= One

External cache vs. ScyllaDB cache
External

ScyllaDB embedded caching
CREATE TABLE caching (
k int PRIMARY KEY,
v1 int,
v2 int
) WITH caching = {'enabled': 'true'};
SELECT * FROM users BYPASS CACHE;
SELECT name FROM users WHERE userid IN (199, 200, 207) BYPASS CACHE;
Enable/disable cache per table:
Disable cache per query:

ScyllaDB vs. DynamoDB
1/5th cost
20x higher throughput
ScyllaDB vs. Google Bigtable
1/5th the cost
ScyllaDB vs. Cassandra
2-20x lower latency
What a Difference a Database Makes

From Redis + Elasticsearch to ScyllaDB
17
<1ms P99
Zero downtime
TCO

18
“ScyllaDB provides a baseline that simpliﬁes the whole <conﬁg> process and reduces
risk and anxiety. Once in production, rather than rely on constant human intervention,
ScyllaDB becomes self-tuning, dynamically adapting to real-world workloads..”
- Mark Smith, Discord
size, fewer nodes
8x throughput, ms P99
operational complexity

19
“This not only reduced TCO, but also reduced the pain that the database
engineering team was taking to actually maintain the cluster in a healthy state.”
– Niraj Kothari, Dir. Platforms Engineering
55 C* nodes to 6!
80% EC2 costs
5xgrowth in clusters

TCO
Speed of Redis
From Redis to ScyllaDB for
Data Stores, Fraud Detection, Ad Targeting
Scalability

962 C* nodes to 78
60% TCO
95% latency
“By moving to ScyllaDB Enterprise
software running on AWS EC2
infrastructure and on-premises,
Comcast improved P99 latency by
more than 95% and were able to rip
out a UI cache layer”

22
<1ms avg Latency
From Redis to Cassandra to ScyllaDB Cloud
4-8msP99
Fault Tolerance

23
Real-time workloads on
3 AWS nodes
Out-of-order solved
Process all Zillow data in <1 day with no
performance hit to real-time
“No one even realizes we are processing the
entirety of Zillow’s property and listings data.”
– Dan Podhola, Principle Engineer

24
“It was comparable to the solution with Kafka, and we didn’t have to
add, manage, and maintain another data product in our ecosystem.”
– Daniel Belenky, Palo Alto Networks
operational complexity
operational costs
(for 1,000+ dbs!)
app throughput

+ GitHub: https://github.com/scylladb/1m-ops-demo
+ You can do the demo yourself with:
+ ScyllaDB Cloud or
+ ScyllaDB Enterprise - running under your own AWS account
+ Add your AWS credentials in variables.tf
+ Then run Terraform
+ Conﬁg:
+ Loader instances: 3 (i4i.8xlarge)
+ ScyllaDB nodes: 3 (i4i.8xlarge)
+ us-east-1
Clone the repo!

Horizontal & Vertical Scaling
Deep Technical Advancements
Built in C++
(no Java overhead)
System and DC Aware Sharding Per Core Shard-Aware Drivers Auto-Tuning
Network
Processor NUMA
Storage
Unique Close-to-Metal Architecture
1000’s Nodes Cluster
2000 Clusters
K8S Deployment
60TB per Node 256 Cores per Node
1B Operations
per Second

ScyllaDB Design Decisions
1
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous
C++ instead of Java

Threads Shards
1 C++ instead of Java
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous

Legacy NoSQL Scylla
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
Uniﬁed cache
SSTables
Complex
Tuning
1
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous
C++ instead of Java

Legacy NoSQL Scylla
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
Uniﬁed cache
SSTables
App
thread
Kernel
SSD
Page fault
Suspend thread
Initiate I/O
Context switch
I/O
completes
Interrupt
Context
switch
Map page
Resume
thread
Page fault
1
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous
C++ instead of Java

Query
Commitlog
Compaction
Userspace
I/O
Scheduler
Disk
Queue
Queue
Queue
1
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous
C++ instead of Java

Scylla Design Decisions
Memtable
Seastar
Scheduler
Compaction
Query
Repair
Commitlog
SSD
Compaction
Backlog Monitor
Memory Monitor
Adjust priority
Adjust priority
WAN
CPU
1
2 All Things Async
3 Shard per Core
4 Uniﬁed Cache
5 I/O Scheduler
6 Autonomous
C++ instead of Java

https://play.instruqt.com/scylladb/invite/fwtkeaxygujs
Coming Soon!
Tablets

Tablets
Resharding is cheap.
SStables split at tablet boundary.
Reassign tablets to shards (logical operation).

+ Introduce a new layer of indirection - the tablets table
+ Each table has its own token range to node mapping
+ Mapping can change independently of node addition
and removal
+ Different tables can have different tablet counts
+ Managed by Raft
Implementation - Metadata
System, tablets
Query
Replica
Set
Token

+ Each tablet replica is isolated into its own
memtable+ SSTables
+ Forms its own little Log-Structured Merge Tree
+ With compaction and stuff
+ Can be migrated as a unit
+ Migration: copy the unit
+ Cleanup: delete the unit
+ Split/merge as the table grows/shrinks
Implementation - Data Path

+ Hosted on one node
+ But can be migrated freely if the node is down
+ Synchronized via Raft
+ Collects statistics on tables and tablets
+ Migrates to balance space
+ Evacuates nodes to decommission
+ Migrates to balance CPU load
+ Rebuilds and repairs
Implementation - Load Balancer

Demo Time!
Tablets
https://play.instruqt.com/scylladb/invite/fwtkeaxygujs

Upcoming: Tablet File-based streaming
+ Similar to Cassandra Zero-copy Streaming
+ But better ;-)
+ Tablets are always owned by the replica
+ Simply copy, done.
+ Up to 75% faster than Open Source for Streaming

Performance Improvements
+ Up to 1.5x Higher Throughput than Open Source
+ Up to 35% Lower Latencies (mean and P99)

Network (RPC) Compression Improvements
+ Improved network compression for RPC traﬃc
+ Option of Zstd instead of LZ4
+ Periodically trained dictionaries, instead
compression per message
+ See Łukasz Paszkowski on Cheating the Cloud: 50%
Savings with Compression Dictionaries at P99 CONF

Serverless (VM Based..)
Typeless Sizeless Limitless

Consistent
metadata +
Elasticity =
Much More

Poll
How long does it take for you to scale
your existing database?

Keep Learning
scylladb.com/category/engineering
Visit our
blog for more
on ScyllaDB
engineering
ONLINE | MARCH 11 + 12, 2025
CALL FOR SPEAKERS

Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

Using ScyllaDB for Extreme Scale Workloads

More Related Content

Similar to Using ScyllaDB for Extreme Scale Workloads

Recently uploaded

Using ScyllaDB for Extreme Scale Workloads