The Path to ScyllaDB 5.2

ScyllaDB 5.2 and
Beyond
Fresh from the ScyllaDB Oven
Avi Kivity, CTO and Co-Founder

Agenda
■ Increasing Streaming Robustness
■ Autoparallel Queries
■ WebAssembly User Deﬁned Functions
■ Per-partition Throttling
■ Alternator Updates
■ Consistent Schema and Topology
■ New SSD Disk Modeling
■ Taming Corner Cases
■ What’s Cooking Now

Repair-Based Node Operations
■ Resumable bootstrap/decommission
■ Stream from primary replica
■ Or a quorum if primary is missing
■ Increases resilience and improves correctness

Autoparallel Queries
■ Aggregations previously done via Spark or custom code
■ Instead, recognize certain CQL patterns
■ Dispatch to all nodes, all vcpus within nodes
Node 5
Node 1
Node 2
Node 4 Node 3
SELECT COUNT(*)
FROM t

WebAssembly UDF/UDA
■ Push compute into database
■ Use any language*
■ Computations run in a WASM sandbox
■ Use case: analytics
*as long as it’s Rust

Per Partition Rate Limit
■ New CQL table attribute to limit access rate to partition
■ Works for reads and writes
■ Prevent bot accounts from spamming database
■ “Hot partition”

Alternator Updates
■ Time-to-live Expiration
■ Improved performance

■ Eliminate classes of operator errors
■ Concurrent schema changes
■ Concurrent topology operations
■ Lay groundwork for more advanced features
■ Concurrent node bootstrap/decommission
■ Tablets
■ Strong consistency
Consistent Schema and Topology

ScyllaDB knows more
about the disk operating
envelope
New SSD Disk Modeling

Reverse Queries
■ 4.5 and older slow for large partitions
■ 4.6 fast, but skipped cache
■ 5.0+ fast, supports cache
■ Works well with paging SELECT *
FROM tab
WHERE …
ORDER BY clustering_key DESC

■ Queries that encounter large consecutive tombstone runs are now well
supported
■ Partitions with many range tombstones work well
Better Handling of Tombstones

■ Escalating countermeasures as memory usage increases
■ Prevent new queries from starting
■ Allow only one query to make progress
■ Kill all but one query
Improved Out-of-Memory Handling

Repair-Based Tombstone Garbage Collection
■ Eliminate gc_grace_seconds
■ Tie tombstone garbage collection to last repair
■ Improves performance for clusters that have frequent repair
■ Improves correctness for clusters that missed repair

Nudging the CQL Grammar Towards SQL
■ Relaxing constraints
■ Reconciling semantic oddities
■ Increasing the scope of autoparallel queries

■ A spectrum of cost/performance tradeoffs
■ RAM: Extremely fast (100ns), very expensive
■ NVMe: Very fast, (100µs), expensive
■ HDD: Slow (10ms), cheap
■ Cloud Object storage (S3 and similar)
■ Slow (40ms), cheap
■ Inﬁnitely expandable
■ Easy to manipulate
■ Shared access
Object Storage

■ Very dense databases
■ Where latency is not critical
■ Tiered storage
■ Mix service levels and cost
■ Optimize both cost and latency
Use Cases for Object Storage

Thank You
Stay in Touch
Avi Kivity
avi@scylladb.com
@AviKivity
@avikivity

The Path to ScyllaDB 5.2

More Related Content

Similar to The Path to ScyllaDB 5.2

More from ScyllaDB

Recently uploaded

The Path to ScyllaDB 5.2