The document presents a comparison between Apache Cassandra versions 3 and 4, and ScyllaDB, focusing on performance improvements and key features. It highlights Cassandra 4's enhancements such as support for JDK 11, increased speed and scalability, and better compression settings, while showcasing Scylla's superior performance in latency, throughput, and node management. The conclusions suggest that Scylla not only outperforms Cassandra but also offers long-term savings and additional features that could outweigh the challenges of migration.
Presenters
4
Karol Baryła
Karol isa Junior Software Engineer at ScyllaDB. He often
participates in security CTF competitions as a member of team
"Armia Prezesa" where he solves web security and reverse
engineering tasks. He is currently pursuing an MSc in Computer
Science at the University of Warsaw.
Piotr Grabowski
Piotr is a software engineer working at ScyllaDB. From a young age,
he participated in many competitive programming contests. Piotr
holds a BSc in Computer Science from the University of Warsaw and
is now pursuing an MSc. For the past year, he worked on Kafka
connectors and Scylla Java Driver.
6
+ The Real-TimeBig Data Database
+ Drop-in replacement for Apache Cassandra
and Amazon DynamoDB
+ Outstanding performance & low tail latency
+ Open Source, Enterprise and Cloud options
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA, USA; Herzelia, Israel;
Warsaw, Poland
About ScyllaDB
7.
At July 27th,2021 Cassandra team
released version 4.0 - 6 years after the
release of version 3.0.
Let’s see how much Cassandra improved
during those 6 years, and how well it holds
against Scylla 4.4 now.
7
8.
8
1. Increased speedand scalability
a. Zero Copy Streaming streaming data up to 5x faster
b. Up to 25% faster throughput on reads and writes
2. Support for JDK 11
3. New configuration settings, better security and observability
4. Better compression settings (support for Zstd)
5. A shift to a 12-month release cycle
Cassandra 4.0 new features
11
1. Latency atdifferent throughputs
a. Gaussian distribution
b. Disk-intensive distribution
c. Memory-intensive distribution
2. Adding a single new node
3. Doubling cluster size
4. Replacing node
Benchmarked operations
12.
12
+ 3 vs3:
+ Cluster nodes: 3x i3.4xlarge (16vCPU, 122GiB RAM, up to 10Gbps network, 2x1.9TB NVMe)
+ Loader nodes: 3x c5n.9xlarge (36vCPU, 96GiB RAM, up to 50Gbps network)
+ 4 vs 40:
+ Scylla cluster: 4x i3.metal (72vCPU, 512GiB RAM, up to 25Gbps network, 8x1.9TB NVMe)
+ Cassandra cluster: 40x i3.4xlarge (16vCPU, 122GiB RAM, up to 10Gbps network, 2x1.9TB NVMe)
+ Loader nodes: 15x c5n.9xlarge (36vCPU, 96GiB RAM, up to 50Gbps network)
+ Java version: JDK 16 (Cassandra 4.0), JDK 8 (Cassandra 3.11)
Benchmarks setup - 3vs3 and 4vs40
24
+ Cassandra 3officially supports only Java 8
+ Cassandra 4 officially supports Java 8 and Java 11
+ Java 11 introduced ZGC - as an experimental feature
+ ZGC is considered production ready from Java 15
+ We used Java 16 in benchmarks in order to utilize full power of ZGC
+ ZGC has extremely short pauses, which reduces Cassandra’s tail latencies.
What causes latency improvements?
25.
25
How much datado you have under management in your own transactional
database systems?
+ <1 terabyte
+ 1 to 50 terabytes
+ 50-100 terabytes
+ >100 terabytes
Quick Poll
46
Summary of results
+Cassandra 4 has much better tail latencies than Cassandra 3.
+ Scylla performs 3-4 times better than Cassandra when adding/replacing nodes.
+ Scylla adds 25% capacity to a 40 TB optimized cluster 11x faster than Cassandra 4.0.
+ Scylla performs major compaction 32x faster than Cassandra 4.0.
+ Scylla has 2x-5x better throughput than Cassandra 4.0 on the same 3-node cluster
+ Scylla has 3x-8x better throughput than Cassandra 4.0 on the same 3-node cluster while
P99 <10ms
+ A 40 TB cluster is 2.5x cheaper with Scylla while providing 42% more throughput under
P99 latency of 10 ms
47.
47
Should I upgradeto Scylla or Cassandra 4?
1. Upgrading is hard, so why not upgrade to Scylla right away?
a. Upgrading is problematic anyway - you should make backups, you risk downtime.
b. Migrating from Cassandra to Scylla is a bit more involving - but the benefits are worth it.
2. Upgrading to Scylla will save you the money in the long run.
3. Scylla offers better performance and lower latencies compared to Cassandra 4.
4. Scylla offers exciting new features:
a. Scylla CDC
b. Kubernetes support with Scylla Operator
c. Scylla Cloud
48.
Download Scylla OpenSource:
scylladb.com/download
Learn more https://university.scylladb.com/
Experience Scylla for Yourself
48