KEMBAR78
4 use cases for C* to Scylla | PPTX
Cassandra -> Scylla
4 Key Use Cases Where Users will See Immediate Benefit
Greg Matza
Which C* Use Cases will See Immediate Benefit With
Scylla?
2
+ Is your Dataset > 10 TB?
+ Do you have > 40k read ops/sec?
+ Is your Application sensitive to Long-tail latency?
+ Do you have a Caching layer in front of Cassandra?
Scylla Supports Huge Datasets
- Amazon’s new i3en instances have up to 60 TB of NVME
- Scylla can use all this disk, with benchmarks of
- 15 hours to add a 45 TB node (10 hrs ingestion + 5 hrs compaction)
- 6 hours to stream a new node - one 45 TB node to two 22.5 TB - (4 hrs streaming + 2 hours cleanup)
- >1 million ops per second per node with 80% cache miss and 99p stable at 2 ms
- Detailed benchmark data here and here
- Cassandra is typically limited to 1-2 TB per node.
Scylla Supports Huge Datasets
- Scenario:
- 40 TB raw data, RF=3, TWCS
- 2 Datacenters
- Total data stored is 40 TB * 3 replicas * 2 DCs = 240 TB
TB/node Node Calculation # nodes Node type Annual cost
per node
Total cost
Cassandra 1.5 240 TB/1.5 TB =
160 nodes
160 i3.2xl $5k $800k
Scylla 45 240 TB/45 TB =
5.3 nodes
6 i3en.24xl $50k $300k
Scylla Supports Heavy Reads
- On i3 hardware, Scylla handles read throughput at approximately the same rate as write
throughput
- Scylla can handle sustained read or write throughput of 10,000 ops/core. (1kb payload, NVME disk)
- Throughput scales linearly with # of cores
- Cassandra typically has per-node limitations on read throughput
- Cassandra can handle sustained read throughput of about 20,000 ops/node (1kb payload, NVME disk)
- Larger core counts, thicker networking, or better I/O do not significantly increase throughput
Scylla Supports Heavy Reads
- Scenario:
- 80k read/sec + 20k writes/sec. 1 TB raw data, RF=3
- Both Scylla and Cassandra are running on i3.4xlarge
- Given RF=3 each application-layer operation is counted as 3 ops against the cluster, as it will act on all 3 replicas
- Total Operations are (80k reads * 3 replicas) + (20k writes * 3 replicas) = 300k ops/sec (240k reads + 60k writes)
Limiting factor
per node
Node calculation # of
nodes
Annual cost
per node
Total cost
Cassandra 20k reads 240k/20k = 12 12 $5k $60k
Scylla 80k ops 300k/80k = 3.75 4 $5k $20k
Long-tail latency sensitive
- Due to Garbage Collection, Compaction, Repair and other operations, Cassandra typically
will have tightly bounded average latency, but 95p or 99p latencies will show regular 5x to
20x spikes
- Scylla has no Garbage Collection, includes its own on-board caching, and actively
manages its own I/O and CPU scheduling. This, among other things, allows it to deliver
tightly bounded 95p, 99p or even max latency.
- I/O and CPU scheduling actively manage tasks in a prioritized manner. So background tasks
like compaction or repair are almost always(*) put behind query or writes.
- *Almost always, because we do have a backpressure mechanism, such that if you are in danger
of losing your node do to OOM or out-of-disk, we will prioritize those tasks needed to save the
node above query.
Long-tail latency sensitive
- Scenario:
- “Customer 360” Use Case
- 3 nodes 8vCPU/64 GB RAM
- 1.4 TB dataset
- 20k reads/sec
- Test run by long-time C* DBA
as part of a Scylla vs.
Cassandra POC
Cassandra’s
latency
Scylla’s latency
Read Latencies, 99p
Cassandra
Scylla
Top 3 North American Telecom
Scylla Does Not Require a Caching Layer
- Read-heavy or Latency-sensitive use cases with Cassandra usually require a Redis, Memcached or
other caching layer to meet those requirements
- Scylla has a built-in caching layer, allowing for easier application-side logic and lower node counts
- no cache invalidation issues
- no cold cache issues
- no try/catch application logic on cache misses
Scylla Does Not Require a Caching Layer
- Scenario:
- Comcast needed <10ms max latency on 200k ops/sec. Balanced Read/Write
- Was implemented in Cassandra with 60 nodes of Varnish (cache) + 600 nodes of Cassandra
- Scylla replaced the entire infrastructure with only 60 nodes
- Case: https://www.scylladb.com/tech-talk/comcast-grow-small-get-big-experiences-with-scylla/
Version Apache Cassandra 2.1.8 Scylla Enterprise 2018.1.11
Data Layer: 600 nodes i3.2xlarge 60 nodes i3.2xlarge
Caching Layer: 60 nodes Varnish m4.4xlarge No caching
OpEx: $3.7 million/yr $328k/yr

4 use cases for C* to Scylla

  • 1.
    Cassandra -> Scylla 4Key Use Cases Where Users will See Immediate Benefit Greg Matza
  • 2.
    Which C* UseCases will See Immediate Benefit With Scylla? 2 + Is your Dataset > 10 TB? + Do you have > 40k read ops/sec? + Is your Application sensitive to Long-tail latency? + Do you have a Caching layer in front of Cassandra?
  • 3.
    Scylla Supports HugeDatasets - Amazon’s new i3en instances have up to 60 TB of NVME - Scylla can use all this disk, with benchmarks of - 15 hours to add a 45 TB node (10 hrs ingestion + 5 hrs compaction) - 6 hours to stream a new node - one 45 TB node to two 22.5 TB - (4 hrs streaming + 2 hours cleanup) - >1 million ops per second per node with 80% cache miss and 99p stable at 2 ms - Detailed benchmark data here and here - Cassandra is typically limited to 1-2 TB per node.
  • 4.
    Scylla Supports HugeDatasets - Scenario: - 40 TB raw data, RF=3, TWCS - 2 Datacenters - Total data stored is 40 TB * 3 replicas * 2 DCs = 240 TB TB/node Node Calculation # nodes Node type Annual cost per node Total cost Cassandra 1.5 240 TB/1.5 TB = 160 nodes 160 i3.2xl $5k $800k Scylla 45 240 TB/45 TB = 5.3 nodes 6 i3en.24xl $50k $300k
  • 5.
    Scylla Supports HeavyReads - On i3 hardware, Scylla handles read throughput at approximately the same rate as write throughput - Scylla can handle sustained read or write throughput of 10,000 ops/core. (1kb payload, NVME disk) - Throughput scales linearly with # of cores - Cassandra typically has per-node limitations on read throughput - Cassandra can handle sustained read throughput of about 20,000 ops/node (1kb payload, NVME disk) - Larger core counts, thicker networking, or better I/O do not significantly increase throughput
  • 6.
    Scylla Supports HeavyReads - Scenario: - 80k read/sec + 20k writes/sec. 1 TB raw data, RF=3 - Both Scylla and Cassandra are running on i3.4xlarge - Given RF=3 each application-layer operation is counted as 3 ops against the cluster, as it will act on all 3 replicas - Total Operations are (80k reads * 3 replicas) + (20k writes * 3 replicas) = 300k ops/sec (240k reads + 60k writes) Limiting factor per node Node calculation # of nodes Annual cost per node Total cost Cassandra 20k reads 240k/20k = 12 12 $5k $60k Scylla 80k ops 300k/80k = 3.75 4 $5k $20k
  • 7.
    Long-tail latency sensitive -Due to Garbage Collection, Compaction, Repair and other operations, Cassandra typically will have tightly bounded average latency, but 95p or 99p latencies will show regular 5x to 20x spikes - Scylla has no Garbage Collection, includes its own on-board caching, and actively manages its own I/O and CPU scheduling. This, among other things, allows it to deliver tightly bounded 95p, 99p or even max latency. - I/O and CPU scheduling actively manage tasks in a prioritized manner. So background tasks like compaction or repair are almost always(*) put behind query or writes. - *Almost always, because we do have a backpressure mechanism, such that if you are in danger of losing your node do to OOM or out-of-disk, we will prioritize those tasks needed to save the node above query.
  • 8.
    Long-tail latency sensitive -Scenario: - “Customer 360” Use Case - 3 nodes 8vCPU/64 GB RAM - 1.4 TB dataset - 20k reads/sec - Test run by long-time C* DBA as part of a Scylla vs. Cassandra POC Cassandra’s latency Scylla’s latency Read Latencies, 99p Cassandra Scylla Top 3 North American Telecom
  • 9.
    Scylla Does NotRequire a Caching Layer - Read-heavy or Latency-sensitive use cases with Cassandra usually require a Redis, Memcached or other caching layer to meet those requirements - Scylla has a built-in caching layer, allowing for easier application-side logic and lower node counts - no cache invalidation issues - no cold cache issues - no try/catch application logic on cache misses
  • 10.
    Scylla Does NotRequire a Caching Layer - Scenario: - Comcast needed <10ms max latency on 200k ops/sec. Balanced Read/Write - Was implemented in Cassandra with 60 nodes of Varnish (cache) + 600 nodes of Cassandra - Scylla replaced the entire infrastructure with only 60 nodes - Case: https://www.scylladb.com/tech-talk/comcast-grow-small-get-big-experiences-with-scylla/ Version Apache Cassandra 2.1.8 Scylla Enterprise 2018.1.11 Data Layer: 600 nodes i3.2xlarge 60 nodes i3.2xlarge Caching Layer: 60 nodes Varnish m4.4xlarge No caching OpEx: $3.7 million/yr $328k/yr