KEMBAR78
[D3T1S03] Amazon DynamoDB design puzzlers | PDF
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hyuk Lee
DynamoDB SA
AWS
Amazon DynamoDB design puzzlers
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
AWS DATA & AI ROADSHOW 2024
2
Agenda
We’re going to collaboratively explore a set of design puzzlers:
1. Starter puzzlers to get us warmed up
2. Three main dish deep dives
3. Dessert
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Starter puzzlers
3
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I’m seeing 10,000 requests per
second but am consuming
20,000 WCUs. Why?
4
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
10,000 consuming 20,000
Oh, and my table’s average item size is under 1kb, does that change
your answer?
Could indexes have something to do with this?
5
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I want to sample 1,000 random
items from my table. How?
6
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Pulling a random item
How does a scan work?
How does a parallel scan work?
How many total segments can
we use with a parallel scan?
Is there a need to use limit?
7
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Main dish -
choosing userID as partition key
8
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I configured userID for my
partition key because it has
high cardinality.
Am I safe now?
9
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
High cardinality for partition key
10
High cardinality
Sorted by date time
Filter by Status
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
High cardinality for partition key
It makes sense that you choose DeviceID for partition key in IoT
workload.
What if a specific user can generate 10,000x traffic more than a
normal user? “Heavy users”
Can you always use userID for partition key and other attributes for
sort key?
11
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
High cardinality for partition key
12
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Main dish -
always single table design?
13
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I am running an IoT service and
want to provide the last 3
months of data to customer.
Do I use single table or
multiple tables on DynamoDB?
14
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
IoT service with the last 3 months of data
Will you design it with single table or multiple tables?
Here is your options with single table design.
15
Single Table
Throughput = provisioned vs on-demand mode
Storage = Standard vs Standard-IA
TTL enabled
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
IoT service with the last 3 months of data
Here is one of your options with multiple table design.
16
March April May
No writes, some reads
and lots of data
No writes, some reads
and lots of data
Many writes & reads
and starting zero data
On-demand mode &
Standard-IA
On-demand mode &
Standard-IA
Provisioned mode &
Reserved Capacity &
Standard
We are here!
“Access pattern matters a lot!”
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Main dish -
is on-demand mode unlimited?
17
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I created a table using on-
demand mode. A new large
scale service will open next
week. Am I safe now?
18
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Is on-demand mode unlimited?
19
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Is on-demand mode unlimited?
20
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Is on-demand mode unlimited?
21
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Is on-demand mode unlimited?
I need 1M WCU/RCU in my account.
How can I pre-warm my table?
22
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Is on-demand mode unlimited?
23
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Dessert
24
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
I have PITR on a large table.
Someone ran a bad bulk job
Monday at noon that deleted
important items. How can I
recover?
25
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
PITR
Is there anything better than a full table restore at 11:50 AM?
Hint: Don’t do a restore, do an export
Hint: Don’t do a full export
Consider an incremental export from 11:30 AM to 12:30 PM that puts
on disk the OldImage and NewImage of all modified items!
What’s the cost savings if the table is 100 TB?
26
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Incremental export to Amazon S3
27
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Incremental export file format
{ "Metadata":
{ "WriteTimestampMicros": "1680109764000000" },
"Key": { "PK": { "S": "CUST#200" } },
"OldImage": {
"PK": { "S": "CUST#200" },
"FirstName": { "S": "Mary" },
"LastName": { "S": "Grace" } },
"NewImage": {
"PK": { "S": "CUST#200" },
"FirstName": { "S": "Mary" },
"LastName": { "S": "Smith" } } }
28
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
Write intensive
• IoT / Monitoring
• User activity logging
• Tracking
• Notification service
• Step count
• Chatting
• Cart
• Recommendation
Consistent latency
• Ad
• Session store
• User profile
• Metadata store for assets
• Product detail
• Purchase history
No Ops
• DevOps
• Global service
• No versions
• No maintenance
DynamoDB usage patterns
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DATA & AI ROADSHOW 2024
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Hyuk Lee
hyuklee@amazon.com

[D3T1S03] Amazon DynamoDB design puzzlers

  • 1.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hyuk Lee DynamoDB SA AWS Amazon DynamoDB design puzzlers
  • 2.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 AWS DATA & AI ROADSHOW 2024 2 Agenda We’re going to collaboratively explore a set of design puzzlers: 1. Starter puzzlers to get us warmed up 2. Three main dish deep dives 3. Dessert
  • 3.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Starter puzzlers 3
  • 4.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I’m seeing 10,000 requests per second but am consuming 20,000 WCUs. Why? 4
  • 5.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 10,000 consuming 20,000 Oh, and my table’s average item size is under 1kb, does that change your answer? Could indexes have something to do with this? 5
  • 6.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I want to sample 1,000 random items from my table. How? 6
  • 7.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Pulling a random item How does a scan work? How does a parallel scan work? How many total segments can we use with a parallel scan? Is there a need to use limit? 7
  • 8.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Main dish - choosing userID as partition key 8
  • 9.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I configured userID for my partition key because it has high cardinality. Am I safe now? 9
  • 10.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 High cardinality for partition key 10 High cardinality Sorted by date time Filter by Status
  • 11.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 High cardinality for partition key It makes sense that you choose DeviceID for partition key in IoT workload. What if a specific user can generate 10,000x traffic more than a normal user? “Heavy users” Can you always use userID for partition key and other attributes for sort key? 11
  • 12.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 High cardinality for partition key 12
  • 13.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Main dish - always single table design? 13
  • 14.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I am running an IoT service and want to provide the last 3 months of data to customer. Do I use single table or multiple tables on DynamoDB? 14
  • 15.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 IoT service with the last 3 months of data Will you design it with single table or multiple tables? Here is your options with single table design. 15 Single Table Throughput = provisioned vs on-demand mode Storage = Standard vs Standard-IA TTL enabled
  • 16.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 IoT service with the last 3 months of data Here is one of your options with multiple table design. 16 March April May No writes, some reads and lots of data No writes, some reads and lots of data Many writes & reads and starting zero data On-demand mode & Standard-IA On-demand mode & Standard-IA Provisioned mode & Reserved Capacity & Standard We are here! “Access pattern matters a lot!”
  • 17.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Main dish - is on-demand mode unlimited? 17
  • 18.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I created a table using on- demand mode. A new large scale service will open next week. Am I safe now? 18
  • 19.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Is on-demand mode unlimited? 19
  • 20.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Is on-demand mode unlimited? 20
  • 21.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Is on-demand mode unlimited? 21
  • 22.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Is on-demand mode unlimited? I need 1M WCU/RCU in my account. How can I pre-warm my table? 22
  • 23.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Is on-demand mode unlimited? 23
  • 24.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Dessert 24
  • 25.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 I have PITR on a large table. Someone ran a bad bulk job Monday at noon that deleted important items. How can I recover? 25
  • 26.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 PITR Is there anything better than a full table restore at 11:50 AM? Hint: Don’t do a restore, do an export Hint: Don’t do a full export Consider an incremental export from 11:30 AM to 12:30 PM that puts on disk the OldImage and NewImage of all modified items! What’s the cost savings if the table is 100 TB? 26
  • 27.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Incremental export to Amazon S3 27
  • 28.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Incremental export file format { "Metadata": { "WriteTimestampMicros": "1680109764000000" }, "Key": { "PK": { "S": "CUST#200" } }, "OldImage": { "PK": { "S": "CUST#200" }, "FirstName": { "S": "Mary" }, "LastName": { "S": "Grace" } }, "NewImage": { "PK": { "S": "CUST#200" }, "FirstName": { "S": "Mary" }, "LastName": { "S": "Smith" } } } 28
  • 29.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 Write intensive • IoT / Monitoring • User activity logging • Tracking • Notification service • Step count • Chatting • Cart • Recommendation Consistent latency • Ad • Session store • User profile • Metadata store for assets • Product detail • Purchase history No Ops • DevOps • Global service • No versions • No maintenance DynamoDB usage patterns
  • 30.
    © 2024, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS DATA & AI ROADSHOW 2024 © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Hyuk Lee hyuklee@amazon.com