Lambda and DynamoDB best practices

Lambda &
DynamoDB
Best Practices
a talk by Yan Cui

“Best practice is usually just someone else’s opinion”
- random person on the internet

The “goodness” of a practice is tied to
the context in which it is applied

“Good ideas that work for most people, most of the time”

Yan Cui
http://theburningmonk.com
@theburningmonk
AWS user for 10 years

Yan Cui
@theburningmonk
Developer Advocate @

Yan Cui
@theburningmonk
Independent Consultant
advise
training delivery

Yan Cui
@theburningmonk running serverless in
production since 2016

01. Observability from the start

A measure of how well the internal state of a
system can be inferred from its external outputs
Observability

everything fails, all the time

happened system repaired
user impact
reduce MTTR

user impact
MTTDiscovery

“What alerts should I have?”

It depends on what you’re building…

Lambda
error rate %
throttle count

Lambda
error rate %
throttle count
DLQ error count
iterator age

Lambda
error rate %
throttle count
DLQ error count
iterator age
regional concurrency

Lambda
error rate %
throttle count
DLQ error count
iterator age
API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %

API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %
SQS
message age
Lambda
error rate %
throttle count
DLQ error count
iterator age

user impact
finding root cause

the needle is here
somewhere…

+ high-value
structured logs +
metrics + alerts

+ high-value
structured logs +
metrics + alerts
most of my
troubleshooting

errors are captured
and categorized

errors are captured
and categorized
frequency and trends

did errors correlate
to a deployment?

invocation event,
env vars, logs, etc.

+ high-value
structured logs +
metrics + alerts
Lambda invocations +
every IO-request

+ high-value
structured logs +
metrics + alerts
Lambda invocations +
every IO-request
complex (non-IO)
biz logic

+ high-value
structured logs +
metrics + alerts
system metrics for
AWS services

02. One account per team per environment

no. of DynamoDB tables
no. of API Gateway regional APIs
no. of API Gateway edge-optimized APIs
no. of Kinesis shards
no. of IAM roles
no. of S3 buckets
no. of CloudFormation stacks
no. of SNS subscription filters
no. of SSM parameters
…
Resource Limits

DynamoDB read & write
API Gateway requests/second
Lambda concurrent executions
SSM parameter ops/second
…
Throughput Limits

Compartmentalise security breaches

One account per Team per Environment

Isolate critical/high-throughput services
to their own accounts

org-formation
infrastructure-as-code
CloudFormation-like YML syntax
template landing zones

org-formation
> org-formation update

org-formation
> org-formation perform-tasks

org-formation
https://github.com/OlafConijn/AwsOrganizationFormation

03. Load secrets
at runtime
?
?
?
?
?
?
?
?
?
? ?
??
?
?
?

SSM Parameter Store
Secret 1
Secret 2

SSM Parameter Store
Secret 1
Secret 2
IAM
Environment:
SECRET_1: …
SECRET_2: …
Environment:
SECRET_1: …
SECRET_2: …

SSM Parameter Store
Secret 1
Secret 2
IAM
Environment:
SECRET_1: …
SECRET_2: …
Environment:
SECRET_1: …
SECRET_2: …
yay!

Secrets should NEVER be in plain text in env variables

SSM Parameter Store IAM
fetch at cold start,
cache,
invalidate every x mins
Secret 1
Secret 2

https://github.com/middyjs/middy

SSM Parameter Store IAM
Secret 1
Secret 2
switch to Higher
Throughput for production

Secrets Manager IAM
Secret 1
Secret 2
built-in rotation,
more expensive

04. Principle of
least privilege

network boundary
full-trust zero-trust networking

network boundary
full-trust zero-trust networking
compromised nodes give attackers
access to our entire system

trust no-one
authenticate and
authorize every request

trust no-one
authenticate and
authorize every request
use IAM to protect
internal APIs

network security is a bonus, not the only line of defense

Set environment variable
AWS_NODEJS_CONNECTION_REUSE_ENABLED
to “1”
(for Node.js function running AWS SDK v1.x)

Use Database Proxies when working with RDS

Smaller deployment artefact === faster coldstart

Adding more memory DOESN’T help reduce cold start duration
(except for JVM functions)

Use Lambda Layers as a deployment optimization

NOT as a package manager
Use Lambda Layers as a deployment optimization

const AWS = require(‘aws-sdk’)
const DynamoDB = require(‘aws-sdk/clients/dynamodb’)

Prefer Lambda Destination over DLQs

DLQ Lambda Destinations
payload

DLQ Lambda Destinations
payload payload, context(s), and response

Use DocumentClient instead of AWS.DynamoDB

Use PAY_PER_REQUEST billing mode as default

Use BatchGetItem and BatchWriteItem to
read/write multiple items

Avoid Scan unless you absolutely have to

Use caching to avoid DynamoDB calls

Use high cardinality keys as hash key

Learn single-table design patterns

Learn single-table design patterns
But don’t turn it into a religion

single-table design
Steep learning curve.

single-table design
Difficult to add new access patterns.

single-table design
Can’t monitor usage cost by entity type.

single-table design
Can’t monitor usage cost by entity type.
Difficult to use DynamoDB streams.

“But what about all the cost savings from Single-Table Design?!”

“But what about all the cost savings from Single-Table Design?!”
Only matters when running at scale.

A best practice for Amazon is probably not best for you.

https://theburningmonk.com/hire-me
Advise
Training Delivery
“Fundamentally, Yan has improved our team by increasing our
ability to derive value from AWS and Lambda in particular.”
Nick Blair
Tech Lead

@theburningmonk
theburningmonk.com
github.com/theburningmonk

Lambda and DynamoDB best practices

More Related Content

What's hot

Similar to Lambda and DynamoDB best practices

More from Yan Cui

Recently uploaded

Lambda and DynamoDB best practices