AWS Redshift
AWS Redshi ²⁸⁸ is a Cloud data warehouse designed by AWS. e key features of Redshi include
the ability to query exabyte data in seconds through the columnar design. In practice, this means
excellent performance regardless of the size of the data.
Learn to use AWS Redshi in the following screencast.
Video Link: hps://www.youtube.com/wat?v=vXSH24AJzrU²⁸⁹
Key actions in a Redshift Workflow
In general, the key actions are as described in the Redshi geing started guide²⁹⁰. ese are the
critical steps to setup a workflow.
* Cluster Setup
• IAM Role configuration (what can role do?)
• Setup Security Group (i.e. open port 5439)
• Setup Sema
1 create table users(
2 userid integer not null distkey sortkey,
3 username char(8),
• Copy data from S3
1 copy users from 's3://awssampledbuswest2/tickit/allusers_pipe.txt'
2 credentials 'aws_iam_role=<iam-role-arn>'
3 delimiter '|' region 'us-west-2';
* ery
²⁸⁸https://aws.amazon.com/redshift/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc
²⁸⁹https://www.youtube.com/watch?v=vXSH24AJzrU
²⁹⁰https://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html
135
1 SELECT firstname, lastname, total_quantity
2 FROM
3 (SELECT buyerid, sum(qtysold) total_quantity
4 FROM sales
5 GROUP BY buyerid
6 ORDER BY total_quantity desc limit 10) Q, users
7 WHERE Q.buyerid = userid
8 ORDER BY Q.total_quantity desc;
Summary of AWS Redshift
e high-level takeaway for AWS Redshi is the following.
• Mostly managed
• Deep Integration with AWS
• Columnar
• Competitor to Oracle and GCP Big ery
• Predictable performance on massive datasets