KEMBAR78
Cloud Computing For Data Analysis - AWS Redshift | PDF
0% found this document useful (0 votes)
12 views2 pages

Cloud Computing For Data Analysis - AWS Redshift

AWS Redshift is a cloud data warehouse designed for high performance, enabling quick queries on large datasets through its columnar design. Key actions in setting up a Redshift workflow include cluster setup, IAM role configuration, security group setup, schema setup, and data copying from S3. Redshift is mostly managed, deeply integrated with AWS, and serves as a competitor to Oracle and GCP BigQuery.

Uploaded by

brisobell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

Cloud Computing For Data Analysis - AWS Redshift

AWS Redshift is a cloud data warehouse designed for high performance, enabling quick queries on large datasets through its columnar design. Key actions in setting up a Redshift workflow include cluster setup, IAM role configuration, security group setup, schema setup, and data copying from S3. Redshift is mostly managed, deeply integrated with AWS, and serves as a competitor to Oracle and GCP BigQuery.

Uploaded by

brisobell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

AWS Redshift

AWS Redshi ²⁸⁸ is a Cloud data warehouse designed by AWS. e key features of Redshi include
the ability to query exabyte data in seconds through the columnar design. In practice, this means
excellent performance regardless of the size of the data.
Learn to use AWS Redshi in the following screencast.
Video Link: hps://www.youtube.com/wat?v=vXSH24AJzrU²⁸⁹

Key actions in a Redshift Workflow


In general, the key actions are as described in the Redshi geing started guide²⁹⁰. ese are the
critical steps to setup a workflow.
* Cluster Setup

• IAM Role configuration (what can role do?)


• Setup Security Group (i.e. open port 5439)

• Setup Sema

1 create table users(


2 userid integer not null distkey sortkey,
3 username char(8),

• Copy data from S3

1 copy users from 's3://awssampledbuswest2/tickit/allusers_pipe.txt'


2 credentials 'aws_iam_role=<iam-role-arn>'
3 delimiter '|' region 'us-west-2';

* ery

²⁸⁸https://aws.amazon.com/redshift/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc
²⁸⁹https://www.youtube.com/watch?v=vXSH24AJzrU
²⁹⁰https://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html
135

1 SELECT firstname, lastname, total_quantity


2 FROM
3 (SELECT buyerid, sum(qtysold) total_quantity
4 FROM sales
5 GROUP BY buyerid
6 ORDER BY total_quantity desc limit 10) Q, users
7 WHERE Q.buyerid = userid
8 ORDER BY Q.total_quantity desc;

Summary of AWS Redshift


e high-level takeaway for AWS Redshi is the following.

• Mostly managed
• Deep Integration with AWS
• Columnar
• Competitor to Oracle and GCP Big ery
• Predictable performance on massive datasets

You might also like