Query Optimization in BigQuery
Advanced BigQuery
© 2018 Google LLC. All rights reserved.
Prerequisites
Some knowledge of BigQuery and/or SQL
(anyone who falls asleep will be taunted)
© 2018 Google LLC. All rights reserved.
1 Query execution deep-dive
Agenda
2 Query optimization
© 2018 Google LLC. All rights reserved.
Query execution deep-dive
BigQuery architecture
© 2018 Google LLC. All rights reserved.
BigQuery architecture
Decoupled storage and compute
SQL:2011
Compliant
Replicated, Distributed BigQuery High-Available Cluster
Storage Compute
Streaming (99.9999999999% durability) (Dremel) REST API
Ingest
Distributed Web UI, CLI
Memory Shuffle
Tier
Client
Libraries
Free Bulk In 7
Loading Petabit Network languages
© 2018 Google LLC. All rights reserved.
Dremel architecture
Master
Shard Shard Shard Shard
● Dynamic serving tree
Shard Shard Shard Shard
Shard Shard Shard Shard
Distributed storage ● Columnar storage
© 2018 Google LLC. All rights reserved.
Query execution by example
© 2018 Google LLC. All rights reserved.
Simple query execution
SELECT COUNT(*) FROM
Master
wikipedia_benchmark.Wiki1B
WHERE title LIKE "G%o%o"
Shard
Stage 2: Sum (1 slot)
Shard Shard Shard Shard Stage 1: Filter, Count (220 slots)
Distributed storage
© 2018 Google LLC. All rights reserved.
Simple query execution - explain plan
SELECT COUNT(*) FROM
wikipedia_benchmark.Wiki1B
WHERE title LIKE "G%o%o"
© 2018 Google LLC. All rights reserved.
Simple query execution - more data
SELECT COUNT(*) FROM
wikipedia_benchmark.Wiki10B
WHERE title LIKE "G%o%o"
SELECT COUNT(*) FROM
wikipedia_benchmark.Wiki100
B
WHERE title LIKE "G%o%o"
© 2018 Google LLC. All rights reserved.
Aggregation
© 2018 Google LLC. All rights reserved.
Aggregation with high cardinality
SELECT foo, COUNT(*) as
foo cnt
Z 16 cnt
A 10 FROM `...`
B 9
C 1
GROUP BY 1
ORDER BY 2 DESC
LIMIT 2
foo cnt foo cnt
A 3 A 7 ● Can’t discard “B” or “C” until after all previous stages
C 1 B 9 are complete.
Z 11 Z 5
● High cardinality “foo” will overwhelm master node.
Distributed storage
© 2018 Google LLC. All rights reserved.
Aggregation with Shuffle
foo cnt
Z 16 SELECT foo, COUNT(*) as cnt
A 10 FROM `...`
GROUP BY 1
[A-M]
[N-Z]
ORDER BY 2 DESC
foo cnt LIMIT 2
foo cnt
A 10
Z 16
B 9
● Shuffle puts like values in the same node
foo cnt foo cnt
A 3 A 7 ● Scalable, since you never have to return more than
C 1 B 9 N from each node in middle tier
Z 11 Z 5
Distributed storage
© 2018 Google LLC. All rights reserved.
Aggregation with Shuffle query plan
SELECT language,
MAX(views) as views
FROM
wikipedia_benchmark.Wiki1B
WHERE title LIKE "G%o%"
GROUP BY 1
ORDER BY 2 DESC
LIMIT 100
© 2018 Google LLC. All rights reserved.
Shuffle aggregation execution
SELECT language, MAX(views) as views
Master
FROM `wikipedia_benchmark.Wiki1B`
WHERE title LIKE "G%o%"
GROUP BY 1 ORDER BY 2 DESC LIMIT 100
Shard
Stage 3: SORT, LIMIT (1 slot)
Shard Shard Shard Shard
Stage 2: GROUP BY, SORT, LIMIT (289
slots)
Shuffle
Shard Shard Shard Shard Stage 1: Partial GROUP BY (40,859 sinks)
Distributed storage
© 2018 Google LLC. All rights reserved.
Multi-Shuffle query plan
SELECT language, SUM(views) as views
FROM (
SELECT title, language,
MAX(views) as views
FROM wikipedia_benchmark.Wiki100B
WHERE title LIKE "G%o%"
GROUP BY title, language
)
GROUP BY language
ORDER BY views desc
LIMIT 100
© 2018 Google LLC. All rights reserved.
Shuffle overview
● Shuffle gets invoked whenever mapping from stage N to stage N+1 isn’t statically determined
● Super fast and extra magic
● Shuffle has quotas: shuffle too much and you spill to disk
● Shuffle quotas increase with reserved slots / fixed price reservations
© 2018 Google LLC. All rights reserved.
JOINs
© 2018 Google LLC. All rights reserved.
Large JOIN (Shuffle)
Master
Shard Shard Hash join
Independent shuffles
Shard Shard Shard Shard SELECT c.author.name a, c2.a m
FROM github_repos.commits c
JOIN (SELECT committer.name a, commit
FROM github_repos.commits) c2
Distributed storage
ON c.commit = c2.commit
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Shuffle JOIN query plan
SELECT
c.author.name a, c2.a m
FROM github_repos.commits c
JOIN (
SELECT
committer.name a, commit
FROM github_repos.commits) c2
ON c.commit = c2.commit
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Small JOIN (Broadcast)
Master SELECT
c.author.name a, c2.a m
FROM github_repos.commits c
JOIN (
SELECT
committer.name a,
commit
Shard Shard Shard Left table
FROM github_repos.commits)
c2
Shard Right table ON
c.commit = c2.commit
WHERE c2.a = 'tom'
Distributed storage
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Broadcast JOIN query plan
SELECT c.author.name a,
c2.a m
FROM github_repos.commits c
JOIN (
SELECT
committer.name a, commit
FROM github_repos.commits) c2
ON c.commit = c2.commit
WHERE c2.a = 'tom'
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Auto-awesome
© 2018 Google LLC. All rights reserved.
Repartitioning
Master SELECT title FROM Wiki10B
GROUP BY title ORDER BY title
LIMIT 1000
Shard
Stage 2 (X shards)
Shard Shard Shard Shard Stage 2.1 (Y shards)
Stage 2.2 (Z shards)
Shuffle
Stage 1: Read
Shard Shard Shard Shard
Distributed storage
© 2018 Google LLC. All rights reserved.
Repartitioning query plan
SELECT title
FROM ....Wiki10B
GROUP BY title
ORDER BY title
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Skewed JOINs
© 2018 Google LLC. All rights reserved.
Skewed JOIN query
SELECT l.language, r.language, COUNT(l.title), COUNT(l.ot)
FROM (
SELECT
IF(title like "%c%", "c", title) as title,
title as ot, language
FROM wikipedia_benchmark.Wiki1B)
JOIN (
SELECT title, language
FROM wikipedia_benchmark.Wiki1B
GROUP BY title, language) r
ON l.title = r.title
WHERE l.language <> r.language
GROUP BY 1, 2
© 2018 Google LLC. All rights reserved.
Skewed JOIN query plan
© 2018 Google LLC. All rights reserved.
Skewed JOIN
Master
Shard Shard One shard gets too much data
Independent shuffles
Shard Shard Shard Shard
Can’t just apply the redispatch trick
because you’d need to reshuffle both sides
Distributed storage
© 2018 Google LLC. All rights reserved.
Ordering
© 2018 Google LLC. All rights reserved.
Large order-bys
foo {A, B, C, SELECT foo
D, F, G, H, FROM table
K, L, M, Q,
Z} ORDER BY foo
foo {A, D, F, foo {A, C, H, foo {B, K, L,
Master node needs to sort and store
G, Q} Z} M} all values
Distributed storage
© 2018 Google LLC. All rights reserved.
Overloaded order-by query plan
SELECT title
FROM Wiki1B
ORDER BY title
© 2018 Google LLC. All rights reserved.
Order-by and limit
foo {A, B, C} SELECT foo
drop {D, F, FROM table
H, K, L}
ORDER BY foo
LIMIT 3
foo {A, D, F}
foo {A, C, H} foo {B, K, L}
drop {G, Q}
drop {Z} drop {M} Can drop values over the limit at
each node
Distributed storage
© 2018 Google LLC. All rights reserved.
Order-by and limit query plan
SELECT title
FROM Wiki1B
ORDER BY title
LIMIT 1000
© 2018 Google LLC. All rights reserved.
Query optimization
Legacy SQL Standard SQL
#legacySQL #standardSQL
● Non-standard SQL dialect ● SQL 2011-compliant
○ Includes extensions for nested
● Typically slower execution because there and repeated data
is no query plan optimizer
● More “standard” syntax than BigQuery
● Supports table decorators LegacySQL
○ Planned but not implemented yet
for Standard SQL ● Has a query plan optimizer
○ Will push down predicates when it
can do so safely
● Force standard SQL with the
#standardSQL query prefix
© 2018 Google LLC. All rights reserved.
Best practices for query structure
© 2018 Google LLC. All rights reserved.
Optimization: necessary columns only
Original code Optimized Reasoning
select select Only select the columns necessary, especially in inner
* dim1,
queries. SELECT * is cost inefficient and may also hurt
from metric1
`dataset.table` from
performance.
`dataset.table`
If the number of columns to return is large, consider using
SELECT * EXCEPT to exclude unneeded columns.
© 2018 Google LLC. All rights reserved.
Optimization: ORDER BY with LIMIT
Original code Optimized Reasoning
select select Writing results for a query with an ORDER BY clause can
t.dim1, t.dim1,
result in Resources Exceeded errors. Because the final
t.dim2, t.dim2,
t.metric1 t.metric1
sorting must be done on a single slot, if you are
from from attempting to order a very large result set, the final
`dataset.table` t `dataset.table` t sorting can overwhelm the slot that is processing the
order by metric1 desc order by metric1 desc data.
limit 1000
If you are sorting a very large number of values, use a
LIMIT clause.
© 2018 Google LLC. All rights reserved.
Best practices for JOINs
© 2018 Google LLC. All rights reserved.
Optimization: WHERE, ASAP
Original code Optimized Reasoning
select select WHERE clauses should be executed as soon as
t1.dim1, t1.dim1,
possible, especially within joins, so the tables to be
sum(t1.metric1) sum(t1.metric1)
from from
joined are as small as possible.
`dataset.table1` t1 `dataset.table1` t1
left join left join WHERE clauses may not always be necessary, as
`dataset.table2` t2 `dataset.table2` t2 standard SQL will do its best to push down filters.
on on
Review the explanation plan to see if filtering is
t1.dim1 = t2.dim1 t1.dim1 = t2.dim1
where ifnull(t2.dim2, 'def') = 'abc' where t2.dim2 = 'abc' happening as early as possible, and either fix the
group by 1; group by 1; condition or use a subquery to filter in advance.
© 2018 Google LLC. All rights reserved.
Optimization: late aggregation
Original code Optimized Reasoning
select select Aggregate as late and as seldom as possible,
t1.dim1, t1.dim1,
because aggregation is very costly.
sum(t1.m1) sum(t1.m1)
sum(t1.m2) sum(t2.m2)
from (select from (select BUT: If a table can be reduced drastically by
dim1, dim1, aggregation in preparation for being joined, then
sum(metric1) m1 metric1 m1 aggregate it early.
from `dataset.table1` group by 1) t1 from `dataset.table1`) t1
join (select join (select
dim1, dim1, Caution: With JOINS, this only works if the two
sum(metric2) m2 metric2 m2 tables are already aggregated to the same level
from `dataset.table2` group by 1) t2 from `dataset.table2`) t2 (i.e., if there is only one row for every join key
on t1.dim1 = t2.dim1 on t1.dim1 = t2.dim1
value).
group by 1; group by 1;
© 2018 Google LLC. All rights reserved.
Optimization: JOIN pattern
Original code Optimized Reasoning
select select When you create a query by using a JOIN, consider the
t1.dim1, t1.dim1,
order in which you are merging the data. The standard
sum(t1.metric1), sum(t1.metric1),
sum(t2.metric2) sum(t2.metric2)
SQL query optimizer can determine which table should
from from be on which side of the join, but it is still recommended
small_table t1 large_table t2 to order your joined tables appropriately.
join join
large_table t2 small_table t1
The best practice is to place the largest table first,
on on
t1.dim1 = t2.dim1 t1.dim1 = t2.dim1 followed by the smallest, and then by decreasing size.
where t1.dim1 = ‘abc’ where t1.dim1 = ‘abc’
group by 1; group by 1;
© 2018 Google LLC. All rights reserved.
Best practices for functions
© 2018 Google LLC. All rights reserved.
Optimization: SQL UDFs
Original code Optimized Reasoning
create temporary function create temporary function JavaScript UDFs are a performance
multiply(x INT64, y INT64) multiply(x INT64, y INT64)
killer because they have to spin up a
returns INT64 as
language js (x * y);
V8 subprocess evaluate.
as “””
return x * y; select multiply(2, 2) as result; Prefer SQL UDFs where possible.
“””;
select multiply(2, 2) as result;
© 2018 Google LLC. All rights reserved.
Optimization: case-insensitive search
Original code Optimized Reasoning
select select LOWER() and UPPER() operations have a
count(*) c count(*) c
hard time when dealing with Unicode text:
from from
`dataset.table` `dataset.table`
each character needs to be mapped
where where individually and they can also be multibytes.
lower(text) LIKE ‘%bigquery%’ OR regexp_contains(
lower(text) LIKE ‘%big query%’ text, '(?i)(bigquery|big query)') Use REGEX_MATCH() and add the case
insensitive (?i) modifier to your regular
expression to do case-insensitive searches.
Multiple search values over the same field
can be combined into a single regex.
© 2018 Google LLC. All rights reserved.
Optimization: approximate functions
Original code Optimized Reasoning
select select If the SQL aggregation function you're using
dim1, dim1,
has an equivalent approximation function, the
count(distinct dim2) approx_count_distinct(dim2)
from from
approximation function will yield faster query
`dataset.table` `dataset.table` performance.
group by 1; group by 1;
Approximate functions produce a result which
is generally within 1% of the exact number.
© 2018 Google LLC. All rights reserved.
Optimization: approximate functions
Size Error
1M 0.59%
10M 0.94%
100M 0.39%
1B 0.51%
10B 0.32%
100B 0.32%
© 2018 Google LLC. All rights reserved.
Optimization: latest record
Original code Optimized Reasoning
select select Using the ROW_NUMBER() function can
* except(rn) event.*
fail with Resources Exceeded errors as
from ( from (
select *, select array_agg(
data volume grows if there are too many
row_number() over( t order by t.created_at desc limit 1 elements to ORDER BY in a single
partition by id )[offset(0)] event partition.
order by created_at desc) rn from
from `dataset.table` t
Using ARRAY_AGG() in standard SQL
`dataset.table` t group by
) id allows the query to run more efficiently
where rn = 1 ) because the ORDER BY is allowed to
order by created_at order by created_at drop everything except the top record on
each GROUP BY.
© 2018 Google LLC. All rights reserved.
Optimization: NTILE functions
Original code Optimized Reasoning
select with QuantInfo AS ( Using NTILE() function can fail with
individual_id, select
Resources Exceeded errors as data
ntile(3) over o, qval
(order by sales desc) AS sales_third from unnest((
volume grows if there are too many
from select APPROX_QUANTILES(sales, 2) elements to ORDER BY in a single
`dataset.table` from `dataset.table` partition.
)) AS qval
with offset 0
Using APPROX_QUANTILES() in
),
select standard SQL allows the query to run
individual_id, more efficiently because it doesn’t
(select require a global ORDER BY for all rows in
MIN(o)
the table.
from QuantInfo
where sales <= QuantInfo.qval
) as sales_third
from `dataset.table`
© 2018 Google LLC. All rights reserved.