KEMBAR78
Beyond EXPLAIN: Query Optimization From Theory To Code | PDF
Beyond EXPLAIN
Yuto Hayamizu
Ryoji Kawamichi
Query Optimization
From Theory To Code
2016/5/20
PGCon 2016 @ Ottawa
Before Relational …
• Querying was physical
• Need to understand
physical organization
• Navigate query execution
by yourself
2016/5/20 2
DEPT
EMP
PROJ
“Which file is this table stored in?”
“How are records linked?”
“Which access path is fast for this table?”
“What is the best order of joining tables”
…
Historically …
Before Relational …
• Querying was physical
• Need to understand
physical organization
• Navigate query execution
by yourself
2016/5/20 3
After Relational …
DEPT
EMP
PROJ
“Which file is this table stored in?”
“How are records linked?”
“Which access path is fast for this table?”
“What is the best order of joining tables”
…
• Querying is logical
• Physical organization is
black-boxed
• Just declare what you want
Historically …
Fill the Gap: Physical and Logical
2016/5/20 4
SELECT * FROM DEPT D, EMP E
WHERE E.D_ID = D.ID AND ...
• Storage I/O strategy
• Access path selection
• Join method selection
• Aggregation, sorting
• Resource allocation
• ...
Query Optimizer
If optimizer perfectly fills the gap...
2016/5/20 5
We don’t need EXPLAIN
Reality Is Tough
• Optimizer is NOT PERFECT
• Generated plans are not always optimal, sometimes
far from optimal
• We have to take care of physical behavior
• That’s why EXPLAIN is so much explained
2016/5/20 6
Go Beyond EXPLAIN
• Deeper understanding of optimization, better
control of your databases
• Theoretical fundamentals of query optimization
• From basic framework to cutting-edge technologies
• PostgreSQL Optimizer implementation
• Focusing on basic scan and join methods
• Behavior observation with TPC-H benchmark
2016/5/20 7
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 8
Query Optimization Framework
• Cost-based optimization
• Plan selection with estimated execution cost
• Most of modern optimizers, including PostgreSQL,
are cost-based
• Rule-based optimization
• Plan selection with heuristically ranked rules
• Easy to produce the same result
• Hard to evaluate wide variety of plans
• Ex) Oracle (~10g), Hive (~0.13)
2016/5/20 9
Main Challenges in Cost-based Optimization
• Cost modeling is HARD
• Overhead of CPU, I/O, memory access, network, …
• Cardinality estimation is HARD
• Output size of scans, joins, aggregations, …
• Join ordering search is HARD
• Combinatorial explosion of join ordering and access path
• Exhaustive search is NP-hard
2016/5/20 10
System-R optimizer (1979)
• “The standard”
• Cost estimation with I/O and CPU
• Cardinality estimation with table statistics
• Bottom-up plan search
• Many of modern optimizers are “System-R style”
• PostgreSQL, MySQL, DB2, Oracle, ...
2016/5/20 11
Cost/Cardinality Estimation
• [#page fetched],[#storage API calls]
are estimated with cost formula and following
statistics
2016/5/20 12
CPU costI/O cost
COST = [#page fetched] + W * [#storage API calls]
weight parameter
• NCARD(T) ... the cardinality of relation T
• TCARD(T) ... the number of pages in relation T
• ICARD(I) ... the number of distinct keys in index I
• NINDX(I) ... the number of pages in index I
Bottom-up Plan Search
• Candidate plans for single relation
• The cheapest access path
• N-relation join ordering search
• Select the cheapest plans for each relation
• Then, find optimal join orderings of every 2-relation join
• Then, find optimal join orderings of every 3-relation join
• ... until N-relation
2016/5/20 13
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Ex) A ⨝ B ⨝ C ⨝ D
Volcano/Cascades (1993)
• Top-down transformational plan search
• Yet another optimization approach
• Not well known as “System-R style”, but widely used in
practice
Ex) SQL Server, Apache Hive (Apache Calcite), Greenplum
Orca
• Extensible optimization framework
2016/5/20 19
Extensible Optimization Framework
Query Optimizer Generator
• Generalized expression of query plan not limited
to relational data model
• Users (optimizer developers) defines actual
implementations:
• Logical operator ... corresponds to relational algebra
• Physical algorithm ... corresponds to scan & join
methods such as sequential scan, index scan, hash
join, nested loop join
2016/5/20 20
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 21
Join Select T
Join
Select R Select S
Proj
Join
Select T
JoinSelect R
Select S
Proj
Change join ordering
Join Select T
Join
Select R
Select SProj
Projection push down
Example: 3-way join with projection
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 22
Join Select T
Join
Select R Select S
ProjExample: 3-way join with projection
HashJoin Select T
Join
SeqScan R SeqScan S
Proj
Join IdxScan T
Join
Select R Select S
Proj
…
Top-down Transformational Search
• Starts from an initial “logical plan”
• Generate alternative plans with:
A) Logical operator transformation
B) Physical algorithm selection
C) Enforcing sorting order
2016/5/20 23
Join Select T
Join
Select R Select S
Proj
Example: 3-way join with projection
Join Select T
Join
Sort Sort
Proj
Select R Select S
Enforce sorting order
merge join of R and S is possible now
Benefits of Top-down approach
• Possible to intentionally limit search space
• Effective pruning with branch-and-bound
• Limit search space with search time deadline
2016/5/20 24
Cost-based Optimization Basics
Two major cost-based optimization style
• System-R
• Cost modeling with statistics
• Bottom-up search
• Volcano/Cascades
• Extensible optimizer generator
• Cost estimation is user’s responsibility
• Top-down transformational search
2016/5/20 25
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 26
PostgreSQL Optimizer
“System-R style” optimization
• Bottom-up plan search with dynamic programming
• CPU and I/O operation based cost modeling
2016/5/20 27
Seq. I/O Random I/O CPU cost per tuple
Cost of single operation
• seq_page_cost
• random_page_cost
• cpu_tuple_cost
• cpu_index_tuple_cost
• cpu_operator_cost
• (parallel_tuple_cost)
Estimated number of each operation
• Cardinality estimation with
statistics
• Cost formula for each plan type
• SeqScan, IndexScan
• NestLoopJoin, HashJoin,
MergeJoin, ...
Detailed Look At Basic Scan Types
• Sequential scan
• Efficient for accessing large potion of tables
• Index scan
• Efficient for accessing a fraction of data
2016/5/20 28
Execution cost
Query selectivity
Sequential scan
of SeqScan
29
= (# pages in a table)
= (# tuples in a table)
= #qual_operator
= (#tuples) × (weight factor of A)
+ (#tuples) × (weight factor of B)
+ ・・・
・・・ WHERE AND AND ・・・A B
cost_seqscan()
@optimizer/path/costsize.c
of IndexScan
Consists of:
(A) CPU cost of searching B+-tree
(B) CPU cost of scanning index tuples in leaf pages
(C) I/O cost of leaf pages
(D) I/O cost of heap pages
(E) CPU cost of scanning heap tuples
2014/12/04 30
of IndexScan
(A) B+-tree search
(B) Scanning index tuples in leaf pages
2014/12/04 31
+= log2(#index_tuples)
I/O cost of internal pages
Assumed to be always cached in the buffer
+= #qual_operator
× #leaf_pages ✕ #ituple_per_page × σ
Selectivity σ
Comes from statistics
Mackert and Lohman function(Yao function)
I/O count estimation with consideration of buffer caching
of IndexScan
(C) I/O cost of index leaf pages
2014/12/04 32
+= Y(effective_cache_size, #leaf_pages)
Selectivity σ
I/Ocount
of IndexScan
(D) I/O cost of heap pages
(E) CPU cost of scanning heap tuples
・ Estimate the number of scanned tuples from σ
2014/12/04 33
+= α2 × #match_pages
Correlation between index and heap ordering: α
α = 0 : I/O pattern is random α = 1 : I/O pattern is sequential
+= (1-α2) × #match_tuples
Detailed Look At Join Methods
• Hash join
• Efficient for joining large number of records
• Usually combined with sequential scans
• Nested Loop Join
• Efficient for joining small number of records
• Usually combined with index scans or small table
sequential scans
2016/5/20 34
of HashJoin
2014/12/04 35
of HashJoin
Build phase
• Cost += Cost(inner)
2014/12/04 36
+= #qual_op × #inner_tuples
+= #inner_tuples
Hashing cost
of HashJoin
2014/12/04 37
Build phase
• Cost += Cost(inner)
+= #qual_op × #inner_tuples
+= #inner_tuples
of HashJoin
2014/12/04 38
Build phase
• Cost += Cost(inner)+
Probe phase
• Cost += Cost(outer)+
+= #qual_op × #inner_tuples
+= #inner_tuples
+= #qual_op × (1 + #bucket_size × 0.5)
× #outer_tuples
+= #match_tuples
Hashing & table lookup (bucket search) cost
recordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordtuple
of HashJoin
2014/12/04 39
#buckets: 2 #buckets : 4
build
4 tuples are compared for
lookup in average 2 tuples are compared for lookup
in average
0.E+00
5.E+06
1.E+07
2.E+07
2.E+07
3.E+07
3.E+07
4.E+07
4.E+07
10000 100000 1000000 10000000 100000000
Estimated cost of 2-way HashJoin
# of records
16 records
recordrecordrecordrecordrecordrecordrecordtuple
recordrecordrecordrecordrecordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
recordrecordrecordtuple
of NestLoopJoin
2014/12/04 40
R ⨝ S Scan R r2
r1
r3
r4
Scan S with r1
s1
s2
s3
ReScan S with r2
s1
s2
s3
outer inner
of NestLoopJoin
• When #outer_tuples = 1
2014/12/04 41
R ⨝ S Scan R
r1
Scan S with r1
s1
s2
s3
outer inner
Cost = Cost(outer) + Cost(inner) +
+= #inner_tuples
+= #qual_operator × #inner_tuples
of NestLoopJoin
• When #outer_tuples > 1
R ⨝ S Scan R r2
r1
r3
r4
Scan S with r1
s1
s2
s3
ReScan S with r2
s1
s2
s3
outer inner
Cost = Cost(outer) + Cost(inner) +
+ (#outer_tuples - 1) × Cost(ReScan inner)
Higher buffer hit ratio in ReScan
→ Cost of ReScan is lower than cost of IndexScan
+= #inner_tuples × #outer_tuples
+= #qual_operator × #inner_tuples × #outer_tuples
See How It Works
• TPC-H Benchmark
• Specification and tools for benchmarking data
warehouse workload
• Open source implementation: DBT-3, pg_tpch
• Schema, data generation rules and queries
• Experiments with 100GB
• Scale Factor = 100
2016/5/20 43
Experimental Setup
• Dell R720xd
• Xeon (2sockets, 16cores)
• x24 NL-SAS HDD
• With PostgreSQL 9.5
• Default cost parameter settings
• SeqScan & HashJoin
• enable_seqscan = on, enable_hashjoin = on
and disables other methods
• IndexScan & NestLoopJoin
• enable_indexscan = on, enable_nestloop = on
and disables other methods
2016/5/20 44
TPC-H Q.1: The Simplest Case
2016/5/20 45
5.E+05
5.E+06
5.E+07
5.E+08
1 10 100 1000
Estimated cost
Selectivity (l_shipdate)
IndexScan
SeqScan
10
100
1000
10000
1 10 100 1000
Execution time (sec)
Selectivity(l_shipdate)
IndexScan
SeqScan
• Good trend estimation for each
method
• Estimated break-event point is
errorneus
• IndexScan should be more
expensive (need parameter
calibration)
SELECT count(*), ... FROM lineitem
WHERE l_shipdate BETWEEN [X] AND [Y]
TPC-H Q.3
2016/5/20 46
Estimated cost
SeqScan
customer
SeqScan
orders
SeqScan
lineitem
Hash
HashJoin
HashJoin
IndexScan
orders
IndexScan
lineitem
NestLoop
NestLoop
IndexScan
customer
Execution time (sec)
Selectivity
1.E+00
1.E+02
1.E+04
1.E+06
1.E+08
1 10 100 1000 10000 100000 1000000
1
10
100
1000
10000
1 10 100 1000 10000 100000 1000000
Selectivity
NestLoop+IndexScan
HashJoin+SeqScan
NestLoop+IndexScan
HashJoin+SeqScan Similar result as in Q.1
• Good trend estimation for each
• Erroneous break-event point
without parameter calibration
SELECT count(*), ...
FROM customer, orders, lineitem
WHERE c_custkey = o_custkey AND
o_orderkey = l_orderkey AND
c_custkey < [X] AND
c_mktsegment = ‘MACHINERY’;
100
1000
10000
100000
1 10 100 1000 10000
More Complex Case
TPC-H Q.4: Semi-Join Query
• Plan selection for semi-
join tend to be unstable
2016/5/20 47Selectivity
1.E+05
1.E+06
1.E+07
1.E+08
1.E+09
1 10 100 1000 10000
HashJoin+SeqScan
NestLoop+IndexScan
Estimated cost
Execution time (sec)
HashJoin+SeqScan
NestLoop+IndexScan SELECT count(*), ...
FROM orders
WHERE
o_orderdate >= ‘1995-01-01’ AND
o_orderdate < ‘1995-01-01’
+ interval ‘3 month’ AND
EXISTS(
SELECT * FROM lineitem
WHERE l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate)
Selectivity
1.E+06
1.E+07
1.E+08
1.E+09
1.E+10
1.E+11
1.E+00 1.E+02 1.E+04 1.E+06 1.E+08
More Complex Case
TPC-H Q.22: Anti-Join Query
• Difficulties in overall cost
trend estimation
2016/5/20 48
Selectivity
1
10
100
1000
10000
1.E+00 1.E+02 1.E+04 1.E+06 1.E+08
Estimated cost
Execution time (sec)
Selectivity
HashJoin+SeqScan
NestLoop+IndexScan
HashJoin+SeqScan
NestLoop+IndexScan
SELECT count(*), ...
FROM supplier, lineitem l1, orders, nation
WHERE s_suppkey = l1.l_suppkey AND
o_orderkey = l1.l_orderkey AND
o_orderstatus = 'F' AND
l1.l_receiptdate > l1.l_commitdate AND
EXISTS (
SELECT * FROM lineitem l2
WHERE l2.l_orderkey = l1.l_orderkey
AND l2.l_suppkey <> l1.l_suppkey)
AND NOT EXIST (
SELECT * FROM lineitem l3
WHERE l3.l_orderkey = l1.l_orderkey
AND l3.l_suppkey <> l1.l_suppkey
AND l3.l_receiptdate > l3.l_commitdate)
AND s_nationkey = n_nationkey
AND n_name = ‘JAPAN'
Summary: PostgreSQL
Optimizer
• Detailed look at cost modeling of basic methods
• SeqScan, IndexScan
• HashJoin, NestedLoopJoin
• Observation with TPC-H benchmark
• Good cost trend estimation for simple join queries
• Erroneous cheapest plan selection without parameter tuning
• Difficulties with semi-join and anti-join queries
2016/5/20 49
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 50
Cutting-Edge Technologies
• Traditional optimization was a “closed” problem
• “Rethink the contract” ー Surajit Chaudhuri
• Feedback from previous execution
• Dynamic integration with execution
2016/5/20 51
cardinality
estimation
cost model
plan space
enumeration
(SQL) query plan
Mid-query Re-optimization
• Detects sub-optimality of executing query plan
• Query plans are annotated for later estimation
improvement
• Runtime statistics collection
• Statistics collector probes are inserted into operators of
executing query plan
• Plan modification strategy
• Discard current execution and re-optimize whole plan
• Re-optimizer only subtree of the plan that are not
started yet
• Save partial execution result and generate new SQL
using the result
2016/5/20 52
[N. Kabra et.al., SIGMOD’98]
Plan Bouquet
• Generate a set of plans for each selectivity range
• Estimation improvement with runtime statistics
collection
• Evaluation with PostgreSQL
2016/5/20 53
[A. Dutt et.al., SIGMOD’14]
Bounding Impact of Estimation Error
• “Uncertainty” analysis of cost estimation
• Optimality sensitivity to estimation error
• Execute partially to reduce uncertainty
2016/5/20 54
[T. Neumann et.al., BTW Conf ‘13]
Outline
• Introduction
• Theory: Query Optimization Framework
• Code: PostgreSQL Optimizer
• Theory: Cutting-Edge Technologies Overview
• Summary
2016/5/20 55
Summary
• Cost-based optimization framework
• System-R style bottom-up optimization
• Volcano style top-down optimization
• Detailed look at PostgreSQL optimizer
• Cost modeling of basic scan and join method
• Experiment with TPC-H benchmark
• Brief overview of cutting-edge technologies
2016/5/20 56

Beyond EXPLAIN: Query Optimization From Theory To Code

  • 1.
    Beyond EXPLAIN Yuto Hayamizu RyojiKawamichi Query Optimization From Theory To Code 2016/5/20 PGCon 2016 @ Ottawa
  • 2.
    Before Relational … •Querying was physical • Need to understand physical organization • Navigate query execution by yourself 2016/5/20 2 DEPT EMP PROJ “Which file is this table stored in?” “How are records linked?” “Which access path is fast for this table?” “What is the best order of joining tables” … Historically …
  • 3.
    Before Relational … •Querying was physical • Need to understand physical organization • Navigate query execution by yourself 2016/5/20 3 After Relational … DEPT EMP PROJ “Which file is this table stored in?” “How are records linked?” “Which access path is fast for this table?” “What is the best order of joining tables” … • Querying is logical • Physical organization is black-boxed • Just declare what you want Historically …
  • 4.
    Fill the Gap:Physical and Logical 2016/5/20 4 SELECT * FROM DEPT D, EMP E WHERE E.D_ID = D.ID AND ... • Storage I/O strategy • Access path selection • Join method selection • Aggregation, sorting • Resource allocation • ... Query Optimizer
  • 5.
    If optimizer perfectlyfills the gap... 2016/5/20 5 We don’t need EXPLAIN
  • 6.
    Reality Is Tough •Optimizer is NOT PERFECT • Generated plans are not always optimal, sometimes far from optimal • We have to take care of physical behavior • That’s why EXPLAIN is so much explained 2016/5/20 6
  • 7.
    Go Beyond EXPLAIN •Deeper understanding of optimization, better control of your databases • Theoretical fundamentals of query optimization • From basic framework to cutting-edge technologies • PostgreSQL Optimizer implementation • Focusing on basic scan and join methods • Behavior observation with TPC-H benchmark 2016/5/20 7
  • 8.
    Outline • Introduction • Theory:Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 8
  • 9.
    Query Optimization Framework •Cost-based optimization • Plan selection with estimated execution cost • Most of modern optimizers, including PostgreSQL, are cost-based • Rule-based optimization • Plan selection with heuristically ranked rules • Easy to produce the same result • Hard to evaluate wide variety of plans • Ex) Oracle (~10g), Hive (~0.13) 2016/5/20 9
  • 10.
    Main Challenges inCost-based Optimization • Cost modeling is HARD • Overhead of CPU, I/O, memory access, network, … • Cardinality estimation is HARD • Output size of scans, joins, aggregations, … • Join ordering search is HARD • Combinatorial explosion of join ordering and access path • Exhaustive search is NP-hard 2016/5/20 10
  • 11.
    System-R optimizer (1979) •“The standard” • Cost estimation with I/O and CPU • Cardinality estimation with table statistics • Bottom-up plan search • Many of modern optimizers are “System-R style” • PostgreSQL, MySQL, DB2, Oracle, ... 2016/5/20 11
  • 12.
    Cost/Cardinality Estimation • [#pagefetched],[#storage API calls] are estimated with cost formula and following statistics 2016/5/20 12 CPU costI/O cost COST = [#page fetched] + W * [#storage API calls] weight parameter • NCARD(T) ... the cardinality of relation T • TCARD(T) ... the number of pages in relation T • ICARD(I) ... the number of distinct keys in index I • NINDX(I) ... the number of pages in index I
  • 13.
    Bottom-up Plan Search •Candidate plans for single relation • The cheapest access path • N-relation join ordering search • Select the cheapest plans for each relation • Then, find optimal join orderings of every 2-relation join • Then, find optimal join orderings of every 3-relation join • ... until N-relation 2016/5/20 13
  • 14.
    Ex) A ⨝B ⨝ C ⨝ D
  • 15.
    Ex) A ⨝B ⨝ C ⨝ D
  • 16.
    Ex) A ⨝B ⨝ C ⨝ D
  • 17.
    Ex) A ⨝B ⨝ C ⨝ D
  • 18.
    Ex) A ⨝B ⨝ C ⨝ D
  • 19.
    Volcano/Cascades (1993) • Top-downtransformational plan search • Yet another optimization approach • Not well known as “System-R style”, but widely used in practice Ex) SQL Server, Apache Hive (Apache Calcite), Greenplum Orca • Extensible optimization framework 2016/5/20 19
  • 20.
    Extensible Optimization Framework QueryOptimizer Generator • Generalized expression of query plan not limited to relational data model • Users (optimizer developers) defines actual implementations: • Logical operator ... corresponds to relational algebra • Physical algorithm ... corresponds to scan & join methods such as sequential scan, index scan, hash join, nested loop join 2016/5/20 20
  • 21.
    Top-down Transformational Search •Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 21 Join Select T Join Select R Select S Proj Join Select T JoinSelect R Select S Proj Change join ordering Join Select T Join Select R Select SProj Projection push down Example: 3-way join with projection
  • 22.
    Top-down Transformational Search •Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 22 Join Select T Join Select R Select S ProjExample: 3-way join with projection HashJoin Select T Join SeqScan R SeqScan S Proj Join IdxScan T Join Select R Select S Proj …
  • 23.
    Top-down Transformational Search •Starts from an initial “logical plan” • Generate alternative plans with: A) Logical operator transformation B) Physical algorithm selection C) Enforcing sorting order 2016/5/20 23 Join Select T Join Select R Select S Proj Example: 3-way join with projection Join Select T Join Sort Sort Proj Select R Select S Enforce sorting order merge join of R and S is possible now
  • 24.
    Benefits of Top-downapproach • Possible to intentionally limit search space • Effective pruning with branch-and-bound • Limit search space with search time deadline 2016/5/20 24
  • 25.
    Cost-based Optimization Basics Twomajor cost-based optimization style • System-R • Cost modeling with statistics • Bottom-up search • Volcano/Cascades • Extensible optimizer generator • Cost estimation is user’s responsibility • Top-down transformational search 2016/5/20 25
  • 26.
    Outline • Introduction • Theory:Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 26
  • 27.
    PostgreSQL Optimizer “System-R style”optimization • Bottom-up plan search with dynamic programming • CPU and I/O operation based cost modeling 2016/5/20 27 Seq. I/O Random I/O CPU cost per tuple Cost of single operation • seq_page_cost • random_page_cost • cpu_tuple_cost • cpu_index_tuple_cost • cpu_operator_cost • (parallel_tuple_cost) Estimated number of each operation • Cardinality estimation with statistics • Cost formula for each plan type • SeqScan, IndexScan • NestLoopJoin, HashJoin, MergeJoin, ...
  • 28.
    Detailed Look AtBasic Scan Types • Sequential scan • Efficient for accessing large potion of tables • Index scan • Efficient for accessing a fraction of data 2016/5/20 28 Execution cost Query selectivity Sequential scan
  • 29.
    of SeqScan 29 = (#pages in a table) = (# tuples in a table) = #qual_operator = (#tuples) × (weight factor of A) + (#tuples) × (weight factor of B) + ・・・ ・・・ WHERE AND AND ・・・A B cost_seqscan() @optimizer/path/costsize.c
  • 30.
    of IndexScan Consists of: (A)CPU cost of searching B+-tree (B) CPU cost of scanning index tuples in leaf pages (C) I/O cost of leaf pages (D) I/O cost of heap pages (E) CPU cost of scanning heap tuples 2014/12/04 30
  • 31.
    of IndexScan (A) B+-treesearch (B) Scanning index tuples in leaf pages 2014/12/04 31 += log2(#index_tuples) I/O cost of internal pages Assumed to be always cached in the buffer += #qual_operator × #leaf_pages ✕ #ituple_per_page × σ Selectivity σ Comes from statistics
  • 32.
    Mackert and Lohmanfunction(Yao function) I/O count estimation with consideration of buffer caching of IndexScan (C) I/O cost of index leaf pages 2014/12/04 32 += Y(effective_cache_size, #leaf_pages) Selectivity σ I/Ocount
  • 33.
    of IndexScan (D) I/Ocost of heap pages (E) CPU cost of scanning heap tuples ・ Estimate the number of scanned tuples from σ 2014/12/04 33 += α2 × #match_pages Correlation between index and heap ordering: α α = 0 : I/O pattern is random α = 1 : I/O pattern is sequential += (1-α2) × #match_tuples
  • 34.
    Detailed Look AtJoin Methods • Hash join • Efficient for joining large number of records • Usually combined with sequential scans • Nested Loop Join • Efficient for joining small number of records • Usually combined with index scans or small table sequential scans 2016/5/20 34
  • 35.
  • 36.
    of HashJoin Build phase •Cost += Cost(inner) 2014/12/04 36 += #qual_op × #inner_tuples += #inner_tuples Hashing cost
  • 37.
    of HashJoin 2014/12/04 37 Buildphase • Cost += Cost(inner) += #qual_op × #inner_tuples += #inner_tuples
  • 38.
    of HashJoin 2014/12/04 38 Buildphase • Cost += Cost(inner)+ Probe phase • Cost += Cost(outer)+ += #qual_op × #inner_tuples += #inner_tuples += #qual_op × (1 + #bucket_size × 0.5) × #outer_tuples += #match_tuples Hashing & table lookup (bucket search) cost
  • 39.
    recordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordrecordtuple of HashJoin 2014/12/04 39 #buckets:2 #buckets : 4 build 4 tuples are compared for lookup in average 2 tuples are compared for lookup in average 0.E+00 5.E+06 1.E+07 2.E+07 2.E+07 3.E+07 3.E+07 4.E+07 4.E+07 10000 100000 1000000 10000000 100000000 Estimated cost of 2-way HashJoin # of records 16 records recordrecordrecordrecordrecordrecordrecordtuple recordrecordrecordrecordrecordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple recordrecordrecordtuple
  • 40.
    of NestLoopJoin 2014/12/04 40 R⨝ S Scan R r2 r1 r3 r4 Scan S with r1 s1 s2 s3 ReScan S with r2 s1 s2 s3 outer inner
  • 41.
    of NestLoopJoin • When#outer_tuples = 1 2014/12/04 41 R ⨝ S Scan R r1 Scan S with r1 s1 s2 s3 outer inner Cost = Cost(outer) + Cost(inner) + += #inner_tuples += #qual_operator × #inner_tuples
  • 42.
    of NestLoopJoin • When#outer_tuples > 1 R ⨝ S Scan R r2 r1 r3 r4 Scan S with r1 s1 s2 s3 ReScan S with r2 s1 s2 s3 outer inner Cost = Cost(outer) + Cost(inner) + + (#outer_tuples - 1) × Cost(ReScan inner) Higher buffer hit ratio in ReScan → Cost of ReScan is lower than cost of IndexScan += #inner_tuples × #outer_tuples += #qual_operator × #inner_tuples × #outer_tuples
  • 43.
    See How ItWorks • TPC-H Benchmark • Specification and tools for benchmarking data warehouse workload • Open source implementation: DBT-3, pg_tpch • Schema, data generation rules and queries • Experiments with 100GB • Scale Factor = 100 2016/5/20 43
  • 44.
    Experimental Setup • DellR720xd • Xeon (2sockets, 16cores) • x24 NL-SAS HDD • With PostgreSQL 9.5 • Default cost parameter settings • SeqScan & HashJoin • enable_seqscan = on, enable_hashjoin = on and disables other methods • IndexScan & NestLoopJoin • enable_indexscan = on, enable_nestloop = on and disables other methods 2016/5/20 44
  • 45.
    TPC-H Q.1: TheSimplest Case 2016/5/20 45 5.E+05 5.E+06 5.E+07 5.E+08 1 10 100 1000 Estimated cost Selectivity (l_shipdate) IndexScan SeqScan 10 100 1000 10000 1 10 100 1000 Execution time (sec) Selectivity(l_shipdate) IndexScan SeqScan • Good trend estimation for each method • Estimated break-event point is errorneus • IndexScan should be more expensive (need parameter calibration) SELECT count(*), ... FROM lineitem WHERE l_shipdate BETWEEN [X] AND [Y]
  • 46.
    TPC-H Q.3 2016/5/20 46 Estimatedcost SeqScan customer SeqScan orders SeqScan lineitem Hash HashJoin HashJoin IndexScan orders IndexScan lineitem NestLoop NestLoop IndexScan customer Execution time (sec) Selectivity 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 1 10 100 1000 10000 100000 1000000 1 10 100 1000 10000 1 10 100 1000 10000 100000 1000000 Selectivity NestLoop+IndexScan HashJoin+SeqScan NestLoop+IndexScan HashJoin+SeqScan Similar result as in Q.1 • Good trend estimation for each • Erroneous break-event point without parameter calibration SELECT count(*), ... FROM customer, orders, lineitem WHERE c_custkey = o_custkey AND o_orderkey = l_orderkey AND c_custkey < [X] AND c_mktsegment = ‘MACHINERY’;
  • 47.
    100 1000 10000 100000 1 10 1001000 10000 More Complex Case TPC-H Q.4: Semi-Join Query • Plan selection for semi- join tend to be unstable 2016/5/20 47Selectivity 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1 10 100 1000 10000 HashJoin+SeqScan NestLoop+IndexScan Estimated cost Execution time (sec) HashJoin+SeqScan NestLoop+IndexScan SELECT count(*), ... FROM orders WHERE o_orderdate >= ‘1995-01-01’ AND o_orderdate < ‘1995-01-01’ + interval ‘3 month’ AND EXISTS( SELECT * FROM lineitem WHERE l_orderkey = o_orderkey AND l_commitdate < l_receiptdate) Selectivity
  • 48.
    1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+00 1.E+02 1.E+041.E+06 1.E+08 More Complex Case TPC-H Q.22: Anti-Join Query • Difficulties in overall cost trend estimation 2016/5/20 48 Selectivity 1 10 100 1000 10000 1.E+00 1.E+02 1.E+04 1.E+06 1.E+08 Estimated cost Execution time (sec) Selectivity HashJoin+SeqScan NestLoop+IndexScan HashJoin+SeqScan NestLoop+IndexScan SELECT count(*), ... FROM supplier, lineitem l1, orders, nation WHERE s_suppkey = l1.l_suppkey AND o_orderkey = l1.l_orderkey AND o_orderstatus = 'F' AND l1.l_receiptdate > l1.l_commitdate AND EXISTS ( SELECT * FROM lineitem l2 WHERE l2.l_orderkey = l1.l_orderkey AND l2.l_suppkey <> l1.l_suppkey) AND NOT EXIST ( SELECT * FROM lineitem l3 WHERE l3.l_orderkey = l1.l_orderkey AND l3.l_suppkey <> l1.l_suppkey AND l3.l_receiptdate > l3.l_commitdate) AND s_nationkey = n_nationkey AND n_name = ‘JAPAN'
  • 49.
    Summary: PostgreSQL Optimizer • Detailedlook at cost modeling of basic methods • SeqScan, IndexScan • HashJoin, NestedLoopJoin • Observation with TPC-H benchmark • Good cost trend estimation for simple join queries • Erroneous cheapest plan selection without parameter tuning • Difficulties with semi-join and anti-join queries 2016/5/20 49
  • 50.
    Outline • Introduction • Theory:Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 50
  • 51.
    Cutting-Edge Technologies • Traditionaloptimization was a “closed” problem • “Rethink the contract” ー Surajit Chaudhuri • Feedback from previous execution • Dynamic integration with execution 2016/5/20 51 cardinality estimation cost model plan space enumeration (SQL) query plan
  • 52.
    Mid-query Re-optimization • Detectssub-optimality of executing query plan • Query plans are annotated for later estimation improvement • Runtime statistics collection • Statistics collector probes are inserted into operators of executing query plan • Plan modification strategy • Discard current execution and re-optimize whole plan • Re-optimizer only subtree of the plan that are not started yet • Save partial execution result and generate new SQL using the result 2016/5/20 52 [N. Kabra et.al., SIGMOD’98]
  • 53.
    Plan Bouquet • Generatea set of plans for each selectivity range • Estimation improvement with runtime statistics collection • Evaluation with PostgreSQL 2016/5/20 53 [A. Dutt et.al., SIGMOD’14]
  • 54.
    Bounding Impact ofEstimation Error • “Uncertainty” analysis of cost estimation • Optimality sensitivity to estimation error • Execute partially to reduce uncertainty 2016/5/20 54 [T. Neumann et.al., BTW Conf ‘13]
  • 55.
    Outline • Introduction • Theory:Query Optimization Framework • Code: PostgreSQL Optimizer • Theory: Cutting-Edge Technologies Overview • Summary 2016/5/20 55
  • 56.
    Summary • Cost-based optimizationframework • System-R style bottom-up optimization • Volcano style top-down optimization • Detailed look at PostgreSQL optimizer • Cost modeling of basic scan and join method • Experiment with TPC-H benchmark • Brief overview of cutting-edge technologies 2016/5/20 56