KEMBAR78
Dynamic Symbolic Database Application Testing | PDF
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Dynamic Symbolic
Database Application Testing
Chengkai Li, Christoph Csallner
University of Texas at Arlington
June 7, 2010
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 1/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Motivation
Maximizing code coverage is an important goal in testing.
Database applications: input can be user-supplied queries.
Query results will be used as program values in program logic.
Different queries thus result in different execution paths.
To maximize code coverage: we need to enumerate queries in
an effective way.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 2/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Our Method
Generate queries dynamically by inverting branching conditions in
existing program execution paths.
1 Monitor the program’s execution paths by dynamic symbolic
execution (e.g., Dart, Pex).
2 Invert a branching condition on some covered path → a new test
query.
3 Execute the query, bring in new tuples.
4 The new tuples will cover new paths.
5 Do 1-4 iteratively.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 3/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
After the initial query
q1=c1 ∧ c2
Execution tree (maintained by dynamic symbolic engine):
each path to a leaf node represents an execution path, encountered
for tuples satisfying the branching conditions on the path.
true
c1
c2
c3
c4
c5 !c5
!c4
(q1)
if (z > 0) { // c1
if (z < 100) // c2
// ..
}
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 4/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
After the initial query, the candidate queries
Each dashed edge represents an inversed branching condition, thus
a candidate query.
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
(q1)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 5/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
The second test query
q2=!c1
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
(q1)
(q2)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 6/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
After the second test query
q2=!c1
candidate queries are again dashed.
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
c6
c7 !c7
c11 !c11
c12 !c12
!c6(q1)
(q2)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 7/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
The third test query
q3=!c1 ∧ c6 ∧ c7
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
c6
c7 !c7
c11 !c11
c12 !c12
!c6(q1)
(q2)
(q3)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 8/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
After the third test query
q3=!c1 ∧ c6 ∧ c7
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
c6
c7
c8
c9 !c9
!c8
c10 !c10
!c7
c11 !c11
c12 !c12
!c6(q1)
(q2)
(q3)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 9/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
The fourth test query
q4=!c1 ∧ c6∧!c7∧!c11∧!c12
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
c6
c7
c8
c9 !c9
!c8
c10 !c10
!c7
c11 !c11
c12 !c12
!c6(q1)
(q2)
(q3)
(q4)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 10/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Illustration of the Idea
After the fourth test query
q4=!c1 ∧ c6∧!c7∧!c11∧!c12
true
c1
c2
c3
c4
c5 !c5
!c4
!c3
!c2
!c1
c6
c7
c8
c9 !c9
!c8
c10 !c10
!c7
c11 !c11
c12 !c12
!c6(q1)
(q2)
(q3)
(q4)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 11/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Advantages of the Proposed Method
Real data, no mock database (which can be hard to generate).
No need to worry about if the mock database is representative.
Given large space of possible program paths, we only test those
that can be encountered for real data.
This is especially useful for applications that only read existing
data.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 12/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Alternative Method 1: Brute force
Test for every tuple in database.
Too costly
Limited resources in testing.
Many tuples result in the same execution path. Thus efforts wasted.
May not be possible to get all the tuples
Security constraint.
Query capability constraint. (e.g., deep-Web databases)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 13/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Alternative Method 2: Sample the existing database
Do sampling first, then test for every tuple in the sample.
A presentative database sample may not trigger a set of program
execution paths that is representative of the paths encountered
in production use.
E.g., a column with 1 million distinct values; several particular
values will trigger some paths.
Ours can be viewed as a sampling technique that is aware of the
program structure.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 14/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Alternative Method 3: Generate custom mock
databases
Generate a mock database such that its data will expose a bug in the
program
Will expose potential program bugs.
But users may not care about them.
Because many “bugs” will never occur in practice.
Because the mock database generator typically cannot generate
fully realistic databases.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 15/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Alternative Method 4: Static Analysis
Static program analysis is typically:
(+) Fast
(-) Imprecise: misses bugs and gives false alarms
Our approach: Test = execute the program (dynamic analysis)
(+) Fully precise: no false alarms
(-) Resource-hungry, will still miss bugs
Our (dynamic) analysis reasons about program + existing database
contents. We are not aware of any static analysis that does that.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 16/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Assumptions/Limitations
Queries
single-relation conjunctive selection query.
Each conjunct is a ⊙ v, where a is an attribute, v is a constant
value, and ⊙ can be <, ≤, >, ≥, =, or ∕=.
no grouping, aggregation, join, insertion, deletion, updates.
Programs
follow tuple-wise semantics.
if a branching condition depends on a database tuple, the
condition can be rewritten to the same form of the query
conjuncts: a ⊙ v.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 17/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Iterative Testing Method
1: q ← define an initial test query; 𝒬 ← {q}
2: repeat
3: 𝒯 ← run q and get the first nq result tuples
4: for each tuple t in 𝒯 do
5: run the program over t and update the execution tree tree𝒬
with encountered new execution paths
6: tree𝒬 ← the complement tree of tree𝒬
7: 𝒬c ← get the candidate queries based on tree𝒬
8: q ← select a query from 𝒬c
9: 𝒬 ← 𝒬 ∪ {q}
10: until stopping criteria satisfied
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 18/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Challenges
How to
decide how many tuples to retrieve for a query?
choose the next test query?
design stopping condition for testing?
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 19/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Optimization Goals
Given program 𝒫 and a set of test queries 𝒬={qi }
maximize coverage
Path(𝒫,ℛ,𝒬) = {Patht ∣t ∈
∪
𝒯i
}, where 𝒯i is the first ni tuples for query
qi .
minimize cost
cost(𝒬) =
∑
i cost(qi )
cost(qi ) = q_cost(qi ) + t_cost(qi ) = w + c × ni + t × ni
t_cost: t is test cost per tuple.
q_cost: w is query cost to get first result tuple, c is query cost to
get each additional tuple.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 20/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Why only ni tuples for a query qi?
Multiple tuples will result in the same program execution path. After a
certain number of initial tuples, most or all distinct paths may have
been encountered.
Less retrieved/tested tuples means both less testing cost and less
query execution cost.
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 21/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
How to choose next q and n
Greedy Approach
Given candidate query q,
score(q) = cost′
(q)
∣Path′(𝒫,ℛ,ℳ,𝒬∪{q})∣−∣Path(𝒫,ℛ,ℳ,𝒬)∣
∣Path′
(𝒫, ℛ, ℳ, 𝒬 ∪ {q})∣: estimate of ∣Path(𝒫,ℛ, ℳ, 𝒬 ∪ {q})∣
cost′
(q): estimate of cost(q)
(both are functions of n)
find q that minimizes score(q)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 22/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Estimating the Coverage and Cost
Estimating the Coverage
Estimate the query result size of leaf node (query).
The result sizes for intermediate nodes are accumulated.
c1
c2
c3 !c3
!c2
c4
c5 !c5
!c4
(100)
(20) (80)
(10) (10) (40) (40)
(30) (10)
Estimating the Cost
both initial tuple cost and total cost.
EXPLAIN (supported by major DBMSs)
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 23/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Stopping Condition for Testing
testing resource limit reached
no more candidate queries
no candidate query can return non-empty result
total number of encountered tuples (associated with distinct
paths) equals the table size
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 24/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Implementation
Overview
Fully automated tool
Analyze Java bytecode programs (any Java program, no need for
source code)
Rewrite application bytecode at load-time: after each application
bytecode instruction, insert a call to our dynamic symbolic engine
Use inserted calls to maintain an accurate symbolic
representation of program state
Treat calls to database (e.g., Jdbc) differently: Represent
returned values as symbolic variables and track how the program
uses them, i.e., in path conditions
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 25/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Implementation
Details
Use Java 5 instrumentation facilities
Use third-party open source bytecode instrumentation framework
ASM
Implement on top of new dynamic symbolic engine Dsc:
Allows handling of regular (non-query) program inputs
Solve constraints on regular program inputs with powerful
third-party satisfiability modulo theories (SMT) constraint solver
Z3
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 26/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Ongoing and Future Work
Several directions
Finish prototype implementation
Evaluate on realistic applications
Compare with mock-database generation techniques + compare
with traditional database sampling techniques:
Can we achieve higher coverage of the application code that is
reachable with the existing database contents?
How to deal with database insert, update, delete?
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 27/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
Thank you!
Contact
cli@uta.edu, csallner@uta.edu
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 28/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
References
Dynamic Symbolic Execution Systems
Dart: C programs, by Godefroid et al. [PLDI’05]
jCute: Java programs, by Sen et al. [CAV’06]
Klee: C programs, by Cadar et al. [OSDI’08]
Pex: .Net programs (C#, etc.), by Tillmann et al. [TAP’08]
Database application testing via mock database generation
jCute extension: Java programs, by Emmi et al. [ISSTA’07]
Qex (Pex extension): .Net programs (C#, etc.), by Veanes et al.
[ICFEM’09]
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 29/30
Overview
Alternative Methods
Details of the Method
Implementation
Ongoing and Future Work
References
Main tools used by our prototype implementation
ASM: http://asm.ow2.org/
Z3:
http://research.microsoft.com/en-us/um/redmond/projects/z3/
DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 30/30

Dynamic Symbolic Database Application Testing

  • 1.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Dynamic Symbolic Database Application Testing Chengkai Li, Christoph Csallner University of Texas at Arlington June 7, 2010 DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 1/30
  • 2.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Motivation Maximizing code coverage is an important goal in testing. Database applications: input can be user-supplied queries. Query results will be used as program values in program logic. Different queries thus result in different execution paths. To maximize code coverage: we need to enumerate queries in an effective way. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 2/30
  • 3.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Our Method Generate queries dynamically by inverting branching conditions in existing program execution paths. 1 Monitor the program’s execution paths by dynamic symbolic execution (e.g., Dart, Pex). 2 Invert a branching condition on some covered path → a new test query. 3 Execute the query, bring in new tuples. 4 The new tuples will cover new paths. 5 Do 1-4 iteratively. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 3/30
  • 4.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea After the initial query q1=c1 ∧ c2 Execution tree (maintained by dynamic symbolic engine): each path to a leaf node represents an execution path, encountered for tuples satisfying the branching conditions on the path. true c1 c2 c3 c4 c5 !c5 !c4 (q1) if (z > 0) { // c1 if (z < 100) // c2 // .. } DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 4/30
  • 5.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea After the initial query, the candidate queries Each dashed edge represents an inversed branching condition, thus a candidate query. true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 (q1) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 5/30
  • 6.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea The second test query q2=!c1 true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 (q1) (q2) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 6/30
  • 7.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea After the second test query q2=!c1 candidate queries are again dashed. true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 c6 c7 !c7 c11 !c11 c12 !c12 !c6(q1) (q2) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 7/30
  • 8.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea The third test query q3=!c1 ∧ c6 ∧ c7 true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 c6 c7 !c7 c11 !c11 c12 !c12 !c6(q1) (q2) (q3) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 8/30
  • 9.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea After the third test query q3=!c1 ∧ c6 ∧ c7 true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 c6 c7 c8 c9 !c9 !c8 c10 !c10 !c7 c11 !c11 c12 !c12 !c6(q1) (q2) (q3) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 9/30
  • 10.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea The fourth test query q4=!c1 ∧ c6∧!c7∧!c11∧!c12 true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 c6 c7 c8 c9 !c9 !c8 c10 !c10 !c7 c11 !c11 c12 !c12 !c6(q1) (q2) (q3) (q4) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 10/30
  • 11.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Illustration of the Idea After the fourth test query q4=!c1 ∧ c6∧!c7∧!c11∧!c12 true c1 c2 c3 c4 c5 !c5 !c4 !c3 !c2 !c1 c6 c7 c8 c9 !c9 !c8 c10 !c10 !c7 c11 !c11 c12 !c12 !c6(q1) (q2) (q3) (q4) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 11/30
  • 12.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Advantages of the Proposed Method Real data, no mock database (which can be hard to generate). No need to worry about if the mock database is representative. Given large space of possible program paths, we only test those that can be encountered for real data. This is especially useful for applications that only read existing data. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 12/30
  • 13.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Alternative Method 1: Brute force Test for every tuple in database. Too costly Limited resources in testing. Many tuples result in the same execution path. Thus efforts wasted. May not be possible to get all the tuples Security constraint. Query capability constraint. (e.g., deep-Web databases) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 13/30
  • 14.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Alternative Method 2: Sample the existing database Do sampling first, then test for every tuple in the sample. A presentative database sample may not trigger a set of program execution paths that is representative of the paths encountered in production use. E.g., a column with 1 million distinct values; several particular values will trigger some paths. Ours can be viewed as a sampling technique that is aware of the program structure. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 14/30
  • 15.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Alternative Method 3: Generate custom mock databases Generate a mock database such that its data will expose a bug in the program Will expose potential program bugs. But users may not care about them. Because many “bugs” will never occur in practice. Because the mock database generator typically cannot generate fully realistic databases. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 15/30
  • 16.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Alternative Method 4: Static Analysis Static program analysis is typically: (+) Fast (-) Imprecise: misses bugs and gives false alarms Our approach: Test = execute the program (dynamic analysis) (+) Fully precise: no false alarms (-) Resource-hungry, will still miss bugs Our (dynamic) analysis reasons about program + existing database contents. We are not aware of any static analysis that does that. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 16/30
  • 17.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Assumptions/Limitations Queries single-relation conjunctive selection query. Each conjunct is a ⊙ v, where a is an attribute, v is a constant value, and ⊙ can be <, ≤, >, ≥, =, or ∕=. no grouping, aggregation, join, insertion, deletion, updates. Programs follow tuple-wise semantics. if a branching condition depends on a database tuple, the condition can be rewritten to the same form of the query conjuncts: a ⊙ v. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 17/30
  • 18.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Iterative Testing Method 1: q ← define an initial test query; 𝒬 ← {q} 2: repeat 3: 𝒯 ← run q and get the first nq result tuples 4: for each tuple t in 𝒯 do 5: run the program over t and update the execution tree tree𝒬 with encountered new execution paths 6: tree𝒬 ← the complement tree of tree𝒬 7: 𝒬c ← get the candidate queries based on tree𝒬 8: q ← select a query from 𝒬c 9: 𝒬 ← 𝒬 ∪ {q} 10: until stopping criteria satisfied DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 18/30
  • 19.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Challenges How to decide how many tuples to retrieve for a query? choose the next test query? design stopping condition for testing? DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 19/30
  • 20.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Optimization Goals Given program 𝒫 and a set of test queries 𝒬={qi } maximize coverage Path(𝒫,ℛ,𝒬) = {Patht ∣t ∈ ∪ 𝒯i }, where 𝒯i is the first ni tuples for query qi . minimize cost cost(𝒬) = ∑ i cost(qi ) cost(qi ) = q_cost(qi ) + t_cost(qi ) = w + c × ni + t × ni t_cost: t is test cost per tuple. q_cost: w is query cost to get first result tuple, c is query cost to get each additional tuple. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 20/30
  • 21.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Why only ni tuples for a query qi? Multiple tuples will result in the same program execution path. After a certain number of initial tuples, most or all distinct paths may have been encountered. Less retrieved/tested tuples means both less testing cost and less query execution cost. DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 21/30
  • 22.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work How to choose next q and n Greedy Approach Given candidate query q, score(q) = cost′ (q) ∣Path′(𝒫,ℛ,ℳ,𝒬∪{q})∣−∣Path(𝒫,ℛ,ℳ,𝒬)∣ ∣Path′ (𝒫, ℛ, ℳ, 𝒬 ∪ {q})∣: estimate of ∣Path(𝒫,ℛ, ℳ, 𝒬 ∪ {q})∣ cost′ (q): estimate of cost(q) (both are functions of n) find q that minimizes score(q) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 22/30
  • 23.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Estimating the Coverage and Cost Estimating the Coverage Estimate the query result size of leaf node (query). The result sizes for intermediate nodes are accumulated. c1 c2 c3 !c3 !c2 c4 c5 !c5 !c4 (100) (20) (80) (10) (10) (40) (40) (30) (10) Estimating the Cost both initial tuple cost and total cost. EXPLAIN (supported by major DBMSs) DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 23/30
  • 24.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Stopping Condition for Testing testing resource limit reached no more candidate queries no candidate query can return non-empty result total number of encountered tuples (associated with distinct paths) equals the table size DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 24/30
  • 25.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Implementation Overview Fully automated tool Analyze Java bytecode programs (any Java program, no need for source code) Rewrite application bytecode at load-time: after each application bytecode instruction, insert a call to our dynamic symbolic engine Use inserted calls to maintain an accurate symbolic representation of program state Treat calls to database (e.g., Jdbc) differently: Represent returned values as symbolic variables and track how the program uses them, i.e., in path conditions DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 25/30
  • 26.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Implementation Details Use Java 5 instrumentation facilities Use third-party open source bytecode instrumentation framework ASM Implement on top of new dynamic symbolic engine Dsc: Allows handling of regular (non-query) program inputs Solve constraints on regular program inputs with powerful third-party satisfiability modulo theories (SMT) constraint solver Z3 DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 26/30
  • 27.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Ongoing and Future Work Several directions Finish prototype implementation Evaluate on realistic applications Compare with mock-database generation techniques + compare with traditional database sampling techniques: Can we achieve higher coverage of the application code that is reachable with the existing database contents? How to deal with database insert, update, delete? DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 27/30
  • 28.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work Thank you! Contact cli@uta.edu, csallner@uta.edu DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 28/30
  • 29.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work References Dynamic Symbolic Execution Systems Dart: C programs, by Godefroid et al. [PLDI’05] jCute: Java programs, by Sen et al. [CAV’06] Klee: C programs, by Cadar et al. [OSDI’08] Pex: .Net programs (C#, etc.), by Tillmann et al. [TAP’08] Database application testing via mock database generation jCute extension: Java programs, by Emmi et al. [ISSTA’07] Qex (Pex extension): .Net programs (C#, etc.), by Veanes et al. [ICFEM’09] DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 29/30
  • 30.
    Overview Alternative Methods Details ofthe Method Implementation Ongoing and Future Work References Main tools used by our prototype implementation ASM: http://asm.ow2.org/ Z3: http://research.microsoft.com/en-us/um/redmond/projects/z3/ DBTest 2010 Chengkai Li, Christoph Csallner Dynamic Symbolic Database Application Testing: 30/30