Automated Program Repair, Distinguished lecture at MPI-SWS

Trustworthy Software & Automated Program
Repair
Abhik Roychoudhury
Professor, National University of Singapore
MPI Distinguished Lecture 2019
1

Working in
Program Analysis
and Software
Security (2001-)
Singapore
Cybersecurity
Consortium
(2016)
National Satellite
of Excellence in
Trustworthy
Software
Systems (2019)
2
Comprehensive university with
Science, Arts, Engineering, medical, Law, Business, Public Policy, Music, Computing, …
Public university 30K undergraduate students, 10K+ graduate students overall.
2500+ faculty members overall, 100+ in Computing (two departments CS and IS).
http://www.nus.edu.sg/about#corporate-information

Snapshot of
the talk
• Search problems in software error detection.
• Fuzzing and Symbolic Execution : Random search and
logical analysis.
• Random Search techniques are becoming more effective.
• [PRELUDE]
• The problem of program repair, as opposed to error
detection.
• Symbolic technique produces higher quality patches
than random or biased random search.
• Novel view of symbolic execution for spec. inference.
• [MAIN PART of theTALK]
3

Trustworthy
SW
• FuzzTesting
– Feed semi-random inputs to find hangs and crashes
• Continuous fuzzing
– Incrementally find new “problems” in software
• Crash reproduction
– Re-construct a reported crash, crashing input not
included due to privacy
• Reaching nooks and corners
• Localizing reported observable errors
• Patching reported errors from input-output examples
Space of Problems
(Search Problems?)
4

Trustworthy
SW
Search
Problems
• Random Search
– Less systematic
– Easy set-up, execute up to a time budget
– Use objective function to steer search.
• Symbolic Execution
– Systematic
– More involved set-up, solver calls.
– Use logical formula to steer search.
5

Use of Random
Search -
Fuzzing
Input: Seed Inputs S
1:T✗ = ∅
2:T = S
3: ifT = ∅ then
4: add empty file toT
5: end if
6: repeat
7: t = chooseNext(T)
8: p = assignEnergy(t)
9: for i from 1 to p do
10: t0 = mutate_input(t)
11: if t0 crashes then
12: add t0 toT✗
13: else if isInteresting(t0 ) then
14: add t0 toT
15: end if
16: end for
17: until timeout reached or abort-signal
Output: Crashing InputsT✗
6

Intuition
• if (condition1)
• return // short path, frequented by many many
inputs
• else if (condition2)
• exit // short paths, frequented by many inputs
• else ….
[CCS16, and its adoption]
7

Results
p(i) = 0, if f(i) > µ
min( ((i)/β)*2s(i), M) otherwise
β is a constant
s(i) #times the input exercising path i has been chosen for fuzzing
f(i) #fuzz exercising path i (path-frequency)
µ mean #fuzz exercising a discovered path (avg. path-frequency)
M maximum energy expendable on a state
Integrated into main-line of AFL fuzzer within a year of
publication (CCS16.
8

SEARCH( A, L, U, X, found, j){
int j, found = 0;
while (L <= U && found == 0){
j = (L+U)/2;
if (X == A[j]){ found = 1;}
else if (X < A[j]){ U = j -1; }
else{ L = j +1; }
}
if (found == 0){ j = L – 1;}
}
SEARCH(A, 1, 5, 20, found, j)
SEARCH(A, 1, 5, X, found, j)
SEARCH(A, N, N+4, X, found, j)
SEARCH(A, 1, M, X, found, j)
Testing ?
Comprehension??
Verification ???
Blurring the lines
“Program testing and
program proving can be
considered as extreme
alternatives. ….
This paper describes a
practical approach between
these two extremes …
Each symbolic execution
result may be equivalent to a
large number of normal tests”
9

Symbolic
Execution
int test_me(int Climb, int Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
10

Symbolic Execution
int test_me(int Climb, int
Up){
int sep, upward;
if (Climb > 0){
sep = Up;}
else {sep = add100(Up);}
if (sep > 150){
upward = 1;
} else {upward = 0;}
if (upward < 0){
abort;
} else return upward;
}
11
Execute IF(r) : FORK
[provided r is unresolved]
Then: PC := PC  r
Else: PC := PC  r
Execute IF(r)
Resolved branch condition r
using concrete values
Suppose true, PC := PC  r , OR
Suppose false, PC := PC  r

Fuzzing vs. Symbolic Execution
Bug Finding
- Symbolic execution tree construction e.g. KLEE
[Modeling system environment]
- Grey-box fuzz testing for systematic path exploration
inspired by concolic execution
AFLFast
12

Fuzzing vs. Symbolic Execution
ReachabilityAnalysis
Reachability of a location in the program
- Traverse the symbolic execution tree using search
strategies e.g. Hercules
- Encode it as an optimization
problem inside the genetic search
of grey-box fuzzing AFLGo
[CCS17]
13

Fuzzing vs
Symbolic
Execution
φ1 = (x>y)∧(x+y>10)
φ2 = ¬(x>y)∧(x+y>10)
 Directed Fuzzing as optimization problem!
1. Instrumentation Time:
• Instrument program to aggregate distance values.
2. Runtime, for each input
• decide how long to be fuzzed based on distance.
• If input is closer to the targets, it is fuzzed for longer.
14

Digression: Fuzzing
vs Symbolic
Execution
Neuro-symbolic execution
[NDSS19]
15
(Ack: figure from P Saxena and co-authors)

Trustworthy
SW
Search
Problems
• Random Search
– Less systematic
– Easy set-up, execute up to a time budget
– Use objective function to steer search.
– Enhance the effectiveness of search, with symbolic
execution as inspiration
• Symbolic Execution
– More involved set-up, solver calls.
– Use logical formula to steer search..
• Novel view of symbolic execution for spec. inference
– Beyond error detection, self-healing
16

Beyond Error Detection
In the absence of formal specifications, analyze the
buggy program and its artifacts such as execution
traces via various heuristics to glean a specification
about how it can pass tests and what could have gone
wrong!
Specification Inference
(application: self-healing)
17
Buggy
Program
Tests

Program
Repair
REPLACETHIS FLOW
Buggy
Program
Tests
18

Repair: Why?
Education
Productivity
Security
19

Search
Applicability
Scalability
Over-fitting
Large program?
Large search space?
20
(Ack: figure from C Le Goues)

Over-fitting
Tests with
oracles
Buggy
Program
Symbolic
Formulae
Program
Repair
Patched
Program
21

Example
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Traverse all mutations of line 6 ??
Hard to generate fix since (a ==c) or (c ==a) never
appear anywhere else in the program !
22

Example
Test id a b c oracle Pass
1 -1 -1 -1 INVALID
2 1 1 1 EQUILATERAL
3 2 2 3 ISOSCELES
4 2 3 2 ISOSCELES
5 3 2 2 ISOSCELES
6 2 3 4 SCALANE
1 int triangle(int a, int b, int c){
2 if (a <= 0 || b <= 0 || c <= 0)
3 return INVALID;
4 if (a == b && b == c)
5 return EQUILATERAL;
6 if (a == b || b != c) // bug!
7 return ISOSCELES;
8 return SCALENE;
9 }
Correct fix
(a == b || b== c || a == c)
Automatically generate the constraint
f(2,2,3)  f(2,3,2)  f(3,2,)   f(2,3,4)
Solution
f(ab,c) = (a == b || b == c || a == c)
23

Comparison 1. Where to fix, which line?
2. Generate patches in the candidate line
3. Validate the candidate patches against correctness
criterion.
1. Where to fix, which line(s)?
2. What values should be returned by those lines, e.g. <inp ==1,
ret== 0>
3. What are the expressions which will return such values?
Syntax-based Schematic
for e in Search-space{
Validate e againstTests
}
Semantics-basedSchematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
24

Specification
Inference
var = f(live_vars) // X
Test input t
Concrete
values
Oracle (expected output)
Output:
Value-set or Constraint
Symbolic
execution
Program
Concrete Execution
[ICSE13] 25

Example
inhibit up_sep down_sep Observed
o/p
Oracle Pass
1 0 100 0 0
1 11 110 0 1
0 100 50 1 1
1 -20 60 0 1
0 0 10 0 0
1 int is_upward( int inhibit, int up_sep, int down_sep){
2 int bias;
3 if (inhibit)
4 bias = down_sep; // bias= up_sep + 100
5 else bias = up_sep ;
6 if (bias > down_sep)
7 return 1;
8 else return 0;
9 }
26

Debugging
• Given a test-suiteT
– fail(s) º # of failing executions in which s occurs
– pass(s) º # of passing executions in which s occurs
– allfail ºTotal # of failing executions
– allpass º Total # of passing executions
• allfail+ allpass = |T|
• Can also use other metric likeOchiai.
Score(s) =
fail(s)
allfail
fail(s)
allfail
pass(s)
allpass
+
Buggy
Program
Test Suite
-Investigate what
this statement
should be.
- Generate a fixed
statement
Fixed
Program
YES
NO
27

Example
28

Example
29

Example
• Accumulated constraints
– f(1,11, 110) > 110 
– f(1,0,100) ≤ 100 
– …
• Find a f satisfying this constraint
– By fixing the set of operators appearing in f
• Candidate methods
• Search over the space of expressions
• Program synthesis with fixed set of operators
– Can also be achieved by second-order constraint solving
• Generated fix
– f(inhibit,up_sep,down_sep) = up_sep + 100
30

Repair
Workflow
31

Simplified
Workflow, but
Applicability
Over-fitting
Scalability
[ICSE15] 32

Comparison
#Programs Equivalent Same Loc. Diff.
SemFix
[ICSE13]
44 17% 46% 6.36
DirectFix
[ICSE15]
44 53% 95% 2.31 33

Workflows
Applicability
Over-fitting
Scalability
34

Angelix
35

Repair Constraint
• SemFix work (ICSE 2013)
– Example: for an identified expression e to be fixed
• [ X > 0 ] ∧ f(t) == X for each test t
• DirectFix work (ICSE 2015)
– Whole Program as repair constraint
– Use the principle of minimality to synthesize a minimal patch.
• Angelix work (ICSE 2016)
– Example: for identified expressions e1, e2, … to be fixed
– [ (X == 1) ∨ (X == 2) ∨ (X== 3)] ∧ f(t) ==X for each test t.
– [ (X== 1 ∧Y == 1) ∨ (X==2 ∧Y ==2)] ∧ f(t) ==X ∧g(t)==Y for each test t.
36

Scalability
Subject LoC
wireshark 2814K
php 1046K
gzip 491K
gmp 145K
libtiff 77K
Average time == 32 minutes
0
5
10
15
20
25
30
35
wireshark
php
gzip
gmp
libtiff
Overall
Angelix
SPR
GenProg
37

Patch Quality
38

Experience
of others
• “The core technique inAngelix using symbolic
execution and program synthesis works well”.
• “It can potentially suffer from poor fault localization”.
• “With better fault localization, the patch synthesis
seems hard to improve in effectiveness”
– Can still be improved in terms of efficiency
• [Anecdotal comments only from user groups]
[ICSE16 and its
usage]
39
Experience
of others

Specification
Inference
• Two approaches
– Get property of function f via symbolic execution, and
synthesize a function f satisfying these properties.
– Directly solve for function f by building a second-order
symbolic execution engine.
• Allow for existentially quantified second order variables.
• Restrict their interpretation to a language e.g. linear
integer arithmetic
Term =Var |Constant |Term +Term |Term –Term |Constant *Term
• Example SAT
– (0) > 0  (1) ≤ 0
– Satisfying solution  = x. 1 – x
40
Specification
Inference

Term =Var | Constant |Term +Term |
Term –Term | Constant *Term
Second order Program Repair
41
scanf(“%d”, &x);
for(i = 0; i <10; i++){
int t = (i,x);
if (t > 0) printf(“1”);
else printf(“0”);
}
P(5)  “1110000000” expected “1111111000”
Buggy Program:
SampleTest:
Synthesis Specification:  . i i  output = expected
Solve for  directly

Second order Program Repair
42
scanf(“%d”, &x);
for(i = 0; i <10; i++){
int t = (i,x);
if (t > 0) printf(“1”);
else printf(“0”);
}
P(5)  “1110000000” expected “1111111000”
Buggy Program:
SampleTest:
Synthesis Specification:  . i i  output = expected
Solve for  directly
𝜌 0,5 > 0
𝜌 1,5 > 0 𝜌 1,5 > 0
𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0 𝜌 2,5 > 0
Yes No
Yes No Yes No
𝑈𝑁𝑆𝐴𝑇
Yes

Encoding for Synthesis
43
… error_severity(1);
return;
}
/ / r(ent->fts_info, ent->fts_errno, prev_depth)
else if (ent->fts_info == FTSSLNONE){
if (symlink_loop(ent->fts_accpath))
…

Digression:
Library
Synthesis
44

(Test-based)
Program
Repair
Syntax-based Schematic
Semantic Schematic
for t inTests {
generate repair constraintΨt
}
Synthesize e from ∧tΨt
for e in Search-space{
Validate e against Tests
}
45

Middle Road
中道
46

Test-
equivalence
based repair
scanf ("%d" ,&x);
for (i = 0; i < 10; i++)
if (x – i > 0)
printf ("1");
else
printf ("0");
Consider all
inequalities
𝛼𝑥 ± 𝛽𝑖 [>≥=≠] 𝛾
Sequence of values: Equivalence class (x = 4):
{T, T, T, T, T, T, T, T, T, T} {x > 0, x > 1, …}
{T, T, T, T, T, T, T, T, T, F} {x – i > -5, …}
{T, T, T, T, T, T, T, T, F, T} EMPTY
{T, T, T, T, T, T, T, T, F, F} {x – i > -4, …}
{T, T, T, T, T, T, T, F, T, T} EMPTY
{T, T, T, T, T, T, T, F, T, F} EMPTY
{T, T, T, T, T, T, T, F, F,T} EMPTY
…
47

Efficiency
48
[TOSEM18]

Repair with
Fuzzing
MPI Distinguished Lecture, 2019
49

Fix2Fit
50
Integration of repair into programming environments?
Number of plausible patches that can be reduced if the tests are
empowered with more oracles
[ISSTA19]

Provably
Correct
FromTests?
From Programs?
51

Analyzing
Linux Busybox
52
[ICSE18]

Other
Applications:
Education
Education
Productivity
Security
Intelligent tutoring systems:Automated grading and
hint generation via Program Repair
Detailed Study in IIT-Kanpur, India [FSE17, and ongoing] 53

Repair in steps
54

Most Relevant Results
Semantic Program Repair Using a Reference Implementation ( PDF )
ICSE 2018.
Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis ( pdf )
ICSE 2016.
DirectFix: Looking for Simple Program Repairs ( PDF )
ICSE 2015.
SemFix: Program Repair via Semantic Analysis ( pdf )
ICSE 2013.
Symbolic execution with second order existential constraints
ESEC-FSE 2018.
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
http://www.comp.nus.edu.sg/~tsunami/
ACKNOWLEGEMENT: Sergey Mechtaev, Semantic Program Repair,ACM SIGSOFTOutstanding Doctoral Dissertation.
Crash-Avoiding Program Repair
ISSTA 2019.
A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments
ESEC-FSE 2017.
55

Other Results (mostly background technology)
Coverage-based Greybox Fuzzing as Markov Chain
CCS 2016.
Directed Greybox Fuzzing
CCS 2017.
ACKNOWLEDGEMENT: National Cyber Security Research program from NRF Singapore
DSO National Laboratories, Singapore.
Neuro-Symbolic Execution: Augmenting Symbolic Execution with Neural Constraints
NDSS 2019.
56
Hercules: Reproducing Crashes in Real-World Application Binaries
ICSE 2015.

Relevant Projects
in Singapore
https://www.comp.nus.edu.sg/~nsoe-tss/
https://www.comp.nus.edu.sg/~tsunami/
Consortium:47 companies
57

Summary
Figure taken from:
Automated Program Repair
C. Le Goues, M. Pradel, A. Roychoudhury
Review Article,Communications of the ACM, 2019.
Selectively HIRING atVARIOUS LEVELS:
Post Docs, …
+
Open Positions for
ASST PROFs in NUS CS Department 58

Automated Program Repair, Distinguished lecture at MPI-SWS

More Related Content

What's hot

Similar to Automated Program Repair, Distinguished lecture at MPI-SWS

More from Abhik Roychoudhury

Recently uploaded

Automated Program Repair, Distinguished lecture at MPI-SWS