KEMBAR78
Demand-Driven Context-Sensitive Alias Analysis for Java | PDF
Demand-Driven Context-Sensitive Alias
         Analysis for Java

                      Dacong (Tony) Yan
                      Guoqing (Harry) Xu
                       Atanas Rountev
                     Ohio State University

     PRESTO: Program Analyses and Software Tools Research Group, Ohio State University
Alias Analysis
    • Many static analysis tools need highly precise and
      efficiently-computed alias information
    • Desirable properties
      – Demand-driven: query “could x and y alias?”
      – Client-driven: client-defined time budget for a query
    • Typical solution: use points-to analysis
      – Compute the objects that x may point to; same for y;
        are there common objects?
    • Goal: answer alias queries precisely and efficiently,
      without computing the complete points-to sets

2
Redundancy in Points-to Analysis




3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;
                               alias?
                   x                        y
    }

3
Redundancy in Points-to Analysis


    m(a) {                           O
        b = a;
                       points-to?            points-to?
        x = a.f;
        y = b.f;
                                    alias?
                   x                                      y
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;
                       f                    f
                   x       a           b        y
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;
                       f       alias?       f
                   x       a            b       y
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;                          O
        x = a.f;
                           points-to?    points-to?
        y = b.f;
                       f           alias?             f
                   x        a                   b         y
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;               alias
                       f                    f
                   x       a           b        y
    }

3
Redundancy in Points-to Analysis


    m(a) {
        b = a;
        x = a.f;
        y = b.f;               alias
                       f                    f
                   x       a           b        y
    }
                               alias
3
Our Approach
    • Alias analysis
      – Demand-driven and client-driven
      – Field-sensitive and calling-context-sensitive
      – Does not require complete points-to set computation
      – Better performance through method summaries
    • Symbolic Points-to Graph
      – A specialized program representation that enables the
        demand-driven alias analysis
      – Facilitates computation and use of method summaries



4
Program Representation
    • Intraprocedural Symbolic Points-To Graph
      – Introduce a symbolic object node s for each
         • formal parameter
         • field read a.fld
         • a call site that returns a reference-typed value
      – Compute intraprocedural points-to relationships

     m(a) {
       c = new …; // o1      a         sa          o1         c
                                             f
                                                     g
       a.f = c;
       return c.g;                    ret          sg
     }
5
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges




6
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges
    m(a) {
      c = new …; // o1
      a.f = c;
      return c.g;
    }




6
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges
    m(a) {                            f
                         a       sa           o1   c
      c = new …; // o1                    g
      a.f = c;
                                ret           sg
      return c.g;
    }




6
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges
    m(a) {                            f
                          a      sa           o1   c
      c = new …; // o1                    g
      a.f = c;
                                ret           sg
      return c.g;
    }

    d = new …; // o2
    b = m(d); // call m



6
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges
    m(a) {                             f
                          a      sa            o1   c
      c = new …; // o1                     g
      a.f = c;
                                ret            sg
      return c.g;
    }
                                 o2            d
    d = new …; // o2
                                   b           sm
    b = m(d); // call m



6
Interprocedural Symbolic Points-To Graph
    • Connect the intraprocedural graphs using entry
      and exit edges
    m(a) {                               f
                            a      sa            o1    c
      c = new …; // o1                       g
      a.f = c;
                          entrym   ret           sg
      return c.g;
                                                      exitm
    }
                                   o2            d
    d = new …; // o2
                                     b           sm
    b = m(d); // call m



6
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them




7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f             f
                  b     o2            s2     d
                      entry1        exit1
                               s1

7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f                f
                  b     o2             s2     d
                      entry1         exit1
                                s1

7
                             a alias? c
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                           f             f
                   b      o2            s2    d
                        entry1        exit1
                                 s1
                        may represent the
              o1                                  s3
7                      same concrete object?
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f             f
                  b     o2            s2     d
                      entry1        exit1
                               s1

              o1 f o2                            s3
7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f              f
                   b    o2             s2    d
                       entry1        exit1
                                s1

              o1   f o entry1 s                  s3
                      2        1
7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f              f
                   b    o2             s2    d
                       entry1        exit1
                                s1

              o1   f o entry1 s exit1 s          s3
                      2        1       2
7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                          f                f
                   b      o2              s2         d
                       entry1           exit1
                                  s1
                   f o         entry1, exit1
              o1      2                         s2       s3
7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                          f                f
                   b      o2              s2      d
                       entry1           exit1
                                  s1
                   f o         entry1, exit1
              o1      2                         s2 f   s3
7
Alias Analysis Formulation
    • Field-sensitivity and calling-context-sensitivity:
      matched points-to edges and entry/exit edges
    • Traverses the object nodes (including symbolic
      ones) and the edges between them
                  a      o1           s3     c
                         f               f
                   b    o2              s2      d
                       entry1        exit1
                                s1
                          f, entry1, exit1, f
              o1                                    s3
7
Using Method Summaries
    • Several reachability strings for a selected method
      – Each string is a sequence of field points-to edges
        between boundary objects of the method




8
Using Method Summaries
    • Several reachability strings for a selected method
      – Each string is a sequence of field points-to edges
        between boundary objects of the method
      m(a) {                                  f
                              a         sa          o1       c
        c = new …; // o1                           g
        a.f = c;                                    sg
                                       ret
        return c.g;
      }



8
Using Method Summaries
    • Several reachability strings for a selected method
      – Each string is a sequence of field points-to edges
        between boundary objects of the method
      m(a) {                                   f
                              a         sa          o1       c
        c = new …; // o1                            g
        a.f = c;                                        sg
                                       ret
        return c.g;
      }
                                        f, g
             summary(m):          sa               sg
8
Using Method Summaries
    • Several reachability strings for a selected method
      – Each string is a sequence of field points-to edges
        between boundary objects of the method
    • Selecting methods for summarization
      – Compute a summary for a method only if it is invoked
        from many different call sites




8
Using Method Summaries
    • Several reachability strings for a selected method
      – Each string is a sequence of field points-to edges
        between boundary objects of the method
    • Selecting methods for summarization
      – Compute a summary for a method only if it is invoked
        from many different call sites
      – Summarization ratio for method m
          • the number of incoming call graph edges of m,
            divided by the average number of incoming call
            graph edges for all methods


8
Experimental Evaluation
    • 19 Java programs
      – Number of methods in the whole-program call graph
        ranging from 2344 to 8789
    • Experiment 1: compare the precision with a field
      and context-sensitive, demand-driven, client-
      driven points-to analysis [PLDI’06]
      – Under the same time budget per alias query
      – Result 1: for 14 programs, the number of alias pairs is
        lower when using our analysis
      – Result 2: summarization leads to better precision
    • Experiment 2: compare running times with and
      without method summaries
9
Running Time Reduction When Using
             Method Summaries




     Running Time Reduction = (RTno-summ  RTsumm) / RTno-summ


10
Running Time Reduction When Using
             Method Summaries




     Running Time Reduction = (RTno-summ  RTsumm) / RTno-summ

          24% average reduction with threshold T=2
10
Conclusions
 • A demand-driven alias analysis
     – Answers alias queries directly, without computing the
       complete points-to sets
     – Selects methods for online summarization to reduce
       analysis running time
     – Outperforms a highly precise state-of-the-art points-to
       analysis




11
Thank you


12

Demand-Driven Context-Sensitive Alias Analysis for Java

  • 1.
    Demand-Driven Context-Sensitive Alias Analysis for Java Dacong (Tony) Yan Guoqing (Harry) Xu Atanas Rountev Ohio State University PRESTO: Program Analyses and Software Tools Research Group, Ohio State University
  • 2.
    Alias Analysis • Many static analysis tools need highly precise and efficiently-computed alias information • Desirable properties – Demand-driven: query “could x and y alias?” – Client-driven: client-defined time budget for a query • Typical solution: use points-to analysis – Compute the objects that x may point to; same for y; are there common objects? • Goal: answer alias queries precisely and efficiently, without computing the complete points-to sets 2
  • 3.
  • 4.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; } 3
  • 5.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; alias? x y } 3
  • 6.
    Redundancy in Points-toAnalysis m(a) { O b = a; points-to? points-to? x = a.f; y = b.f; alias? x y } 3
  • 7.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; f f x a b y } 3
  • 8.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; f alias? f x a b y } 3
  • 9.
    Redundancy in Points-toAnalysis m(a) { b = a; O x = a.f; points-to? points-to? y = b.f; f alias? f x a b y } 3
  • 10.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; alias f f x a b y } 3
  • 11.
    Redundancy in Points-toAnalysis m(a) { b = a; x = a.f; y = b.f; alias f f x a b y } alias 3
  • 12.
    Our Approach • Alias analysis – Demand-driven and client-driven – Field-sensitive and calling-context-sensitive – Does not require complete points-to set computation – Better performance through method summaries • Symbolic Points-to Graph – A specialized program representation that enables the demand-driven alias analysis – Facilitates computation and use of method summaries 4
  • 13.
    Program Representation • Intraprocedural Symbolic Points-To Graph – Introduce a symbolic object node s for each • formal parameter • field read a.fld • a call site that returns a reference-typed value – Compute intraprocedural points-to relationships m(a) { c = new …; // o1 a sa o1 c f g a.f = c; return c.g; ret sg } 5
  • 14.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges 6
  • 15.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges m(a) { c = new …; // o1 a.f = c; return c.g; } 6
  • 16.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges m(a) { f a sa o1 c c = new …; // o1 g a.f = c; ret sg return c.g; } 6
  • 17.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges m(a) { f a sa o1 c c = new …; // o1 g a.f = c; ret sg return c.g; } d = new …; // o2 b = m(d); // call m 6
  • 18.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges m(a) { f a sa o1 c c = new …; // o1 g a.f = c; ret sg return c.g; } o2 d d = new …; // o2 b sm b = m(d); // call m 6
  • 19.
    Interprocedural Symbolic Points-ToGraph • Connect the intraprocedural graphs using entry and exit edges m(a) { f a sa o1 c c = new …; // o1 g a.f = c; entrym ret sg return c.g; exitm } o2 d d = new …; // o2 b sm b = m(d); // call m 6
  • 20.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them 7
  • 21.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 7
  • 22.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 7 a alias? c
  • 23.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 may represent the o1 s3 7 same concrete object?
  • 24.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 o1 f o2 s3 7
  • 25.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 o1 f o entry1 s s3 2 1 7
  • 26.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 o1 f o entry1 s exit1 s s3 2 1 2 7
  • 27.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 f o entry1, exit1 o1 2 s2 s3 7
  • 28.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 f o entry1, exit1 o1 2 s2 f s3 7
  • 29.
    Alias Analysis Formulation • Field-sensitivity and calling-context-sensitivity: matched points-to edges and entry/exit edges • Traverses the object nodes (including symbolic ones) and the edges between them a o1 s3 c f f b o2 s2 d entry1 exit1 s1 f, entry1, exit1, f o1 s3 7
  • 30.
    Using Method Summaries • Several reachability strings for a selected method – Each string is a sequence of field points-to edges between boundary objects of the method 8
  • 31.
    Using Method Summaries • Several reachability strings for a selected method – Each string is a sequence of field points-to edges between boundary objects of the method m(a) { f a sa o1 c c = new …; // o1 g a.f = c; sg ret return c.g; } 8
  • 32.
    Using Method Summaries • Several reachability strings for a selected method – Each string is a sequence of field points-to edges between boundary objects of the method m(a) { f a sa o1 c c = new …; // o1 g a.f = c; sg ret return c.g; } f, g summary(m): sa sg 8
  • 33.
    Using Method Summaries • Several reachability strings for a selected method – Each string is a sequence of field points-to edges between boundary objects of the method • Selecting methods for summarization – Compute a summary for a method only if it is invoked from many different call sites 8
  • 34.
    Using Method Summaries • Several reachability strings for a selected method – Each string is a sequence of field points-to edges between boundary objects of the method • Selecting methods for summarization – Compute a summary for a method only if it is invoked from many different call sites – Summarization ratio for method m • the number of incoming call graph edges of m, divided by the average number of incoming call graph edges for all methods 8
  • 35.
    Experimental Evaluation • 19 Java programs – Number of methods in the whole-program call graph ranging from 2344 to 8789 • Experiment 1: compare the precision with a field and context-sensitive, demand-driven, client- driven points-to analysis [PLDI’06] – Under the same time budget per alias query – Result 1: for 14 programs, the number of alias pairs is lower when using our analysis – Result 2: summarization leads to better precision • Experiment 2: compare running times with and without method summaries 9
  • 36.
    Running Time ReductionWhen Using Method Summaries Running Time Reduction = (RTno-summ  RTsumm) / RTno-summ 10
  • 37.
    Running Time ReductionWhen Using Method Summaries Running Time Reduction = (RTno-summ  RTsumm) / RTno-summ 24% average reduction with threshold T=2 10
  • 38.
    Conclusions • Ademand-driven alias analysis – Answers alias queries directly, without computing the complete points-to sets – Selects methods for online summarization to reduce analysis running time – Outperforms a highly precise state-of-the-art points-to analysis 11
  • 39.