KEMBAR78
Comparing 7e794d0a3f...6a1e32d532 · git/git · GitHub
Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 7e794d0a3f
Choose a base ref
...
head repository: git/git
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 6a1e32d532
Choose a head ref
  • 6 commits
  • 9 files changed
  • 1 contributor

Commits on Aug 20, 2018

  1. t/perf: factor boilerplate out of test_perf

    About half of test_perf() is boilerplate preparing to run
    _any_ test, and the other half is specifically running a
    timing test. Let's split it into two functions, so that we
    can reuse the boilerplate in future commits.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 20, 2018
    Configuration menu
    Copy the full SHA
    968e77a View commit details
    Browse the repository at this point in the history
  2. t/perf: factor out percent calculations

    This will let us reuse the code when we add new values to
    aggregate besides times.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 20, 2018
    Configuration menu
    Copy the full SHA
    5a924a6 View commit details
    Browse the repository at this point in the history
  3. t/perf: add infrastructure for measuring sizes

    The main objective of scripts in the perf framework is to
    run "test_perf", which measures the time it takes to run
    some operation. However, it can also be interesting to see
    the change in the output size of certain operations.
    
    This patch introduces test_size, which records a single
    numeric output from the test and shows it in the aggregated
    output (with pretty printing and relative size comparison).
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 20, 2018
    Configuration menu
    Copy the full SHA
    22bec79 View commit details
    Browse the repository at this point in the history
  4. t/perf: add perf tests for fetches from a bitmapped server

    A server with bitmapped packs can serve a clone very
    quickly. However, fetches are not necessarily made any
    faster, because we spend a lot less time in object traversal
    (which is what bitmaps help with) and more time finding
    deltas (because we may have to throw out on-disk deltas if
    the client does not have the base).
    
    As a first step to making this faster, this patch introduces
    a new perf script to measure fetches into a repo of various
    ages from a fully-bitmapped server.
    
    We separately measure the work done by the server (in
    pack-objects) and that done by the client (in index-pack).
    Furthermore, we measure the size of the resulting pack.
    
    Breaking it down like this (instead of just doing a regular
    "git fetch") lets us see how much each side benefits from
    any changes. And since we know the pack size, if we estimate
    the network speed, then one could calculate a complete
    wall-clock time for the operation (though the script does
    not do this automatically).
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 20, 2018
    Configuration menu
    Copy the full SHA
    198b349 View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2018

  1. pack-bitmap: save "have" bitmap from walk

    When we do a bitmap walk, we save the result, which
    represents (WANTs & ~HAVEs); i.e., every object we care
    about visiting in our walk. However, we throw away the
    haves bitmap, which can sometimes be useful, too. Save it
    and provide an access function so code which has performed a
    walk can query it.
    
    A few notes on the accessor interface:
    
     - the bitmap code calls these "haves" because it grew out
       of the want/have negotiation for fetches. But really,
       these are simply the objects that would be flagged
       UNINTERESTING in a regular traversal. Let's use that
       more universal nomenclature for the external module
       interface. We may want to change the internal naming
       inside the bitmap code, but that's outside the scope of
       this patch.
    
     - it still uses a bare "sha1" rather than "oid". That's
       true of all of the bitmap code. And in this particular
       instance, our caller in pack-objects is dealing with the
       bare sha1 that comes from a packed REF_DELTA (we're
       pointing directly to the mmap'd pack on disk). That's
       something we'll have to deal with as we transition to a
       new hash, but we can wait and see how the caller ends up
       being fixed and adjust this interface accordingly.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 21, 2018
    Configuration menu
    Copy the full SHA
    30cdc33 View commit details
    Browse the repository at this point in the history
  2. pack-objects: reuse on-disk deltas for thin "have" objects

    When we serve a fetch, we pass the "wants" and "haves" from
    the fetch negotiation to pack-objects. That tells us not
    only which objects we need to send, but we also use the
    boundary commits as "preferred bases": their trees and blobs
    are candidates for delta bases, both for reusing on-disk
    deltas and for finding new ones.
    
    However, this misses some opportunities. Modulo some special
    cases like shallow or partial clones, we know that every
    object reachable from the "haves" could be a preferred base.
    We don't use all of them for two reasons:
    
      1. It's expensive to traverse the whole history and
         enumerate all of the objects the other side has.
    
      2. The delta search is expensive, so we want to keep the
         number of candidate bases sane. The boundary commits
         are the most likely to work.
    
    When we have reachability bitmaps, though, reason 1 no
    longer applies. We can efficiently compute the set of
    reachable objects on the other side (and in fact already did
    so as part of the bitmap set-difference to get the list of
    interesting objects). And using this set conveniently
    covers the shallow and partial cases, since we have to
    disable the use of bitmaps for those anyway.
    
    The second reason argues against using these bases in the
    search for new deltas. But there's one case where we can use
    this information for free: when we have an existing on-disk
    delta that we're considering reusing, we can do so if we
    know the other side has the base object. This in fact saves
    time during the delta search, because it's one less delta we
    have to compute.
    
    And that's exactly what this patch does: when we're
    considering whether to reuse an on-disk delta, if bitmaps
    tell us the other side has the object (and we're making a
    thin-pack), then we reuse it.
    
    Here are the results on p5311 using linux.git, which
    simulates a client fetching after `N` days since their last
    fetch:
    
     Test                         origin              HEAD
     --------------------------------------------------------------------------
     5311.3: server   (1 days)    0.27(0.27+0.04)     0.12(0.09+0.03) -55.6%
     5311.4: size     (1 days)               0.9M              237.0K -73.7%
     5311.5: client   (1 days)    0.04(0.05+0.00)     0.10(0.10+0.00) +150.0%
     5311.7: server   (2 days)    0.34(0.42+0.04)     0.13(0.10+0.03) -61.8%
     5311.8: size     (2 days)               1.5M              347.7K -76.5%
     5311.9: client   (2 days)    0.07(0.08+0.00)     0.16(0.15+0.01) +128.6%
     5311.11: server   (4 days)   0.56(0.77+0.08)     0.13(0.10+0.02) -76.8%
     5311.12: size     (4 days)              2.8M              566.6K -79.8%
     5311.13: client   (4 days)   0.13(0.15+0.00)     0.34(0.31+0.02) +161.5%
     5311.15: server   (8 days)   0.97(1.39+0.11)     0.30(0.25+0.05) -69.1%
     5311.16: size     (8 days)              4.3M                1.0M -76.0%
     5311.17: client   (8 days)   0.20(0.22+0.01)     0.53(0.52+0.01) +165.0%
     5311.19: server  (16 days)   1.52(2.51+0.12)     0.30(0.26+0.03) -80.3%
     5311.20: size    (16 days)              8.0M                2.0M -74.5%
     5311.21: client  (16 days)   0.40(0.47+0.03)     1.01(0.98+0.04) +152.5%
     5311.23: server  (32 days)   2.40(4.44+0.20)     0.31(0.26+0.04) -87.1%
     5311.24: size    (32 days)             14.1M                4.1M -70.9%
     5311.25: client  (32 days)   0.70(0.90+0.03)     1.81(1.75+0.06) +158.6%
     5311.27: server  (64 days)   11.76(26.57+0.29)   0.55(0.50+0.08) -95.3%
     5311.28: size    (64 days)             89.4M               47.4M -47.0%
     5311.29: client  (64 days)   5.71(9.31+0.27)     15.20(15.20+0.32) +166.2%
     5311.31: server (128 days)   16.15(36.87+0.40)   0.91(0.82+0.14) -94.4%
     5311.32: size   (128 days)            134.8M              100.4M -25.5%
     5311.33: client (128 days)   9.42(16.86+0.49)    25.34(25.80+0.46) +169.0%
    
    In all cases we save CPU time on the server (sometimes
    significant) and the resulting pack is smaller. We do spend
    more CPU time on the client side, because it has to
    reconstruct more deltas. But that's the right tradeoff to
    make, since clients tend to outnumber servers. It just means
    the thin pack mechanism is doing its job.
    
    From the user's perspective, the end-to-end time of the
    operation will generally be faster. E.g., in the 128-day
    case, we saved 15s on the server at a cost of 16s on the
    client. Since the resulting pack is 34MB smaller, this is a
    net win if the network speed is less than 270Mbit/s. And
    that's actually the worst case. The 64-day case saves just
    over 11s at a cost of just under 11s. So it's a slight win
    at any network speed, and the 40MB saved is pure bonus. That
    trend continues for the smaller fetches.
    
    The implementation itself is mostly straightforward, with
    the new logic going into check_object(). But there are two
    tricky bits.
    
    The first is that check_object() needs access to the
    relevant information (the thin flag and bitmap result). We
    can do this by pushing these into program-lifetime globals.
    
    The second is that the rest of the code assumes that any
    reused delta will point to another "struct object_entry" as
    its base. But of course the case we are interested in here
    is the one where don't have such an entry!
    
    I looked at a number of options that didn't quite work:
    
     - we could use a flag to signal a reused delta, but it's
       not a single bit. We have to actually store the oid of
       the base, which is normally done by pointing to the
       existing object_entry. And we'd have to modify all the
       code which looks at deltas.
    
     - we could add the reused bases to the end of the existing
       object_entry array. While this does create some extra
       work as later stages consider the extra entries, it's
       actually not too bad (we're not sending them, so they
       don't cost much in the delta search, and at most we'd
       have 2*N of them).
    
       But there's a more subtle problem. Adding to the existing
       array means we might need to grow it with realloc, which
       could move the earlier entries around. While many of the
       references to other entries are done by integer index,
       some (including ones on the stack) use pointers, which
       would become invalidated.
    
       This isn't insurmountable, but it would require quite a
       bit of refactoring (and it's hard to know that you've got
       it all, since it may work _most_ of the time and then
       fail subtly based on memory allocation patterns).
    
     - we could allocate a new one-off entry for the base. In
       fact, this is what an earlier version of this patch did.
       However, since the refactoring brought in by ad635e8
       (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23),
       the delta_idx code requires that both entries be in the
       main packing list.
    
    So taking all of those options into account, what I ended up
    with is a separate list of "external bases" that are not
    part of the main packing list. Each delta entry that points
    to an external base has a single-bit flag to do so; we have a
    little breathing room in the bitfield section of
    object_entry.
    
    This lets us limit the change primarily to the oe_delta()
    and oe_set_delta_ext() functions. And as a bonus, most of
    the rest of the code does not consider these dummy entries
    at all, saving both runtime CPU and code complexity.
    
    Signed-off-by: Jeff King <peff@peff.net>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    peff authored and gitster committed Aug 21, 2018
    Configuration menu
    Copy the full SHA
    6a1e32d View commit details
    Browse the repository at this point in the history
Loading