KEMBAR78
Optimize `any::swap` by achabense · Pull Request #5710 · microsoft/STL · GitHub
Skip to content

Conversation

achabense
Copy link
Contributor

@achabense achabense commented Sep 3, 2025

This pr removes an unnecessary copy in any::swap.

Hopefully fixes #5698.

Benchmark result:

------------------------------------------------------
Benchmark            Time             CPU   Iterations
------------------------------------------------------
bm<trivial>       14.5 ns         14.8 ns     49777778
bm<small>         21.1 ns         21.0 ns     32000000
bm<large>         15.2 ns         15.0 ns     44800000

->
------------------------------------------------------
Benchmark            Time             CPU   Iterations
------------------------------------------------------
bm<trivial>       10.1 ns         9.52 ns     64000000
bm<small>         15.9 ns         15.3 ns     40727273
bm<large>         10.6 ns         10.7 ns     64000000

@achabense achabense requested a review from a team as a code owner September 3, 2025 16:27
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Sep 3, 2025
@StephanTLavavej StephanTLavavej added the performance Must go faster label Sep 8, 2025
@StephanTLavavej StephanTLavavej self-assigned this Sep 8, 2025
@StephanTLavavej
Copy link
Member

Thanks! 😻 I pushed a change to further simplify and optimize any::swap().

Here are benchmarks comparing main, your change (v2), and my change (v3). The speedups I measured on my 5950X for v2 were similar to what you measured (1.44, 1.33, 1.43).

Benchmark main v2 v3 v2 Speedup v3 Speedup
bm<trivial> 13.6 ns 8.74 ns 6.19 ns 1.56 2.20
bm<small> 18.6 ns 13.5 ns 9.41 ns 1.38 1.98
bm<large> 9.65 ns 7.89 ns 4.27 ns 1.22 2.26

I believe that v3 passes our test coverage and handles self-swap scenarios equally well.

@StephanTLavavej StephanTLavavej removed their assignment Oct 3, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Oct 3, 2025
@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Oct 3, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit b7c03bb into microsoft:main Oct 4, 2025
39 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Oct 4, 2025
@StephanTLavavej
Copy link
Member

Thanks for optimizing and benchmarking this! ⏱️ 🏎️ 🎉

@achabense
Copy link
Contributor Author

Initially, I didn't dare to remove that additional copy in _Assign in v2, as I thought that might change the behavior for some (cursed) self-swaps (swapping with internal any; see a.swap(b) and b.swap(a) below). Since the copy is removed finally (v3), this becomes more disturbing so I rechecked what really happens in v2, v3 and the original version (before this pr).

Summary:

  • No version (even the original version) can actually deal with those swaps. (And the net effect of v2, v3 and the original version are the same.)
  • It's because for such cases, the "correct behavior" doesn't exist at all, i.e. these swaps cannot be well defined.
  • So v3 is correct and optimal for the current implementation.

(I'm only talking about cursed swaps here. All these versions are supposed to handle normal swaps correctly (including normal self-swaps).)


Suppose any a contains any b, which contains auto c.

  • a is "large", as it contains another any. (large ~ contains only an owning pointer, and when moved-from the pointer gets exchanged.)
  • b may contain whatever possible.
  • (If a contains b via unique_ptr<any> instead, it's "small" but also exchanged when moved-from, so the following analysis applies too.)

v2 impl (equivalent version):

any _Old_this = _STD move(*this);
{
    any _Old_that = _STD move(_That);
    reset();
    _Move_from(_Old_that);
}
_That.reset();
_That._Move_from(_Old_this);

Effects:

// If `a.swap(b)` happens first:
any old_a = move(a);     // old_a contains (owns) &b, a is moved-from (exchanged / emptied).
{
    any old_b = move(b); // contains move(c), b is moved-from.
    a.reset();           // noop.
    a._Move_from(old_b); // a contains move(move(c))
}
b.reset();               // b is emptied.
b._Move_from(old_a);     // b owns &b. (!!)

// If `b.swap(a)` happens first:
any old_b = move(b);     // contains move(c), b is moved-from.
{
    any old_a = move(a); // owns &b, a is emptied.
    b.reset();           // b is emptied.
    b._Move_from(old_a); // b owns &b. (!!)
}
a.reset();               // noop.
a._Move_from(old_b);     // a contains move(move(c)).

v3 impl:

any _Old = _STD move(*this);
reset();
_Move_from(_That);
_That.reset();
_That._Move_from(_Old);

Effects:

// If `a.swap(b)` happens first:
any old_a = move(a); // old_a owns &b, a is emptied.
a.reset();           // noop.
a._Move_from(b);     // a contains move(c), b is moved-from.
b.reset();           // b is emptied.
b._Move_from(old_a); // b owns &b. (!!)

// If `b.swap(a)` happens first:
any old_b = move(b); // old_b contains move(c), b is moved-from.
b.reset();           // b is emptied.
b._Move_from(a);     // b owns &b, a is emptied. (!!)
a.reset();           // noop.
a._Move_from(old_b); // b contains move(c).

In all these cases (either v2 or v3, either a.swap(b) or b.swap(a)), a contains move(c) (or move(move(c))), but b "owns" itself finally (lines containing (!!) comment). The swap will not crash immediately as after the swap, no entity is responsible for destroying b, so its dtor is never called.

This seems a serious problem, but it turns out that, at least this pr doesn't make anything worse - the original exchange-based implementation already have the same problem.

Take the following examples for example. On my visual studio (using the original exchange-based version), the first example compiles to a program that quickly reaches several GBs. (Which suggests the same leak pattern.) The second program tries to destroy b and crash due to stack-overflow. (Which suggests the same self-owning pattern.)

// 1
#include<any>
int main()
{
    for (;;) {
        std::any a = std::make_any<std::any>(42);
        std::any& b = *std::any_cast<std::any>(&a);
        a.swap(b);
    }
}

// 2
#include<any>
int main()
{
    std::any a = std::make_any<std::any>(42);
    std::any& b = *std::any_cast<std::any>(&a);
    a.swap(b);
    b.~any();
}

Initially, I thought this may be due to the implementation chooses to hold data in-place for trivial/small types. However, it turns out even if std::any always hold data via a pointer p, the same problem happens. For pointer-only implementation, the only reasonable way to implement any::swap is to swap their pointer:

std::swap(a.p, b.p);
// (and swap type info)

Note that after this, the same issue happens - now b "owns" itself.


I believe these cursed swaps cannot be well-defined.

Conceptually, a must be implicitly owned by something else, either on stack, allocated or inside some other classes, so the actual case is:

a's unknown owner (can be stack) contains any a, which contains any b, which contains auto c.

If the swap must be defined for a and b, the only meaningful effect (I can imagine) is to transfer b's ownership to a's initial owner, and let b contain a instead (and a contain c). But these are impossible to implement within any::swap. (Not generally implementable in C++.)

As to the current C++ standard, it only says the effect of any::swap being "exchanging the state". (Ref.). (Which doesn't look quite helpful; I'm not familiar with the standard, so maybe some rules listed elsewhere already prohibit these swaps?)

void swap(any& rhs) noexcept;
Effects: Exchanges the states of *this and rhs.

(Copy and move assignments are not cursed likewise as they are explicitly defined in terms of creating a temporary then swap.)

@StephanTLavavej
Copy link
Member

Thanks for the careful analysis!

@achabense achabense deleted the Any-swap-and-bench branch October 22, 2025 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Questions about the precondition of std::any's assignment operators

2 participants