It's already been a whole decade since the inception of what would soon become cpp-sort, and it had accumulated a number of deprecated components over the years; it was high time to blow all that dust off with a breaking change. It was also a good opportunity to start requiring C++17: it's indeed been a very long time since I last had my hands on a compiler that did not support most C++17 features (even in continuous integration, which made guaranteeing support for older compilers more difficult). That change allows to get rid of a lot of internal polyfills—naturally leading to reduced maintenance effort—, and to slightly improve user experience here and there.
This release also focuses on improving the ecosystem around measures of disorder: over the years, this project has become one of the first results when searching for "measures of presortedness", yet I was forced to admit that my own knowledge of the topic was extremely surface-level, and that the documentation was far from enough when not outright incorrect. Beyond clarifying the difference between the terms "measure of disorder" and "measure of presortedness", the wiki now clearly documents some important axioms and properties, what measures of disorder satisfy them, and how they relate to adaptive sorting. The test suite was made more robust in this area, with RapidCheck being used to perform much more thorough tests of the different axioms, properties, lemmas and theorems from the literature. Special thanks to Slimane Benloucif for all the resources, enlightening conversations, and exchanges of points of views, without whom this release would have been much more tamer on this front.
Overall version 2.0.0 is almost solely focused on quality and making the project a better resource, without adding a single big new feature. Do not worry though, I do plan to bring back features to the table in the near future, with new sorters and measures of disorder on the way. Among my various goals, I would also like to add more safety-related features to the library, and to document some sorting-adjacent topics and implementation tricks from the library in a less rigid form on my blog. What I do manage to work on will obviously depend on my motivation, though I do hope that leaving C++14 behind will help me with that.
Note: if you're an avid reading of the library's release notes, you might remember that I had been teasing a v2.0.0 breaking change for a long time, with ambitious changes. That effort is still undergoing, but does not correspond to the current version: I have hit a lot of road-blockers along the way, and progress has mostly stalled. I can't make any promise about a timeline, but as of today that future version targets C++23.
Removals
cppsort::sort
, cppsort::stable_sort
and default_sorter
(#168)
The very concept of cpp-sort is to put a handful of sorting-related algorithms and tools into the users' hands and to give them some amount of choice. Providing a default sorter kind of went against that, and the one I picked at the time was supposedly "best effort" but truly ad-hoc, likely not answering anybody's needs. Additionally it was a bloated mess, leading to measurable compilation slowdown and unreadable error messages.
If you feel nostalgic, you can still combine the bits and pieces yourself to bring back its spirit:
struct old_default_sorter:
cppsort::self_sort_adapter<
cppsort::hybrid_adapter<
cppsort::small_array_adapter<
cppsort::low_comparisons_sorter,
std::make_index_sequence<14u>
>,
cppsort::quick_sorter,
cppsort::pdq_sorter
>
>
{};
template<>
struct cppsort::stable_adapter<default_sorter>:
cppsort::merge_sorter
{
stable_adapter() = default;
constexpr explicit stable_adapter(const default_sorter&) noexcept:
stable_adapter()
{}
};
cppsort::sort
and cppsort::stable_sort
(#167)
Despite looking cute and easy to use, these functions basically brought nothing to the table except a well-known interface for default_sorter
, which got nuked as mentioned in the previous section. stable_sort
was merely a wrapper hiding stable_adapter
, and as such not much more useful. Both of the functions had heavy enough template tinkering to make overloads non-ambiguous, which happened to make compile times slower and error messages ever more unreadable when something went wrong.
In v2.0.0, the preferred solution is to use sorters and adapters directly.
drop_merge_sorter
, split_sorter
and verge_sorter
These sorters have respectively been replaced with drop_merge_adapter
, split_adapter
and verge_adapter
. If you want the old sorters back, you need to explicitly wrap them as follows:
struct drop_merge_sorter:
cppsort::drop_merge_adapter<cppsort::pdq_sorter>
{
drop_merge_sorter() = default;
};
struct split_sorter:
cppsort::split_adapter<cppsort::pdq_sorter>
{
split_sorter() = default;
};
struct verge_sorter:
cppsort:verge_adapter<
cppsort::hybrid_adapter<
cppsort::pdq_sorter,
cppsort::quick_merge_sorter
>
>
{
verge_sorter() = default;
};
template<>
struct cppsort::stable_adapter<verge_sorter>:
cppsort::stable_t<
cppsort::verge_adapter<
cppsort::hybrid_adapter<
cppsort::pdq_sorter,
cppsort::quick_merge_sorter
>
>
>
{
stable_adapter() = default;
constexpr explicit stable_adapter(const verge_sorter&):
cppsort::stable_t<
cppsort::verge_adapter<
cppsort::hybrid_adapter<
cppsort::pdq_sorter,
cppsort::quick_merge_sorter
>
>
>()
{}
};
Those reimplementations are also examples of how the existing components of the library compose.
block_sorter
The old block_sorter
has been renamed to wiki_sorter
a few versions ago. The reason behind that change is that block sorts are nowadays a class of sorting algorithms more than a single algorithm: grail_sorter
is another block sort, and many more have been invented in the last decade (many of them by members of the Discord server The Studio, @The-Studio-Discord), some of which differ widely in how they are implemented.
As such if felt unjust to keep the name "block sort" for a specific implementation thereof, especially since there is not one consensual "basic" block sort (unlike insertion, selection or heapsorts). Ironically enough, "wikisort" is also a bad name according to its author, who intended for his project to be "like a wiki around techniques to implement a block sort", but that's what the algorithm is generally called nowadays.
counting_adapter
This old sorter adapter eventually gave birth to a subcategory of adapters with a slightly different interface: metrics. It can be replaced with metrics::comparisons
, which similarly counts the number of comparisons performed by a comparison sorter.
Metrics are a category of sorter adapters meant to gather information about a sorter while it is running, that return the gathered information wrapped into cppsort::utility::metric
. As such, the main difference between counting_adapter
and metrics::comparisons
is that the latter returns utility::metric<CountType, comparisons_tag>
instead of CountType
.
probe::par
When I originally started adding measures of disorder to the library, I was discovering the very concept of disorder, and the literature around it at the same time. Most such measures had no implementation, neither in the paper nor in the wild, and the naming was pretty confusing. As a result lots of mistakes were made: along incorrectly implemented measures, I also accidentally implemented the same measure twice, under two different names:
A few years later as I was going through the literature and optimizing my implementation, I realized my mistake and removed
utility::static_const
static_const
was a hack originally used by Eric Niebler to solve ODR issues while circumventing the lack of inline
variables. We used it to create global instances of sorters:
namespace
{
constexpr auto&& awesome_sort
= cppsort::utility::static_const<awesome_sorter>::value;
}
With C++17 we get inline
variables, which render static_const
useless, hence its deletion:
inline constexpr awesome_sorter awesome_sort{};
utility::make_integer_range
and utility::make_index_range
Old unused features that don't even interact with the rest of the library.
Other breaking changes
probe::enc
The implementation of probe::enc
was changed: the previous implementation returned
The main difference for users is that it returns max_for_size
accordingly changes from
Miscellaneous
The following changes are less likely to immediately affect code in the wild, but are worth mentioning:
utility::projection_base
is now a CRTP base class, which can help the compiler to better optimize projection chains (#212).- The comparison adapter
make_projection_compare
was renamed inprojection_compare
, and the returned type fromprojection_compare
toprojection_compare_t
. This change makes the naming similar to the other comparison adapters in the library. - The O(n²) fallback of the measure of disorder
probe::osc
was removed: it was used when there wasn't enough heap memory available for the main O(n log n) algorithm, but didn't do so gracefully: it was all or nothing. I wanted to move away from such surprising changes of complexity, but also to get rid of quadratic algorithms as much as reasonable. - CMake options that didn't start with
CPPSORT_
were removed. Equivalents options with aCPPSORT_
prefix can be used instead.
Bug fixes
- Fix
probe::osc
.max_for_size(1)
which used to return some garbage value. - Fix
probe::osc
.max_for_size
for even sizes: it used to unconditionally return$\frac{|X|(|X| - 2) - 1}{2}$ , which is fine with odd sizes, but the actual bound for even sizes is$\frac{|X|(|X| - 2)}{2}$ .
Improvements
- Replace internal uses of
std::is_trivial
, deprecated in C++26. - Bump downloaded version of libassert to v3.9.1.
Tooling
Benchmarks:
- The errorbar-plot benchmark now cycles through its list of markers when there's not enough markers for all displayed lines.
- A new
dist::max
distribution accepting a factor parameter in the range [0.0, 1.0] was added. It generates an ascending sequence$X$ of elements, except that its first element is swapped with the one at position$\lfloor factor * |X| \rfloor$ . This can be used to test whether an algorithm is Max-adaptive. The following graph displays how regulardist::max
is:
Documentation:
- Replace all occurrences of "iterable" with either "range", "container", "collection" or "sequence" depending on the context.
- Rewrite, expand and clarify the documentation about measures of disorder:
- Define what a measure of disorder is, make it clear that measures of presortedness are a subset of measures of disorder that satisfy a specific set of additional axioms.
- When a measure of disorder we implement does not satisfy one of those axioms, clearly document it, with a counterexample.
- Document the monotonicity and prefix monotonicity properties, described as desirable by Valdimir Estivill-Castro in Sorting and Measures of Disorder.
- Specify whether the measures of disorder we implement are monotonic or not.
- Clearly document the specific notation used through this page of the documentation.
- Provide better definitions of what constitutes an adaptive sorting algorithm, and of the operator used to establish a partial order on measures of disorder.
- Mention that
$Exc$ , the minimum number of exchanges to bring a collection in sorted order, is NP-hard (issue #233) in the presence of elements that compare equivalent. That result was proven by Amir et al in On the Cost of Interchange Rearrangement in Strings, and explains whyprobe::exc
does not always return the smallest possible number of exchanges in the presence of equivalent elements. - Clarify that
probe::sus
returns the minimum number of ascending subsequences that can be used to decompose a sequence minus 1.
- Update the average complexity of some sorters to reflect that they adapt to some measure of disorder: for example the average complexity of
mel_sorter
is now documented as$n \log{} Enc(X)$ . - Improve and extend the Original research section about the partial ordering of $Mono$, providing tentative proofs that
$Mono \not \equiv Max$ and$Mono \not \equiv SUS$ . - Remove the section about C++17 in the changelog page: everything that used to be described there is now supported by default.
- Remove diff marks mentioning versions 1.x.y of the library. The plan is to reset such marks after every new major version.
Test suite:
- Don't force the use of specific linkers anymore when compiling the test suite with CMake. Depending on the operating system and compiler, the test suite
CMakeLists.txt
used to force the use of gold or lld, nowadays it uses whatever CMake is configured to use without overwriting user options. - Test all sorters with a "descending plateau" distribution: a strictly descending subsequence followed by a plateau of equivalent elements, followed by another strictly descending subsequence.
- Improve tests for measures of disorder:
- Add a property test to ensure that, for any measure of disorder
$M$ , we have$M(X) \le M.max\_for\_size(|X|)$ - Add a property test for order isomorphism, arguably the most important property of measures of disorder, and the second axiom defined by Mannila that measures of presortedness have to satisfy: given a measure of presortedness
$M$ and two sequences$X$ and$Y$ , if the relative order of all elements of$X$ and$Y$ is the same, then$M(X) = M(Y)$ . The property ensures that only the order of elements is involved in the computation of the disorder. - Add a property test verifying that
$M(subsequence(X)) \le M(X)$ for most of the library's measures of disorder (Mannila axiom 3 for measures of presortedness). - Add a property test for
$M(XY) = M(X) \le M(Y)$ if$X \le Y$ for most of the library's measures of disorder (Mannila axiom 4 for measures of presortedness). - Add a property test for the following stronger bound:
$M(XY) = M(X) + M(Y)$ if$X \le Y$ . The property is satisfied byprobe::exc
,probe::ham
,probe::inv
,probe::rem
,probe::runs
andprobe::spear
. - Add a property test for prefix monotonicity, a property of interest satisfied some measures of disorder, described by Vladimir Estivill-Castro in Sorting and Measures of Disorder: given a measure of disorder
$M$ and three sequences$X$ ,$Y$ and$Z$ , then$M$ satisfies the prefix monotonicity property if$M(XZ) \le M(YZ)$ when$X \le Z$ ,$Y \le Z$ and$M(X) \le M(Y)$ (where$XZ$ is the concatenation of sequences$X$ and$Z$ ). This property is satisfied byprobe::dis
,probe::exc
,probe::enc
,probe::ham
,probe::inv
,probe::max
,probe::rem
,probe::runs
,probe::spear
andprobe::sus
. - Add a property test for monotonicity, another property of interest satisfied some measures of disorder, described in Sorting and Measures of Disorder: given a measure of disorder
$M$ and four sequences$W$ ,$X$ ,$Y$ and$Z$ , then$M$ satisfies the monotonicity property if$M(WXZ) \le M(WYZ)$ when,$W \le X \le Z$ ,$W \le Y \le Z$ , and$M(X) \le M(Y)$ . This property implies prefix monotonicity and is satisfied byprobe::dis
,probe::exc
,probe::ham
,probe::inv
,probe::max
,probe::rem
,probe::runs
,probe::spear
andprobe::sus
. - Add a property test checking that
$M( \langle n, 1, 2, ..., n - 1 \rangle ) \le |X|$ for most probes (Theorem 3.18-1 in Sorting and Measures of Disorder). - Add a property test checking that
$M( \langle 2, 1, 4, 3, 6, 5, ... \rangle ) \le \frac{|X| * M(\langle 2, 1 \rangle)}{2}$ for most probes (Theorem 3.18-2 in Sorting and Measures of Disorder). - Add a property test checking that
$M(Reversed(X)) = M(X)$ forprobe::mono
andprobe::osc
. - Add a property test for
$Dis(XY) = \max{}(Dix(X), Dis(Y))$ .
- Add a property test to ensure that, for any measure of disorder
- Bump downloaded version of Catch2 to v3.10.0.
Miscellaneous:
- New Git version tagging scheme: instead of
1.x.y
, we now use the more conventionalv2.x.y
scheme instead (prefixv
). - Require CMake 3.11 or greater to build the library.
- The CMake option
CPPSORT_BUILD_TESTING
now defaults to the standard optionBUILD_TESTING
, which means that it is nowOFF
by default. - Remove the vendored CMake tool Crascit/DownloadProject now that
FetchContent
is unconditionally available. Thanks for all those years together! - Change the
cmake_minimum_required
version to 3.5.0 intest_package
, making it compatible with CMake 4. - Require C++17 in the Conan recipe that ships with the project.
- CI: use the mold linker where possible.
Known bugs
I didn't manage to fix every bug I could find since the previous release, so you might want to check the list of known bugs.