Make memory alignment more random

While working on a new bot for auto-filing performance regressions in dotnet/runtime repository ([sample issue](https://github.com/dotnet/runtime/issues/37806)) we have found out that quite a few microbenchmarks from dotnet/performance repository are bimodal and the modality tends to be stable for a few days before it switches to the other mode

Example:

![obraz](https://user-images.githubusercontent.com/6011991/89928990-e64ee280-dc08-11ea-85e7-4beb9004f846.png)

So it's very often something like:

```log
_____~~~~~~_____~~~~~~______~~~~~~
```

A while ago @andyayersms has [mentioned](https://github.com/dotnet/BenchmarkDotNet/issues/756) [stabilizer](https://github.com/ccurtsinger/stabilizer/blob/master/README.md) which performs full randomization.

.NET does not allow for full control of memory alignment, but we could at least try to make it more random.

In https://github.com/dotnet/runtime/issues/37814 @jkotas has [provided](https://github.com/dotnet/runtime/issues/37814#issuecomment-667804880) a small repro that shows "the many modal nature of memory copying":

```cs
using System;
using System.Diagnostics;

class Program
{
    static void Work(ReadOnlySpan<string> from, Span<string> to)
    {
        for (int i = 0; i < 1000000; i++) from.CopyTo(to);
    }

    static void Main(string[] args)
    {
        Random r = new Random();
        for (;;)
        {
            var sw = new Stopwatch();
            GC.KeepAlive(new byte[r.Next(32)]); // the trick
            var from = new string[2048];
            GC.KeepAlive(new byte[r.Next(32)]);
            var to = new string[2048];
            if (r.Next(10) == 0) GC.Collect();
            sw.Start();
            Work(from, to);       
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
        }
    }
}
```

So the first step could be to allocate a variable-size byte array between iterations to have more randomized memory alignment of the objects allocated during benchmarking.

The problem is that very often the input is allocated in `[GlobalSetup]` method (to exclude the cost of allocation from the benchmark which is good) which we promise to call only once during benchmarking ;) Perhaps we could add an optional config flag to invoke it once per every iteration? (but somehow avoid the `[IterationSetup]` hell)
Some benchmarks are initialized in constructors, so we might also consider allocating a new instance of benchmarked type for every iteration.

@AndreyAkinshin what do you think? 

@drewscoggins is there any chance you could provide a list of such benchmarks to use them for experimenting?

/cc @billwert @tannergooding @kunalspathak 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make memory alignment more random #1513

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Make memory alignment more random #1513

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions