What is microbenchmarking?

It means exactly what it says on the tin can – it’s measuring the performance of something “small”, like a system call to the kernel of an operating system.

The danger is that people may use whatever results they obtain from microbenchmarking to dictate optimizations. And as we all know:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of
all evil” — Donald Knuth

There can be many factors that skew the result of microbenchmarks. Compiler optimizations is one of them. If the operation being measured takes so little time that whatever you use to measure it takes longer than the actual operation itself, your microbenchmarks will be skewed also.

For example, someone might take a microbenchmark of the overhead of for loops:

void TestForLoop()
{
    time start = GetTime();

    for(int i = 0; i < 1000000000; ++i)
    {
    }

    time elapsed = GetTime() - start;
    time elapsedPerIteration = elapsed / 1000000000;
    printf("Time elapsed for each iteration: %d\n", elapsedPerIteration);
}

Obviously compilers can see that the loop does absolutely nothing and not generate any code for the loop at all. So the value of elapsed and elapsedPerIteration is pretty much useless.

Even if the loop does something:

void TestForLoop()
{
    int sum = 0;
    time start = GetTime();

    for(int i = 0; i < 1000000000; ++i)
    {
        ++sum;
    }

    time elapsed = GetTime() - start;
    time elapsedPerIteration = elapsed / 1000000000;
    printf("Time elapsed for each iteration: %d\n", elapsedPerIteration);
}

The compiler may see that the variable sum isn’t going to be used for anything and optimize it away, and optimize away the for loop as well. But wait! What if we do this:

void TestForLoop()
{
    int sum = 0;
    time start = GetTime();

    for(int i = 0; i < 1000000000; ++i)
    {
        ++sum;
    }

    time elapsed = GetTime() - start;
    time elapsedPerIteration = elapsed / 1000000000;
    printf("Time elapsed for each iteration: %d\n", elapsedPerIteration);
    printf("Sum: %d\n", sum); // Added
}

The compiler might be smart enough to realize that sum will always be a constant value, and optimize all that away as well. Many would be surprised at the optimizing capabilities of compilers these days.

But what about things that compilers can’t optimize away?

void TestFileOpenPerformance()
{
    FILE* file = NULL;
    time start = GetTime();

    for(int i = 0; i < 1000000000; ++i)
    {
        file = fopen("testfile.dat");
        fclose(file);
    }

    time elapsed = GetTime() - start;
    time elapsedPerIteration = elapsed / 1000000000;
    printf("Time elapsed for each file open: %d\n", elapsedPerIteration);
}

Even this is not a useful test! The operating system may see that the file is being opened very frequently, so it may preload it in memory to improve performance. Pretty much all operating systems do this. The same thing happens when you open applications – operating systems may figure out the top ~5 applications you open the most and preload the application code in memory when you boot up the computer!

In fact, there are countless variables that come into play: locality of reference (e.g. arrays vs. linked lists), effects of caches and memory bandwidth, compiler inlining, compiler implementation, compiler switches, number of processor cores, optimizations at the processor level, operating system schedulers, operating system background processes, etc.

So microbenchmarking isn’t exactly a useful metric in a lot of cases. It definitely does not replace whole-program benchmarks with well-defined test cases (profiling). Write readable code first, then profile to see what needs to be done, if any.

I would like to emphasize that microbenchmarks are not evil per se, but one has to use them carefully (that’s true for lots of other things related to computers)

Leave a Comment