How to find leap year programmatically in C

Question

Most efficient leap year test:

if ((year & 3) == 0 && ((year % 25) != 0 || (year & 15) == 0))
{
    /* leap year */
}

This code is valid in C, C++, C#, Java, and many other C-like languages. The code utilizes a single TRUE/FALSE expression that consists of three separate tests:

4th year test: year & 3
100th year test: year % 25
400th year test: year & 15

A complete discussion of how this code works appears below, but first a discussion of Wikipedia’s algorithm is called for:

Wikipedia algorithm is INEFFICIENT/UNRELIABLE

Wikipedia has published a pseudo-code algorithm (See: Wikipedia: Leap year – Algorithm) that has been subjected to constant editing, opinion, and vandalism.

DO NOT IMPLEMENT WIKIPEDIA ALGORITHM!

One of the longest-standing (and inefficient) Wikipedia algorithms appeared as follows:

if year modulo 400 is 0 then
   is_leap_year
else if year modulo 100 is 0 then
   not_leap_year
else if year modulo 4 is 0 then
   is_leap_year
else
   not_leap_year

The above algorithm is inefficient because it always performs the tests for the 400th year and 100th year even for years that would quickly fail the “4th year test” (the modulo 4 test)—which is 75% of the time! By re-ordering the algorithm to perform the 4th year test first we speed things up significantly.

“MOST-EFFICIENT” PSEUDO-CODE ALGORITHM

I provided the following algorithm to Wikipedia (more than once):

if year is not divisible by 4 then not leap year
else if year is not divisible by 100 then leap year
else if year is divisible by 400 then leap year
else not leap year

This “most-efficient” pseudo-code simply changes the order of tests so the division by 4 takes place first, followed by the less-frequently occurring tests. Because “year” does not divide by four 75-percent of the time, the algorithm ends after only one test in three out of four cases.

NOTE: I have fought various Wikipedia editors to improve the algorithm published there, arguing that many novice—and professional—programmers quickly arrive at the Wikipedia page (due to top search engine listings) and implement the Wikipedia pseudo-code without any further research. Wikipedia editors repudiated and deleted every attempt I made to improve, annotate or even merely footnote the published algorithm. Apparently, they feel finding efficiencies is the programmer’s problem. That may be true, but many programmers are too hurried to perform solid research!

DISCUSSION OF “MOST-EFFICIENT” LEAP YEAR TEST

Bitwise-AND in place of modulo:

I have replaced two of the modulo operations in the Wikipedia algorithm with bitwise-AND operations. Why and how?

Performing a modulo calculation requires division. One doesn’t often think twice about this when programming a PC, but when programming 8-bit microcontrollers embedded in small devices you may find that a divide function cannot be natively performed by the CPU. On such CPUs, division is an arduous process involving repetitive looping, bit shifting, and add/subtract operations that is very slow. It is very desirable to avoid.

It turns out that the modulo of powers of two can be alternately achieved using a bitwise-AND operation (see: Wikipedia: Modulo operation – Performance Issues):

x % 2^n == x & (2^n – 1)

Many optimizing compilers will convert such modulo operations to bitwise-AND for you, but less advanced compilers for smaller and less popular CPUs may not. Bitwise-AND is a single instruction on every CPU.

By replacing the modulo 4 and modulo 400 tests with & 3 and & 15 (see below: ‘Factoring to reduce math’) we can ensure that the fastest code results without using a much slower divide operation.

There exists no power of two that equals 100. Thus, we are forced to continue to use the modulo operation for the 100th year test, however 100 is replaced by 25 (see below).

Factoring to simplify the math:

In addition to using bitwise-AND to replace modulo operations, you may note two additional disputes between the Wikipedia algorithm and the optimized expression:

modulo 100 is replaced by modulo 25
modulo 400 is replaced by & 15

The 100th year test utilizes modulo 25 instead of modulo 100. We can do this because 100 factors out to 2 x 2 x 5 x 5. Because the 4th year test already checks for factors of 4 we can eliminate that factor from 100, leaving 25. This optimization is probably insignificant to nearly every CPU implementation (as both 100 and 25 fit in 8-bits).

The 400th year test utilizes & 15 which is equivalent to modulo 16. Again, we can do this because 400 factors out to 2 x 2 x 2 x 2 x 5 x 5. We can eliminate the factor of 25 which is tested by the 100th year test, leaving 16. We cannot further reduce 16 because 8 is a factor of 200, so removing any more factors would produce a unwanted positive for a 200th year.

The 400th year optimization is greatly important to 8-bit CPUs, first, because it avoids division; but, more important, because the value 400 is a 9-bit number which is much more difficult to deal with in an 8-bit CPU.

Short-circuit Logical AND/OR operators:

The final, and most important, optimization used are the short-circuit logical AND (‘&&’) and OR (‘||’) operators (see: Wikipedia: Short-circuit evaluation), which are implemented in most C-like languages. Short-circuit operators are so named because they do not bother to evaluate the expression on the right side if the expression on the left side, by itself, dictates the outcome of the operation.

For example: If the year is 2003, then year & 3 == 0 is false. There is no way that the tests on the right side of the logical AND can make the outcome true, so nothing else gets evaluated.

By performing the 4th year test first, only the 4th year test (a simple bitwise-AND) is evaluated three-quarters (75 percent) of the time. This speeds up program execution greatly, especially since it avoids the division necessary for the 100th year test (the modulo 25 operation).

NOTE ON PARENTHESES PLACEMENT

One commenter felt parentheses were misplaced in my code and suggested the sub-expressions be regrouped around the logical AND operator (instead of around the logical OR), as follows:

if (((year & 3) == 0 && (year % 25) != 0) || (year & 15) == 0) { /* LY */ }

The above is incorrect. The logical AND operator has higher precedence than logical OR and will be evaluated first with or without the new parentheses. Parentheses around the logical AND arguments has no effect. This might lead one to eliminate the sub-groupings entirely:

if ((year & 3) == 0 && (year % 25) != 0 || (year & 15) == 0) { /* LY */ }

But, in both cases above, the right side of the logical OR (the 400th year test) is evaluated almost every time (i.e., years not divisible by 4 and 100). Thus, a useful optimization has been mistakenly eliminated.

The parentheses in my original code implement the most optimized solution:

if ((year & 3) == 0 && ((year % 25) != 0 || (year & 15) == 0)) { /* LY */ }

Here, the logical OR is only evaluated for years divisible by 4 (because of the short-circuit AND). The right side of the logical OR is only evaluated for years divisible by 4 and 100 (because of the short-circuit OR).

NOTE FOR C/C++ PROGRAMMERS

C/C++ programmers might feel this expression is more optimized:

if (!(year & 3) && ((year % 25) || !(year & 15))) { /* LY */ }

This is not more optimized! While the explicit == 0 and != 0 tests are removed, they become implicit and are still performed. Worse, the code is no longer valid in strongly-typed languages like C# where year & 3 evaluates to an int, but the logical AND (&&), OR (||) and NOT (!) operators require bool arguments.

Most efficient leap year test:

Wikipedia algorithm is INEFFICIENT/UNRELIABLE

DISCUSSION OF “MOST-EFFICIENT” LEAP YEAR TEST

More Related Contents:

Leave a Comment Cancel reply