A BCrypt hash string looks like:
$2a$10$Ro0CUfOqk6cXEKf3dyaM7OhSCvnwM9s4wIX9JeLapehKK5YdLxKcm
\__/\/ \____________________/\_____________________________/
| | Salt Hash
| Cost
Version
Where
2a
: Algorithm Identifier (BCrypt, UTF8 encoded password, null terminated)10
: Cost Factor (210
= 1,024 rounds)Ro0CUfOqk6cXEKf3dyaM7O
: OpenBSD-Base64 encoded salt (22 characters, 16 bytes)hSCvnwM9s4wIX9JeLapehKK5YdLxKcm
: OpenBSD-Base64 encoded hash (31 characters, 24 bytes)
Edit: i just noticed these words fit exactly. i had to share:
$2a$10$TwentytwocharactersaltThirtyonecharacterspasswordhash $==$==$======================-------------------------------
BCrypt does create a 24-byte binary hash, using 16-byte salt. You’re free to store the binary hash and the salt however you like; nothing says you have to base-64 encode it into a string.
But BCrypt was created by guys who were working on OpenBSD. OpenBSD already defines a format for their password file:
$[HashAlgorithmIdentifier]
$[AlgorithmSpecificData]
This means that the “bcrypt specification” is inexorably linked to the OpenBSD password file format. And whenever anyone creates a “bcrypt hash” they always convert it to an ISO-8859-1 string of the format:
$2a
$[Cost]
$[Base64Salt][Base64Hash]
A few important points:
-
2a
is the algorithm identifier- 1: MD5
- 2: early bcrypt, which had confusion over which encoding passwords are in (obsolete)
- 2a: current bcrypt, which specifies passwords as UTF-8 encoded
-
Cost is a cost factor used when computing the hash. The “current” value is 10, meaning the internal key setup goes through 1,024 rounds
- 10: 210 = 1,024 iterations
- 11: 211 = 2,048 iterations
- 12: 212 = 4,096 iterations
-
the base64 algorithm used by the OpenBSD password file is not the same Base64 encoding that everybody else uses; they have their own:
Regular Base64 Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/ BSD Base64 Alphabet: ./ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
So any implementations of bcrypt cannot use any built-in, or standard, base64 library
Armed with this knowledge, you can now verify a password correctbatteryhorsestapler
against the saved hash:
$2a$12$mACnM5lzNigHMaf7O1py1O3vlf6.BA8k8x3IoJ.Tq3IB/2e7g61Km
BCrypt variants
There is a lot of confusion around the bcrypt versions.
$2$
BCrypt was designed by the OpenBSD people. It was designed to hash passwords for storage in the OpenBSD password file. Hashed passwords are stored with a prefix to identify the algorithm used. BCrypt got the prefix $2$
.
This was in contrast to the other algorithm prefixes:
$1$
: MD5$5$
: SHA-256$6$
: SHA-512
$2a$
The original BCrypt specification did not define how to handle non-ASCII characters, or how to handle a null terminator. The specification was revised to specify that when hashing strings:
- the string must be UTF-8 encoded
- the null terminator must be included
$2x$, $2y$ (June 2011)
A bug was discovered in crypt_blowfish🕗, a PHP implementation of BCrypt. It was mis-handling characters with the 8th bit set.
They suggested that system administrators update their existing password database, replacing $2a$
with $2x$
, to indicate that those hashes are bad (and need to use the old broken algorithm). They also suggested the idea of having crypt_blowfish emit $2y$
for hashes generated by the fixed algorithm. Nobody else, including canonical OpenBSD, adopted the idea of 2x
/2y
. This version marker was was limited to crypt_blowfish🕗.
The versions $2x$ and $2y$ are not “better” or “stronger” than $2a$. They are remnants of one particular buggy implementation of BCrypt.
$2b$ (February 2014)
A bug was discovered in the OpenBSD implementation of BCrypt. They wrote their implementation in a language that doesn’t have support strings – so they were faking it with a length-prefix, a pointer to a character, and then indexing that pointer with []
. Unfortunately they were storing the length of their strings in an unsigned char
. If a password was longer than 255 characters, it would overflow and wrap at 255. BCrypt was created for OpenBSD. When they have a bug in their library, they decided its ok to bump the version. This means that everyone else needs to follow suit if you want to remain current to “their” specification.
- http://undeadly.org/cgi?action=article&sid=20140224132743 🕗
- http://marc.info/?l=openbsd-misc&m=139320023202696 🕗
There is no difference between 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
- If you were doing the right thing from the beginning (storing strings in utf8 and also hashing the null terminator) then: there is no difference between 2, 2a, 2x, 2y, and 2b. If you wrote your implementation correctly, they all output the same result.
- The version $2b$ is not “better” or “stronger” than $2a$. It is a remnant of one particular buggy implementation of BCrypt. But since BCrypt canonically belongs to OpenBSD, they get to change the version marker to whatever they want.
- The versions $2x$ and $2y$ are not better, or even preferable, to anything. They are remnants of a buggy implementation – and should summarily forgotten.
The only people who need to care about 2x and 2y are those you may have been using crypt_blowfish back in 2011. And the only people who need to care about 2b are those who may have been running OpenBSD.
All other correct implementations are identical and correct.