How should I ethically approach user password storage for later plaintext retrieval?

How about taking another approach or angle at this problem? Ask why the password is required to be in plaintext: if it’s so that the user can retrieve the password, then strictly speaking you don’t really need to retrieve the password they set (they don’t remember what it is anyway), you need to be able to give them a password they can use.

Think about it: if the user needs to retrieve the password, it’s because they’ve forgotten it. In which case a new password is just as good as the old one. But, one of the drawbacks of common password reset mechanisms used today is that the generated passwords produced in a reset operation are generally a bunch of random characters, so they’re difficult for the user to simply type in correctly unless they copy-n-paste. That can be a problem for less savvy computer users.

One way around that problem is to provide auto-generated passwords that are more or less natural language text. While natural language strings might not have the entropy that a string of random characters of the same length has, there’s nothing that says your auto-generated password needs to have only 8 (or 10 or 12) characters. Get a high-entropy auto-generated passphrase by stringing together several random words (leave a space between them, so they’re still recognizable and typeable by anyone who can read). Six random words of varying length are probably easier to type correctly and with confidence than 10 random characters, and they can have a higher entropy as well. For example, the entropy of a 10 character password drawn randomly from uppercase, lowercase, digits and 10 punctuation symbols (for a total of 72 valid symbols) would have an entropy of 61.7 bits. Using a dictionary of 7776 words (as Diceware uses) which could be randomly selected for a six word passphrase, the passphrase would have an entropy of 77.4 bits. See the Diceware FAQ for more info.

  • a passphrase with about 77 bits of entropy: “admit prose flare table acute flair”

  • a password with about 74 bits of entropy: “K:&$R^tt~qkD”

I know I’d prefer typing the phrase, and with copy-n-paste, the phrase is no less easy to use that the password either, so no loss there. Of course if your website (or whatever the protected asset is) doesn’t need 77 bits of entropy for an auto-generated passphrase, generate fewer words (which I’m sure your users would appreciate).

I understand the arguments that there are password protected assets that really don’t have a high level of value, so the breach of a password might not be the end of the world. For example, I probably wouldn’t care if 80% of the passwords I use on various websites was breached: all that could happen is a someone spamming or posting under my name for a while. That wouldn’t be great, but it’s not like they’d be breaking into my bank account. However, given the fact that many people use the same password for their web forum sites as they do for their bank accounts (and probably national security databases), I think it would be best to handle even those ‘low-value’ passwords as non-recoverable.

Leave a Comment