Where is the PEM file format specified?

For quite a long time, there was no formal specification of the PEM format with regards to cryptographic exchange of information. PEM is the textual encoding, but what is actually being encoded depends on the context. In April 2015, the IETF approved RFC 7468, which finally documents how various implementations exchange data using PEM textual encoding. The following list, taken directly from the RFC, describes the PEM format used for the following scenarios:

  1. Certificates, Certificate Revocation Lists (CRLs), and Subject
    Public Key Info structures in the Internet X.509 Public Key
    Infrastructure Certificate and Certificate Revocation List (CRL)
    Profile [RFC5280].
  2. PKCS #10: Certification Request Syntax [RFC2986].
  3. PKCS #7: Cryptographic Message Syntax [RFC2315].
  4. Cryptographic Message Syntax [RFC5652].
  5. PKCS #8: Private-Key Information Syntax [RFC5208], renamed to One
    Asymmetric Key in Asymmetric Key Package [RFC5958], and Encrypted
    Private-Key Information Syntax in the same documents.
  6. Attribute Certificates in An Internet Attribute Certificate
    Profile for Authorization [RFC5755].

According to this RFC, for the above scenarios you can expect the following labels to be within the BEGIN header and END footer. Figure 4 of the RFC has more detail, including corresponding ASN.1 types.

That’s not the full story, though. The RFC was written by looking at existing implementations and documenting what they did. The RFC wasn’t written first, nor was it written based on some existing authoritative documentation. So if you end up in a situation where you want to inter-operate with some implementation, you may have to look into the implementation’s source code to figure out what they support.

For example, OpenSSL defines these BEGIN and END markers in crypto/pem/pem.h. Here is an excerpt from the header file with all the BEGIN and END labels that they support.

# define PEM_STRING_X509_OLD     "X509 CERTIFICATE"
# define PEM_STRING_X509         "CERTIFICATE"
# define PEM_STRING_X509_TRUSTED "TRUSTED CERTIFICATE"
# define PEM_STRING_X509_REQ_OLD "NEW CERTIFICATE REQUEST"
# define PEM_STRING_X509_REQ     "CERTIFICATE REQUEST"
# define PEM_STRING_X509_CRL     "X509 CRL"
# define PEM_STRING_EVP_PKEY     "ANY PRIVATE KEY"
# define PEM_STRING_PUBLIC       "PUBLIC KEY"
# define PEM_STRING_RSA          "RSA PRIVATE KEY"
# define PEM_STRING_RSA_PUBLIC   "RSA PUBLIC KEY"
# define PEM_STRING_DSA          "DSA PRIVATE KEY"
# define PEM_STRING_DSA_PUBLIC   "DSA PUBLIC KEY"
# define PEM_STRING_PKCS7        "PKCS7"
# define PEM_STRING_PKCS7_SIGNED "PKCS #7 SIGNED DATA"
# define PEM_STRING_PKCS8        "ENCRYPTED PRIVATE KEY"
# define PEM_STRING_PKCS8INF     "PRIVATE KEY"
# define PEM_STRING_DHPARAMS     "DH PARAMETERS"
# define PEM_STRING_DHXPARAMS    "X9.42 DH PARAMETERS"
# define PEM_STRING_SSL_SESSION  "SSL SESSION PARAMETERS"
# define PEM_STRING_DSAPARAMS    "DSA PARAMETERS"
# define PEM_STRING_ECDSA_PUBLIC "ECDSA PUBLIC KEY"
# define PEM_STRING_ECPARAMETERS "EC PARAMETERS"
# define PEM_STRING_ECPRIVATEKEY "EC PRIVATE KEY"
# define PEM_STRING_PARAMETERS   "PARAMETERS"
# define PEM_STRING_CMS          "CMS"

These labels are a start, but you still have to look into how the implementation encodes the data between the labels. There’s not one correct answer for everything.

Leave a Comment