Important Data Formats - X.509, PKCS, ASN.1 & PEM explained

Discover essential cryptography and certificate formats including X.509, PKCS 7, PKCS 10, PKCS 12, ASN.1, and PEM.

X.509

X.509 is the standard for digital certificates. There used to be some competitors like OpenPGP, but they are much less frequently used nowadays. But wait ... X.509 isn't actually the right standard. X.509 is actually an ITU-T standard, while the definitions that really matter are actually part of RFC 5280, i.e. a standard from the IETF and not ITU-T. The RFC defines the "PKI Profile for the Internet" based on X.509, but it is the only profile that really matters in practice.

A digital certificate makes use of asymmetric cryptography and especially digital signatures. The most common use case for digital certificate today is server authentication. Everybody has used them already - probably mostly without knowing that it was an X.509 certificate: Every time you access an HTTPS web site, the browser establishes a TLS connection to the web server. The web server uses an X.509 certificate to prove who it is. For example, when people access their bank's website to transfer some money from their bank account, they want to be sure that it is actually their bank they are communicating with. Attackers might try to impersonate the bank website, read the user's password and TAN and then transfer the money a bank account they control. The HTTPS helps prevent this, because the attackers do not have the X.509 certificate of the bank's web server. When the browser says you are accessing a specific domain and it is a HTTPS connection, the certificate makes sure you are really connected to a web server of somebody owning that domain.

How does the certificate do that? The certificate contains some meta data, the public key of a cryptographic key pair, and a signature by a Certification Authority. Let's go through the three parts:

Contents of an X.509 Certificate

Metadata

The metadata of the X.509 certificate for which time it is valid, what the certificate shall be used for, and to whom the certificate was issued, among other things. Especially, for a TLS server certificate, it contains the domain of the web server. The browser compares the domain it tried to access with that in the certificate. If they do not match, the browser will not continue, but display a warning. If the certificate has expired, it will also display a warning.

Strangely, browsers often did not display a warning if you accessed an HTTP site without TLS, although this is even less secure. If you accessed a web server with an invalid X.509 certificate, it may just be a misconfiguration and you might actually be secure from attackers spying upon your connection, while the HTTP connection provides no protection at all.

There is some progress for this issue, though, for example Certificate Pinning. But that's an advanced topic, so let's look at the other two things present in an X.509 certificate.

Public Key

The certificate contains the public part of an asymmetric key pair. The cryptography allows the web server (or generally, the certificate holder if it is a different use case) to prove that it also possesses the private part of the key pair while not exposing it to the connecting client or anybody else. Thus, the client can be sure that the web server is actually the owner of the certificate and not just somebody who copied it. Copying the certificate itself is actually fairly easy, as the web server sends a copy of it to everybody who attempts to connect to the server. Stealing the private key is difficult or impossible, as it never leaves the web server. There are various methods to make stealing the private key more difficult like HSMs and TPMs.

Signature by a Certification Authority

The metadata allows the client to check that the certificate is suitable for the specific use case. The Public key shows that the other party is actually the owner of the certificate. But an attacker might just create a new key pair and a new certificate that also fulfil these two criteria. How would a client know that the certificate is trustworthy?

The certificate contains a cryptographic signature by another key pair that belongs to a Certification Authority (CA) certificate. The CA certificate can be checked in the same way and is also signed by a certificate. The client can follow this "trust chain" from the so-called leaf certificate to the Root CA certificate, which it can recognize by the fact that it is self-signed, i.e. the certificate's signature is from its own key pair. Usually, this is only one or two steps, so there is only one or two CA certificates involved.

CAs must carefully check that the metadata is correct when they issue a certificate. For example, if you want to have TLS server certificate for your own domain, you must prove to the CA that you are really the owner. Depending on how thorough this check is, you get a basic or Extended Validation (EV) certificate.

Each browser and operating system comes with a list of pre-defined Trusted Root CAs. User and/or admins can add additional Trusted Root CAs. Some Root CAs are trusted only for specific purposes, others have a more general trust. The client checks whether the trust chain terminates in a Trusted Root CA. If it does, and all certificates in the trust chain are still valid and the metadata says that they are used according to their purpose, the leaf certificate of the web server is trusted and the connection is established.

Validity of X.509 Certificates

X.509 Certificates are all about trust that must be established. As explained, one criterion for trust is that a certificate must chain up to a Trusted Root CA. Another one is that it is within its validity period. That usually means it is not expired, but -- usually due to technical errors -- a certificate might not be valid yet. But there is one more criterion: The CA issuing the certificate must not have revoked it. Because this is a topic for itself, we have a separate article about it.

PKCS#7

PKCS#7 is the Swiss Army Knife in cryptographic data formats and it may contain virtually anything -- encrypted messages, signed messages, signed and encrypted messages, certificates, and private keys

This is also the major disadvantage of this format. If an application or user gets a PKCS#7, it is not by itself clear what to do with it. Here are some important use cases:

S/MIME messages are basically emails with PKCS#7 bodies or attachments.
SCEP requests and replies are both actually PKCS#7 signed messages.
EST responses are CMS messages.

Encoding

Common file endings are .p7b (DER encoded), .p7s (a signed message or message signature), and .p7m (a signed and/or encrypted message). The PEM-encoding with label "PKCS7" is also defined, but seldom used.

Tools

In Windows, you can open PKCS#7 messages with a double click and the Crypto-shell extensions will display it for you. However, you can usually only extract certificates and their private keys out of it, and not message contents.

You can convert these files into other formats with tools like OpenSSL.

PKCS#10

A Certificate Signing Requests (CSR) as defined in PKCS#10 is a file containing a description of a certificate that you would like to acquire from a Certification Authority (CA). It has a similar structure to an X.509 certificate, but it is missing the signature from a CA. Instead, it contains the signature of the certificate requester. It is still a different format, so it is not the same as a self-signed certificate.

It can be binary DER-encoded or PEM-encoded using the "CERTIFICATE REQUEST" label.

PKCS#12

PKCS#12 is also known as PFX, especially in Windows environments. Common file endings are therefore .pfx and .p12. It contains X.509 certificates and almost always corresponding private keys, although that is actually not technically enforced.

Data in a PKCS#12 file is usually encrypted to passwords. Often, only the private key is encrypted, so you could extract the certificates without knowing the passwords if your application allows it (most of them do not). While PKCS#12 is the most common way in Windows environments to store a certificate and its private key in a file. In Linux environments, PEM-encoded PKCS#8 files are more common.

Because the standard provides many options how to store certificates and private keys in nested "safebags", PKCS#12 files have some compatibility problems, like:

Windows is renowned for associating private keys in PKCS#12 with all certificates extracted from the file, not just the one it is meant for. If the PKCS#12 contains a certificate chain, Windows might display that it has the private key for the CA certificate.
On MacOS, you cannot import PKCS#12 files if the cryptographic algorithms are too new.
It might be necessary to encrypt the certificates in a PKCS#12 file in order for receiving applications to extract them. But some only support only very old and weak algorithms, which is usually not a problem, since the information is public anyway. But OpenSSL 3.x does not support these old and vulnerable algorithms and refuses to open the PKCS#12.

ASN.1 and PEM

The Abstract Syntax Notation One (ASN.1) is a language used to describe data structures. There are some pre-defined base data types like integers or sequences and an author of a protocol or file format can then define custom data types using the basic ones.

DER Encoding

The ITU-T standard X.680 defines different encodings for data specified as ASN.1. For X.509-related data, the most important encoding is DER, because there is only one way to encode a type; therefore, a hash of the binary DER representation of the type will always have the same value, which is important for example when signing ASN.1-encoded data.

PEM Encoding

For many, but not all X.509-related file types, you can either store the file binary in DER-encoding or apply an additional PEM-encoding on top of the DER encoding. PEM uses only ASCII characters and can therefore be copy and pasted easily in a clipboard or, some decades ago when this was still relevant, send via email.

Tools

If you have an ASN.1-encoded file and either you do not know which type it is or you have no application handling this specific type, you can still decode the raw ASN.1 structure and see what it shows. On Windows, the built-in tool certutil can do this with the command certutil -decode.