Introduction
Recently I had the need to get out the following information out of certificates and PKCS7 messages:
- A certificate's validity period (notBefore, notAfter attributes)
- A PKCS7 digital signature's author and signing time
Some basics
Digital certificates are
ASN.1 (Abstract Syntax Notation One) structures
DER (Distinguished Encoding Rules).
ASN.1 is something like Backus-Naur Form used for describing data structures, e.g.:
MyType ::= SEQUENCE {
myObjectIdentifier OBJECT IDENTIFIER,
myNumbers SEQUENCE OF MyNumber,
myMessage VisibleString
}
MyNumber ::= INTEGER (0..255)
Although it's nearly 30 years old (being originally part of the CCITT X.409:1984 spec), it's still often used in the Public Key Infrastructure world. For example, digital certificates like
X.509 and
PKCS (Public Key Cryptography Standards) make use of ASN.1. Simply said, it's a simple and common way to define data structures.
Beside native data types like booleans, integer numbers, real numbers, date-times, strings and null, ASN.1 includes keywords to build complex data types. In the above example, SEQUENCE was used to build a C like
struct data structure and SEQUENCE OF for a list of numbers between 0 and 255.
The CHOICE keyword acts much like C's
union keyword. It's used to pack several alternative data structures into a same space. One very special primitive type is Object Identifier. It's used to reference an already globally registered data type o semantic interpretation. For example, the
commonName value used in digital certificates
Subject field of type Name has the globally registered ID: 2.5.4.3.
On the other hand, DER is a way to digitally encode ASN.1 data structures with the goal to transfer this information to some other party.
If you have a certificate in PEM format it's easy to convert them to DER with OpenSSL:
openssl x509 -in cert.pem -out cert.der -outform DER
The digital certificate data structure
The X.509v3 digital certificate data structure is quite complex. The IETF has published its format as used on the Internet, which has evolved over time:
RFC 2459 ->
RFC 3280 ->
RFC 5280.
Here is the ASN.1 description of the first two hierarchy levels:
Certificate ::= SEQUENCE {
tbsCertificate TBSCertificate,
signatureAlgorithm AlgorithmIdentifier,
signatureValue BIT STRING }
TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version shall be v2 or v3
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version shall be v2 or v3
extensions [3] EXPLICIT Extensions OPTIONAL
-- If present, version shall be v3
}
Well, this doesn't seem to be quite complex.
Name is basically a collection of tuples (Object Identifier, Value), where:
- Object Identifier is a globally (Internet) registered identifier which you can look up in the internet, e.g. in oid-info.com. One example could be "commonName" which is 2.5.4.3.
- Value is normally a string. (There are several types of strings in ASN.1.)
The "not so trivial" part of this data structure is the
extensions part which only may be present in X.509 certificates of version 3 or later. The original RFC states:
The extensions defined for X.509 v3 certificates provide methods for
associating additional attributes with users or public keys and for
managing the certification hierarchy.
One of the more interesting standard extensions is the Subject Alternative Names (aka SubjectAltName) extension:
The subject alternative names extension allows additional identities
to be bound to the subject of the certificate. Defined options
include an Internet electronic mail address, a DNS name, an IP
address, and a uniform resource identifier (URI). Other options
exist, including completely local definitions. Multiple name forms,
and multiple instances of each name form, may be included. Whenever
such identities are to be bound into a certificate, the subject
alternative name (or issuer alternative name) extension MUST be used.
Because the subject alternative name is considered to be
definitiviely bound to the public key, all parts of the subject
alternative name MUST be verified by the CA.
As things happen, some of our spanish officially recognized Certificate Authorities packs non-standard attributes into SubjectAltNames. The extension data is available again as a DER encoded ASN.1 data package, so that you have to feed it through the appropriate parser.
About reading X.509 digital certificates with Python
Now, we already know that X.509 certificates are ASN.1, DER-encoded data structures. Thanks to the excellent
PyASN1 library we can read those data structures. But something is still missing. DER encoded ASN.1 data packages are not self describing, i.e. we must have a data structure description, just like a C
typedef struct or a Python class definition.
It would be great to have a ASN.1 to PyASN1 compiler. Then we could pick up the
X509v3 ASN.1 description and translate it to Python. Until recently there was none but now there is an attempt to fill this gap:
asn1ate. Before, most existing data model descriptions for PyASN1 were translated by hand. The separate
PyASN1-modules package includes common data structures like PKCS12, X509v3 (RFC2459), etc.
Here is one example: the PKCS12 data structure translated to PyASN1:
#
# PKCS#12 syntax
#
# ASN.1 source from:
# ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-12/pkcs-12.asn
#
# Sample captures could be obtained with "openssl pkcs12" command
#
from pyasn1.type import tag, namedtype, namedval, univ, constraint
from pyasn1_modules.rfc2459 import *
from pyasn1_modules import rfc2251
class Attributes(univ.SetOf):
componentType = rfc2251.Attribute()
class Version(univ.Integer): pass
class CertificationRequestInfo(univ.Sequence):
componentType = namedtype.NamedTypes(
namedtype.NamedType('version', Version()),
namedtype.NamedType('subject', Name()),
namedtype.NamedType('subjectPublicKeyInfo', SubjectPublicKeyInfo()),
namedtype.NamedType('attributes',
Attributes().subtype(implicitTag=tag.Tag(
tag.tagClassContext, tag.tagFormatConstructed, 0)))
)
Hands on with PyASN.1
Now, let's try parse a certificate. You can find the test certificate used in this example in the pyx509 package described below. You can also generate your own certificate with OpenSSL:
openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.der -days 1000 -outform DER -nodes
First, we'll install pyasn1 and pyasn1-modules:
$ sudo pip install pyasn1
Downloading/unpacking pyasn1
Downloading pyasn1-0.1.7.tar.gz (68kB): 68kB downloaded
Running setup.py egg_info for package pyasn1
Installing collected packages: pyasn1
Running setup.py install for pyasn1
Successfully installed pyasn1
Cleaning up...
$ sudo pip install pyasn1-modules
Downloading/unpacking pyasn1-modules
Downloading pyasn1-modules-0.0.5.tar.gz
Running setup.py egg_info for package pyasn1-modules
Requirement already satisfied (use --upgrade to upgrade): pyasn1>=0.1.4 in /Library/Python/2.7/site-packages (from pyasn1-modules)
Installing collected packages: pyasn1-modules
Running setup.py install for pyasn1-modules
Successfully installed pyasn1-modules
Cleaning up...
Now we'll go ahead and read a certificate:
$ python
>>> from pyasn1.codec.der.decoder import decode
>>> from pyasn1_modules import rfc2459
>>> derData = file('cert.der', 'rb').read()
>>> cert, rest = decode(derData, asn1Spec=rfc2459.Certificate())
>>> print cert.prettyPrint()
Certificate:
tbsCertificate=TBSCertificate:
version='v3'
serialNumber=1019333950
signature=AlgorithmIdentifier:
algorithm=1.2.840.113549.1.1.5
parameters=0x0500
issuer=Name:
=RDNSequence:
RelativeDistinguishedName:
AttributeTypeAndValue:
type=2.5.4.6
value=0x13024553
RelativeDistinguishedName:
AttributeTypeAndValue:
type=2.5.4.10
value=0x1304464e4d54
RelativeDistinguishedName:
AttributeTypeAndValue:
type=2.5.4.11
value=0x130f464e4d5420436c6173652032204341
validity=Validity:
notBefore=Time:
utcTime=100903074356Z
notAfter=Time:
utcTime=130903074356Z
subject=Name:
=RDNSequence:
RelativeDistinguishedName:
AttributeTypeAndValue:
type=2.5.4.6
value=0x13024553
RelativeDistinguishedName:
AttributeTypeAndValue:
type=2.5.4.10
value=0x1304464e4d54
RelativeDistinguishedName:
...
Not bad for the first attempt. But nearly all attributes seem to be encoded. Let's get the subject and see if we can transform it to a readable string.
>>> cert = cert['tbsCertificate'] # just get the core part of the certificate
>>> subject = cert['subject']
>>> rdnsequence = subject[0] # the subject is only composed by one component
>>> for rdn in rdnsequence:
... oid, value = rdn[0] # rdn only has 1 component: (object id, value) tuple
... print oid, ':', str(value)
...
2.5.4.6 : ES
2.5.4.10 : FNMT
2.5.4.11 : FNMT Clase 2 CA
2.5.4.11 : 703002474
2.5.4.3 : 8NOMBRE REVILLA DERKSEN ALEJANDRO ERNESTO - NIF ...
Now we have some readable output. The Object Identifiers have the following meaning:
- 2.5.4.6: countryName, abbreviated: C
- 2.5.4.10: organizationName, abbreviated: O
- 2.5.4.11: organizationalUnitName, abbreviated: OU
- 2.5.4.3: commonName, abbreviated: CN
With OpenSSL, it is normally displayed like this:
$ openssl x509 -in ub1204/svn/pyx509/exampledata/cert.der -inform DER -subject -noout
subject= /C=ES/O=FNMT/OU=FNMT Clase 2 CA/OU=703002474/CN=NOMBRE REVILLA DERKSEN ALEJANDRO ERNESTO - NIF ...
Using pyx509 to parse X.509 certificates
The
pyx509 library is an attempt to offer a more Python like data structure. It brings it's own model of X.509 for PyASN1.
My fork of pyx509 includes the possibility to parse / display
SubjectAltName directory name (
dirName) name parts. Sorry, no still no PyPi / setup.py, so you have to download the zip/tar ball and uncompress it.
Here an example:
./x509_parse exampledata/cert.der
=== X509 Certificate ===
X.509 version: 3 (0x2)
Serial no: 0x3cc1cd3e
Signature algorithm: SHA1/RSA
Issuer: C=ES, O=FNMT, OU=FNMT Clase 2 CA
Validity:
Not Before: 2010-09-03 07:43:56
Not After: 2013-09-03 07:43:56
Subject: C=ES, CN=NOMBRE REVILLA DERKSEN ALEJANDRO ERNESTO - NIF ..., O=FNMT, OU=703002474, OU=FNMT Clase 2 CA
Subject Public Key Info:
Public Key Algorithm: RSA
Modulus: (b64)
...
Exponent: 65537
Extensions:
...
Subject Alternative Name: is_critical: False
email: ernesto.revilla@gmail.com
dirName: Apellido1=REVILLA, Apellido2=DERKSEN, DNI=..., Nombre=ALEJANDRO ERNESTO
...
=== EOF X509 Certificate ===
This seems to give us a much more usable output and may be a good alternative to parsing OpenSSL output.
Displaying digital signatures / timestamps with pyx509
With pyx509 we can also display some data of digital signatures complying PKCS7:
= PKCS7 signature block =
PKCS7 Version: 1
== Encapsulated content Info ==
ContentType: data
Content: None
== Signer info ==
Certificate serial number: 0x89bbba0749918db3
Issuer: C=es, ...
Digest Algorithm: SHA-1
Signature: (b64)
gXpU5jadSY+FVBoeCdvn1/m5bzEMzN3ZKuiN9sPk79iJgX+DDDOMH6K5Scnh
wLL7nHRT983GlhTY1A2QE1VryWTbuBGK08oalKIM8QZs3UfZa5dXsx83eS4b
/M/icfIf6CHu1fWZ4VBJ4mva2N3nh2r0FV09bvuj1bodl4kXJAs=
Attributes:
contentType: data
serialNumber: 0x89bbba0749918db3
signingTime: 2011-10-04 14:36:51
messageDigest: y8OX3qoZBY4/Cc6/w0xuRqzzQzU=
signingCertificate: 0x89bbba0749918db3
== EOF Signer info ==
=== X509 Certificate ===
X.509 version: 3 (0x2)
Serial no: 0x89bbba0749918db3
Signature algorithm: SHA1/RSA
Issuer: C=es, ...
Validity:
Not Before: 2011-09-17 00:00:00
Not After: 2031-09-12 11:09:54
Subject: C=es, CN=REVILLA DERKSEN, ALEJANDRO ERNESTO...
Subject Public Key Info:
Public Key Algorithm: RSA
Modulus: (b64)
AOs2/Pip46F5BJPBQd/5bwS1HO97lJ74ZjJfGtvEH831d6Ld4bsF9jdFOjlx
mv+kxYNFryZZFWM109+zng/PiU8NZPRZt4XlTO7qb3r2g5AR17EQWJNokQto
s3w3cXSEDPxxFmTHEhGarTLddEg2o1v9/UIlMS8mzHej0Q9uBuuh
Exponent: 65537
Extensions:
Authority Key Id Ext: is_critical: False
key id: (b64)
AiuDvGb4bxWnCsZJ9/RHNrRhSxk=
Basic Constraints Ext: is_critical: False
CA: False
max_path_len: None
Subject Alternative Name: is_critical: False
email: tramitacion.electronica@telefonica.es
dirName: Apellido1=REVILLA, Apellido2=DERKSEN, DNI=..., Nombre=ALEJANDRO ERNESTO
Subject Key Id: is_critical: False
key id: (b64)
Zh0L6JJSz+GgiCimE4U7s5PHH+g=
Signature: (b64)
k1OVoQyNZv0ASor/bitI6JgJm37piIheIzwdKSgEtKeQuIXfA5V5rclPVUg7
PW71JTQyY8iDbvJB4sb4FH5XyjOXUmf3CXiG7ppS48cQXSf1k3wHWZB0neTE
V3XxZnPjqWvv0x0ScsOGKxpHjyy8SFZMKR6tnfQ4TXfHMxid7dw=
=== EOF X509 Certificate ===
We can clearly see that there is one signature block (Signer Info) which specifies the original message's digest, the digest algorithm used (SHA-1), the signature, a reference to the certificate, the certificate itself and the signing time.
Here one example for a time stamp token gotten from a public Time Stamp Authority (TSA):
./pkcs7_parse.py exampledata/timestamp.tst
= PKCS7 signature block =
PKCS7 Version: 3
== Encapsulated content Info ==
ContentType: TimeStampToken
=== Timestamp Info ===
Version: 1
Policy: 1.3.4.6.1.3.4.6
msgImprint:
Algorithm Id: 1.3.14.3.2.26
Value: (b64)
rnLdD3molzRsebPvq7oOSG9n8fU=
Serial number: 134059559
Time: 20131011084712Z
==== Accuracy ====
Seconds: 1
Milis: 1
Micros 2
==== EOF Accuracy ====
TSA:
=== EOF Timestamp Info ===
== Signer info ==
Certificate serial number: 0x5079e
Issuer: C=ES, CN=MINISDEF-EC-WPG, O=MDEF, OU=PKI
Digest Algorithm: SHA-1
Signature: (b64)
...
Attributes:
contentType: TimeStampToken
messageDigest: KpRSk0vbBke+8G40MIII9NNb51E=
signingCertificate: 0x5079e
== EOF Signer info ==
=== X509 Certificate ===
X.509 version: 3 (0x2)
Serial no: 0x5079e
Signature algorithm: SHA1/RSA
Issuer: C=ES, CN=MINISDEF-EC-WPG, O=MDEF, OU=PKI
Validity:
Not Before: 2011-08-17 09:50:22
Not After: 2021-08-17 09:50:22
Subject: C=ES, CN=Sello de tiempo TS@ - @firma - desarrollo, O=MDEF, OU=PKI, serialNumber=S2833002E
Subject Public Key Info:
Public Key Algorithm: RSA
Modulus: (b64)
...
Exponent: 65537
Extensions:
...
Extended Key Usage: is_critical: True
timeStamping
Key Usage: is_critical: True
digitalSignature,nonRepudiation
Subject Alternative Name: is_critical: False
email: soporte.afirma5@mpt.es
dirName: CN=TS@- Autoridad Sellado de tiempo-desarrollo, O=Ministerio de la Política Territorial y Administración Pública, certType=sello de tiempo, serialNumber=S2833002E
...
=== EOF X509 Certificate ===
= EOF PKsCS7 signature block =
Conclusions
Although pyx509 is rather incomplete it may fulfill your needs and may be an alternative to parsing certificates, digital signatures and timestamps with OpenSSL.