Digital Certificates: TLS and more

Introduction

A digital certificate serves a very simple purpose. Its job is to certify the ownership of a public key. This gives the user of the certificate the confidence that the public key has not been tampered with. Certificates are issued by a Certificate Authority (CA).

Certificates find many uses. They are crucial to the whole concept of TLS. It is used to both, encrypt a message and authenticate a message. Another area of use is email encryption.

How does a certificate look like?

Certificates usually conform to the X.509 structure. Here is a sample certificate picked up from Wikipedia:

Certificate:
   Data:
       Version: 1 (0x0)
       Serial Number: 7829 (0x1e95)
       Signature Algorithm: md5WithRSAEncryption
       Issuer: C=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting cc,
               OU=Certification Services Division,
               CN=Thawte Server CA/emailAddress=server-certs@thawte.com
       Validity
           Not Before: Jul  9 16:04:02 1998 GMT
           Not After : Jul  9 16:04:02 1999 GMT
       Subject: C=US, ST=Maryland, L=Pasadena, O=Brent Baccala,
                OU=FreeSoft, CN=www.freesoft.org/emailAddress=baccala@freesoft.org
       Subject Public Key Info:
           Public Key Algorithm: rsaEncryption
           RSA Public Key: (1024 bit)
               Modulus (1024 bit):
                   00:b4:31:98:0a:c4:bc:62:c1:88:aa:dc:b0:c8:bb:
                   33:35:19:d5:0c:64:b9:3d:41:b2:96:fc:f3:31:e1:
                   66:36:d0:8e:56:12:44:ba:75:eb:e8:1c:9c:5b:66:
                   70:33:52:14:c9:ec:4f:91:51:70:39:de:53:85:17:
                   16:94:6e:ee:f4:d5:6f:d5:ca:b3:47:5e:1b:0c:7b:
                   c5:cc:2b:6b:c1:90:c3:16:31:0d:bf:7a:c7:47:77:
                   8f:a0:21:c7:4c:d0:16:65:00:c1:0f:d7:b8:80:e3:
                   d2:75:6b:c1:ea:9e:5c:5c:ea:7d:c1:a1:10:bc:b8:
                   e8:35:1c:9e:27:52:7e:41:8f
               Exponent: 65537 (0x10001)
   Signature Algorithm: md5WithRSAEncryption
       93:5f:8f:5f:c5:af:bf:0a:ab:a5:6d:fb:24:5f:b6:59:5d:9d:
       92:2e:4a:1b:8b:ac:7d:99:17:5d:cd:19:f6:ad:ef:63:2f:92:
       ab:2f:4b:cf:0a:13:90:ee:2c:0e:43:03:be:f6:ea:8e:9c:67:
       d0:a2:40:03:f7:ef:6a:15:09:79:a9:46:ed:b7:16:1b:41:72:
       0d:19:aa:ad:dd:9a:df:ab:97:50:65:f5:5e:85:a6:ef:19:d1:
       5a:de:9d:ea:63:cd:cb:cc:6d:5d:01:85:b5:6d:c8:f3:d9:f7:
       8f:0e:fc:ba:1f:34:e9:96:6e:6c:cf:f2:ef:9b:bf:de:b5:22:
       68:9f

When you open a website (using HTTPS), your browser gets a similar certificate from the server (the website).

The first check the browser does is regarding the validity of the certificate (the Not Before and the Not After part). If the current time does not fall between these two timestamps, your browser will crib.

The second check is for the Common Name (CN). It is the FQDN of the owner of the certificate. If the website you are accessing does not match this CN (“www.freesoft.org” in this case), your browser will crib.

Note: If you want the same certificate to support multiple domains, you can use wildcards. Here is an example of Google’s certificate supporting wildcard (it will apply to “anything.google.com”):

Screen Shot 2016-06-29 at 7.28.00 PM

Why would you (your browser) trust this certificate?

Because it has been signed by another CA! We will come to why you’ll trust this CA later, but for now let’s assume that you do trust this second CA. This CA will have its own certificate which we will be using to the validate the first certificate. This secon certificate will also look similar to the one shown at the top.

Let’s authenticate this certificate!

Now comes the fun stuff. Our job is to verify only one thing – that the signature (mentioned at the bottom of the certificate) is actually genuine. Let’s verify this!

In the first certificate, the Signature Algorithm is mentioned as md5WithRSAEncryption. This signifies that the second CA took the MD5 hash of the first certificate, and encrypted it using RSA (which is an asymmetric algorithm). The encryption was done using their (the second CA’s) private key. The result of this MD5 followed by RSA is what is called the certificate’s signature (which, again, you can see at the bottom of the certificate).

Now, all your browser needs to do is, decrypt the signature (using the second CA’s public key from their certificate), extract the MD5 hash of the (first) certificate from it and finally match this hash with an independently computed MD5 hash of the (first) certificate.

If the hashes match, it proves that the certificate has not been tampered with. Or, in other words, it has been properly signed! Hence, I mentioned in the beginning that our only job is to prove the authenticity of this signature (since it implicitly guarantees that the certificate has not been, maliciously or otherwise, modified).

Easy, wasn’t it? 🙂

Chain of Trust

Now, let’s revisit our question – why would you trust the second CA? It all about the chain of trust.

It’s quite simple actually. You blindly trust the certificate you receive (from the website), verify its authenticity (signature) using the issuer’s (second CA’s) certificate. Now, you need to verify the authenticity of the second CA. For that, you fetch its issuer’s (third CA’s) certificate, and authenticate. This chain goes on all the way to the top CA. The top CA’s certificate is called Root Certificate.

Now, imagine, that you trust the root CA’s certificate. This implicitly authenticates the certificates of all the CA’s in the chain of trust, doesn’t it? This leads us to (the final) question.

Why trust the Root Certificate?

In the second image (Google’s partial certificate), you can see the chain of trust. At the root is the GeoTrust Global CA. Your browser blindly trusts this CA’s certificate. This is because root level certificates are already part of you browser/OS they are shipped!

Epilogue

Just one line here: certificates provide (among other things) a very simple mechanism to authenticate someone’s public key.

Why do you need to authenticate someone’s public key? Read the “Why would you trust my public key?” part in my previous blog.

 

How HTTPS works

Introduction

In this article, I am going to talk about the behind-the-scene moments of an  HTTPS session. This post will not be a deep dive into the topic, but rather, will aim to provide the reader a high level understanding of the topic, so they are better able to grasp how the protocol works.

The job of HTTPS is chiefly two-fold:

  • Prove the authenticity of a server (website) to a client (browser).
  • Provide a secure channel of communication between a client and a server.

At the heart of this protocol lies the encryption/decryption of the data being transferred.

Why encrypt data?

Because if not encrypted, it will be susceptible to eavesdropping by intruders. All your private information (credit card details, login credentials, chat conversations, etc.) will not be private anymore, if transported unencrypted (plaintext).

How is HTTP different from HTTPS?

HTTP lies in the Layer 7 of the OSI model. Simply put, it provides a mechanism to clients/servers to interact with each other over the world wide web.

HTTPS is nothing but HTTP using TLS (which is an evolution of SSL – Secure Socket Layer). TLS is what differentiates HTTPS from HTTP.

Encryption (a slight detour)

The basic premise of the working of HTTPS is that the data exchanged between applications over the internet will be encrypted using some encryption algorithm. Encryption can broadly be categorised into two groups:

  • Symmetric encryption – Using the same key for encryption/decrpyption.
  • Asymmetric encryption – Using a public/private key-pair (essentially two keys). Public key is (in most cases) used for encryption of the plaintext, and the private key is used to decrypt the generated ciphertext.

Asymmetric algorithms are more complex, more computationally expensive and much slower compared to symmetric ones. (Also, the key sizes differ by quite a margin – a 256 bit symmetric AES vs a 2048 bit asymmetric RSA).

Therefore, in an HTTPS session, a combination of the two is used. Symmetric algorithm is used for the actual encryption/decryption of the message. Asymmetric algorithm is used to transfer this symmetric key (well, not the exact key, as you’ll see below) between the two communicating systems.

What prevents me from carrying out a man-in-the-middle attack by issuing my own public key?

Or,

Why would you trust my public key?

Lets bring the Certificate Authority into the picture. The job of a CA is to provide digital certificates to entities to prove the ownership of (among other things) a public key, so that nobody else can fake it (and claim to be that entity).

Using the public key (which is part of the digital certificate), I’ll encrypt whatever needs encryption and send it over the communication channel. The beauty of asymmetric algorithms is that data encrypted using the public key can only* be decrypted using the corresponding private key (which belongs to the actual owner of the public/private key-pair). So, I will remain assured that only the intended recipient will be able to decrypt the message.

*NOTE: It isn’t that the ciphertext cannot be decrypted without the private key, but just that it would be computationally very hard to break the same.

Why would you encrypt a symmetric key using asymmetric encryption?

Same reason as the above – only the intended recipient will be able to retrieve the encrypted symmetric key by decrypting it with the private asymmetric key.

Now that we seem to be heading somewhere, lets take a look at how an actual HTTPS session begins.

TLS Handshake

Before a TLS session actually begins, a handshake is performed between the client and the server. A lot of things happen behind the scenes during this handshake. (Do give this awesome article a read if you are interested in going under the hood of a typical TLS handshake.)

In brief, the following things occur:

  • Client informs the server of its supported cipher suites, and the sever chooses one.
  • Server sends over its certificate (issued by a CA) and a random value (this will be used later).
  • Client authenticates the server’s certificate.
  • Client generates a Pre-Master Secret, encrypts the same using the server’s public key (from the certificate), and sends across the same to the server.
  • Using the random value (sent by the server) and the Pre-Master Secret, both, the client and the server, generate the same Master Secret.
  • Master Secret is used (by both, client and server) to generate the necessary session keys (for encrypting messages and for hashing – MAC).

Now that the TLS handshake is complete, the client and the server can begin exchanging messages by using the generated session keys. Each time an encrypted message is exchanged between the client and the server, the corresponding hash is also shared (typically done using HMAC).

Epilogue

Simply put, HTTPS is nothing but an encrypted/authenticated form of HTTP, and the encryption of messages is performed using symmetric algorithms (the symmetric key is exchanged using an asymmetric algorithm – which is part of the digital certificate).

One question still remains. How do digital certificates work and why should you trust a CA? Seems like another blog post is in the offing!