HTTPS stands for Hypertext Transfer Protocol Secure, also commonly referred to as HTTP over TLS or HTTP over SSL. HTTPS is not a separate protocol from HTTP, it merely means using SSL/TLS to encrypt an HTTP request and response.
SSL/TLS
Basically, SSL (Secure Sockets Layer) and TLS (Transport Layer Security) is the same thing. TLS is the modern version of now-deprecated SSL. In most contexts, both terms are exchangeable.
TLS is a security protocol which mainly performs three tasks:
- Privacy - encrypting data between client and server using Encryption Algorithms.
- Authentication - ensuring that server is who it claims to be using Certificates.
- Integrity - verifying that data have not been forged using Message Authentication Code (MAC).
In addition to the use case of HTTPS, TLS can also be used to encrypt other communications such as Email or VoIP.
ref:
https://www.cloudflare.com/learning/ssl/transport-layer-security-tls/
Encryption Algorithms
There are 2 types of encryption algorithms:
- Symmetric Encryption
- There is only one key: the client and server use the same key to encrypt and decrypt.
- Fast and cheap (nanoseconds per operation).
- A common algorithm is AES.
- Asymmetric Encryption (also known as Public-key Encryption)
- There is a pair of two keys: the public key encrypts the message, and only the corresponding private key can decrypt it.
- Slow and expensive (microseconds to milliseconds per operation).
- Some common algorithms are RSA and Diffie-Hellman (DH).
TLS actually uses both Asymmetric Encryption and Symmetric Encryption, so-called a hybrid cryptosystem. Simply speaking, TLS first uses an asymmetric algorithm to exchange shared secrets between both sides, then generates a symmetric key (the session key) from the shared secrets, finally uses the session key to encrypt application data (HTTP request/response). A cryptographic system involves certificates and public-key encryption is often called Public Key Infrastructure (PKI).
I'm not an expert in Cryptography or Information Security, but I'm going to talk about what RSA and Diffie-Hellman are a little bit since they are crucial to TLS.
RSA
There is a pair of two keys in RSA, the public key can be shared publicly, and the private key, as the name suggests, must be kept secret. Data encrypted with the public key can only be decrypted with the private key, and vice versa. Since RSA is a relatively slow algorithm, we usually generate a key pair and use them for every connection. So RSA keys are considered static.
RSA is also often used for Digital Signature. In this case, the message is hashed first since RSA operations can't handle messages longer than the key size. The sender generates a signature by signing (encrypting) the hash with its own private key, and sends both the message and the signature to the receiver. The receiver also hashes the message first, and verifies (decrypts) the signature with the corresponding public key, and checks whether the decrypting hash equals the hash of the message. If they are equal, the message is indeed sent by the sender because no one else has the private key.
ref:
https://www.comparitech.com/blog/information-security/digital-signatures/
Diffie-Hellman (DH)
There are many variants of Diffie-Hellman, for instance, Diffie-Hellman Ephemeral (DHE), Elliptic Curve Diffie–Hellman (ECDH), and Elliptic Curve Diffie-Hellman Ephemeral (ECDHE).
Let's talk about how DHE works first:
- Both client and server agree on a set of DH parameters:
g
(generator) andp
(prime).- Instead of exchange,
g
andp
are usually predefined in the software that client and server use. - These values are public so it's ok that an attacker knows them.
- Instead of exchange,
- Each of client and server generates a random number as the private key, and calculates the public key from DH formula 1:
(g^own_private_key) mod p
.- There are 2 key pairs:
- Client private key
- Client public key
- Server private key
- Server public key
- Since the 2 private keys are randomly generated for every connection, this is the "Ephemeral" part of DHE.
- There are 2 key pairs:
- Each of client and server sends their public keys to the other side.
- Each of client and server calculates the same shared secret from DH formula 2:
(the_other's_public_key^own_private_key) mod p
.- Client's shared secret =
(server_public_key^client_private_key) mod p
. - Server's shared secret =
(client_public_key^server_private_key) mod p
. - Magically, client's shared secret == server's shared secret.
- Client's shared secret =
Then we can use the shared secret for Symmetric Encryption.
In step 2, if both client and server always use the same private keys for every connection, that is Static DH. Because the key pairs are temporary, a compromise of private keys does not jeopardize the privacy of other DH connections. This is known as Perfect Forward Secrecy (PFS). Moreover, if we replace the DH formula in step 2 and 4 with an elliptic curve formula, that is ECDH.
ref:
https://www.wst.space/ssl-part-2-diffie-hellman-key-exchange/
https://crypto.stackexchange.com/questions/67797/in-diffie-hellman-are-g-and-p-universal-constants-or-are-they-chosen-by-one
Certificates
To obtain a valid SSL certificate, the server first needs to create a Certificate Signing Request (CSR) file with an RSA private key and submits it to a Certificate Authority (CA). A CA is an organization, a trusted third-party that generates and gives out SSL certificates. The CA will also sign the certificate with its private key, allowing clients to verify it with CA's public key. Operating systems and browsers have pre-installed public keys of all of the major CAs.
A SSL certificate contains:
- The domain name and associated subdomains
- The issuer (CA)
- The CA's digital signature
- The expiration date of the certificate
- The server's public key
A certificate is actually a chain of multiple certificates (Chain of Trust), usually three or more: the server's certificate, the intermediate CA's certificate, and the root CA's certificate. In a TLS communication, the client first verifies the signature of the server's certificate with the intermediate CA's public key, and checks the signature of the intermediate CA's certificate with the root CA's public key. Finally, the root CA's certificate is inherently trusted by OSs or browsers. If any verification fails or the root certificate is not trusted (this would be a self-signing certificate), the TLS communication terminates.
Clients also checks if the certificate is for the correct domain.
ref:
https://www.cloudflare.com/learning/ssl/what-is-an-ssl-certificate/
https://security.stackexchange.com/questions/56389/ssl-certificate-framework-101-how-does-the-browser-actually-verify-the-validity
Message Authentication Code (MAC) Algorithms
Simply speaking, the sender calculates a MAC code by doing mac_code = mac_function(key, message)
, then sends the message along with the MAC code to the receiver. The receiver also calculates a MAC code of the message using the same MAC algorithm and the shared secret key, and checks whether both MAC values are equal. If they are equal, the integrity of the message is confirmed. A MAC code is sometimes called a checksum.
A common MAC algorithm is HMAC.
TLS Handshake
TLS handshake is the foundational part of a HTTPS communication which happens after establishing TCP connection and before the HTTP request/response cycle. The purpose of TLS handshake is to negotiate a session key used to encrypt/decrypt HTTP data.
Most of major browsers have already dropped support for TLS 1.0 and 1.1 in 2020, also, TLS 1.2 and 1.3 are currently the most widely used versions of TLS. We will focus on later ones. Since TLS 1.2 supports multiple algorithms for key exchange but TLS 1.3 only uses Diffie-Hellman (RSA has been completely removed), so we are going to talk about TLS handshake with RSA in TLS 1.2 and TLS handshake with Diffie-Hellman in TLS 1.3.
ref:
https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/
https://www.thesslstore.com/blog/explaining-ssl-handshake/
https://www.thesslstore.com/blog/cipher-suites-algorithms-security-settings/
TLS Handshake with RSA in TLS 1.2
Before the TLS handshake, both client and server need to establish a TCP connection if HTTP/1.1 or HTTP/2 is using (HTTP/3 would be another story). TCP is bi-directional, and there could be multiple packets sent in one trip.
TLS handshake in TLS 1.2 takes 2 roundtrips:
| Client | Server |
|---------------------------------------------------|---------------------------------------------------|
| -> TCP: SYN | |
| | <- TCP: SYN ACK |
| -> TCP: ACK | |
| -> TLS: Client Hello (plaintext) | |
| | <- TLS: Server Hello (plaintext) |
| | <- TLS: Server Certificate (plaintext) |
| | <- TLS: Server Hello Done (plaintext) |
| -> TLS: Client Key Exchange (plaintext) | |
| -> TLS: Client Change Cipher Spec (plaintext) | |
| -> TLS: Client Handshake Finished (encrypted) | |
| | <- TLS: Server Change Cipher Spec (plaintext) |
| | <- TLS: Server Handshake Finished (encrypted) |
| -> HTTP: Request (encrypted) | |
| | <- HTTP: Response (encrypted) |
- Client Hello
- Sending following data:
- A random "Client Random" string.
- A list of supported Cipher Suites (a list of cryptographic algorithms).
- Sending following data:
- Server Hello
- Sending following data:
- A random "Server Random" string.
- The selected Cipher Suite.
- Sending following data:
- Server Certificate
- Sending SSL certificate contains server's public key.
- Server Hello Done
- Telling client that it has sent over above messages.
- Client Key Exchange
- Sending a random "Pre-master Secret" encrypted with server's public key.
- Client Change Cipher Spec
- Generating the session key from Client Random, Server Random, and Pre-master Secret.
- Telling server that it's ready for encrypted communication.
- Client Finished
- Sending a verification data encrypted with the session key.
- Server Change Cipher Spec
- Generating the session key from Client Random, Server Random, and Pre-master Secret.
- Telling client that it's ready for encrypted communication.
- Server Finished
- Sending a verification data encrypted with the session key.
After the TLS handshake, both client and server start to encrypt/decrypt application data with the session key.
For a complete byte-by-byte illustration of TLS 1.2 Handshake, see:
https://tls.ulfheim.net/
ref:
https://medium.com/kuranda-labs-engineering/tls-6d9f75adba9f
TLS Handshake with Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) in TLS 1.3
TLS or any encrypted communications have always added overhead when it comes to performance. One of the significant changes of TLS 1.3 is that the TLS handshake in TLS 1.3 only requires one roundtrip instead of 2, which makes TLS 1.3 much faster than older versions.
TLS 1.3 has also reduced the number of supported Cipher Suites from 37 to 5 by removing weak and less-used cryptographic algorithms, and only supports ECDHE for key exchange algorithms, which means that client can send its key share information right away at the beginning of the handshake. In other words, TLS 1.3 merges Client Key Exchange into Client Hello.
TLS handshake in TLS 1.3 takes 1 (and a half actually) roundtrip:
| Client | Server |
|---------------------------------------------------|---------------------------------------------------|
| -> TCP: SYN | |
| | <- TCP: SYN ACK |
| -> TCP: ACK | |
| -> TLS: Client Hello (plaintext) | |
| | <- TLS: Server Hello (plaintext) |
| | <- TLS: Wrapper (encrypted) |
| | Server Certificate |
| | Server Handshake Finished |
| -> TLS: Wrapper (encrypted) | |
| Client Handshake Finished | |
| -> HTTP: Request (encrypted) | |
| | <- HTTP: Response (encrypted) |
- Client Hello
- Before Client Hello, calculating an ephemeral key pair for ECDHE key share based on the selected curve.
- Sending following data:
- A random "Client Random" string.
- A list of supported Cipher Suites.
- The selected curve (DH parameters:
g
andp
) for ECDHE. - The client's ECDHE public key.
- Server Hello
- Before Server Hello, calculating an ephemeral key pair for ECDHE key share based on client's selected curve.
- Sending following data:
- A random "Server Random" string.
- The selected Cipher Suite.
- The server's ECDHE public key.
- Server Wrapper
- Before Server Wrapper, generating the handshake key from its own ECDHE private key and client's ECDHE public key.
- Containing following records:
- Server Certificate
- Server Handshake Finished
- They are encrypted with the handshake key.
- After Server Wrapper, generating the session key from Client Random, Server Random, and the handshake key.
- Client Wrapper
- Before Client Wrapper, generating the handshake key from its own ECDHE private key and server's ECDHE public key.
- Containing following records:
- Client Handshake Finished
- They are encrypted with the handshake key.
- After Client Wrapper, generating the session key from Client Random, Server Random, and the handshake key.
After the TLS handshake, both client and server start to encrypt/decrypt application data with the session key.
For a complete byte-by-byte illustration of TLS 1.3 Handshake, see:
https://tls13.ulfheim.net/
ref:
https://www.thesslstore.com/blog/tls-1-3-everything-possibly-needed-know/