WWW FAQs: What is SSL?

2006-09-11: SSL (Secure Sockets Layer), also known as TLS (Transport Layer Security), is a protocol that allows two programs to communicate with each other in a secure way. Like TCP/IP, SSL allows programs to create "sockets," endpoints for communication, and make connections between those sockets. But SSL, which is built on top of TCP, adds the additional capability of encryption. The HTTPS protocol spoken by web browsers when communicating with secure sites is simply the usual World Wide Web HTTP protocol, "spoken" over SSL instead of directly over TCP.

In addition to providing privacy, SSL encryption also allows us to verify the identity of the party we are talking to. This can be very important if we don't trust the Internet. While it is unlikely in practice that the root DNS servers of the Internet will be subverted, a "man in the middle" attack elsewhere on the network could substitute the address of one Internet site for another. SSL prevents this scenario by providing a mathematically sound way to verify the other program's identity. When you log on to your bank's website, you want to be very, very sure you are talking to your bank!

How SSL Works

SSL provides both privacy and security using a technique called "public/private key encryption" (often called "asymmetric encryption" or simply "public key encryption").

A "public key" is a string of letters and numbers that can be used to encrypt a message so that only the owner of the public key can read it. This is possible because every public key has a corresponding private key that is kept secret by the owner of the public key.

How exactly are the public and private key related? That depends on the algorithm (mathematical method) used. SSL allows several algorithms, of which the most famous is the RSA algorithm invented by Ron Rivest, Adi Shamir and Len Adleman of MIT in 1977.

Several algorithms, including RSA, depend on properties of very large prime numbers. For instance, it is very difficult to difficult to factor a number that is a product of two large primes, unless you already know one of the primes.

Public and private keys can also be used in the opposite way: a message encrypted with the private key can only be decrypted (read) with the public key. This comes in handy at the beginning of the conversation, as a way of verifying the other program's identity.

The SSL Handshake: Identity and Privacy

Let's suppose Jane wants to log into www.examplebank.com. When Jane's web browser makes an HTTPS connection to www.examplebank.com, her browser sends the bank's server a string of randomly generated data, which we'll call the "greeting."

The web server responds with two things: its own public key encoded in an SSL certificate, which we'll examine more closely later, and the "greeting" encrypted with its private key.

Jane's web browser then decrypts the greeting with the bank's public key. If the decrypted greeting matches the original greeting sent by the browser, then Jane's browser can be sure it is really talking to the owner of the private key - because only the holder of the private key can encrypt a message in such a way that the corresponding public key will decrypt it.

Now, let's suppose Bob is monitoring this traffic on the Internet. He has the bank's public key, and Jane's greeting. But he doesn't have the bank's private key. So he can't encrypt the greeting and send it back. That means Jane can't be fooled by Bob.

The Identity Problem

But what if Bob inserts himself into the picture even before Jane's browser connects to the bank? What if Jane's browser is actually talking to Bob's server from the very beginning? Then Bob can substitute his own public and private keys, encrypt the greeting successfully, and convince Jane's browser that his computer is the bank's. Not good!

That's why the complete SSL handshake includes more than just the bank's public key. The public key is part of an SSL certificate issued by a certificate authority that Jane's browser already trusts.

How does this work? When web browser software is installed on a computer, it already contains the public keys of several certificate authorities, such as GoDaddy, VeriSign and Thawte. Companies that want their secure sites to be "trusted" by web browsers must purchase an SSL certificate from one of these authorities.

But what is the certificate, exactly? The SSL certificate consists essentially of the bank's public key and a statement identifying the bank, encrypted with the certificate authority's private key.

When the bank's web server sends its certificate to Jane's browser, Jane's browser decrypts it with the public key of the certificate authority. If the certificate is fake, the decryption results in garbage. If the certificate is valid, out pops the bank's public key, along with the identifying statement. And if that statement doesn't include, among other information, the same hostname that Jane connected to, Jane receives an appropriate warning message and decides not to continue the connection.

Now, let's return to Bob. Can he substitute himself convincingly for the bank? No, he can't, because he doesn't have the certificate authority's private key. That means he can't sign a certificate claiming that he is the bank.

Now that Jane's browser is thoroughly convinced that the bank is what it appears to be, the conversation can continue.

After the Handshake: Symmetric Key Encryption

Jane's browser and the bank could continue to communicate with public key encryption. But public key encryption is very processor-intensive - it makes both computers work hard. And that slows down both systems. Jane's browser might not matter, since Jane's computer is probably only talking to one site at a time. But the bank's server is communicating with hundreds of customers and can't afford to do the math!

Fortunately, now that Jane's browser trusts the bank's server, there's an easier way. Jane's browser simply tells the bank's server that the rest of the conversation should be carried out using a "symmetrical" cipher - a method of encryption that is simpler than public/private key, or "asymmetrical," encryption. "Symmetric" ciphers use a single key that is shared by both sides. Jane's browser picks a cipher (an "algorithm," or mathematical method, of encryption, such as the AES Advanced Encryption Standard) and randomly generates the key to be used. Finally, Jane's browser tells the bank's server what the cipher and key will be, encrypting this information with the bank's public key, and the conversation continues using symmetric encryption.

But what if Bob is still listening? Bob might receive the symmetric key from Jane, but that information is itself encrypted with the bank's public key... and can only be decrypted with the bank's private key. Which Bob doesn't have. So

Jane and the bank now share a symmetric key, also known as a "master secret," that no one else can know. And this allows them to continue communicating secretly.

Additional Reading

Here I've discussed what a typical SSL conversation looks like and addressed the essential features of public key cryptography. I've tried to cover the important features while keeping things understandable. But for simplicity's sake, I've glossed over quite a bit.

If you're interested in understanding the mathematical details and the many encryption algorithms that can be employed, you can find a more technical discussion on Wikipedia.

Legal Note: yes, you may use sample HTML, Javascript, PHP and other code presented above in your own projects. You may not reproduce large portions of the text of the article without our express permission.

Got a LiveJournal account? Keep up with the latest articles in this FAQ by adding our syndicated feed to your friends list!