We’ve seen a lot of confusion regarding what SSL certificates are all about – what they are, what they do, how you use them to secure a Web site, what the “gotchas” are when you’re trying to set up mobile devices to synchronize with an Exchange server, etc. So we’re going to attempt, over a few posts, to explain in layman’s terms (OK, a fairly technical layman) what it’s all about. However, before you can really understand what SSL is all about, you need to understand a little bit about cryptography.

When we were all kids, we probably all played around at one time or another with a simple substitution cipher – where each letter of the alphabet was substituted for another letter, and the same substitution was used for the entire message. It may have been done by simply reversing the alphabet (e.g., Z=A, Y=B, etc.), by shifting all the letters “x” letters to the right or left, or by using your Little Orphan Annie Decoder Ring. (The one-letter-to-the-left substitution cypher was famously used by Arthur C. Clarke in *2001: A Space Odyssey* to turn “IBM” into “HAL” – the computer that ran the spaceship.)

The problem with such a simple cipher is that it may fool your average six-year-old, but that’s about it – because (among other things) it does nothing to conceal frequency patterns. The letter “e” is, by far, the most frequently used letter in the English language, followed by “t,” “a,” “o,” etc. (If you want the full list, you can find it at http://en.wikipedia.org/wiki/Letter_frequency.) So whichever letter shows up most frequently in your encoded message is likely to represent the letter “e,” and so forth…and the longer the message is, the more obvious these patterns become. It would be nice to have a system that used a different substitution method for each letter of the message so that the frequency patterns are also concealed.

One approach to this is the so-called “one-time pad,” which is nearly impossible to break if it is properly implemented. This is constructed by selecting letters at random, for example, drawing them from a hopper similar to that used for a bingo game. A letter is drawn, it’s written down, then it goes back into the hopper which is again shuffled, and another letter is drawn. This process is continued until you have enough random letters written down to encode the longest message you might care about. Two copies are then made: one which will be used to encode a message, and the other which will be used to decode it. After they are used once, they are destroyed (hence the “one-time” portion of the name). One-time pads were commonly used in World War II to encrypt the most sensitive messages.

To use a one-time pad, you take the first letter of your message and assign it a numerical value of 1 to 26 (1=A, 26=Z). Then you add to that numerical value the numerical value of the first letter of the pad. That gives you the numerical value of the first letter of your cyphertext. If the sum is greater than 26, you subtract 26 from it. This kind of arithmetic is called “modulo 26,” and while you may not have heard that term, we do these kinds of calculations all the time: If it’s 10:00 am, and you’re asked what time it will be in five hours, you know without even thinking hard that it will be 3:00 pm. Effectively, you’re doing modulo 12 arithmetic: 10 + 5 = 15, but 15 is more than 12, so we have to subtract 12 from it to yield 3:00. (Unless you’re in the military, in which case 15:00 is a perfectly legitimate time.) So as we work through the following example, it might be helpful to visualize a clock that, instead of having the numbers 1 – 12 on the face, has the letters A – Z…and when the hand comes around to “Z,” it then starts over at “A.”

Let’s say that your message is, “Hello world.” Let’s further assume that the first ten characters of your one-time pad are: DKZII MIAVR. (By the way, I came up with these by going to www.random.org, and using their on-line random number generator to generate ten random numbers between 1 and 26.) So let’s write out our message – I’ll put the numerical value of each letter next to it in parentheses – then write the characters from the one-time pad below them, and then do the math:

H(8) E(5) L(12) L(12) O(15) W(23) O(15) R(18) L(12) D(4)

+ D(4) K(11) Z(26) I(9) I(9) M(13) I(9) A(1) V(22) R(18)

= L(12) P(16) L(12) U(21) X(24) J(10) X(24) S(19) H(8) V(22)

So our cyphertext is: LPLUX JXSHV. Note that, in the addition above, there were three times (L + Z, W + M, and L + V) when the sum exceeded 26, so we had to subtract 26 from that sum to come up with a number that we could actually map to a letter. Our recipient, who presumably has a copy of the pad, simply reverses the calculation by subtracting the pad from the cyphertext to yield the original message.

While one-time pads are very secure, you do have the logistical problem of getting a copy of the pad to the intended recipient of the message. So this approach doesn’t help us much when we’re trying to secure computer communications – where often you don’t know in advance exactly who you will need to communicate with, e.g., a banking site or a typical Internet e-commerce site. Instead, we need something that lends itself to automated coding and decoding.

During World War II, the Germans had a machine that the Allies referred to by the code name “Enigma.” This machine had a series of wheels and gears that operated in such a way that each time a letter was typed, the wheels would rotate into a new position, which would determine how the next letter would be encoded. The first Enigma machine had spaces for three wheels; a later model had spaces for four. All the recipient needed to know was which wheels to use (they generally had more wheels to choose from than the machine had spaces for) and how to set the initial positions of the wheels, and the message could be decoded. In modern terms, we would call this information the “key.”

One of the major turning points in the war occurred when the British were able to come up with a mathematical model (or “algorithm”) of how the Enigma machine worked. Alan Turing (yes, *that *Alan Turing) was a key player in that effort, and the roots of modern digital computing trace back to Bletchley Park and that code-breaking effort. (For a very entertaining read, I highly recommend *Cryptonomicon* by Neal Stephenson, in which Bletchley Park and the code breakers play a leading role.)

Today, we have computers that can perform complex mathematical algorithms very quickly, and the commonly used encryption algorithms are generally made public, specifically so that researchers will attack and attempt to break them. That way, the weak ones get weeded out pretty quickly. But they all work by performing some kind of mathematical manipulation of the numbers that represent the text (and to a computer, all text consists of numbers anyway), and they all require some kind of key, or “seed value,” to get the computation going. Therefore, since the encryption algorithm itself is public knowledge, the security of the system depends entirely on the key.

One such system is the “Advanced Encryption Standard” (“AES”), which happens to be the one adopted by the U. S. government. AES allows for keys that are 128 bits, 192 bits, or 256 bits long. Assuming there isn’t some kind of structural weakness in the AES algorithm – in which case it would presumably have been weeded out before anyone who was serious about security started using it – the logical way to attack it is to sequentially use all possible keys until you find the one that decodes the message. This is called a “brute force” attack. Of course, with a key length of *n* bits, there are 2^{n} possible keys. So every bit that’s added to the length of the key *doubles *the number of possible keys.

It is generally accepted that the computing power required to try all possible 128-bit keys will be out of reach for the foreseeable future, unless some unanticipated breakthrough in technology occurs that dramatically increases processing power. Of course, such a breakthrough is entirely possible, which is why AES also allows for 192-bit and 256-bit keys – and remember, a 256-bit key isn’t just *twice *as hard to break as a 128-bit key, it’s *2*^{128} times as hard. (And 2^{128} is roughly equal to the digit “3” followed by 38 zeros.) Therefore the government requires 192- or 256-bit keys for “highly sensitive” data.

AES uses a *symmetrical *key, meaning that the same key is used both to encrypt and decrypt the message, just as was the case with the old Enigma machine. In the next post of this series, we’ll talk about *asymmetrical *encryption systems, and try to work our way around to talking about SSL certificates.