Since the rise of the internet, we’ve moved a huge amount of our communication onto it. Because the internet as a communication medium relies on millions of computers that handle the transfer of the bits of information we exchange, we leave traces of the communication and ourselves along the way.
Information from a single exchange might not provide too much information to outsiders – but since we use so much of the internet, the sheer amount of information left behind, however fragmented, can and will inevitably be combined to identify you. Data exposure might also leak your personal information online.
Keeping communication private is essential to limit the information you leave behind. When you secure your communication, you essentially hide the information from publicly available sources. Most ways of keeping communication private rely on encryption. Essentially, encryption hides your communication from all the other computers the information travels through. The encryption algorithm is what determines how secure the encryption is but since the data is handled through untrusted computers (remember, anyone can set up a node on the internet) the way encryption keys are exchanged can also be the weak point.
A brief introduction to cryptography
You do not need to understand all the details of encryption, but here’s a very limited primer on simple encryption methods – people tend to find this information interesting. After all, some form of cryptography has been used for thousands of years. The first known users of encryption might have been the Egyptians, although simple substitution of message contents to hide the original message might have been in use even before that.
Think back to the first chapter where you learned about the CIA triad. Cryptography is a common way to ensure the Confidentiality side of the triad is handled well.
The Caesar cipher
The Romans used cryptography quite often. Julius Caesar was known to have used a form of substitution cipher nowadays known as the Caesar cipher. A cipher is a term describing an algorithm used to encrypt and decrypt data. A substitution cipher is a cipher where one character is replaced by another as determined by some means. For example, Caesar used a cipher where each letter was written in a way that substituted the character with the third one from it in alphabetical order, i.e. an A would become a D, a B would be an E, and so forth.
As you can see from the picture, the key for this “encryption” is “move three steps forward in the alphabet”. You might also be able to deduct these kinds of ciphers depending on how secret you can keep the system instead of the key. In the Caesar cipher example, the key could be anything and you’d be able to guess it pretty easily by just trying all combinations (brute forcing) or using the frequency of letters in a given language. For example, if the letter P is the most frequent letter in the encoded message and the language is known to be English, it is a reasonable guess that P in the secret message is E in the plain text message, as E is the most common letter in English. Different variations of substitution ciphers were used for at least a thousand years until mathematics provided better ways of encryption.
Another well-known substitution cipher often mentioned is the so-called “Kamasutra cipher”. The cipher is first mentioned in the Kamasutra as the art of hiding messages, especially by women. The Kamasutra cipher suggested that the alphabet is split in to two and paired together. The pair of the letter would be the substitution in the cipher, not just a shift of a certain length.
As a side note, encryption doesn’t have to be perfect. In conflicts such as war, just making sure the message is not read in minutes or hours can be essential. However, in cases where the encrypted data is confidential and can cause damage later as well, you should make sure the method you use is not too easily compromised.
During the 16th century, a new and better way of encryption was developed that protected against frequency attacks. The Vigenère cipher uses a repeating key, defeating the simple use of frequency analysis as the letter P no longer matches E in most of the encrypted texts.
Vigenère cipher uses a variable key that makes the simple substitution cipher more robust. The key is repeated to cover the full length of the message. The length of the key is the most important part of the Vigenère cipher as with a key length of 1 the cipher is basically a Caesar cipher. Let's go through an example.
An example of a Vigenère square. You can see that this version of the Vigenère cipher is basically a cyclically shifted Caesar cipher.
As you can see from the table above, using a key of just the character “D” the cipher is identical to the cipher Caesar used in his messages. Using the table, look through the D row and from the A column you’ll find the letter A encrypted as the letter D, like in the Caesar cipher.
To understand the power of Vigenère cipher, let’s go through a more complete example with a longer key. Using the Vigenère square above (also called a Vigenère table). We can encrypt a plaintext message “CYBERSECURITY” with the key “SECRET” as follows.
Firstly, we need to find the column matching the letter C in row S of the table. Using the table we find that the result is the letter U. Insert it as the first letter of the encrypted text.
The next letter of the message is Y on row E (second letter of the encryption key). The result is the letter C. Adding that to the encryption shows that the first two letters of the encrypted text (ciphertext) is UC
Following the same pattern, we end up with the final ciphertext:
Congratulations, you’ve encrypted your first message with the Vigenère cipher! The cipher was used successfully for about three centuries until a general decryption method was discovered. As you might have guessed, some messages were most likely decrypted earlier as the secrecy of the message depends on the secrecy and quality of the key.
Sign up to solve exercises
The Vigenère cipher remained secure for a long time, however in the 19th century general weaknesses were found and published. Using these methods, the encryption was generally broken regardless of the key used. The attacks rely on weaknesses and repetition of the key in the algorithm to find out the key length (or possible key lengths) and with that information a key elimination attack can be used to find out the key used. Multiple different methods can be used to guess the key length with a high percentage of certainty.
A form of Vigenère cipher was also used by the confederate forces in the US Civil war. By that time the union forces regularly decrypted their messages as several weaknesses had been found in the cipher.
A Vigenère cipher using a truly random key with the same key length as the plain text message is generally considered unbreakable and is called a one-time pad. A one-time pad doesn’t have the associated weaknesses of a repetitive key. Since dictionary attacks do not apply to truly random keys, the secrecy of the message is reliant on the key and not the cipher. However, using a one-time pad is hard since the weak point will be how the key is exchanged between the parties.
A fun interactive site with lots of examples of different historical cryptographic functions is Simon Singh’s The Black Chamber.
With the invention of mechanical engineering, new and better methods were invented in the field of encryption. Machines were created where a keypress would light up the encrypted letter. These first mechanical encryption machines also expanded on the vocabulary length by adding rotating rotors next to each other. When one rotor with 26 characters turned around from A to Z, another moved one step further thus obscuring the repeating pattern of the key. A single rotor would thus provide a key of length 26. Adding another rotor that moved would then expand the available positions to 26 x 26 = 676. To set up the shared key you would only need to communicate the initial state of both of the rotors.
Assuming we have three rotors and the initial setting for all is A, we’ll get the following positions for each subsequent key press:
A A A
B A A
C A A
D A A
Z A A
A B A
B B A
Y Z Z
Z Z Z
A A A
The actual setup, wiring and the way the machines work is more involved than the example, but it shows the power of automating the generation of the actual encryption key from an initial setup of the mechanical machines.
The most well-known of these rotor machines was the Enigma, which was used extensively by the Germans in WWII. Enigma machines during the war used anything from three to eight rotors. However, the way the Germans used these machines allowed Polish and British cryptologists to use these weaknesses in decrypting German messages.
More information on perhaps the best-known encryption machine ever to have existed can be read on Wikipedia.
The ciphers we went through are symmetric ciphers, meaning the same key that was used in encryption can be used for decryption. Throughout the years, methods that do not require the secret key to be transmitted to the message recipient were developed; these are called asymmetric encryption. In asymmetric encryption, the message is encrypted with a public key that is derived from the secret key of the message recipient. For example, if you wanted your colleagues to encrypt messages to you, you’d send your public key to them and they would use that key to encrypt their messages to you. After the messages were encrypted, only someone (hopefully you) who has your secret key can decrypt the message. This is called public key cryptography as it uses a public key for the encryption.
Hashing to prove Integrity
Hash functions are widely used for verifying the integrity of a message. As you may recall, integrity is one side of the CIA triad (Confidentiality, Integrity and Availability). The hash function can be used for verifying integrity in the same way a password can be protected. A hash function can be used to calculate a value for the whole content of a message and sent along with the original message. The recipient of the hashed password can then verify that the hash value (the so-called “message digest”) matches the calculated value at the receiving end. Some cryptographic hash functions incorporate identity into the function, which can be verified by the receiver. These functions create what is called a Message Authentication Code (MAC).
Modern cryptography depends on the secrecy of the key and uses named and widely studied ciphers. The ciphers can be known as the secrecy is dependent on the key instead of how it is used. For example, when you are visiting your online bank, your browser and the bank web server securely exchange a one-time symmetric (and long) secret key that is used to encrypt the traffic between your browser and the bank server. Additionally, public key cryptography is usually used to verify the other party’s identity. Each message is also verifiable by a message authentication code (MAC). This process of a browser and server exchanging keys, verification and authentication is called TLS or transport layer security, which we will learn more about in the next chapter. Previous versions used a now deprecated method called SSL, secure socket layer.
Key learnings of cryptography
The secrecy of your message should always depend on the secrecy of the key, and not on the secrecy of the encryption system (this is known as Kerckhoffs's principle).
Always use ciphers which have been publicly reviewed and have been established as a standard. Using "secret crypto", or inventing your own is bad, because just like the Caesar cipher, once the system is known and understood, all messages can be decrypted.