Cryptography Part 1 - Getting Started - Security Series #16
Wow, 20+ days since my last post. :( It has been a busy few weeks getting ready for cf.Objective() 2010, and I have been slack in my blogging. But no more! Back to it.
Today I am going to continue my security series with a discussion of cryptography. This is a HUGE subject about which I am no expert, but I am learning and, as always, I feel the need to share this knowledge.
Recently I started graduate school and my first class required a research paper. I chose to do an "Introduction to Cryptography". I also turned it into a presentation for cf.Objective(). Now I am going to continue that and incorporate it into my security series. Repetition makes it stick, right?
So let's get started. And be sure to stick with me, cause somewhere in this post, I will have a contest.
What is cryptography
Cryptography is a field within the area of study called Cryptology. Cryptology is the combined fields of cryptography and cryptanalysis.
Cryptography is the study and practice of hiding information (Wikipedia, n.d.). Information hiding can be done in many ways, even to the point of hiding that information is even being hidden (steganography). It may seem like this practice of information hiding has limited uses, like protecting data from prying eyes, but in fact, it has quite a few additional uses.
Uses of cryptography
Confidentiality is the use of cryptography that everyone is familiar with. You can use cryptography to stop prying eyes from viewing your data. Hence you are keeping it confidential. We'll look at that in a moment.
Integrity is another user of cryptography. Cryptographic techniques can be used to ensure that a message that has been sent has not been modified from the original. We can use tools like hashes and salts to create message authentication codes (MAC) to verify the integrity of a message.
Authentication can extend well beyond the domain of usernames and passwords and into the area of cryptography. We can use certificates and digital signatures to prove our identities and verify the identities of others.
Non-repudiation is a fun to say. Additionally, it is the concept that the sender of a message cannot later deny that they sent the message or that the message is authentic.
Over the course of this series, I will try to discuss each of these uses, some of them we are already using without even realizing it.
Terminology
As with any area of scientific study, there are new terms to go along with it. Some of these will be familiar to you, some may not be. I will take a moment to define each.- Senders and Receivers (Originators and Recipients) - The sender of a message is just what it sounds like. The sender is the one who encrypts a message (or signs it, hashes it, etc) and the receiver is the one who decrypts it (verifies it, authenticates it, etc).
- Encrypt and Decrypt (Encipher and Decipher) - Encryption and decryption are the processes that a message goes through to be hidden and revealed. Encryption is the hiding process and decryption is the process of revealing. These processes are done through cipher algorithms.
- Ciphers - The algorithms that messages go through to be encrypted and decrypted.
- Encryption and Decryption Keys - Keys are input parameters to cipher algorithms that determine the output. The key is the primary secret of a well implemented algorithm.
- Plaintext and Ciphertext - Plaintext is the state of a message prior to encryption and after decryption. Ciphertext is its state after encryption and before decryption.
Basic Encryption
As with most subjects, we should start with the basics. And so we are going to look at the simplest form of encryption, the basic substitution cipher.
A simple substitution cipher uses a algorithm that replaces each letter of the plaintext message with another letter or combination of letters, or even with a symbol. The result is the ciphertext of the message. Let's look at a very basic example.
The Caesar Cipher
Also known and ROT ciphers, or shift ciphers, the Caesar cipher uses a simple method to create a substitution alphabet from which to work. By shifting each letter of the alphabet by N characters we can then line up the plaintext alphabet with the ciphertext alphabet to find which letters to use for substitution.| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
| X | Y | Z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W |
As you can, hopefully, see from this table, in our cipher text, A will be replaced by X, D will be replaced by A, Z by W, etc.
So to encrypt the plaintext message "Simon and Ben are ColdFusion Experts" we get the cipher text:
VLPRQ DQG EHQ DUH FROGIXVLRQ HASHUWV
As you can see, this would be very easy to decrypt using some simple cryptanalysis.
Cryptanalysis
Cryptanalysis is the field of code-breaking in cryptology. Cryptanalyst are the extremely smart, math-oriented geeks who try to compromise encryption algorithms. But note that cryptanalyst are not evil. Many of them use their powers for good by finding vulnerabilities in ciphers and reporting them so that the algorithms can be replaced by stronger ones.Let's try a little cryptanalysis.
Time for a contest
Below I am going to list five piece of ciphertext, each encrypted using a different algorithm. I will challenge you, my readers, to decrypt each one. The first five people to post a solution to a unique problem will win a copy of Simon Singh's The Code Book. Please also explain how you came to the solution.NOTE: Please only decrypt ONE ciphertext. Don't get ambitious and greedy, you can only win once. If you post more than one solution, I will delete all of your comments and you'll forever be on my shitlist.
Ciphertext #1 (easy) GJQP MW E JERXEWXMG PERKYEKI. M VIEPPC PSZI MX.
Ciphertext #2 (still easy) CLXVGYPLIVQX SM I HOD MOAKJCG. S LJITTX TSUJ GQSM MGOHH.
Ciphertext #3 (moderate) WRTRPLADSV DP RCPU RZ RJMPUEM CRZRBOBM.
Ciphertext #4 (moderate to hard) BZ LJ FD HF SQ LJ FD OM WU TR LJ GE LJ FD HF VT IG TR IG LJ FD KI NL VT VT GE RP MK TR HF
Ciphertext #5 (hard) OMTHTU UMYECN RUIBOI CNSEMT OITSMY
Conclusion
That's all I will cover today. I will give you some time to work on the ciphers. If they don't get solved in a few days I'll start throwing out some clues on twitter. If you are not following me already, I am @jasonpdean.Post solutions to the comments. Also feel free to use the comments for comments if you have something to add or have a question. If you are submitting a solution, be sure to use a real email address so I can contact you about your winnings.



I figured this out using the Ceasar Cipher, figuring that the one letter words had to be either "A" or "I".
Of course that won't work for #4 and should not work for #5 (oh! There's a hint).
I figured it out similar to Nathan - using language patterns. For example, one and two letter words, double letters, etc. From there I just worked the substitutions.
BTW, it was great meeting you last week.
I used letter frequency to get started and then the rest fell into place. It was a little confusing at first because your last ciphertext word should actually be CRZBORBM. As written it decodes to LANAGUGE.
Well done!
And actually, the typo was intentional. Misspellings and mistakes make ciphertext harder to decrypt because it makes it more difficult to run dictionary checks up against it.
It is believed that the Zodiac Killer in the 70's use misspellings and bad grammar intentionally to make his ciphers harder to decrypt. :)
Good one.
I tried frequency analysis first and got nowhere with it. It wasn't until much later I realized that the six letter groupings were not to hide the word boundaries, but to help make a uniform grid size. D'oh! Like so:
OMTHTU
UMYECN
RUIBOI
CNSEMT
OITSMY
Once you have the grid, read down the first column to the end, then down the second column, etc. Very crafty, Jason. Ever read Cryptonomicon?
Awesome!!! Nice work. I wondered if anyone would get that one since it is not a substitution cipher. It is actually called a transposition cipher and I have not covered yet. I will in the next post. Well done.
I am listening to the Cryptonomicon audio book right now. I am still near the beginning though.
<cfset string = 'WRTRPLADSV DP RCPU RZ RJMPUEM CRZRBOBM'>
<cfoutput>
<cfloop index="shift" from="-26" to="26">
<cfloop index="inner" from="1" to="#Len(string)#">
<cfset ascii = Asc(Mid(string,inner,1))>
<cfif ascii EQ 32> <cfelse>#Chr(ascii+shift)#</cfif>
</cfloop><br />
</cfloop>
</cfoutput>
My second try was to build a tabula recta like a Vigenere cipher. This produced better results, as I can see a pattern, but can't make progress from there. It doesn't appear to be an actual Vigenere, as there doesn't appear to be a key. (But maybe I'm missing something crucial or overthinking it -- my copy of The Code Book has been gathering dust for the better part of the last decade.)
@rick, sorry to disappoint. I actually did not think to do a playfair cipher. That would have been good too. I actually just made up this simple substitution myself. You may be over-thinking it. Look for patterns.
That one was tougher than I expected. I knew moving the word boundaries would make it harder but I thought the patterns would make it easy. I guess I was wrong. When I was putting it together I was trying to think about things that people might notice. Here are some things that I hoped people would noticed. I'm sure most did, but I guess it was harder to put together than I thought it would be.
1. Patterns like LJ, VT, GE and HF were all repeated. If each character represented a single character then this would be really weird, so each pair should represent a character. VT was also repeated two times in a row, so I was hoping someone would point out that that was likely one of the common paired letters EE, TT, DD, OO, FF, etc
2. Each pair of letters is separated by one single letter. G and E are separated by F, L and J are separated by K, and so on. This would be another clue that each pair represented a single letter.
3. Since the patterns always matched, then we could actually strip off the first letter or last letter of every pair and make it easier. The pairs were there to obfuscate. The second character actually did not increase the technical difficulty.
So ultimately the answer is that this is a reversed Caesar cipher. The key alphabet is reversed to make the first character and then shifted to create the second character.
A = ZX
B = YW
C = XV
D = WU
E = VT
F = US
G = TR
H = SQ
I = RP
J = QO
K = PN
L = OM
M = NL
N = MK
O = LJ
P = KI
Q = JH
R = IG
S = HF
T = GE
U = FD
V = EC
W = DB
X = CA
Y = BZ
Z = AY
I hope people don't hate me now for burning up their time ;)