Cryptography Part 1 - Getting Started - Security Series #16

Wow, 20+ days since my last post. :( It has been a busy few weeks getting ready for cf.Objective() 2010, and I have been slack in my blogging. But no more! Back to it.

Today I am going to continue my security series with a discussion of cryptography. This is a HUGE subject about which I am no expert, but I am learning and, as always, I feel the need to share this knowledge.

Recently I started graduate school and my first class required a research paper. I chose to do an "Introduction to Cryptography". I also turned it into a presentation for cf.Objective(). Now I am going to continue that and incorporate it into my security series. Repetition makes it stick, right?

So let's get started. And be sure to stick with me, cause somewhere in this post, I will have a contest.

What is cryptography

Cryptography is a field within the area of study called Cryptology. Cryptology is the combined fields of cryptography and cryptanalysis.

Cryptography is the study and practice of hiding information (Wikipedia, n.d.). Information hiding can be done in many ways, even to the point of hiding that information is even being hidden (steganography). It may seem like this practice of information hiding has limited uses, like protecting data from prying eyes, but in fact, it has quite a few additional uses.

Uses of cryptography

Confidentiality is the use of cryptography that everyone is familiar with. You can use cryptography to stop prying eyes from viewing your data. Hence you are keeping it confidential. We'll look at that in a moment.

Integrity is another user of cryptography. Cryptographic techniques can be used to ensure that a message that has been sent has not been modified from the original. We can use tools like hashes and salts to create message authentication codes (MAC) to verify the integrity of a message.

Authentication can extend well beyond the domain of usernames and passwords and into the area of cryptography. We can use certificates and digital signatures to prove our identities and verify the identities of others.

Non-repudiation is a fun to say. Additionally, it is the concept that the sender of a message cannot later deny that they sent the message or that the message is authentic.

Over the course of this series, I will try to discuss each of these uses, some of them we are already using without even realizing it.

Terminology

As with any area of scientific study, there are new terms to go along with it. Some of these will be familiar to you, some may not be. I will take a moment to define each.

  • Senders and Receivers (Originators and Recipients) - The sender of a message is just what it sounds like. The sender is the one who encrypts a message (or signs it, hashes it, etc) and the receiver is the one who decrypts it (verifies it, authenticates it, etc).
  • Encrypt and Decrypt (Encipher and Decipher) - Encryption and decryption are the processes that a message goes through to be hidden and revealed. Encryption is the hiding process and decryption is the process of revealing. These processes are done through cipher algorithms.
  • Ciphers - The algorithms that messages go through to be encrypted and decrypted.
  • Encryption and Decryption Keys - Keys are input parameters to cipher algorithms that determine the output. The key is the primary secret of a well implemented algorithm.
  • Plaintext and Ciphertext - Plaintext is the state of a message prior to encryption and after decryption. Ciphertext is its state after encryption and before decryption.

Basic Encryption

As with most subjects, we should start with the basics. And so we are going to look at the simplest form of encryption, the basic substitution cipher.

A simple substitution cipher uses a algorithm that replaces each letter of the plaintext message with another letter or combination of letters, or even with a symbol. The result is the ciphertext of the message. Let's look at a very basic example.

The Caesar Cipher

Also known and ROT ciphers, or shift ciphers, the Caesar cipher uses a simple method to create a substitution alphabet from which to work. By shifting each letter of the alphabet by N characters we can then line up the plaintext alphabet with the ciphertext alphabet to find which letters to use for substitution.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
X Y Z A B C D E F G H I J K L M N O P Q R S T U V W

As you can, hopefully, see from this table, in our cipher text, A will be replaced by X, D will be replaced by A, Z by W, etc.

So to encrypt the plaintext message "Simon and Ben are ColdFusion Experts" we get the cipher text:


VLPRQ DQG EHQ DUH FROGIXVLRQ HASHUWV

As you can see, this would be very easy to decrypt using some simple cryptanalysis.

Cryptanalysis

Cryptanalysis is the field of code-breaking in cryptology. Cryptanalyst are the extremely smart, math-oriented geeks who try to compromise encryption algorithms. But note that cryptanalyst are not evil. Many of them use their powers for good by finding vulnerabilities in ciphers and reporting them so that the algorithms can be replaced by stronger ones.

Let's try a little cryptanalysis.

Time for a contest

Below I am going to list five piece of ciphertext, each encrypted using a different algorithm. I will challenge you, my readers, to decrypt each one. The first five people to post a solution to a unique problem will win a copy of Simon Singh's The Code Book. Please also explain how you came to the solution.

NOTE: Please only decrypt ONE ciphertext. Don't get ambitious and greedy, you can only win once. If you post more than one solution, I will delete all of your comments and you'll forever be on my shitlist.

Ciphertext #1 (easy) GJQP MW E JERXEWXMG PERKYEKI. M VIEPPC PSZI MX.

Ciphertext #2 (still easy) CLXVGYPLIVQX SM I HOD MOAKJCG. S LJITTX TSUJ GQSM MGOHH.

Ciphertext #3 (moderate) WRTRPLADSV DP RCPU RZ RJMPUEM CRZRBOBM.

Ciphertext #4 (moderate to hard) BZ LJ FD HF SQ LJ FD OM WU TR LJ GE LJ FD HF VT IG TR IG LJ FD KI NL VT VT GE RP MK TR HF

Ciphertext #5 (hard) OMTHTU UMYECN RUIBOI CNSEMT OITSMY

Conclusion

That's all I will cover today. I will give you some time to work on the ciphers. If they don't get solved in a few days I'll start throwing out some clues on twitter. If you are not following me already, I am @jasonpdean.

Post solutions to the comments. Also feel free to use the comments for comments if you have something to add or have a question. If you are submitting a solution, be sure to use a real email address so I can contact you about your winnings.

Comments
Nathan Mische's Gravatar The answer to #1 is : "CFML IS A FANTASTIC LANGUAGE. I REALLY LOVE IT."

I figured this out using the Ceasar Cipher, figuring that the one letter words had to be either "A" or "I".
# Posted By Nathan Mische | 4/27/10 11:39 AM
Jason Dean's Gravatar @Nathan! Fantastic! Good use of language pattern analysis. Hopefully that can help others figure out #2 and #3.

Of course that won't work for #4 and should not work for #5 (oh! There's a hint).
# Posted By Jason Dean | 4/27/10 11:52 AM
Rob Brooks-Bilson's Gravatar #2: CRYPTOGRAPHY IS A FUN SUBJECT. I REALLY LIKE THIS STUFF.

I figured it out similar to Nathan - using language patterns. For example, one and two letter words, double letters, etc. From there I just worked the substitutions.
# Posted By Rob Brooks-Bilson | 4/27/10 1:32 PM
Jason Dean's Gravatar @Rob, awesome! Nice work.

BTW, it was great meeting you last week.
# Posted By Jason Dean | 4/27/10 5:12 PM
Michael Mongeau's Gravatar #3: JAVASCRIPT IS ALSO AN AWESOME LANGUAGE.

I used letter frequency to get started and then the rest fell into place. It was a little confusing at first because your last ciphertext word should actually be CRZBORBM. As written it decodes to LANAGUGE.
# Posted By Michael Mongeau | 4/27/10 6:45 PM
Jason Dean's Gravatar @Michael,

Well done!

And actually, the typo was intentional. Misspellings and mistakes make ciphertext harder to decrypt because it makes it more difficult to run dictionary checks up against it.

It is believed that the Zodiac Killer in the 70's use misspellings and bad grammar intentionally to make his ciphers harder to decrypt. :)

Good one.
# Posted By Jason Dean | 4/27/10 6:59 PM
Jonah Chanticleer's Gravatar Number 5: Our Community is the best community.

I tried frequency analysis first and got nowhere with it. It wasn't until much later I realized that the six letter groupings were not to hide the word boundaries, but to help make a uniform grid size. D'oh! Like so:

OMTHTU
UMYECN
RUIBOI
CNSEMT
OITSMY

Once you have the grid, read down the first column to the end, then down the second column, etc. Very crafty, Jason. Ever read Cryptonomicon?
# Posted By Jonah Chanticleer | 4/27/10 9:57 PM
Jason Dean's Gravatar @jonah,

Awesome!!! Nice work. I wondered if anyone would get that one since it is not a substitution cipher. It is actually called a transposition cipher and I have not covered yet. I will in the next post. Well done.

I am listening to the Cryptonomicon audio book right now. I am still near the beginning though.
# Posted By Jason Dean | 4/27/10 10:38 PM
andy matthews's Gravatar That explains why my approach didn't work then. I wrote some quick code to loop over the alphabet, then with each iteration, shift the ascii value of each letter in the cipher by one. I tested it against the first two correct answers and it worked fine...not so much with the other ones.

<cfset string = 'WRTRPLADSV DP RCPU RZ RJMPUEM CRZRBOBM'>
<cfoutput>
<cfloop index="shift" from="-26" to="26">
   <cfloop index="inner" from="1" to="#Len(string)#">
      <cfset ascii = Asc(Mid(string,inner,1))>
      <cfif ascii EQ 32> &nbsp; &nbsp; &nbsp; <cfelse>#Chr(ascii+shift)#</cfif>
   </cfloop><br />
</cfloop>
</cfoutput>
# Posted By andy matthews | 4/28/10 6:40 AM
Rick O's Gravatar My first shot at #4 was that it was Playfair, owing to the pairs of letters. But, you have I, J, and Q, which would be unusual for a Playfair.

My second try was to build a tabula recta like a Vigenere cipher. This produced better results, as I can see a pattern, but can't make progress from there. It doesn't appear to be an actual Vigenere, as there doesn't appear to be a key. (But maybe I'm missing something crucial or overthinking it -- my copy of The Code Book has been gathering dust for the better part of the last decade.)
# Posted By Rick O | 4/28/10 10:48 AM
Jason Dean's Gravatar @andy, Your code should only work with the first one. The second one has the key letters in a random order. But nice work. I will make another post that shows some interesting modulus mathematics that makes the output a little nicer. It takes out all of the special chars and only works with the 26 character alphabet.

@rick, sorry to disappoint. I actually did not think to do a playfair cipher. That would have been good too. I actually just made up this simple substitution myself. You may be over-thinking it. Look for patterns.
# Posted By Jason Dean | 4/28/10 1:15 PM
Joseph Lamoree's Gravatar Ha! YOU SHOULD GO TO USER GROUP MEETINGS
# Posted By Joseph Lamoree | 4/28/10 8:15 PM
Joseph Lamoree's Gravatar I split the string into symbols and made a sorted frequency array. I tried plugging in "ETAOIN SHRDLU" but got nowhere without the word separation. I used a little Java application called Decrypto with an English dictionary once I had the word boundaries. It solved it in less than a second.
# Posted By Joseph Lamoree | 4/28/10 8:20 PM
Jason Dean's Gravatar @joseph, w000t!!!! Nice work.

That one was tougher than I expected. I knew moving the word boundaries would make it harder but I thought the patterns would make it easy. I guess I was wrong. When I was putting it together I was trying to think about things that people might notice. Here are some things that I hoped people would noticed. I'm sure most did, but I guess it was harder to put together than I thought it would be.

1. Patterns like LJ, VT, GE and HF were all repeated. If each character represented a single character then this would be really weird, so each pair should represent a character. VT was also repeated two times in a row, so I was hoping someone would point out that that was likely one of the common paired letters EE, TT, DD, OO, FF, etc

2. Each pair of letters is separated by one single letter. G and E are separated by F, L and J are separated by K, and so on. This would be another clue that each pair represented a single letter.

3. Since the patterns always matched, then we could actually strip off the first letter or last letter of every pair and make it easier. The pairs were there to obfuscate. The second character actually did not increase the technical difficulty.

So ultimately the answer is that this is a reversed Caesar cipher. The key alphabet is reversed to make the first character and then shifted to create the second character.

A = ZX
B = YW
C = XV
D = WU
E = VT
F = US
G = TR
H = SQ
I = RP
J = QO
K = PN
L = OM
M = NL
N = MK
O = LJ
P = KI
Q = JH
R = IG
S = HF
T = GE
U = FD
V = EC
W = DB
X = CA
Y = BZ
Z = AY

I hope people don't hate me now for burning up their time ;)
# Posted By Jason Dean | 4/28/10 8:40 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.9.1. Contact Blog Owner