By about 1300 CE, Arabic cryptographers had determined that you can decipher messages in which one letter has been replaced by another letter, number, or symbol by exploiting statistical characteristics of the underlying language. Here are some especially useful patterns in English.
- E is by far the most common letter – representing about 1/8th of normal text.
- If you list the alphabet from most to least commonly used, it divides into four groups.
- The highest frequency group includes: e, t, a, o, n, i, r, s, and h.
- The middle frequency group includes: d, l, u, c, and m.
- Less common are p, f, y, w, g, b, and v.
- The lowest frequency group includes: j, k, q, x, and z.
- E associates most widely with other letters: appearing before or after virtually all of them, in different circumstances.
- Among combinations of a, i, and o io is the most common combination. Ia is the second most common. Ae is rarest.
- 80% of the time, n is preceded by a vowel.
- 90% of the time, h appears before vowels.
- R tends to appear with vowels; s tends to appear with consonants.
- The most common repeated letters are ss, ee, tt, ff, ll, mm and oo.
Naturally, there are thousands more such patterns. Even understanding a few can help in deciphering messages that have had a basic substitution cipher applied.
Here’s one to try out:
LKCLHQBCKDRCPQQBDKAPZULSQUCDK
AZRDTDGPCOTZKQDPQBZQDQZHHLOIP
XLSVDQBZAOCZQICZGLHQDJCQLOCZI
QBDKAPQBZQDKQCOCPQXLSDKXLSOPM
ZOCQDJCSKHLOQSKZQCGXLQQZVZDPO
CGZQDTCGXMLLOGXMOLTDICIHLOVDQ
BRZHCPQLLJZKXLHQBCJRGLPCNSDQC
CZOGXDKQBCCTCKDKA
One hint is that cipher alphabets are not always entirely random. The tools on this page are useful for cracking monoalphabetic substitution ciphers.
Here is a much easier to solve ciphetext from Simon Singh’s website.
What a difference spaces make…
Crushing by elephant
In your puzzle, I am betting C is ‘e’ and Q is ‘t.’
If Q is ‘t’, QQZVZ might be ‘Ottawa.’
LQQZVZ, rather
oKeoHtBeKDRePttBDKAPaUoStUeDK
AaRDTDGPeOTaKtDPtBatDtaHHoOIP
XoSwDtBaAOeatIeaGoHtDJetoOeaI
tBDKAPtBatDKteOePtXoSDKXoSOPM
aOetDJeSKHoOtSKateGXottawaDPO
eGatDTeGXMooOGXMOoTDIeIHoOwDt
BRaHePtooJaKXoHtBeJRGoPeNSDte
eaOGXDKtBeeTeKDKA
‘tBe’ occurs three times, so B may be ‘h.’
oKeoHtheKDRePtthDKAPaUoStUeDK
AaRDTDGPeOTaKtDPthatDtaHHoOIP
XoSwDthaAOeatIeaGoHtDJetoOeaI
thDKAPthatDKteOePtXoSDKXoSOPM
aOetDJeSKHoOtSKateGXottawaDPO
eGatDTeGXMooOGXMOoTDIeIHoOwDt
hRaHePtooJaKXoHtheJRGoPeNSDte
eaOGXDKtheeTeKDKA
abcdefghijklmnopqrstuvwxyz
Z—C–B——L—-Q–V—
I am betting the cipher alphabet is in the form:
abcdefghijklmnopqrstuvwxyz
WORDABCEFGHIJKLMNPQSTUVWYZ