Random numbers

Truly random numbers are hard to find, as patterns tend to abound everywhere. This is problematic, because there are times when a completely random string of digits is necessary: whether you are choosing the winner of a raffle or generating the one-time pad that secures the line from the White House to the Kremlin.

Using random radio crackle, random.org promises to deliver random data in a number of convenient formats (though one should be naturally skeptical about the security of such services). Another page, by Jon Callas, provides further information on why random numbers are both necessary and surprisingly tricky to get.

This comic amusingly highlights another aspect of the issue.

Encrypting personal communication

Statue outside the National Archives

Personal use of encrypted communication is yet another example of so-called ‘network effects.’ (These have been mentioned previously: 1, 2, 3.) The basic idea is that the more widespread certain technologies become, the more useful they are to everyone using them. The most commonly cited examples are telephones and fax machines; back when only a few people had them, they had limited utility. You would need alternative channels of communication and you would waste time deciding which one to use and exchanging instructions about that with other parties. Once telephones became ubiquitous, each one was a lot more powerful and convenient. The same can be said for email addresses.

Good free software exists that allows the encryption of emails at a level where it would challenge major organizations to read them. While this may not protect an individual message that falls under scrutiny, it changes the dynamic of the whole system. It is no longer possible to filter every email passing along a fibre-optic cable for certain keywords, for instance. You would need to crack every one of them first.

Making the transition to the routine use of encryption, however, requires more effort than the adoption of telephones or email. While those technologies were more convenient than their predecessors, encryption adds a layer of difficulty to communication. You need to have the required software, key pairs generated, and passphrases. It is possible to make mistakes and encrypt things such that you can never access them again.

As such, there is a double barrier to the adoption of widespread communication encryption: people must deal with the added difficulties involved in communicating in this way and with the problem that hardly anyone uses such systems now. If there is nobody out there with whom you can exchange PGP encrypted messages, you aren’t too likely to bother with acquiring and using the software. It is entirely possible that those two constraints will prevent widespread adoption for the foreseeable future.

One nice exception to this rule is Skype. Users may not know it, but calls made over Skype are transmitted in encryption form, very considerably increasing the difficulty of intercepting them. The fact that users do not know this is happening greatly increases the level of usage (you cannot avoid using it). While such systems may well not be as secure as explicit encryption efforts undertaken by senders and recipients, they may be a useful way to increase overall adoption of privacy technology. Such ‘invisible encryption’ could also be usefully incorporated into stores of personal data, such as the contents of GMail accounts.

PS. For anyone who decides to give PGP a try, my public key is available here.

Footprints all over the web – Google Web History

Red brick facade and fire escapes

When I am online, I usually have at least one Google service open. At home, I usually have a Google Mail window open at all times, as well as Google Calendar. At work, it is only the latter. What I didn’t know until today is that whenever you are logged into your Google account, Google is tracking your web usage through a system called Web History. Accessing the system allows you to ‘pause’ the recording and even delete what is already there. While the listings disappear from your screen, there is good reason to doubt whether they vanish from Google’s records.

It is common knowledge that Google saves every search query that gets input into it, and does so in a way that can be linked to an individual computer. The web history service, however, has more troubling implications. Whether you are at work, at home, or at an internet cafe, you just need to be logged into any Google service for it to be operating. Since more than one computer can be logged into a Google account at once, and there is no indication on either machine that this is happening, anybody who gets your password can monitor your web usage, as well as your email and any other Google services you use. Given how common keyloggers have become, this should worry people.

One very helpful feature Google could implement would be the option to show when and where you last logged into your account. That way, if someone has been peeking at your email from London while you have been in Seattle, you know that it may be time to change your password. Also desirable, but much less likely to happen, would be a requirement that services like GMail store your information as an encrypted archive. Even if the encryption was based on your password and a relatively weak cipher, it would make it impractical for either Google or malicious agents with access to their information storage systems to undertake the wholesale mining of the information therein.

The final reason for which this is concerning has to do with cooperation between companies and governments. It is widely rumoured that companies including Microsoft and Yahoo have helped the Chinese government to track down and prosecute dissidents, by turning over electronic records held outside China. Given the increasingly bold snooping of both democratic and authoritarian governments, a few more layers of durable protection built into the system would be prudent and encouraging.

The Code Book

Simon Singh’s The Code Book proves, once again, that he is a superlatively skilled writer on technical and scientific subjects. Thanks to his book, I now actually understand how Enigma worked and how it was broken: likewise, the Vigenere Cipher that has been built into this site for so long. This book manages to capture both major reasons for which cryptography is so fascinating: the technical aspects, centred around the ingenuity of the methods themselves, and the historical dramas connected, from the execution of Mary Queen of Scots to the use of ULTRA intelligence during the Second World War.

Anybody who has any interest in code-making or code-breaking should read this book, unless they already know so much about the subject as to make Singh’s clear and comprehensible explanations superfluous. Even then, it may arm them with valuable tools for explaining interesting concepts to the less well initiated.

At the end of the book is a series of ten ciphers for the reader to break. Originally, there was a £15,000 prize for the first person to crack the lot. Now, they exist for the amusement of amateur cryptologists. I doubt very much I will get through all ten, but I am giving it a try. The first ciphertext is on his website and is helpfully labeled ‘Simple Monoalphabetic Substitution Cipher.’ I expect to crack it quickly.

Continue readingThe Code Book

M’s PL, XII

(220), (210), (241), (310), (250)
(350), (380), (317), (271), (346)
(222+1), (212), (302), (258), (280)
(127), (100), (556), (452+1), (599)
(621), (633), (590), (392), (387)
(414), (423), (539), (572), (157)
(142), (128), (189), (529+2), (412)
(361+1), (351), (200), (229), (174)
(409), (440), (594), (532), (539)
(608), (259+1), (310), (271), (100)
(143), (98), (478), (530), (599)
(369+1), (343), (321), (370), (375)
(389), (413), (530), (58), (79)
(33), (87), (211+1), (251), (346)
(556), (608), (631), (640), (546)
(579), (549), (492), (481), (429)
(336), (387), (442+1), (219), (213)
(439), (450), (551), (632), (245)
(396+1), (589), (539), (418), (499)
(422), (460+2)

Hint: second Beale cipher.

Strengthening substitution ciphers

Fountain in Gatineau

The biggest problem with substitution ciphers (those that replace each letter with a particular other letter or symbol) is that they are vulnerable to frequency analysis. In any language, some letters are more common than others. By matching up the most common symbols with what you know the most common letters are, you can begin deciphering the message. Likewise, you can use rules like ‘a rare letter than almost always appears to the left of one specific more common letter is probably a Q.’ What is needed to strengthen such ciphers is a language in which words have no such ‘personality.’ Here is how to do it:

First, take all the short words (less than three letters) and assign them a random three digit code. Lengthening very short words further strengthens this approach because short words are the most vulnerable to frequency analysis; a single letter sitting with spaces on either side is probably ‘a’ or ‘i.’ Using three digit groups and 26 letters, you can assign 17,576 words. Now, take as many words from the whole language as you want to be able to use. For the sake of completeness, let’s use the entire Oxford English Dictionary. The 456,976 possible four letter groups more than suffice to cover every word in it, leaving some space for technical terms that we may want to encrypt but which might not be included. If we need even more possibilities, there are 11,881,376 five letter combinations.

This approach is cryptographically valuable for a number of reasons. Since the codes representing words have a random collection of letters, the letter frequency in a ‘translated’ message is also random. You no longer need to worry that some English letters are more common than others. Just as important, there are none of the ‘Q’ type rules by which to later attack the substitution cipher. The dictionary of equivalencies would not need to be secret; indeed, it should be widely available. Having the dictionary does not make encrypted messages more vulnerable, since they will have passed through a substitution cipher before being distributed and are fundamentally more robust to the cryptoanalysis of substitution ciphers than a message enciphered from standard English would be.

In the era of modern algorithms like AES, I doubt there is any need for the above system. Still, I wonder if there are any historical examples of this approach being used. If you have a computer to do the code-for-word and word-for-code substitutions, it would be quite a low effort mechanism to increase security.

Botnets

The rise of the botnet is an interesting feature of contemporary computing. Essentially, it is a network of compromised computers belonging to individuals and businesses, now in control of some other individual or group without the knowledge or permission of the former group. These networks are used to spread spam, defraud people, and otherwise exploit the internet system.

A combination of factors have contributed to the present situation. The first is how virtually all computers are now networked. Using a laptop on a plane is a disconcerting experience, because you just expect to be able to check the BBC headlines or access some notes you put online. The second is the relative insecurity of operating systems. Some seem to be more secure than others, namely Linux and Mac OS X, but that may be more because fewer people use them than because they are fundamentally more secure. In a population of 95% sheep, sheep diseases will spread a lot faster than diseases that affect the goats who are the other 5%. The last important factor is the degree to which both individuals and businesses are relatively unconcerned (and not particularly liable) when it comes to what their hijacked computers might be up to.

Botnets potentially affect international peace and security, as well. Witness the recent cyberattacks unleashed against Estonia. While some evidence suggests they were undertaken by the Russian government, it is very hard to know with certainty. The difficulty of defending against such attacks also reveals certain worrisome problems with the present internet architecture.

The FBI is apparently on the case now, though the task will be difficult, given the economics of information security.