Geek stuff – Page 143 – a sibilant intake of breath

Server troubles

No sooner did I mention SQL database problems than I find myself in the midst of one.

At least once or twice a month, someone who I know endures a computational disaster. This could be anything from a glass of wine spilled on a laptop to some kind of complex SQL database problem. In the spirit of Bruce Schneier, I thought I would offer some simple suggestions that anyone should be able to employ.

The most important thing is simply this: if it is important, back it up. Burn it to a CD, put it on a flash memory stick, email it to yourself or to a friend. The last thing you want is to have your laptop hard drive fail when it contains the only copy of the project you’ve spent the last month working on.

Now, for a quick list of tips. These are geared towards university students, not those with access to sensitive information or large amounts of money:

Do not trust anything you see online. If you get an email from ‘PayPal’ or your bank, assume it is from someone trying to defraud you. It probably is. Likewise, just because a website looks reputable, do not give it any sensitive information. This includes passwords you use for things like your bank.
Never address email messages to dozens of friends. Lots of viruses search through your computer for email addresses to sell to spammers or use for attacks. If anyone in that fifty person party invitation gets a virus, it could cause problems for all the rest. If you want to send emails to many people, use the Blind Carbon Copy (BCC) feature that exists in almost all email programs and web based email systems.
If you run Windows, you must run a virus scanner. All the time. Without exception. If you run a Mac, run one in order to be sure you don’t pass along viruses to your friends. Both Oxford and UBC offer free copies of Sophos Antivirus. Install it and keep it updated.
Run a spyware and adware scanner like AdAware often. If you are not doing advanced things with your computer, be proactive and use something like Spyware Blaster. (Note, some of the patches it installs can cause problems in rare circumstances.)
No matter what operating system you run, make sure to apply security updates as soon as they come out. An unpatched Windows XP home machine is basically a sitting duck as soon as it is connected to the internet. See this BBC article.
Only install software you really need. Lots of free software is riddled with spyware and adware that may not be removed when you uninstall it. Especially bad for this are some file-sharing programs. If you do any kind of file sharing, the importance of having a virus scanner becomes imperative.
Never use secret questions. If you are forced to, fill the box with a long string of random letters and numbers. If you cannot remember your passwords, write them down and guard them like hundred dollar bills.
For your web browser, use Firefox. Safari is fine, but you should never use Internet Explorer. If a website forces you to (especially something like a bank), complain.
If there is something you really want to keep secret, either keep it on a device not connected to any network or encrypt it strongly. A user-friendly option for the latter is PGP. Whether it is some kind of classified research source or a photo of yourself you never want to see on the cover of the Daily Mail (once you are Prime Minister), it is best to encrypt it.
Avoid buying compact discs that include Digital Rights Management (DRM). Many of the systems that are used to prevent copying can be easily hijacked by those with malicious ends. See one of my earlier posts on this.
If you have a laptop, especially in Oxford or another high theft area, insure it. They can be stolen in a minute, either by breaking a window, picking a lock, or distracting you in a coffee shop. Aren’t you glad you made a backup of everything crucial before that happened?
If your internet connection is on all the time (broadband), turn your computer off when you aren’t using it.

Basically, there are three big kinds of risks out there. The first is data loss. This should be prevented through frequent backups and being vigilant against viruses. The second is data theft. Anyone determined can break into your computer and steal anything on there: whether it is a Mac or a PC. That is true for everything from your local police force to a clever fourteen year old. Some of the suggestions above help limit that risk, especially the ones about security updates and turning off your computer when it is not in use. The third risk is physical loss or destruction of hardware. That is where caution and insurance play their part.

If everyone followed more or less this set of protocols, I would get fewer panicked emails about hard drives clicking and computers booting to the infamous Blue Screen of Death.

[Update: 6 January 2007] The recent GMail bug has had me thinking about GMail security. Here are a few questions people using GMail might want to ask themselves:

If I search for “credit card” while logged in, do any emails come up that contain a valid credit card belonging to me or to someone else? I only ask because that is just about the first thing that someone malicious who gets into your account will look for. “Account number” and similar queries are also worth thinking about.
Can someone who gets the password to my Facebook account, or some other account on a trivial site, use it to get into my GMail account?
Have I changed the password to my GMail account in the last few weeks or months?

If the answer to any of those is ‘yes,’ I would recommend taking some precautionary action.

Great circles and airline routes

When flying between western Canada and England, it sometimes seems surprising that such a northward trajectory is followed. On my way back to Vancouver, for instance, we were treated to an aerial view of Iceland’s unique landscape. Of course, the reason for the path is that the spherical character of the earth is not well reflected in standard map projections. The most famous – the Mercator projection – is arranged such that a straight line drawn on the map will correspond to a course that actually passes through each point on the earth depicted. This kind of map is called ‘conformal.’ As such, the notorious distortion (enlarging the apparent size of polar regions while reducing that of equatorial ones) is an emergent property of its design.

That said, the most efficient course between any two points on the globe is probably not the one that connects them on a Mercator projection line of the shortest distance. Mathematically, the most direct course is based on what is called a ‘great circle.’ That is to say, imagine marking your present location and your destination using a marker on an orange. The line you could draw all the way around, intersecting both, is the great circle. The line segment between the points is the shortest distance that can be transcribed between them on a sphere (or near-sphere, in the case of the earth).

Unless you are going due north, due south, or straight around the equator, actually following a great circle path requires constantly changing your heading. This is because of how the line you are on does not maintain a constant bearing with respect to either magnetic or true north. In the days before computers and long haul air travel, few people would probably have bothered to calculate great circle courses. A more venerable option can be found in the Rhumb line. Now, GPS and autopilot systems have made doing so all but automatic. Hence the genesis of those gracefully arcing lines printed in your in-flight magazine.

On a separate note, the precision of modern location and navigation systems in aircraft can sometimes cause problems. (Via Philip Greenspun)

The awesome power of hardware

Is anyone else surprised by how emulated versions of games originally written for a machine with a 16-bit, 3.58Mhz processor can strain the capabilities of a computer with a 1300Mhz processor? The second machine has 363 more processor cycles per unit time, and a staggering 10,000 times more RAM. Dedicated chips are awesome; hence, the superiority of digital cameras that use hardware interpolation (all Canon cameras, for instance), as opposed to those that use software to interpolate using generic chips.

That said, it cannot really be denied that Super Metroid is the best of the Metroid series, Super Mario World is the best Mario game, and A Link to the Past is the best Zelda. Final Fantasy VII may narrowly beat Chrono Trigger as the best console RPG.

On electronic voting

There is some controversy in The Netherlands right now about electronic voting. A group has gotten hold of a voting machine, discovered that the physical and software security therein is very weak, and otherwise established the possibility that determined individuals could significantly impact election results through electronic tinkering.

The advantages of electronic voting are fairly numerous. Firstly, it could be made to happen more quickly. This may advantage the media more than anyone else, but it may as well be listed. Secondly, electronic devices could be made easier to use for people with physical disabilities and the like. Another advantage the system should have is increasing standardization between voting districts. Skullduggery involving dated or problematic machines in districts likely to vote in a certain way has been noted in a number of recent elections. Also, having an electronic record in addition to a paper one could allow for cross-verification in disputed districts. In cases where the results very starkly do not match, it should be possible to repeat the vote, with greater scrutiny.

The answer to the whole issue is exceptionally simple:

You are presented with a screen where you select from among clearly labeled candidates, with an option to write in a name if that is part of your electoral system.
The vote is then registered electronically, by whatever means, and a piece of paper is printed with the person’s choice of candidate, ideally in large bold letters.
For an election involving multiple choices, each is likewise spelled out clearly. For instance, “I vote NO on Proposition X (flags for orphans).”
The voter then checks the slip to make sure it is correct, before dropping it in a ballot box.
These are treated in the standard fashion: locked, tracked, and observed before counting.
The votes are tallied electronically, with a decent proportion (say, 20%) automatically verified by hand.
If there is any serious discrepancy between the paper and electronic votes, all the paper ballots should be counted. Likewise, if there is a court ordered recount on the basis of other allegations of electoral irregularity.

Electronic systems have vulnerabilities including hacked polling stations; transmission interception and modification; as well as server side attacks where the data is being amalgamated. Paper systems have vulnerabilities relating to physical tampering. Maintaining both systems, as independently as possible, helps to mitigate the risks of each separately and improve the credibility of the process. It is like having both your bank and your credit card company keep separate records of your transactions. If they do not match, you have a good leg to stand on when alleging some kind of wrongdoing.

This system could use relatively simple electronic machines, and may therefore actually cost less in the long run than all paper balloting. Critically, it would maintain an unambiguous paper trail for the verification of people’s voting intentions. Companies that deny the importance of such a trail are either not thinking seriously about the integrity of the voting process or have self interested reasons for holding such a position.

[Update: 14 October 2006] The Economist has a leader on electronic voting machines and the US midterm elections. They assert, in part:

The solutions are not hard to find: a wholesale switch to paper ballots and optical scanners; more training for election officials; and open access to machine software. But it is too late for any of that this time—and that is a scandal.

Quite right.

Threading aspirations

Warning: Blogging about Blogging

Continue reading “Threading aspirations”

Oxford from above

As a recent comment proves, there is at least one thing Microsoft does better than Google: display aerial views of Oxford.

Compare Google Maps, centred on Wadham College, with the Windows Live equivalent: enormously superior.

Here, you can see:

The Department of Politics and International Relations
2 Church Walk, my present abode
Christ Church, main quad
Radcliffe Square

Those pointed out, I should return to the overly loud MCR freshers party, and stop worrying about my ongoing student loan appeal dialogue. People should feel encouraged to list more nice Oxford locations in the comments (with links to Live Local photos).

March of the iPods

Today, iPod the Fifth arrived. They are packed much more compactly now than in earlier days. I suspect Apple is cutting costs in anticipation of having to compete with Microsoft’s Zune player, though, as always, it remains to be seen how successful that product will be. Everyone remembers the spectacular failure of the ‘rokr’ iTunes phone.

As regards iPod the Fifth, I hope it lasts as long as the previous four put together did.

In other news, the heating in our flat has suddenly been turned on. It hit my like a tropical blast as soon as I opened my door. I probably will no longer be sleeping in the woolen toque that Sarah P gave me.

PS. For some reason, iTunes 7.0.1 lacks the option to “only update checked songs” to the iPod. Since I was using that feature to keep a collection of songs small enough for the 20GB version updated, it will now not update at all, because the overall library is too big. I have come up with a crude hack (creating a smart playlist that includes all checked songs and having the iPod only update that), but doing so causes the device to only list that playlist, with none of my other smart or normal listings visible. Trying to add them all (even though they are the same songs as the ‘checked only’ list, causes a ‘not enough space’ error. Any ideas?

Seeking new Oxford bloggers

Oxford is positively laden with newly arriving students. At least some of them must be bloggers. If you are among them, please let a comment with a link back to your site (if you want it added to my listing of Oxford blogs). Likewise, if anyone has found such a fresher blog, please leave a comment that links back to it.

I will not link blogs immediately. Rather, I will wait to see that they:

have at least some real content
have been around for at least a few weeks

Otherwise, maintaining the list would take far too long, and too many items in it would be without much value.

All Oxford bloggers should remember that the fourth OxBloggers gathering is happening on Wednesday of 4th week, November 1st.

PS. Making a link in a blog comment is easy. Just use the following format, replacing the square brackets with pointy ones (the ones that look like this shape ^ turned on either side):

[a href=”http://www.thesiteyouarelinking.com”]the text you want for the link[/a]

That will make a string of blue text that says: “the text you want for the link.” When clicked, it will take the browser to www.thesiteyouarelinking.com. Every bit of the formatting is important, including the quotation marks, so be careful.

Basic problems with biometric security

You have to wonder whether anything other than having watched too many James Bond films feeds the idea that biometrics are a good means of achieving security. Nowadays, Canadians are not allowed to smile when they are having their passport photos taken, in hopes that computers will be able to read the images more easily. Of course, any computer matching system foiled by something as simple as smiling is not exactly likely to be useful for much.

Identification v. authentication

Biometrics can be used in two very distinct ways: as a means of authentication, and as a means of identification. Using a biometric (say, a fingerprint) to authenticate is akin to using a password in combination with a username. The first tells the system who you claim to be, the second attempts to verify that using something you have (like a keycard), something you know (like a password), or something you are (like a fingerprint scan). Using a biometric for identification attempts to determine who you are, within a database of possibilities, using biometric information.

Using a fingerprint scan for identification is much more problematic than using it for authentication. This is a bit like telling people to enter a password and, if it matches any password in the system, allow them into that person’s account. It isn’t quite that bad, because fingerprints are more unique and secure than passwords, but the problem remains that as the size of the database increases, the probability of false matching increases.

For another example, imagine you are trying to identify the victim of a car wreck using dental records. If person X is the registered owner and hasn’t been heard from since the crash, we can use dental records to authenticate that a badly damaged body almost certainly belongs to person X. This is like using biometrics for authentication. Likewise, if we know the driver could be one of three people, we can ascertain with a high degree of certainty which it is, by comparing dental x-rays from the body with records for the three possible matches. The trouble arises when we have no idea who person X is, so we try running the x-rays against the whole collection that we have. Not only is this likely to be resource intensive, it is likely to generate lots of mistakes, for reasons I will detail shortly.

The big database problem in security settings

The problem of a big matching database is especially relevant when you are considering the implementation of wholesale surveillance. Ethical issues aside, imagine a database of the faces of thousands of known terrorists. You could then scan the face of everyone coming into an airport or other public place against that set. Both false positive and false negative matches are potentially problematic. With a false negative, a terrorist in the database could walk through undetected. For any scanning system, some probability (which statisticians call Beta, or the Type II Error Rate) attaches to that outcome. Conversely, there is the possibility of identifying someone not on the list as being one of the listed terrorists: a false positive. The probability of this is Alpha (Type I Error Rate), and it is in setting that threshold that the relative danger of false positives and negatives is established.

A further danger is somewhat akin to ‘mission creep’ – the logic that, since we are already here, we may as well do X in addition to Y, where X is our original purpose. This is a very frequent security issue. For example, think of driver’s licenses. Originally, they were meant to certify to a police officer that someone driving a car is licensed to do so. Some types of people would try to attack that system and make fake credentials. But once having a driver’s license lets you get credit cards, rent expensive equipment, secure other government documents, and the like, a system that existed for one purpose is vulnerable to attacks from people trying to do all sorts of other things. When that broadening of purpose is not anticipated, a serious danger exists that the security applied to the originally task will prove inadequate.

A similar problem exists with potential terrorist matching databases. Once we have a system for finding terrorists, why not throw in the faces of teenage runaways, escaped convicts, people with outstanding warrants, etc, etc? Again, putting ethical issues aside, think about the effect of enlarging the match database on the possibility of false positive results. Now, if we can count on security personnel to behave sensibly when such a result occurs, there may not be too much to worry about. Numerous cases of arbitrary detention, and even the use of lethal force, demonstrate that this is a serious issue indeed.

The problem of rare properties

In closing, I want to address a fallacy that relates to this issue. When applying an imperfect test to a rare case, you are almost always more likely to get a false positive than a legitimate result. It seems counterintuitive, but it makes perfect sense. Consider this example:

I have developed a test for a hypothetical rare disease. Let’s call it Panicky Student Syndrome (PSS). In the whole population of students, one in a million is afflicted. My test has an accuracy of 99.99%. More specifically, the probability that a student has PSS is 99.99%, given that they have tested positive. That means that if the test is administered to a random collection of students, there is a one in 10,000 chance that a particular student will test positive, but will not have PSS. Remember that the odds of actually having PSS are only one in a million. There will be 100 false positives for every real one – a situation that will arise in any circumstance where the probability of the person having that trait (whether having a rare disease or being a terrorist) is low.

Given that the reliability of even very expensive biometrics is far below that of my hypothetical PSS test, the ration of false positives to real ones is likely to be even worse. This is something to consider when governments start coming after fingerprints, iris scans, and the like in the name of increased security.

PS. Those amazed by Bond’s ability to circumvent high-tech seeming security systems using gadgets of his own should watch this MythBusters clip, in which an expensive biometric lock is opened using a licked black and white photocopy of the correct fingerprint.

PPS. I did my first Wikipedia edit today, removing someone’s childish announcement from the bottom of the biometrics entry.

[Update: 3 October 2006] For a more mathematical examination of the disease testing example, using Bayes’ Theorem, look here.

Category: Geek stuff