Two questions for data management

1) How bad would it be to lose it?

If the answer is ‘bad’ then you absolutely must have at least two copies.

If the answer is ‘terrible’ then at least three copies is wise. At least one should be in a safe place off site. Incremental backups are better than basic ones, since files do get corrupted and vanish.

Ideally, you want both up to date incremental backups and complete snapshots taken at regular intervals and checked periodically for integrity.

2) How bad would it be if it ended up all over the internet?

If this would be a problem at all, there is a whole universe of precautions to consider:

Hardware: firewalls, built in encryption, air gapped systems and storage, networking hardware (including WiFi), etc

Software: cryptography, operating systems, malware risks (including pirated software), intrusion detection systems, etc

Behaviour: physical access control, data retention and destruction, passwords and secret questions, backup processes

Digital data is the sort of thing where not only can the cat get out of the bag, but it can get out, get copied a billion times, and become your name’s top Google hit for the rest of your life.

Evil and non-evil Facebook buttons

Many websites now include Facebook buttons and widgets of various sorts. As a user, it is worth knowing that if you are logged into Facebook, many of those buttons and widgets can be used by Facebook to track your web use and  link it to your real identity. 

This site has a Facebook button, as well, but it is a graphic that loads from my own server. It does not allow Facebook to add to their trove of data.

That said, Google has its own massive data pile, which this site contributes to in obvious ways like content being indexed and less obvious ways like Google Analytics visitor tracking.

Flex your rights: anonymity

Being able to speak anonymously on the internet is an important right, in this age of increasingly constant surveillance. Because of organizations like the NSA, GCHQ, and Canada’s CSE, we can never know when our private conversations are actually being intercepted.

One tiny way to push back is to continue to be bold in asserting the importance of freedom of speech, even what circumstances compel that right to be used anonymously.

To leave anonymous comments on this site, just use whatever made-up name you like, including ‘anonymous’. If you use anon@sindark.com as your email address, you will get an anonymous logo beside your comment.

None of this is intended as an endorsement of the amorphous group ‘Anonymous‘.

2010 blog finances

For most people who put content on the internet, the deal provided by one company or another is this: you provide the content, we will put ads beside it, and we will pay for the servers and bandwidth necessary for hosting a website. More sneakily, sites like Facebook make their money by selling the personal information of users, in addition to selling targeted advertising (which is increasingly the same thing).

Some sites do all this earning and paying indirectly, with the people running the site outside the advertising/hosting cost loop. Alternatively, it is possible to do both yourself: sell ads and pay for hosting.

Lately, this site has followed the latter model. I pay for hosting and I have revenue from automatically-generated Google AdSense ads. The costs largely balance out. Between 1 January 2010 and 1 January 2011, this site and BuryCoal.com collectively received C$296.38 in advertising revenue. During the same span, I paid US$249.70 collectively to DreamHost and Flickr.

Would people feel more comfortable if this site was hosted by a third party that kept the advertising revenue, rather than self-funding in this way? One consideration is scaling hosting to demand. With a third party they would handle it, but I couldn’t choose to pay for performance improvements. For instance, moving to a private VPS account on DreamHost would cost US$15 per month, but would probably make the site quicker and more reliable.

[Update: 11:36pm] I have always encouraged readers who disliked the ads to use Firefox with the AdBlock Plus plugin.

On sindark.com

sindark.com might seem like a rather random URL for this site, which consists of a mixture of posts on climate change, photography, Ottawa, and other general subjects of interest to me. The genesis of the name is a long one. Back when I was an undergraduate at UBC, a friend of mine exposed me to the James Joyce poem “Nightpiece” which contains the sonorous line: “Night’s sindark nave.” I chose that as the title for my blog at the time, which was still produced and hosted using Google’s Blogger service.

The site underwent several evolutions – moving to a private hosting company and eventually to being managed through WordPress. It also got a major update after I finished at UBC. Along with that update came the new name: “a sibilant intake of breath”. As such, the current name has nothing to do with the current URL, except insofar as both are taken from literature.

The address of the site is potentially problematic, insofar as it contains misleading theological overtones. It may communicate something a bit useful, in that this site is pretty anti-religious, but that is hardly the most important thing to highlight. As such, it is probably a good idea to eventually migrate to a new address, probably leaving all the old content where it is now.

The new address should ideally be something short and memorable, which is certainly challenging in a crowded internet landscape. I would strongly prefer for it to be .com, rather than .org or .net or anything like that. That preference isn’t driven by the view that .com sites are commercial. Rather, I just see .com as the default and easier for users to remember and use than any of the alternatives. It also offers the most flexibility, since the content of the site is not partly linked to the name.

Something like milanilnyckyj.com or ilnyckyj.com would be possible, but both are impossible to spell and less memorable than a more common word or combination of words. Perhaps I should dig back through some of my favourite pieces of writing to find a snippet of text that passes the tests of being concise, sticking in the mind of the reader, and being available with a ‘.com’ appended to the end.

Legal chess positions versus IPv6 addresses

Based on recent minimal research, it seems like there are probably more legal chess positions than there are addresses in Internet Protocol version 6 (IPv6). Wikipedia explains that there are 3.4 x 10^38 IPv6 addresses, and explains that Claude Shannon estimated the chess figure at 10^120, though other estimates exist.

If there are more chess positions than IPv6 addresses, it means you could devise an algorithm to represent the address of an internet-connected machine using IPv6 as a legal chess position, and that there would be enough chess positions to represent every possible IPv6 address. For instance, you could devise a set of rules that would produce an exhaustive set of chess positions, then generate the whole set and start numbering them using IPv6 addresses. You would start with a legally set up board, then assign IPv6 addresses to the positions that can be achieved through every possible move. Then, keep going until your rules have produced the gigantic complete set of possible legal chess positions. It would be like a rainbow table.

That would be a neat way to express the addresses in a human-readable form. It also means that you could translate the address of any device into a playable chess game, though a lot of them would be very lopsided, in terms of which colour has the advantage.

‘Track changes’ in calendars

One neat thing about software like MediaWiki (which powers Wikipedia, among other sites), is that it keeps a record of every change that is made to a document. That way, it is easy to see what the history of changes has been and respond when information changes.

It seems to me like it would be very useful to have the same technology in my calendar. So often these days, things get moved around and re-scheduled. It would be useful if I could annotate my calendar to know what is certain and what is uncertain, which appointments have already been rescheduled, and so on.

It would also be useful for situations where something accidentally gets deleted. If I delete my only record of an event, the chances of me remembering and showing up are virtually nil. That is one reason why I maintain a paper copy of my calendar in a page-a-day Moleskine, in addition to the Google Calendar I update from computers and my phone.

Radio frequency ID security

Contact-free cards and authentication tokens have become common. These are the sort of things that you put close to a reader on the wall in order to open a door or perform a similar function. People use them to get into parking garages and offices, and even credit cards now allow you to pay without swiping or inserting your card. Of course, all this creates new security risks. All of these cards can be read at a moderately long distance with inexpensive hardware, which is one reason why it is a bit crazy that these chips are being put into passports. Furthermore, cloning these radio frequency identification (RFID) tags is often quite easy.

Your standard RFID tag is just a little chip with an antenna. When it receives a signal on a particular frequency, it chirps out its name. The card reader says: “Any RFID tags out there?” and it says: “12345678abc” or whatever string it contains. The string is transmitted in clear text, and it is always the same. Anyone with a device that can program RFID tags can easily copy it. These sorts of tags exist all over the place. An office tower might have a database listing the code inside the RFID tags used by each employee. It would then check the database each time someone used a card, to make sure the number was on the list.

This system can easily be attacked. Just stand outside a building with an appropriate antenna and recording equipment and you can capture the code from each person’s tag as they go in. You can then copy whichever you like to make your own access card.

More sophisticated tags use a challenge-response authentication protocol. That means they take an input value, perform a mathematical operation on it, and generate a response which they transmit. For instance, an absurdly simple rule would be something like ‘multiply input by two’. Then, the reader would say: “3” and any card that replied “6” would be accepted as valid. These tags tend to require a battery to run their computing hardware, so they are relatively rare.

This is harder to attack. You need to figure out what the rule is, and they are often cryptographic. That being said, the cryptography used is often either proprietary (which usually means ‘bad’) or out of date. With access to a few tags and some knowledge, it may well still be possible to reverse-engineer the algorithm being used and clone tags.

In addition, this kind of system can be attacked in real time, using a man-in-the-middle attack. Suppose I am in line at the grocery store, about to pay. I take out a dummy wireless credit card, while I have an antenna concealed in my jacket sleeve. The clerk’s RFID reader sends a challenge request, which my antenna picks up. I then re-broadcast that request with more power, so that all the tags nearby chirp up. Suddenly, everyone in line who has a wireless card is offering to pay for your groceries. Re-broadcast one of those responses back to the clerk’s card reader and you suddenly have free groceries. I suspect something similar would work with the more high-security access cards used by some offices.

Not all cloning is necessarily malicious. Phones are increasingly sophisticated radio transmitters and receivers. They can transmit voice calls on various frequencies, as well as access WiFi networks and interface with Bluetooth devices. Somebody should make a phone that can transmit and receive on the common frequencies used by RFID cards. Software could then be used to record the contents of a person’s existing cards. Instead of carrying one fob for your car, one card for work, one embedded in your transit pass, and a credit card, you could just program the functionality of all those RFID tags into one device.

Of course, doing such a thing would reveal how easy it is to copy RFID cards in the first place. That’s all it would be doing, however – making it obvious. Anybody who is malicious and capable can already copy these cards, though consumers often assume that they are secure (like they assume their cell phone calls cannot easily be intercepted by moderately resourceful crackers). By revealing how insecure most wireless authentication technologies are, this cell phone software could play an important role in raising awareness, and maybe even lead people to pressure politicians to get rid of those stupid wireless passports.

I mean really, does that have any non-evil uses at all? A passport clerk can easily scan a barcode or swipe a magnetic strip. Making them readable at a distance only helps spies and criminals. How easy would it be to build a bomb and connect it to a machine that constantly scans the vicinity for wireless-equipped passports? You could program it to explode when more than a set number of nationals of any country you dislike are within a particular distance. Alternatively, criminals could take advantage of chatty radio passports to identify promising targets for mugging.

Republican speculation, via psychic powers

The other night, talking with my friend Jessica, it occurred to me that it could be possible to set up a kind of internet sensation based around the upcoming American presidential election (how early they become ‘upcoming’!) and ‘psychic’ claims of the sort that made an octopus famous during the World Cup. All you would need is pictures of all the plausible Republican contenders and some mechanism for deciding who among them will win on the basis of supposed supernatural powers. An octopus could work. Another idea would be a very young baby, the cuter the better.

In order to draw things out and give advertisers time to start hocking their wares alongside your videos, you could follow a process of elimination, in which candidates are rejected rather than selected. Naturally, you would want to rig the selections so as to produce the most total viewership. A good idea would be to do something a bit controversial at the outset – like reject Sarah Palin. Then, start working through the no-hope candidates as you are building momentum. Rigging the outcomes would be incredibly easy: just keep making videos until you get one where your preferred selection is made.

By the end of the Republican primary competition, when there are only a few plausible candidates left in the race, there would be a reasonable chance that you could simply guess correctly, cementing the reputation of your chosen psychic vessel as the real deal, at least in the eyes of a credulous few. Naturally, you would then want to make a prediction on the actual election. Chances are, you will be able to guess correctly on the basis of sophisticated polling of the Nate Silver variety, along with an assessment of key economic indicators.

If you wanted to keep exploiting the gullibility that seems widespread within the general public, you could use your advertising earnings as seed money to start a cult.