Security vulnerabilities in computer hardware

Why is trustworthy computer security impossible for ordinary users? In part because the system has multiple levels at which failure can occur, from hardware to operating systems and software.

Spectre and Meltdown show that no matter how careful you are about the operating sytem and software you run you can still be attacked using the underlying hardware. Another bug included at least in some VIA C3 x86 processors has similar ramifications.

These kinds of problems will be much worst with the “Internet of Things”, since bugs like Heartbleed will go unpatched, or even be unpatchable, in a lot of embedded computing applications for consumers.

How this site broke and got back online

The world is now full of technology that needs regular software updates to fix security vulnerabilities as they are publicly reported. This includes all of your computers (including cell phones, smart devices like TVs and sensors, and network equipment like routers). It definitely applies to website content management systems (CMS) like WordPress.

That’s why when WordPress 4.9.4 was released in February, my hosting provider DreamHost implemented my ongoing instruction to automatically update the software.

How WordPress works

For those who don’t know, WordPress stores posts, comments, and all sorts of other things inside a database based on open-source software called MySQL. The other big piece of WordPress is the programming language PHP.

You can think of the MySQL database as where WordPress stores everything it knows, and PHP as the machinery that lets WordPress operate and serve up what you ask for. You might think of websites as being like newspapers: all set up and formatted before you have anything to do with them. Actually, modern websites are created dynamically as your web browser talks to the server and the software on the server makes decisions about what to send you.

For example, consider the web address:

https://www.sindark.com/page/50/

WordPress is set up to show a certain number of posts per page, and then to allow users to scroll back through older pages if they wish.

When your web browser visits https://www.sindark.com/page/50/ some pretty complicated stuff happens on the server side. It works out how many posts there should be on each page, works out what should be on the 50th page, goes into the MySQL database to find the titles and contents of posts, as well as their authors and the number of comments on them, and then it puts together an HTML page which your browser displays.

Exactly how everything looks visually in WordPress depends a lot on themes. These are collections of files that tell WordPress what typefaces to use, where to locate design elements, what to show on each page, and more.

For years this site used a premium paid theme called Thesis. Specifically, it used the latest version of Thesis 1. Sometime around 2012, Thesis 2 was released but, whereas Thesis 1 allowed non-expert website operators to set up the look they wanted with simple menus, in my opinion Thesis 2 doesn’t help all that much in designing a look from scratch and requires essentially a web designer’s capabilities to use.

So, the site was using what was arguably an antiquated theme before the WordPress 4.9.4 update was installed.

How the site broke

For someone as non-expert as me, a big piece of software like WordPress is like a space station. It’s complicated and I don’t begin to understand how most of it works, but I can see when things have gone badly wrong, like because the station modules are full of smoke or, worse, literally nothing.

WordPress themes store information in the MySQL database, such as the location and content of menus.

It’s not that Thesis 1.8.9 (the latest version of Thesis 1) is incompatible with WordPress 4.9.4. My old climate-focused site BuryCoal still uses the theme and upgraded just fine, as did my professional photography site durablepigments.com.

Computers make a lot fewer mistakes than humans, but they do happen. A file you download can have some of its contents incorrectly transmitted, and a processor can perform an operation incorrectly on data. Of course, bugs in software can produce errors too.

First, some kind of error broke the back-end system that allows a WordPress site operator to create new posts, manage comments, change how the site looks, and so on. At that stage, all the visitor-facing parts of the site still looked normal. I just couldn’t manage the site from my end as usual.

I put in some effort trying to fix the site, eventually leading to it going down completely. This highlighted the importance of not allowing my ignorance and DreamHost’s limits to permanently wreck the old database. It had problems that kept the site from working, but it was still a good copy of all my posts and all your comments.

Fixing the site

Job number one was to avoid destroying all the years worth of content on the site. Tinkering with the MySQL database, undertaken by an absolute non-expert, carried a considerable risk.

This site is hosted using the least expensive plan DreamHost provides, which is called shared hosting. The name is a little misleading, because even sites on more expensive plans “share” the computer server where they operate with other sites. Those higher-end plans, however, promise you a certain amount of resources like RAM. On shared hosting, an unknown number of sites are all sharing those resources which, among other things, makes it possible for a big jump in popularity on some totally unrelated site to slow down yours.

Shared hosting has other limitations. Crucially, in this case, DreamHost limits which tools you can use to work with your MySQL databases. Through their website they provide a tool called phpMyAdmin which theoretically lets you do things like modify the content of databases, export their contents, and import contents into a new blank database.

Unfortunately, phpMyAdmin suffers from one huge limitation that crops up commonly in shared hosting. If you ask the server to handle too much data, it gets overwhelmed and gives up. This happens to me constantly when I try to upload photos to the site (indeed, that frustration is a big reason I have been considering leaving shared hosting and/or DreamHost). For a site with as many posts and comments as this one, a lot of what I read online suggested that this could be a problem. One major alternative — copying the database using Secure Shell (SSH) isn’t allowed for shared hosting users.

At the beginning of March, I was struggling with efforts to make a copy of my MySQL database to tinker with without risk of breaking the original.

There’s actually a bigger problem, though. Think for a moment about a typewriter. It has all the letters of the alphabet, punctuation, and probably some special symbols like & and ^. With computers, there are different character encodings which similarly include letters and symbols. A basic one, ASCII from 1963, doesn’t handle much more than the typewriter. It basically includes Arabic numerals 0–9, upper and lowercase letters from the English alphabet, and standard punctuation.

But people use computers in languages other than English which include diacritical marks and characters not used in the English alphabet. People also use special punctuation marks like endashes and emdashes. Partly for these reasons, Unicode was developed in the 1980s, eventually allowing people to use all sorts of characters. WordPress, like many computer systems, now uses a UTF-8 character encoding.

To summarize: WordPress is software that helps you turn content like the text of blog posts into a website people can access. It stores that content in a MySQL database, and the content of that database is encoded using UTF-8.

This next bit is a little tricky and probably won’t have occurred to most web users. Using a system like UTF-8 can be risky in a variety of ways. For example, it contains characters from foreign alphabets which look indistinguishable from English letters but which are known to be different by computers. This could allow somebody, for instance, to register a website that looks visually like google.com but which is actually run by the person who made the site with the non-English characters.

Even when it comes to importing new content into a MySQL database UTF-8 could cause problems, so phpMyAdmin will take certain non-standard characters and replace them with what looks like gibberish on import. So, the Greek letter delta imported into phpMyAdmin becomes Δ and `smart’ quotes, which I hate because of these kinds of problems, but which the Thesis theme uses, turn into “ and â€.

So, even when I succeeded in importing my old database into a new one (to be able to fix the site without risk of breaking the original), the new version contained many thousands of errors. I didn’t want to keep adding to a site full of errors, since I realized it should eventually be possible to get a properly copied database.

More on encoding and the web:

The fix

Anyway, it turned out that the pretty basic steps I had been asking DreamHost to use all along worked fine as soon as I found a customer service representative willing to read through and implement them.

I’m not the first person who had this problem with character encoding and phpMyAdmin. Early on I found a website called Orthogonal Thought which describes the problem and some ways to fix it.

Unfortunately, the fixes are done via SSH, which DreamHost doesn’t allow with MySQL for those on shared hosting. I had to get someone on DreamHost’s side to run these commands.

And so began an agonizing process of submitting customer service ‘tickets’, as requests for help are often called in the world of information technology. In each I tried to explain what the problem was, and in each I directed the tech support person to the post on Orthogonal Thought along with a request that they make a copy of my database with characters intact.

DreamHost tech support person after tech support person then did one of three things: refused to help because they thought this problem was something I should fix (despite how the necessary tools are denied to those on shared hosting), made a copy of the database where the character encoding was still broken, or made a copy of the database that somehow didn’t work with WordPress at all. In March, “John R” gave me the “not our problem” treatment, while the efforts of other tech support personnel yielded a set of unfixed databases through April and May.

I sought help from other forums and expressed my frustration on Twitter, leading to many messages from other web hosting companies explaining how bad DreamHost shared hosting is. In many cases, the people operating Twitter accounts for other hosting companies provided me with tech support via Twitter, trying to find ways to copy the database properly myself.

After months with the site down, in desperation I started tweeting at all the people who describe themselves as DreamHost employees in their Twitter bios like @DreamHostBrett whose Twitter handle is in their newsletters, “WordPress Core Developer” Mike Schroder, and “Product Marketing Manager” Jennifer Kay. None of them responded to me, but this prompted another round of exchanges with the DreamHost tech support Twitter account @DreamHostCare.

Finally, a day ago, one of their tech support people emailed me to say they had made a good copy of the database. Indeed, they finally had.

Aware that other people have had and will have this problem, I asked for the solution they used and was told by email: “Per our manager “I made sure to include –default-character-set=latin1” and changed it to “changed latin1 to utf8″”. They had used one of the fixes from the blog post which I had been sending them all along.

There doesn’t seem to be much appetite at DreamHost for looking into and fixing problems with their customer service. That plus all the site reliability problems that have cropped up due to shared hosting over the years have me still searching for alternatives. Probably, I will test out another hosting provider with a set dedicated to my PhD research and move everything over there once I am confident it’s better.

I hope some people from DreamHost will read this and reflect on what it says about the effectiveness of their tech support. One huge problem is how every time a new ticket is created it seems to get randomly assigned to someone new who doesn’t understand the background to the problem. I have been told there is also no way to elevate the problem to the attention of a manager when it proves beyond the capabilities of the first-line tech support people. Unwillingness or inability to follow simple instructions has been the problem all along here, and I would like to hear that they have some intention of making things better.

If they want to credit me back for the nearly four months my site was down, I would be open to that too.

Urban mesh networking

One fascinating dimension of software-defined radio is the ability to establish mesh networks: distributed data sharing systems where each computer involved is a node which can carry traffic on behalf of others. That means that as long as you have solid radio links you can establish a network that can transmit information independently from the commercial internet, run by the kind of telecom companies that provide home internet connections. If you then connect some parts of the mesh to high-quality internet connections, you can share internet access over the mesh network.

This is all part of the plans of Toronto Mesh, a group that meets at Robarts Library and is planning to set up such a network in Toronto. NYC Mesh is much father along: with Manhattan and Brooklyn ‘supernodes’ in place which provide internet access through the mesh.

There are numerous advantages to a mesh network. It can free people of all the bad behaviour from local telcos: charging monopoly prices, slowing down traffic from some sites, engaging in surveillance themselves or supporting government surveillance, etc. It also holds the promise to create more resilient networks which are better able to cope with societal disruption. Building infrastructure of that sort will be important as climate change continues to destabilize human and natural systems.

I’m collaborating with Toronto Mesh to propose a hardware and software development partnership with the Campus Co-Operative Residence. They have a large number of houses within 1.5km of each other, share many of the values of Toronto Mesh, and would likely value the ability to control and enhance their internet access in the ways mesh networking would allow. The proposal is circulating for comments and for people to start getting familiar with it now. Soon we will develop a formal version with cost estimates to go to the Co-Op board.

AI + social networks + unscrupulous actors

Charlie Stross’s talk at the 34th Chaos Communications Congress highlights risks associated with artificial intelligence technologies in combination with factors like geolocation, the engineering of content online to produce emotional responses, and people with malicious objectives from manipulating elections to harassing women seeking abortions.

It’s worth watching, and starting to think about what sort of regulatory and technological barriers might be erected to such abuse.

Breaking loops

As an experiment in living and in an effort to protect my sleep I have set my router to disable internet access from all my devices between 2:00am and 7:00am seven days a week.

Especially when I am feeling down and wishing I could avoid things, there is a temptation to just keep clicking through YouTube videos, Wikipedia articles, or news stories.

Contrary to the pervasive idea that being well-informed is all about being apprised of the latest information, there is good reason to think that the newer information is the more likely it is to the incorrect, incomplete, or useless. Over time, we filter information by quality, put things together, and benefit from additional context. That makes the news from a weekly or monthly magazine more likely to be informative than the news from the current Google News page or a social media feed, and it means reading a book which society has determined to be important almost certainly carries more lifetime value than reading the same number of words from breaking news stories.

There are other self-harming loops I have been working to disrupt in myself and better understand in other people. Despite a lot of anguish and turmoil, the overall experience of the last couple of months suggests that improvement is possible.

What3Words

In an illustration of combinatorial mathematics, what3words.com will represent any location on Earth as a set of three simple English words.

It’s intended to help in cities that lack formal maps and street names.

The points it distinguishes are close enough together that for a building of any size you get various choices.

The Toronto Reference Library could be journals.nuggets.nipped.

Toronto’s best kite-flying spot: agree.rewarded.lasts.

High Park’s labyrinth? hatched.alarm.riding or drainage.draining.kitchen or playing.training.achieving.

Concept for a portable computer device: the Triple Pi

I have been curious about picking up a Raspberry Pi one board computer.

They are the standard hardware for nodes on the Toronto Mesh network, so with a suitable USB radio transceiver I could use it in small areas as a bridge to their network via the IPv6 Hyperboria network.

I could also use it to run Linux-based software-defined radio (SDR) software in combination with my USB radio receiver dongle. I could set up software to locate digital signals and then decode those which are not encrypted, or use it as a portable radio scanning rig.

At the same time, there seems to be awesome game emulation software which can be run on a Pi. With two USB-interface SNES-style controllers, I am told it has enough processing power to make a great SNES emulator.

I don’t have a screen with an HDMI input, so it might be worth getting a small portable display to use with the system. One neat idea would be to make the whole thing capable of running on its own batteries.

To start with, I will try to get a working setup that runs with the Pi and the display plugged into the wall. If it seems useful enough to be made portable, I’ll start thinking of battery hardware and case and transportation options for the whole system.

The hardware to get started will be about $100 plus the cost of the display. ToMesh has an installation party later this week where they will install the operating system and software stack necessary to use your Pi as a node.

genre experiments

For two years I have been working on an art project.

I’m not sure whether the concept predated when I first heard James Allard’s lecture on Mary Shelley’s Frankenstein, but the lecture is a great demonstration of how labeling does interpretive work when it comes to art.

Presented with a digital file, we may struggle to decide what it is in both a technical and artistic sense.

Perhaps it’s an HTML file with embedded image files being displayed in a web browser, or the raw data from the sensor of a digital camera. In either case, it’s also an object within a software and operating system-defined architecture and also bits physically written to some data storage medium.

From an artistic perspective, it may be a line from a play quoted in a piece of art which has been photographed and posted online (or a screenshot of a cell phone app displaying a tweet of a digital photo posted online of a print of a photograph taken illicitly in an art gallery, on display in that art gallery).

The multiple presentations of the same data are the idea of interest: like all the exposure and white balance modifications that can be applied to a raw file from a digital camera, meaning that every photograph arising from that process is an interpretation.

These experiments are also intriguing insofar as they concern cybernetic relationships between individuals, organizations that archive data (like search engines), algorithms nobody fully understands, and governments. The location of a data file on the internet does everything to establish its visibility and significance.

The idea of the project is that every distinct work within it is presented to the viewer with multiple possible modes of interpretation, whether they are based on data architecture, metadata, or the cultural and political content of the human-readable image.