The promise of digital data is that bits don’t rot. But they do.
Data from the US Census of 1960 were lost because the tapes were obsolete and partly unreadable. They had to be restored from 300,000 rolls of microfilm stored in a refrigerated cave in Kansas.
In the mid-1970s NASA spent a billion dollars sending two Viking landers to Mars to search for life there. The biology data were lost: they were buried in thousands of pages of poor-quality microfilm archives, mixed in with engineering data and of too poor quality to be scanned; or, alternatively, stored as long sequences of numbers on CDROM without any indication of what the numbers meant. They were rescued only when a retired researcher was found to have kept some printouts on paper which could then be read and typed in by teams of students. (For reference, there is a nice Life on Mars paper here).
In 1986 the BBC’s Domesday Project was a 20th-century Domesday Book, with text and photographs from all over the country. The data were stored on two big silver laser discs and read by a special BBC-supplied computer which understood their format. There are no laser disc readers today outside museums, and the format was one that only the special computers could read. There were some of those in museums, but would any of them still work? The project was rescued from oblivion by the skin of its teeth and the data are now available on the web. The lessons of that narrow escape have been carefully forgotten: in place of video stills in an unreadable format (unless you had the right hardware and software) the photographs are compressed in JPEG format, which is unreadable unless you have the right software. Moreover, unlike the laser discs, nobody actually has the new Domesday data. We all have to rely on the BBC existing for ever and being for ever willing to keep them available to us. As available as Compuserve, BIX, or AOL? In a format as readily readable as 8″ floppy disks, Amstrad 3″ disks in their special cases, or the Sinclair Microdrive?
Meanwhile the special hardware required to read the Rosetta Stone is widely available, on either side of your nose; and the software is readily learned. The oldest photograph in the world dates from Spring 1838 and is readable with the same equipment (and no software). But what photographs does this generation have which will be readable in 175 years’ time?
There is an enjoyable article on digital preservation here. At least, there was when I wrote this blog entry. The link may rot at any moment.
I think of a number at random and write you a cheque for that amount. I put it in an envelope.
I write you a cheque for twice the amount and put it into another envelope.
I swirl the envelopes round a bit so you don’t know which is which, and invite you to take one of them. Whichever cheque is in that envelope, it is yours.
You pick an envelope but just before you open it, I offer you the chance to change your mind and take the other one instead.
There is a paradoxical argument which seems to say that you will always profit by changing your mind. Which is nonsense – but where is the hole in the argument?
I’ve been reading Halfway to Venus, Sarah Anderson’s memoir about having one arm: an experience enriched for me by the fact that when I met her on the train I never even noticed the empty sleeve.
She says at one point that being one-armed is in some ways more disabling than being one-legged, since losing a leg makes locomotion difficult but losing an arm transforms one’s life completely. My first thought was that I hadn’t thought of it like that – and naturally my second thought was: how would having only one arm transform one’s jiu-jitsu?
The answer is that it doesn’t stop one from becoming a black belt. A One-Armed Guide to Jiu Jitsu describes exactly how the black belt Aaron Lapointe deals with the situation, and his opponents. Like any black belt’s description of his own game, it makes one think more deeply about one’s own.
Off to Breakin’ Convention this weekend, and coincidentally ILL-Abilities will be performing again.
You can tell I’ve got some serious work on when I’m on here instead. Here’s another Proof from THE BOOK.
Write down 10 numbers. Any 10 whole numbers. They don’t even all need to be different.
- either one of them is a multiple of 10,
- or a sequence of them adds up to a multiple of 10.
For instance, if I write down 3 1 4 1 5 9 2 6 5 3, then 4+1+5=10 and 5+9+2+6+5+3=30.
(Note: the division into case 1 and case 2 is purely artificial. A mathematician would be happy with “a sequence of 1 number adds up to a multiple of 10″ but real people aren’t.)
In the days when I did not suffer from insomnia on trains the way I do now, the nearest railway station to home was Tonbridge, about 45 minutes from London.
The next station after Tonbridge was Ashford, 30 minutes further down the line. This was not theoretical knowledge. I knew it from experience.
It follows that when, one day, I woke up and the train was stationary at Tonbridge, I moved with lightning rapidity. I gathered together bag, coat, books, magazines, whatever, and tumbled with them, in a heap, onto the platform.
I was safe.
I then looked round and realised, one by one, three things.
- I was in London, at Waterloo East station, which looks a lot like Tonbridge.
- But it was all right, I had not leapt off the train 42 minutes too early, because…
- … I was not travelling home from London at all, I was travelling to London from home, for a party.
The good software disasters are the ones where nothing goes wrong and everything works.
Knight Capital’s suicide bid on 1 August 2012 was one of those. They have never admitted what caused them to buy shares at the offer price and immediately sell them back at the (lower) bid price, losing a little money each time, between 40 and 100 times a second, on many different shares, for almost half an hour, but the description at Nanex, although based on inference, seems close to the truth of how software development works:
- They developed some new market-making software: software to offer shares to people who want them and to bid for shares at a lower price from people who don’t.
- They had to test this.
- So they wrote a test program which placed orders to buy at the offer price, to see that the market-making software handled them properly, and which placed orders to sell at the bid price – very fast, to make sure that the software could handle high volumes of transactions.
- The tests worked.
- So they linked the market-making software into New York Stock Exchange’s systems, and at 9.30am on 1 August 2012, it went live.
- It worked.
- They accidentally included the test program in the software that went live.
- It worked too.
- So, many times a second, Knight Capital asked to buy a share at the offer price and immediately afterwards asked to sell it at the bid price, losing a little money every time and driving the prices wild.
- If the test order went to Knight Capital’s own market-making software, nothing much happened. Knight Capital was losing money to Knight Capital. But when it went to another market-maker, it was a real purchase or a real sale. Knight Capital lost a quarter of a million dollars every second. In half an hour, the rest of Wall Street was 440 million dollars richer at Knight Capital’s expense.
Because the test program was a test program and not doing real trades, there was no need for it to record the trades it was making, so it didn’t. The only way to tell what was happening while it was happening was to watch the monitors go wild. Here is a series of monitor pictures, at increasing levels of zoom, from the middle of the Knightmare. They are actually rather pretty.
But the thing to remember is that at all times and in all ways, all the software worked exactly as it was designed to work.
Well, almost pure.
I was leafing through Aigner and Ziegler’s Proofs from THE BOOK, which is a compendium of the most beautiful results in mathematics, and I came across a very simple puzzle which Paul Erdös used to use when he wanted to see if someone was really a mathematician. Here it is, without the algebra:
Think of the numbers from 1 to 100. Prove that if you make a collection of any 51 numbers between 1 and 100, at least one number in your collection will be divisible by some other number in your collection.
I am not a real mathematician. I am not patient enough. Knowing I’d regret it for the rest of my life, I read the solution. It is indeed very beautiful. To atone for my crime and assuage my anguish, here is the proof in a form that I think everyone will be able to follow. Try it. It’s fun.
DuckDuckGo is Google for anarchists.
It doesn’t track or remember anything you search for. It doesn’t synthesize an identity for you based on your searches. It doesn’t target advertisements at you based on your searches. And so on. DuckDuckGo is a Google that you don’t pay for with your soul.
This is a Good Thing.
Publishers tell one that each equation one includes in a book halves its sales. Hence the phenomenon of completely incomprehensible books which would have made sense if the author had only been allowed to say what he was talking about: João Magueijo and Marcus du Sautoy are notable victims.
The back page of La Recherche, always a good source of the information that really matters (it once reported a paper which established a negative correlation between designer stubble and life expectancy), now weighs in on the other side. A new paper reports that adding a random line of nonsense mathematics to the abstract of an academic paper on sociology or anthropology made the paper seem more authoritative – though the effect seemed to be confined to non-mathematical readers.