Tools and methods

5 Comments

The other day I received an email from Andrew Gray, a reader of this blog, alerting me to the existence of a new online newspaper archive available at ukpressonline. I've used ukpressonline before for its complete runs of the Daily Express and the Daily Mirror, which were the most popular British dailies for most of the 1930s and 1940s. But it's not a free service. I don't mind paying, but the annual subscription rates are too prohibitive for me, and so when I do pay it's only for short-term access with a specific topic in mind. So it's not something I routinely draw upon.

But what Andrew pointed out (thanks Andrew!) was a new 'World War II' subscription package covering just the years 1933 to 1945, ie from the rise of Hitler to the end of the Second World War. It's only available by annual subscription, but I think £50.00 is more than reasonable for what it offers: not only the Express and the Mirror, but also the Yorkshire Post (one of the few conservative newspapers to take a stand against appeasement), the Daily Worker (owned by the Communist Party of Great Britain), and Action and Blackshirt (published by the British Union of Fascists and its successors). And it is promised that 'In the coming months, we aim to add major regional newspapers and some of the further-left press' (I would guess that the Yorkshire Post and the Daily Worker are the first of these, actually). This is a really excellent resource for anyone interested in the British press in this period; I've already signed up and started using it.
...continue reading

6 Comments

The Scareship Age, 1892-1946

A couple of months ago, Alun Salt did a very nice thing for me: he unexpectedly assembled some of the posts I've written here about phantom airships into an e-book. Using that as the basis, I've had a go at learning how to do e-books myself. (Alun recommended using Jutoh, an e-book project manager, and I'm glad he did.) So I've tweaked things a bit; added a few of the recent phantom airship posts I've written recently, played with the cover image, and the result is The Scareship Age, 1892-1946, available in the two most common e-book formats: EPUB, an open format, and MOBI, the format used by Amazon's Kindle. You can download them here, from the Downloads page, or from the sidebar on Airminded's front page. They are of course free, as in Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported.

I have tried this sort of thing before, with my Sudeten crisis posts, but that was as a PDF which is not really suited for e-books; and with all the images it turned out to be quite bloated at 5.6 Mb. The Scareship Age comes in at 0.5 Mb for the EPUB and 0.9 Mb for the MOBI, which is much better. Now that I have a better idea about how e-books work, I'll have another go at the Sudeten crisis. But not now!

13 Comments

Aerial terminus of the White Moon Line

TRAVELLING OF THE FUTURE: THE BRITISH AERIAL TERMINUS OF THE WHITE MOON LINE -- The old order is passing. Already glimpses of the future of aerial transport, with all its mighty possibilities, are becoming visible. When the stricken nations return to a state of prosperity, great things are in store. As to what economic and commercial revolutions are latent in the development of flying, the most daring of us hesitates to speculate. The picture shows an aerial terminus of the White Moon Line, raised aloft over a seaport. This is no flat aerodrome, but a huge circular structure. Around its topmost circumference platforms swinging on a circular railed bed are carried by two rotating arms, on which the aero liners alight and from which they ascend. The arms are moved round as the wind changes, so that the aero liners descend and ascend facing it. These arms are inclined a little downwards to bring the liners more quickly to rest -- they alight up the slope -- and to assist them to gather speed more rapidly before the final breathless abandonment of the sloping platform and the upward rush into the heavens. On the left is seen a passenger lift with two cars which rise and sink continually, carrying passengers to and from the high embarking level. A mono-railway penetrates to the heart of the terminus; a footway runs between the tracks. An aero liner is seen just ascending, bound on some far journey; another is stationary, loading up. Inside the structure is a huge lift for lowering the aero liners for refitting and repair, and in its mysterious depths we can picture workshops lit by flickering arc lamps, where hundreds of mechanics work busily day and night... Perhaps some of the future aerial termini will be on the ground; but where a man can find no ground near the starting point, he will raise structures such as this. The sea-captains will look upwards at the air-captains, beholding the fulfilment of a great dream, dreamt by generations of wise men long passed away, who wondered because they knew that such great things would come to pass. From the original by Roderic Hill.

Source: Flight, 6 January 1921, 10-1.
...continue reading

1 Comment

So, THATCamp Melbourne is over. It was pretty much as I expected, which is to say it was excellent. I'm not going to write a conference report (you should have been following #thatcamp on Twitter for that!) but two sessions did give me ideas for digital history projects I might like to do. One day. If I get the time.

One came out of the unofficial API Tim Sherratt reverse-engineered for Trove Newspapers. (Why the National Library of Australia won't release an official API is a bit mysterious.) He uses that to scrape Trove to do searches and display results which aren't possible with the interface offered by the NLA, such as plotting the frequency of Australian vs British/Briton. Are there any publicly accessible datasets which I use which could benefit from the same treatment? Yes, there are. The first one I thought of was the Flight archive, which is a great resource burdened with a limited interface. (But it's fantastic that it exists at all: Flightglobal is a commercial operation and they didn't need to open up their back issues like this at all, if they didn't want to.) I think this is easily doable. A second one is much more ambitious: The National Archives catalogue. It's frustrating that you can't do keyword search across their digitised collections; all you can do is search the descriptions in the catalogue, and these are by their nature limited. A scraper would help here. But the problem there is that you can't download documents directly, even when they are free; you have to add to a 'shopping cart', pay £0.00 for it and wait for an email to arrive. Possibly this could be automated; possibly not.

The other idea I had was to use SahulTime (or its eventual successor, possibly called TemporalEarth) to display the British scareship waves. SahulTime is something like Google Earth, but it allows you to map events/documents/people/objects in time as well as space. Matthew Coller, the developer, originally devised it to represent archaeological data on migration into Australia across the ice-age land bridge, but it is just as useful for historical data. So I could use this to show when and where the scareships were seen, showing how the waves started and evolved, with links to the primary sources. SahulTime is also good at displaying uncertainty in time, which is helpful where I have only vague information about when a sighting happened. The same could be done for uncertainty in space, though that's a bit trickier conceptually.

One day... if I get the time...

8 Comments

Later this week I'm going to THATCamp Melbourne. What's THATCamp, you ask? THATCamp stands for The Humanities and Technology Camp. It's an unconference devoted to exploring the ways in which the humanities and digital technology can work together. It is informal and collegial: attendees vote on the programme on the first morning. It's practical and hands-on: digital projects are often started during the camp, or tools written, or software installed. The first THATCamp was held at the Center for History and New Media at George Mason University in Virginia in 2008; last year there were 17 held around the world, including one in Canberra. Melbourne's is being held at the University of Melbourne, where I work and near where I live, so it would be hard to justify not going!

But the truth is that I did have qualms, because I don't consider myself a digital historian. Sure, there's the blog. But that's about communication, not research; and research comes first. And apart from using digitised sources where possible, my research methods are quite traditional. I find sources, I read them, I compare them, I draw conclusions, and so on. I imagine Gibbon did much the same.

In some ways, this is surprising. In my day job I work in systems administration and IT support, so it's not like I don't know my way around computers. And before history, I studied astrophysics, which has long used digital technology as an integral part of its methods. Indeed, about the first thing you do when you start out learning how to do astrophysical research is to become familiar with the analysis software you'll be using. And my masters project was entirely computational: I wrote, tested and debugged code. (Written in Fortran 77, no less!) So I'm sure that, when I came to do my PhD, I could have handled a project which was much more digital and less traditional in its approach if I'd wanted to.

But that's the thing: I didn't want to. Why leave a career in IT for one in history (and I still hope that will happen) and do the same kind of thing, just for a different end? Fiddle around with Apache installs, write justifications for storage arrays, think about database structures. That's what I want to get away from. What I want to do is read old books, uncover forgotten ideas, meet interesting (albeit usually dead) people. (And tell the world about it, which is where blogging comes in.) I would guess that most historians have similar motivations. And that's the problem for digital history. The types of people who are attracted to doing history are not likely to be attracted to doing digital history. (I have similar reservations about Anthony Grafton's recent call for more collaboration between historians, in emulation of the sciences. We tend to play better alone.)

This is not because digital history has no value: it clearly has vast potential. But at the moment it still belongs to the hackers, those who enjoy creating visualisation tools and XML datasets. It won't realise its potential until every historian is a digital historian, and that won't happen until doing digital history is as natural and painless as... well, as natural and painless as doing traditional history is, anyway. The technology needs to adapt itself to the users, in other words, not the other way around. Well, in reality both will happen; but we aren't there yet.

That said, I'm still excited to be going to THATCamp, and to seeing all the cool ideas and smart people. And I do hope to get more involved in digital history myself, rather than maintaining my current watching brief. But you can understand why I haven't come up with a cool session idea of my own. Or perhaps you can't? Am I being too cautious, too reactionary, too -- dare I say it -- Luddite?

2 Comments

A while back, The National Archives made all Cabinet papers from 1915 to 1980 freely available for download. Now TNA Labs have created a visualisation tool for said papers, allowing you to see clouds of the 25 most frequent words and contributors for any year (month in wartime) or, using the 'flexible querying' mode, any period you specify (up to ten years). Mouse-overing each result gives the actual count and links to the relevant DocumentsOnline entries. It's something of a toy at the moment (though they encourage you to download the XML dataset it is based upon and play with it yourself). For blogging purposes, it's annoying that there's no export function: I've had to grab some screen shots to show the results. And it's not possible to search for specific words or change the stop word list. But the potential is easy to see.

Cabinet Minutes word frequency, 1931-1940

When looking at the lifetime of the National Government (1931-1940, spanning three prime ministers: Ramsay MacDonald, Stanley Baldwin, and Neville Chamberlain) one word inevitably caught my eye: air. At 1970 mentions over the decade, it's the fourth most common word after war (2537) , foreign (2125) and meeting (2059). Air could be used in a number of contexts, of course: the Secretary of State for the Air (a Cabinet position at this time) or Air Ministry, Royal Air Force, German air force, air routes, air raids, air raid precautions, air defence, air attack and so on. (I assume the tool is sophisticated enough to match only whole words and not just substrings.) But it suggests that the National Government spent a great deal of its time talking about the air, that it was, so to speak, airminded. (Naval, which admittedly has a somewhat narrower compass, is the only similar term and was used only 1204 times.)
...continue reading

3 Comments

A tweet from William J. Turkel alerted me to the possibility of using 18th century-style fonts in LaTeX. The most noticeable difference from modern typesetting is the long s, but there are different ligatures too. There are a number of ways to do it but the easiest way is with the inbuilt Kepler Fonts package. (The Fell Types are far prettier, but look difficult, or at least tedious, to install. Font management is one of LaTeX's biggest weaknesses.) Just insert the following in your preamble and you're done:

\usepackage[fullveryoldstyle]{kpfonts}

Well, almost. This simply replaces every s with a long s, which is not right. Most importantly, long s is generally not used at the end of a word, so you need to replace these with 's='. Here's what the first paragraph of my thesis looks like when done this way:

I wish I'd known about this before submitting it.

1 Comment

airminded,1920-2000

Following Ross' suggestion I've plugged airminded itself into the Google Ngram Viewer (for British English over 1920-2000 with a smoothing of 3). The word wasn't used until c. 1925 and grew in popularity until the end of the Second World War. It then began its long descent. Around 1960 its heyday was definitely over and by the late 1990s it was less popular than almost ever before. There's a noticeable dip in the years around 1940, which makes me wonder if the menace of aviation had temporarily overwhelmed its promise. But that's probably reading too much into it.

14 Comments

Finally, something to justify the existence of the Internet. The Google Ngram Viewer takes the corpus of words formed by the Google Books dataset (i.e. books, journals, magazines, but not newspapers) and lets you plot the changes in frequency of selected ones over time. There are all sorts of interesting questions you could (in principle) answer with this tool, so let's give it a whirl.

aeroplane, airplane, 1890-2000

Here's a pretty basic one. Blue is aeroplane, red is airplane, the period is 1890-2000. (The smoothing in all these plots is 3 years.) Aeroplane was initially the more popular term, but airplane has predominated since about 1925. Note the peaks during the world wars -- airplane was 5 times more likely to be used in the Second World War than in the 1990s.

But we don't have to use the English corpus: there's also American English and British English. Here's the American version.
...continue reading

18 Comments

[Cross-posted at Cliopatria.]

I know. Writing about Wikipedia is so 2006. And yes, finding errors in Wikipedia articles is not exactly difficult. But I have a bee in my bonnet which needs releasing.

Wikipedia's page on the Blitz has a section entitled 'Commencement on September 6'. This is how it currently reads (sans hyperlinks and superscripts):

There is a misconception that the Blitz started on September 7, 1940. Bombs began dropping the night of September 6 and continued for the full day of the 7th and on into the morning of the 8th. Saturday 7th was the first full day and has officially and erroneously become known as the day the Blitz started. Hermann Göring launched bombers and the first bombs caused damage the night of September 6.

Quoted in the The Manchester Guardian is Göring's communiqué:

Attacks of our Air Force on objectives of special military and economical value in London, which began during the night of September 6, were continued during the day and night of September 7 with exceptionally strong forces using bombs of the heaviest caliber.

A witness recalled the evening of Friday September 6, 1940:

My name is John Davey. I was born on December 27th 1924 in South Moltom [sic - Molton] Road, Custom House, West Ham, and a couple of miles from the Royal Docks. In September 1940, on the Friday evening of the weekend the docks were first blitzed, I was sitting with my friend in his house. At about 7 p.m. there was a series of explosions and the shattering of glass. We ran into the road and saw at the end a flame that shot into the sky, seeming to light up the whole area. My friend and I and lots of others ran towards the fire.
—BBC, WW2 People's War

The first damage to property on September 7 was recorded at eight minutes past midnight, a grocer’s shop at 43 Southwark Park Road, SE16.

It has long been the accepted, but erroneous, view that the London Blitz lasted 57 consecutive nights starting on September 7 1940 and ending November 1. In actuality September 6 makes 57 nights and not September 7. The historian AJP Taylor wrote of such an error:

… it is the fault of previous legends which have been repeated by historians without examination. These legends have a long life.

This is really quite silly. Yes, it's true that the accepted date of 7 September 1940 as the start of the London Blitz is a bit misleading, since there was a non-trivial amount of bombing before that date (e.g. see here). Judging from contemporary press accounts, 7 September certainly seemed to mark an important change in German bombing strategy, but more one of quantity than quality -- almost more an inflection point than a turning point. In retrospect we tend not to see it that way, which is fine. But we could recognise that -- leaving aside the eventual reification involved in the name 'the Blitz' itself -- the 'start of the Blitz' was less clearly defined then than it seems now.
...continue reading