Tools and methods

You are currently browsing the archive for the Tools and methods category.

Looking over the list of Australian mystery aircraft sightings suggests that some generalisations can be made.

Aeroplane vs airship, 1900-1918

In the 1910s, mysterious lights in the sky were usually described as being airship-like; after 1910 they were far more likely to be called aeroplanes. Perhaps not coincidentally, 1910 was when aeroplanes first flew in Australia; certainly a search of Trove Newspapers (using Wraggelabs' QueryPic) shows that 1910 was the first year when the word "aeroplane" appeared markedly more frequently than "airship". So that's easy enough to explain.

The same search shows that 1909 was the year that aviation really broke through into public consciousness. That's also the year of the Australian phantom airship wave. As it was the first burst of interest in aircraft, the first time that people started to learn about them, it's perhaps not surprising that people might think they saw them flying around where they weren't. The 1918 mystery aeroplane scare came after several years of increasing press coverage of aviation, obviously due to the war. So again that fits. Aeroplanes were something people were reading (and probably talking) about a lot. But that by itself is evidently not enough to generate a mystery aeroplane scare: there were a few seen in 1914, and a handful in the years after that, but nothing on the scale of 1918. There needs to be a plausible reason for aircraft to be flying about: and the reported visit of the Wolf and its Wölfchen to Australian shores provided that, though the desperate situation of the Allied armies in France was also a factor.
Read the rest of this entry »

I've updated my list of British newspapers online, 1901-1950 to reflect the new titles available in the British Newspaper Archive (BNA), a pay-site which was launched with some fanfare about a month ago. Although it has been digitised from (and in partnership with) the British Library's newspapers collections, I must admit to not having paid much attention at the time because it sounded like it only covered 1900 and earlier. While that's mostly true, there's actually enough to interest an early 20th-century historian, especially in terms of regional newspapers, and more titles and pages are promised. Having said that, the price structure isn't very appealing for what's on offer, so I haven't subscribed to BNA and probably won't until I have a specific purpose in mind.

Most of the 20th-century titles are available only up to 1903. But the Western Times (Exeter) is available right up until 1950, and the Tamworth Herald until 1944. Four other newspapers have digitised runs of over a decade: Cheltenham Looker-On (1902 to 1913); North Devon Journal (Barnstaple, to 1923); Nottingham Evening Post (1921 to 1944); Western Daily Press (Bristol, 1915 to 1930). You can download whole pages (though apparently not individual articles), though sadly without a text layer. The free samples are good quality -- of course, they would be, but keyword searches (which you can do for free) suggests that the OCR is generally good. There is also the ability to correct the text where the OCR fails; and you can tag or comment on individual articles. User accounts also come with a 'My Research' section which allows you to bookmark articles as well as view a history of previous searches performed and articles viewed. A potentially handy feature is the ability to perform a keyword search on just the articles you've viewed. Searching in general is fast and powerful; you can quickly narrow a query by period, area, title or section of newspaper. I'm impressed with BNA's user interface overall: it is a lot like (and I'm sure directly inspired by) the National Library of Australia's Trove Digitised Newspapers but with a few more improvements for the dedicated researcher in mind.

Now for the complaints. These all revolve around the non-free nature of BNA. I do have philosophical objections to state institutions handing over their nation's cultural heritage largely preserved at taxpayer expense to free enterprise to make a buck out of, but there are practical problems too. The facilities for tagging, commenting and correcting are great, for example, but I question whether these are going to be used much in a non-open environment like this. Especially corrections: Trove has a community of eager text-correctors who make over a hundred thousand corrections a day; but then Trove is free. Expecting people to pay BNA for the privilege of improving their product is a bit much to ask, it seems to me. Apparently the current commercial arrangement will last for ten years, after which it may become open; but by then the technology will no doubt need updating and probably another commercial arrangement to fund it. I realise that digitisation and hosting costs money and it's not the British Library's fault it had to go down this route if it wanted to make its newspaper collection available to all; but I much prefer the Antipodean ethos on this one. Some of the problems resulting from the non-free, non-open nature of BNA could be fixed, though. As I noted above, given the limited number of titles currently available for the 20th century, subscribing for a whole year is not attractive to me. Why not have a cheaper option for just the 20th century?

[Cross-posted at Cliopatria.]

I have a favour to ask of you. Would you mind please having a look at this and telling me what's wrong with it? Thank you.

To be somewhat less cryptic, it's an article for peer-review which I am having no luck getting accepted anywhere, and I don't really know why. I've had some bad luck. I wrote the first version about a year before I finished my PhD, in the hope that it would be on my CV by the time I entered the job market; in the event the journal I submitted it to took well over a year to reject it. But I've made some bad choices too. In its original form it was too ambitious and far too long; after three rejections I decided to cut it in two and rewrite each piece as a standalone article. As it (or at least the first part) was now shorter and sharper, I was again hopeful that I could find a home for it. But I've now received a second rejection for this version. This last rejection was helpful in that the reviewer provided detailed criticism, but while much of it is well taken, some of it is not suggests that the point of my article did not get across. That's my fault as a writer; it might also be that I've been sending it to the wrong journals. But as I say, I'm not really why it's so difficult to place; it doesn't seem to me to be any worse than my first or even my second peer-reviwed articles.

So I'm taking a leaf out of Katrina Gulliver's book (though not her actual book!) by putting the article up on Google Docs and requesting feedback from anyone who has the patience to wade through it. You can comment on the article itself, either anonymously (if you don't want to be mentioned in the acknowledgements) or using your Google account; or you can send me an email. (No comments here though, please, unless they're about the crowdsourcing itself.) I'll take it down after a week or so.

How can I improve the article? What am I doing wrong? Where should I send it? Or should I just accept that this one is a dud and forget about it? It's up to you! Well, it's still up to me, but I'll be grateful for any and all suggestions.

[Cross-posted at Cliopatria.]

I've been using the Internet for nearly two decades: in 1992 -- after nervously checking with the physics computer lab manager first -- I sent an email to my future Honours supervisor while she was visiting Toronto. I was quickly hooked by the promise of overcoming the tyranny of distance and transparently communicating with people all across the planet. Of course, it never worked quite like that. Of the many of the different forms of communication enabled by the Internet I've tried since then, many have fallen by the wayside (who now uses Unix talk? When was the last WAIS server shut down?), others still limp along (Gopher, IRC, Usenet) while others are in surprisingly rude health (you've probably used FTP at some point, though you may not have known it). Sometimes I was an early adopter: I set up my first webserver early in 1994, at a time when there must have been only a few thousand websites in the world. At other times I was very late to the party. But after much enthusiastic (and occasionally obsessive) participation in these and other protocols, I eventually became jaded and turned to passive consumption of content rather than creation in any form. It was only when I took up blogging at the start of my PhD that I rediscovered that early joy in talking to the world.

But the thing about blogging is that it's pretty much all about me, me, me. While I absolutely value and enjoy interacting with commenters, and hope that those who read without commenting find what I post here interesting or valuable, it's my place and I set the agenda. And I'd probably still blog even if nobody read it. So while Airminded is part of the World Wide Web, spending so much time on it could lead me to think that bombing and phantom airships and the knock-out blow are more important than they really are (which is to say, not very). As well, because my authorial voice dominates here it can lead me to think that my opinion is more important than it really is (which is to say, even less).

Which brings me to Twitter. I've blogged about tweeting a couple of times before, first when I began using Twitter in earnest, then when I reached one thousand tweets. I've now added more than 10,000 to that figure, so it's probably safe to say that I'm a Twitter addict -- er, become accustomed to using it. For link sharing, making contacts, historical musings, friendly banter and just general silliness, for sure; but there's more to it than that.

Tweeting is sometimes called microblogging, but that's a bit of a misnomer. It's true that it's possible to use Twitter just to broadcast your own thoughts or promote your own things, but unless you're already a celebrity nobody is going to listen. The real value comes from listening and (optionally) responding to what others say -- in interacting with others. With other historians, sure, but also with other people who share some interests and with others who don't.

The biggest and best example of this, for me, has been following the Arab Spring, particularly the revolutions in Egypt and Libya. Not just the news (and the rumours), but the commentary coming from those living through them: their experiences, hopes, fears. I confess this was a bit of an eye-opener for me. Intellectually, of course, I knew that people living in autocracies are like people everywhere else, but hearing the diversity of their responses (even within the limitations of 140 characters) I recognised them as individuals at a more basic level. It became impossible for me to discount the revolutions as quarrels in far away countries between people of whom I knew nothing. Twitter help me humanise an important period in contemporary history. That's something that I don't think any of those older protocols, from email on, could have helped me to do, not in practice. It's not transparent at all, of course, and it is as subject to biases and deceptions as any other form of human communication; but using Twitter is really the closest I've come to entering the global village I glimpsed nearly two decades ago.

Because it's #twitterstorians Day, I really should have said something about the specifically historical uses (and limitations) of Twitter. Luckily there are plenty of others who have done that:

@katrinagulliver (who is responsible) · @jliedl · @jondresner · @kathryntomasek · @kellyhignett · @kelly_j_baker · @lottelydia · @markcheathem · @publichistorian · @raherrmann · @sharon_howard (with a special shout-out for The Broadside) · @wilkohardenberg

PS If you don't already follow me on Twitter, I'm @Airminded!

The other day I received an email from Andrew Gray, a reader of this blog, alerting me to the existence of a new online newspaper archive available at ukpressonline. I've used ukpressonline before for its complete runs of the Daily Express and the Daily Mirror, which were the most popular British dailies for most of the 1930s and 1940s. But it's not a free service. I don't mind paying, but the annual subscription rates are too prohibitive for me, and so when I do pay it's only for short-term access with a specific topic in mind. So it's not something I routinely draw upon.

But what Andrew pointed out (thanks Andrew!) was a new 'World War II' subscription package covering just the years 1933 to 1945, ie from the rise of Hitler to the end of the Second World War. It's only available by annual subscription, but I think £50.00 is more than reasonable for what it offers: not only the Express and the Mirror, but also the Yorkshire Post (one of the few conservative newspapers to take a stand against appeasement), the Daily Worker (owned by the Communist Party of Great Britain), and Action and Blackshirt (published by the British Union of Fascists and its successors). And it is promised that 'In the coming months, we aim to add major regional newspapers and some of the further-left press' (I would guess that the Yorkshire Post and the Daily Worker are the first of these, actually). This is a really excellent resource for anyone interested in the British press in this period; I've already signed up and started using it.
Read the rest of this entry »

The Scareship Age, 1892-1946

A couple of months ago, Alun Salt did a very nice thing for me: he unexpectedly assembled some of the posts I've written here about phantom airships into an e-book. Using that as the basis, I've had a go at learning how to do e-books myself. (Alun recommended using Jutoh, an e-book project manager, and I'm glad he did.) So I've tweaked things a bit; added a few of the recent phantom airship posts I've written recently, played with the cover image, and the result is The Scareship Age, 1892-1946, available in the two most common e-book formats: EPUB, an open format, and MOBI, the format used by Amazon's Kindle. You can download them here, from the Downloads page, or from the sidebar on Airminded's front page. They are of course free, as in Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported.

I have tried this sort of thing before, with my Sudeten crisis posts, but that was as a PDF which is not really suited for e-books; and with all the images it turned out to be quite bloated at 5.6 Mb. The Scareship Age comes in at 0.5 Mb for the EPUB and 0.9 Mb for the MOBI, which is much better. Now that I have a better idea about how e-books work, I'll have another go at the Sudeten crisis. But not now!

Aerial terminus of the White Moon Line

TRAVELLING OF THE FUTURE: THE BRITISH AERIAL TERMINUS OF THE WHITE MOON LINE -- The old order is passing. Already glimpses of the future of aerial transport, with all its mighty possibilities, are becoming visible. When the stricken nations return to a state of prosperity, great things are in store. As to what economic and commercial revolutions are latent in the development of flying, the most daring of us hesitates to speculate. The picture shows an aerial terminus of the White Moon Line, raised aloft over a seaport. This is no flat aerodrome, but a huge circular structure. Around its topmost circumference platforms swinging on a circular railed bed are carried by two rotating arms, on which the aero liners alight and from which they ascend. The arms are moved round as the wind changes, so that the aero liners descend and ascend facing it. These arms are inclined a little downwards to bring the liners more quickly to rest -- they alight up the slope -- and to assist them to gather speed more rapidly before the final breathless abandonment of the sloping platform and the upward rush into the heavens. On the left is seen a passenger lift with two cars which rise and sink continually, carrying passengers to and from the high embarking level. A mono-railway penetrates to the heart of the terminus; a footway runs between the tracks. An aero liner is seen just ascending, bound on some far journey; another is stationary, loading up. Inside the structure is a huge lift for lowering the aero liners for refitting and repair, and in its mysterious depths we can picture workshops lit by flickering arc lamps, where hundreds of mechanics work busily day and night... Perhaps some of the future aerial termini will be on the ground; but where a man can find no ground near the starting point, he will raise structures such as this. The sea-captains will look upwards at the air-captains, beholding the fulfilment of a great dream, dreamt by generations of wise men long passed away, who wondered because they knew that such great things would come to pass. From the original by Roderic Hill.

Source: Flight, 6 January 1921, 10-1.
Read the rest of this entry »

So, THATCamp Melbourne is over. It was pretty much as I expected, which is to say it was excellent. I'm not going to write a conference report (you should have been following #thatcamp on Twitter for that!) but two sessions did give me ideas for digital history projects I might like to do. One day. If I get the time.

One came out of the unofficial API Tim Sherratt reverse-engineered for Trove Newspapers. (Why the National Library of Australia won't release an official API is a bit mysterious.) He uses that to scrape Trove to do searches and display results which aren't possible with the interface offered by the NLA, such as plotting the frequency of Australian vs British/Briton. Are there any publicly accessible datasets which I use which could benefit from the same treatment? Yes, there are. The first one I thought of was the Flight archive, which is a great resource burdened with a limited interface. (But it's fantastic that it exists at all: Flightglobal is a commercial operation and they didn't need to open up their back issues like this at all, if they didn't want to.) I think this is easily doable. A second one is much more ambitious: The National Archives catalogue. It's frustrating that you can't do keyword search across their digitised collections; all you can do is search the descriptions in the catalogue, and these are by their nature limited. A scraper would help here. But the problem there is that you can't download documents directly, even when they are free; you have to add to a 'shopping cart', pay £0.00 for it and wait for an email to arrive. Possibly this could be automated; possibly not.

The other idea I had was to use SahulTime (or its eventual successor, possibly called TemporalEarth) to display the British scareship waves. SahulTime is something like Google Earth, but it allows you to map events/documents/people/objects in time as well as space. Matthew Coller, the developer, originally devised it to represent archaeological data on migration into Australia across the ice-age land bridge, but it is just as useful for historical data. So I could use this to show when and where the scareships were seen, showing how the waves started and evolved, with links to the primary sources. SahulTime is also good at displaying uncertainty in time, which is helpful where I have only vague information about when a sighting happened. The same could be done for uncertainty in space, though that's a bit trickier conceptually.

One day... if I get the time...

Later this week I'm going to THATCamp Melbourne. What's THATCamp, you ask? THATCamp stands for The Humanities and Technology Camp. It's an unconference devoted to exploring the ways in which the humanities and digital technology can work together. It is informal and collegial: attendees vote on the programme on the first morning. It's practical and hands-on: digital projects are often started during the camp, or tools written, or software installed. The first THATCamp was held at the Center for History and New Media at George Mason University in Virginia in 2008; last year there were 17 held around the world, including one in Canberra. Melbourne's is being held at the University of Melbourne, where I work and near where I live, so it would be hard to justify not going!

But the truth is that I did have qualms, because I don't consider myself a digital historian. Sure, there's the blog. But that's about communication, not research; and research comes first. And apart from using digitised sources where possible, my research methods are quite traditional. I find sources, I read them, I compare them, I draw conclusions, and so on. I imagine Gibbon did much the same.

In some ways, this is surprising. In my day job I work in systems administration and IT support, so it's not like I don't know my way around computers. And before history, I studied astrophysics, which has long used digital technology as an integral part of its methods. Indeed, about the first thing you do when you start out learning how to do astrophysical research is to become familiar with the analysis software you'll be using. And my masters project was entirely computational: I wrote, tested and debugged code. (Written in Fortran 77, no less!) So I'm sure that, when I came to do my PhD, I could have handled a project which was much more digital and less traditional in its approach if I'd wanted to.

But that's the thing: I didn't want to. Why leave a career in IT for one in history (and I still hope that will happen) and do the same kind of thing, just for a different end? Fiddle around with Apache installs, write justifications for storage arrays, think about database structures. That's what I want to get away from. What I want to do is read old books, uncover forgotten ideas, meet interesting (albeit usually dead) people. (And tell the world about it, which is where blogging comes in.) I would guess that most historians have similar motivations. And that's the problem for digital history. The types of people who are attracted to doing history are not likely to be attracted to doing digital history. (I have similar reservations about Anthony Grafton's recent call for more collaboration between historians, in emulation of the sciences. We tend to play better alone.)

This is not because digital history has no value: it clearly has vast potential. But at the moment it still belongs to the hackers, those who enjoy creating visualisation tools and XML datasets. It won't realise its potential until every historian is a digital historian, and that won't happen until doing digital history is as natural and painless as... well, as natural and painless as doing traditional history is, anyway. The technology needs to adapt itself to the users, in other words, not the other way around. Well, in reality both will happen; but we aren't there yet.

That said, I'm still excited to be going to THATCamp, and to seeing all the cool ideas and smart people. And I do hope to get more involved in digital history myself, rather than maintaining my current watching brief. But you can understand why I haven't come up with a cool session idea of my own. Or perhaps you can't? Am I being too cautious, too reactionary, too -- dare I say it -- Luddite?

A while back, The National Archives made all Cabinet papers from 1915 to 1980 freely available for download. Now TNA Labs have created a visualisation tool for said papers, allowing you to see clouds of the 25 most frequent words and contributors for any year (month in wartime) or, using the 'flexible querying' mode, any period you specify (up to ten years). Mouse-overing each result gives the actual count and links to the relevant DocumentsOnline entries. It's something of a toy at the moment (though they encourage you to download the XML dataset it is based upon and play with it yourself). For blogging purposes, it's annoying that there's no export function: I've had to grab some screen shots to show the results. And it's not possible to search for specific words or change the stop word list. But the potential is easy to see.

Cabinet Minutes word frequency, 1931-1940

When looking at the lifetime of the National Government (1931-1940, spanning three prime ministers: Ramsay MacDonald, Stanley Baldwin, and Neville Chamberlain) one word inevitably caught my eye: air. At 1970 mentions over the decade, it's the fourth most common word after war (2537) , foreign (2125) and meeting (2059). Air could be used in a number of contexts, of course: the Secretary of State for the Air (a Cabinet position at this time) or Air Ministry, Royal Air Force, German air force, air routes, air raids, air raid precautions, air defence, air attack and so on. (I assume the tool is sophisticated enough to match only whole words and not just substrings.) But it suggests that the National Government spent a great deal of its time talking about the air, that it was, so to speak, airminded. (Naval, which admittedly has a somewhat narrower compass, is the only similar term and was used only 1204 times.)
Read the rest of this entry »

« Older entries