Pages

Friday, 25 October 2013

Advice for our donors and depositors


Anyone who knows anything about digital archiving knows that one of the best ways to ensure the longevity of your digital data is to plan for it at the point of creation.

If data is created with long term archiving in mind and following a few simple and common sense data management rules, then the files that are created are not only much easier for the digital archivist to manage in the future, but also easier for the creator to work with. How much easier is it to locate and retrieve files that are ordered in a sensible and logical hierarchy of folders and named in a way that is helpful? We are producing more and more data over time and as the quantity of data increases, so do the size of our problems in managing it.

We do not have many donors and depositors at the Borthwick who regularly put digital archives into our care but this picture will no doubt change over time. For those who do deposit digital archives, it is important that we encourage them to put good data management into practice and the earlier we speak to them about this the better.

File:BitRot web.png
'Bitrot' From the Digital Preservation Business
Case Toolkit http://wiki.dpconline.org/

Last week I was fortunate enough to be invited to speak to a group of people from one of our depositor organisations who are likely to start giving us digital data to archive in the future. They were from a large organisation with no central IT infrastructure and many people working from home on their own computers. Good data management is particularly important in this sort of scenario. This was a great opportunity for me to test out what could become the basis of a standard presentation on digital data management techniques that could be delivered to our donors and depositors.

I started off talking about what digital preservation is and why we really need to do it. It is always handy to throw in a few cautionary tales at this point as to what happens when we don't look after our data. I think these sorts of stories resonate with people more than just hearing the dry facts about obsolescence and corruption. I made a good plug for the 'Sorry for your data loss' cards put together by the Library of Congress earlier this year as this is something that any of us who have experienced data loss can relate to.

I then moved on to my own recent tale of digital rescue, using the 5 1/4 inch floppy disks from the Marks and Gran archive as my example (discussed in a previous blog post). This was partly because this is my current pet project, but also because it is a good way to cement and describe the real issues of hardware and software obsolescence and how we can work around these.

In the last section of the presentation I gave out my top tips on data management. I wanted the audience to go home with a positive sense of what they can start to do immediately in order to help protect their data from corruption, loss or misinterpretation.

Much of what I discussed in this section was common sense stuff really. Topics covered included:
  • how to name files sensibly
  • how to organise files well within a directory structure
  • how to document files
  • the importance of back up
  • the importance of anti-virus software
The presentation went well and sparked lots of interesting questions and debate and it was encouraging to see just how accessible this topic is to a non-specialist audience. Some of the questions raised related to current 'hot topics' in the digital archiving world which I hadn't had time to mention in any depth in my presentation:

  • How do you archive e-mails?
  • Is cloud storage safe?
  • What is wrong with pdf files?
  • What is the life span of a memory stick?
I had an answer to all of these apart from the last one, for which I have since found out the answer is 'it depends'. I have recently been told on Twitter that most digital preservation questions can be answered in this way!

Monday, 7 October 2013

Do sound engineers have more fun?


At the end of last week I was at the British Library on their excellent ‘Understanding and Preserving Audio Collections’ course.

British Library and Newton by Joanna Penn on Flickr
The concept of ‘Preserving audio’ is not a new one to me. Audio needs to be digitised for preservation and access and that pushes it firmly into my domain as digital archivist. I know the very basics such as the recommended file formats for long term preservation, but when faced with a real life physical audio collection on a variety of digital and analogue carriers it is hard to know what the priorities are and where exactly to start. This is where the ‘Understanding audio’ part of the course came to my rescue, filling in some of the gaps in my knowledge.

The course

The first day of the course was fascinating. We were given a run-down of the history of audio media and were introduced to (and in many cases, allowed to handle) many different physical carriers of audio. Hearing a wax cylinder being played on an original phonograph was a highlight for me. Digital archivists don’t normally get to play with the physical artefacts held within archives!  Perhaps most useful in this session was learning how to recognise different types of physical media and spot the signs of physical degradation.

In the following two days the emphasis moved on to digital carriers and digital files. Interestingly digital carriers were seen to be more vulnerable than analogue in many respects. Digitisation workflows were also discussed and we got the chance to see around the digitisation studios with a wide range of equipment demonstrated. This was the point at which I started to wish I was a sound engineer!

Not so different after all…

One of the things that struck me throughout the three days was that this really isn't an alien subject to me at all. Familiar concepts were repeated again and again about obsolescence of technology, lack of standards (particularly when a new type of media takes off), the importance of metadata, the idea that future technologies may be able to do a better job of this than us, and the vain hopes that an ‘everlasting’ media carrier may be made available to us and solve all of our problems. Standard topics in any introductory presentation concerning digital archiving!

What was new and interesting to me though was that for audio media a time limit has been internationally agreed for taking action. We need to plan to digitise our audio and preserve it within a digital archive within the next 15 years because there will come a point at which this strategy will not be possible any more. We have a limited window of opportunity to work in. Digitising obsolete analogue and digital carriers is becoming harder to do (as the media degrades in a variety of different ways) and more expensive (as the necessary hardware and parts becomes harder to source). In fact, whereas digitisation of documents is becoming cheaper over time as new technologies are introduced, the digitisation of audio is becoming more expensive over time as the necessary equipment becomes harder to get hold of.

Has such a time limit has ever been discussed for rescuing data from obsolete digital media such as floppies and zip disks? If so, it is not one that has hit my radar.

Putting the learning into context

The Borthwick Institute and the University of York curate some substantial music collections, but we have also been carrying out an audit of all the other bits and pieces of audio that are buried within some of our other collections. Currently we have a list with basic information about each item including the media type, the location in the strongroom and descriptive information taken from the label or packaging. The next step was to work out a digitisation strategy for these items.

This is all well and good, but work on this stalled as it quickly found its way into the ‘too difficult’ box. Following on from the information absorbed on this course, I now have the ability to start the process of prioritising the audio for digitisation based on variables such as vulnerability of the physical media and the condition of individual items. Also taking into account whether the content is unique or of particular interest.

Another benefit is that I now feel that I could now hold a conversation with a sound engineer! This is key to planning a digitisation project. Every discipline has its own particular language or jargon and happily I now have some understanding of waveforms, equalisation curves and sampling rates. At the very least I know what to ask for if writing a specification for an audio digitisation project and have a wealth of references, resources and contacts at my finger tips if I need to find out more.