Anyone who knows anything about digital archiving knows that one of the best ways to ensure the longevity of your digital data is to plan for it at the point of creation.
If data is created with long term archiving in mind and following a few simple and common sense data management rules, then the files that are created are not only much easier for the digital archivist to manage in the future, but also easier for the creator to work with. How much easier is it to locate and retrieve files that are ordered in a sensible and logical hierarchy of folders and named in a way that is helpful? We are producing more and more data over time and as the quantity of data increases, so do the size of our problems in managing it.
We do not have many donors and depositors at the Borthwick who regularly put digital archives into our care but this picture will no doubt change over time. For those who do deposit digital archives, it is important that we encourage them to put good data management into practice and the earlier we speak to them about this the better.
|'Bitrot' From the Digital Preservation Business |
Case Toolkit http://wiki.dpconline.org/
Last week I was fortunate enough to be invited to speak to a group of people from one of our depositor organisations who are likely to start giving us digital data to archive in the future. They were from a large organisation with no central IT infrastructure and many people working from home on their own computers. Good data management is particularly important in this sort of scenario. This was a great opportunity for me to test out what could become the basis of a standard presentation on digital data management techniques that could be delivered to our donors and depositors.
I started off talking about what digital preservation is and why we really need to do it. It is always handy to throw in a few cautionary tales at this point as to what happens when we don't look after our data. I think these sorts of stories resonate with people more than just hearing the dry facts about obsolescence and corruption. I made a good plug for the 'Sorry for your data loss' cards put together by the Library of Congress earlier this year as this is something that any of us who have experienced data loss can relate to.
I then moved on to my own recent tale of digital rescue, using the 5 1/4 inch floppy disks from the Marks and Gran archive as my example (discussed in a previous blog post). This was partly because this is my current pet project, but also because it is a good way to cement and describe the real issues of hardware and software obsolescence and how we can work around these.
In the last section of the presentation I gave out my top tips on data management. I wanted the audience to go home with a positive sense of what they can start to do immediately in order to help protect their data from corruption, loss or misinterpretation.
Much of what I discussed in this section was common sense stuff really. Topics covered included:
- how to name files sensibly
- how to organise files well within a directory structure
- how to document files
- the importance of back up
- the importance of anti-virus software
The presentation went well and sparked lots of interesting questions and debate and it was encouraging to see just how accessible this topic is to a non-specialist audience. Some of the questions raised related to current 'hot topics' in the digital archiving world which I hadn't had time to mention in any depth in my presentation:
- How do you archive e-mails?
- Is cloud storage safe?
- What is wrong with pdf files?
- What is the life span of a memory stick?
I had an answer to all of these apart from the last one, for which I have since found out the answer is 'it depends'. I have recently been told on Twitter that most digital preservation questions can be answered in this way!