Impressions of the ‘Hadoop-driven digital preservation Hackathon’ in Vienna
More than 20 developers visited the ‘Hadoop-driven digital preservation Hackathon’ in Vienna which took place in the baroque room called “Oratorium” of the Austrian National Library from 2nd to 4th of December 2013.
Scalable Environments for File Format Identification and Characterisation
This webinar provides an introduction to file format identification and characterisation tools which have been developed or extended as part of the SCAPE Project.
It covers the basic principals of file format identification, and shows how format information drives digital preservation workflows.
Participants will be given an overview of file format registries, and their role in digital preservation, and will see demonstrations of identification and characterisation tools including fido and tika.
We will provide a Virtual Machine image with samples files and step-by-step worksheets to allow participants to try out these exercises for themselves after the webinar with support.
Learning outcomes (by the end of the webinar and exercises, participants
will be able to):
- Distinguish between different file types and identify the requirements for characterising each of them.
- Carry out identification and characterisation experiments on example files.
- Compare characterisation and identification tools and understand their advantages and disadvantages when used in different scenarios.
Session Lead: Carl Wilson, OPF
Date: Friday 25 October
Time: 12 noon BST / 13:00 CET
Duration: 1 hour (please note this includes the presentation and demonstrations. Practical exercises can be carried out after the webinar).
There are 25 places available which will be allocated on a first come, first serve basis.
Software Museums (Archives)
During and around this year’s iPRES a couple of discussions sprung up around the topic of proper software archiving and it was part of the DP challenges workshop discussions. With services emerging around emulation as e.g. developed in the bwFLA project (see e.g.
ebooks: what do we care (for)?
Last Friday I ran a workshop at the BL trying to identify what I guess we might call significant properties of ebooks.
Identification of PDF preservation risks: the sequel
SCAPE Planning and Watch: Two years and a bit more
SCAPE Training Event – Future Formats First: Building Application Infrastructures for Action Services
Overview
This workshop is the second event in the SCAPE project training programme. It will focus on using tools and workflows to carry out digital preservation actions at scale.
It will begin with an introduction to scalability and will present techniques to use a scalable platform with common preservation tools.on using tools and workflows to carry out digital preservation actions at scale.
By building on a real use case from the British Library, delegates will gain hands on experience in migrating a large volume of image files to the JPEG 2000 format, verifying each migration against the original file using tools including ImageMagik, jpylyzer and Matchbox.
Delegates will learn about building workflows to invoke multiple operations, and how to share and discover other workflows. By building a scalable environment using Hadoop and Taverna, delegates will then be able to execute their workflow at scale, performing multiple simultaneous migrations and verifications.
Learning Outcomes (by the end of the training event the attendees will be able to):
- Understand scalable platforms and evaluate the situations in which such environments are required.
- Apply knowledge of existing tools to solve migration and quality control problems.
- Combine and modify tool chains in order to create automated workflows for migration and quality control.
- Implement best practice for discovering and sharing workflows for use and re-use.
- Make use of a scalable environment and apply a number of workflows to automatically perform migration and quality assurance checks on a large number of objects.
- Identify a number of potential problems when working in a scalable environment and propose solutions.
- Understand the potential to use scalable platforms in digital preservation and synthesise new opportunities within your own environments.
Delegates will receive a certificate of attendance for the training course.
Agenda
The draft agenda is available here: SCAPE Future Formats First Agenda
The event will be conducted in English.
Who should attend?
Practitioners (digital librarians and archivists, digital curators, repository managers, or anyone responsible for managing digital collections) with an interest in building digital preservation workflows using a variety of preservation tools, and then executing them at scale. To get the most out of this training course you will ideally have some knowledge or experience of digital preservation.
Developers who are interested in learning about digital preservation at scale.
Registration
Registration is now open at: http://scape-future-formats-first.eventbrite.co.uk/.
The cost for the two days is £90. Morning and afternoon coffee breaks and lunch will be provided and are included in the registration fee.
*Please ensure you bring your laptop with you so you can participate in the practical exercises.*
Registration will close on Friday 6 September
Further information
Please visit the event wiki page for details about how to get to the venue, where to stay and how to prepare for the event.
To find out more about the SCAPE project visit: http://www.scape-project.eu/
Photograph © The British Library Board
ICC profiles and resolution in JP2: update on 2011 D-Lib paper
It’s been more than two years now since I wrote my D-Lib paper JPEG 2000 for Long-term Preservation: JP2 as a Preservation Format. From time to time people ask me about the status of the issues that are mentioned in that paper, so here’s a long overdue update.
Open Research Challenges in Digital Preservation: Call for contributions!
Following the community response to our workshop last year, we want to invite you again to contribute your future preservation challenge!
Preservation capabilities: How to assess? How to improve?
Digital Preservation is making certain progress in terms of tool development, progressive establishment of standards and increasing activity in user communities, but there is a wide gap of approaches to systematically assess, compare and improve how organizations go about achieving their preservation goals.




