Authors: Martin Schaller, Sven Schlarb, and Kristin Dill
In the SCAPE Project, the memory institutions are working on practical application scenarios for the tools and solutions developed within the project. One of these application scenarios is the migration of a large image collection from one format to another.
Learning to Think Like a Package Maintainer
Lots of great digital preservation applications and services exist, however very few are actively maintained and thus preserved! This is a big problem! By introducing the steps to develop these and engage the support of the community, this training course looks at what can be done to improve this situation. Specifically, this training course looks at how to prepare packages for submission into the very heart of many digital environments; the operating system and directly associated “app-stores”. Attendees will be given hands-on experience with developing and maintaining packages rather than software and key differences will be discussed and evaluated. Better preservation of preservation tools, means better preservation our digital history.
Learning Outcomes (by the end of the training event the attendees will be able to):
- Understand the complexities of package management and distinguish between the different practices relating to both package objectives and chosen programming language.
- Be able to carry out advanced package management operations in order to critically appraise current packages and propose changes.
- Understand the importance of clearly defined versioning and licenses and the role of clear documentation and examples.
- Apply best practice techniques in order to create a simple package suitable for long term maintenance.
- Evaluate a number of options for managing package configuration and behavior relating to package installation, removal, upgrade and re-installation.
- Analyse opportunities for automating package management and releases, maintaining a clear focus on the user and not the developer.
- Critically evaluate opportunities to generalise package management to allow the easy building and maintenance of packages on multiple platforms.
- Assess the potential to apply package management techniques in your own environment.
Delegates will receive a certificate of attendance for the training course.
The agenda can be seen here: http://wiki.opf-labs.org/display/SP/Agenda+-+Preserving+Your+Preservation+Tools.
Registration is now open! https://scape-preserving-tools.eventbrite.co.uk
- Travis compiles the projects and executes unit tests whenever a new commit is pushed to Github, or when a pull request is submitted to the project.
- Jenkins builds are generally scheduled once per day. After a build the software has its code quality analysed by Sonar
It's been more than two years now since I wrote my D-Lib paper JPEG 2000 for Long-term Preservation: JP2 as a Preservation Format. From time to time people ask me about the status of the issues that are mentioned in that paper, so here's a long overdue update.
As part of our work on test-beds for the SCAPE project we have been investigating the various ways in which a large scale file format migration workflow could be implemented. The underlying technologies chosen for the platform are Hadoop and Taverna. One of the aims of the SCAPE project is to allow the automatic generation and execution of Taverna workflows, which will be executed via Hadoop.
The four methods for implementing a file format migration workflow that we tested were:
- Batch execution of a shell script (no parallelisation)
- A workflow written in/controlled from Java, run on Hadoop
- A workflow written in/controlled from Taverna, run on Hadoop
- A workflow written in Taverna, calling an XML defined unit of execution in Hadoop
I've already written a number of blog posts on format validation of JP2 files. Format validation is only a one aspect of a quality assessment workflow. Digitisation guidelines typically impose various constraints on the technical characteristics of preservation and access images. For example, they may state that a preservation master must be losslessly compressed, and that its progression order must be RPCL. A format profile is a set of such technical constraints.