Webinar | Open Planets Foundation

Webinar: Tools for uncovering preservation risks in large repositories

Overview

An important part of digital preservation is analysing content to uncover the risks that hinder its preservation. This analysis entails answering diverse questions, for example: Which file formats do I have? Are there any invalid files? Are there any files violating my defined policies?; and many others.

The threats to preserving content come from many distinct domains, from technological to organizational, economical and political, and can relate to the content holder, the producers or to the target communities to which the content is primarily destined for.

Scout, the preservation watch system, centralizes all the necessary knowledge on the same platform, cross-referencing this knowledge to uncover all preservation risks. Scout automatically fetches information from several sources to populate its knowledge base. For example, Scout integrates with C3PO to get large-scale characterization profiles of content. Furthermore, Scout aims to be a knowledge exchange platform, to allow the community to bring together all the necessary information into the system. The sharing of information opens new opportunities for joining forces against common problems.

This webminar demonstrates how to identify preservation risks in your content and, at the same time, share your content profile information with others to open new opportunities.

Learning outcomes

In this webinar you will learn how to:

characterise collections and use C3PO to easily inspect the content characteristics
integrate C3PO with Scout and publish content profiles online
use Scout to automatically monitor your content profile
monitor preservation risks by cross referencing your content profile with policies, information from the world, and even content profiles from peers

There are 23 places available on a first come, first service basis.

Date: Thursday 26 June

Time: 14:00 BST / 15:00 CET

Duration: 1 hour

Session Lead: Luis Faria, KEEP SOLUTIONS

Date:

26 June 2014

Event Types:

Webinar

Link:

SCAPE Digital Preservation Policy webinar

The SCAPE project is focussed on large scale digital preservation, to be able to manage large scale collections then it is important to work within a policy framework. Developing policy in a new and changing area can be hard and so the SCAPE Policy Representation team have worked on creating a three tier policy framework, a catalogue of policy elements which can be used by organisations as a starting point for creating their own preservation procedure policy. SCAPE has produced automated watch and planning tools which can use machine readable policy to ensure appropriate outcomes and the SCAPE suggested process for this will be outlined.

The webinar will cover the SCAPE three level policy framework, discuss the catalogue and how it might be used and describes how machine readable policy can be derived from the catalogue elements.

Who should attend?
This webinar will be of interest to those responsible for making digital preservation policy, or those who wish to use their existing policy in SCAPE watch and planning tools (SCOUT and Plato) Note use of these tools will not be discussed.

There are twenty-five places available on a first come, first serve basis.

Date: Wednesday 28 May
Time: 14:00 BST / 15:00 CET
Duration: 1 hour
Session Lead: Catherine Jones, Science and Technology Facilities Council & Barbara Sierman, National Library of the Netherlands

Date:

28 May 2014

Event Types:

Webinar

Link:

Registration

SCAPE Webinar: ToMaR – The Tool-to-MapReduce Wrapper: How to Let Your Preservation Tools Scale

Overview

When dealing with large volumes of files, e.g. in the context of file format migration or characterisation tasks, a standalone server often cannot provide sufficient throughput to process the data in a feasible period of time. ToMaR provides a simple and flexible solution to run preservation tools on a Hadoop MapReduce cluster in a scalable fashion.
ToMaR offers the possibility to use existing command-line tools and Java applications in Hadoop’s distributed environment very similarly to a Desktop computer. By utilizing SCAPE tool specification documents, ToMaR allows users to specify complex command-line patterns as simple keywords, which can be executed on a computer cluster or a single machine. ToMaR is a generic MapReduce application which does not require any programming skills.

This webinar will introduce you to the core concepts of Hadoop and ToMaR and show you by example how to apply it to the scenario of file format migration.

Learning outcomes

1. Understand the basic principals of Hadoop
2. Understand the core concepts of ToMaR
3. Apply knowledge of Hadoop and ToMaR to the file format migration scenario

Who should attend?

Practitioners and developers who are:

• dealing with command line tools (preferrably of the digital preservation domain) in their daily work
• interested in Hadoop and how it can be used for binary content and 3rd-party tools

Session Lead: Matthias Rella, Austrian Institute of Technology

Time: 10:00 GMT / 11:00 CET

Duration: 1 hour

Date:

21 March 2014

Event Types:

Webinar

Link:

Registration

OPF Webinar – From the Preservation Toolkit: JHOVE2

This webinar will give an overview of JHOVE2, the free and open-source tool for characterizing digital objects. It will cover the motivation for creating a second-generation version of JHOVE some of the new features of the tool, including the ability to perform not just format identification, validation, and feature extraction, but also assessment (a policy-based determination of the acceptability of a format instance, regardless of its validation). It will discuss JHOVE2’s more sophisticated data model of a format instance, embracing complex digital objects that can be composed of more than one file, each of a possibly different format. It will provide pointers on JHOVE2’s setup and use. It will briefly introduce the tool’s architecture, and ways in which it has been and can continue to be extended to include more formats, building on existing libraries and tools.

There are twenty-five places available on a first come, first serve basis.

Date: Friday 31 Janaury

Time: 09:00 EST / 14:00 GMT / 15:00 CET

Duration: 1 hour

Session Lead: Sheila Morrissey, Portico

Date:

31 January 2014

Event Types:

Webinar

Link:

OPF Webinar: Securing funding for your digital preservation, with SPRUCE

Making the case to your organisation’s management, or to external funders, to adequately resource your digital preservation activities is not an easy task. Digital preservation is not always a straightforward sell. In this financial climate the justification for spending money has to be compelling and watertight. In this webinar Paul Wheatley will describe how to make the case for funding your digital preservation, with reference to the SPRUCE Project’s Digital Preservation Business Case Toolkit.

* Making a compelling case to fund digital preservation
* The Digital Preservation Business Case Toolkit from SPRUCE
* Getting started
* Other resources

There are twenty-five places available on a first come, first serve basis.

Date: Wednesday 27 November

Time: 14:00 GMT / 15:00 CET

Duration: 1 hour

Session Lead: Paul Wheatley, SPRUCE Project Manager, University of Leeds

Date:

27 November 2013

Event Types:

Webinar

Link:

Scalable Environments for File Format Identification and Characterisation

This webinar provides an introduction to file format identification and characterisation tools which have been developed or extended as part of the SCAPE Project.

It covers the basic principals of file format identification, and shows how format information drives digital preservation workflows.

Participants will be given an overview of file format registries, and their role in digital preservation, and will see demonstrations of identification and characterisation tools including fido and tika.

We will provide a Virtual Machine image with samples files and step-by-step worksheets to allow participants to try out these exercises for themselves after the webinar with support.

Learning outcomes (by the end of the webinar and exercises, participants
will be able to):

Distinguish between different file types and identify the requirements for characterising each of them.
Carry out identification and characterisation experiments on example files.
Compare characterisation and identification tools and understand their advantages and disadvantages when used in different scenarios.

Session Lead: Carl Wilson, OPF
Date: Friday 25 October
Time: 12 noon BST / 13:00 CET
Duration: 1 hour (please note this includes the presentation and demonstrations. Practical exercises can be carried out after the webinar).

There are 25 places available which will be allocated on a first come, first serve basis.

Date:

25 October 2013

Event Types:

Webinar

Link:

Webinar registration

OPF Webinar – Digital library development and practice at the London School of Economics

This webinar will present a case study of digital preservation and digital library development at the London School of Economics. It will cover the nature of digital library collections we are working with now and a bit about our experiments and future directions for other kinds of born-digital material; the high-level architecture and functional components we have in place, and a discussion about our general approach and what we feel we can avoid having an opinion about for now; discussion of our user experience design process and how we are integrating this way of thinking into other areas of the library like our main website; and a bit about how we made the case to fund digital preservation and the development of our core team and how we involve others within the library.

Session lead: Ed Fay, Digital Library Manager, London School of Economics

Time: 14:00 BST / 15:00 CET

There are 25 places available which will be allocated on a first come, first serve basis. Registration will open soon.

Date:

23 September 2013

Event Types:

Webinar

Link:

Webinar registration

OPF Webinar – Capturing and Analyzing Forensic Disk Images with BitCurator

*all places have now been filled*

In this webinar, we’ll be examining the benefits of capturing and preserving forensically-packaged disk images in collecting institutions. Along the way, we’ll get some hands on experience with the open source BitCurator environment, freely available as a virtual machine download from http://wiki.bitcurator.net/.

Some of the topics we’ll be exploring:
– Forensic disk image formats, and capturing forensic disk images with Guymager
– Extracting and analyzing potentially private and sensitive information from imaged media
– File system analysis using The Sleuth Kit and fiwalk
– Generating reports using custom BitCurator tools

Session lead: Kam Woods, School of Library and Information Science, University of North Carolina

Time: 10:00 EDT / 15:00 BST / 16:00 CET

Duration: 1 hour

There are 25 places available. These will be allocated on a first come, first serve basis.

Date:

9 August 2013

Event Types:

Webinar

Link:

Registration

OPF Webinar – bwFLA Part II CD-ROM Ingest and Scientific Processes

Demonstrating Digital Preservation on Demand: Emulation-as-a-Service (EaaS) – CD-ROM Ingest and Scientific Processes.

This webinar will cover:

*bwFLA Digital Preservation Demonstration
*bwFLA and CD-ROM Ingest
*bwFLA and Scientific Processes

Session Lead: Annette Strauch, University of Ulm

Time: 12 noon BST / 13:00 CET

Date:

26 June 2013

Event Types:

Webinar

Link:

Webinar registration

Webinar – C3PO, an introduction to content profiling

This webinar will focus on content profiling and preservation planning. It aims to address the following questions:

Why do we need identification and characterisation?
How can we use the metadata that these processes provide?

It will cover some of the tools that are available and there will be a demonstration of C3PO (Clever, Crafty, Content Profiling of Objects) tool, and an explanation of how you can analyse the metadata it produces.

Session Lead: Petar Petrov, Creative Pragmatics

Time: 13:00 BST / 14:00 CET

Date:

31 May 2013

Event Types:

Webinar

Link:

Webinar registration