Per Møldrup-Dalum's blog | Open Planets Foundation

A Weekend With Nanite

Well over a year ago I wrote the ”A Year of FITS”(http://www.openplanetsfoundation.org/blogs/2013-01-09-year-fits) blog post describing how we, during the course of 15 months, characterised 400 million of harvested web documents using the File Information Tool Kit (FITS) from Harvard University. I presented the technique and the technical metadata and basically concluded that FITS didn’t fit that kind of heterogenic data in such large amounts. In the time that has passed since that experiment, FITS has been improved in several areas including the code base and organisation of the development and it could be interesting to see how far it has evolved for big data. Still, FITS is not what I will be writing on today. Today I’ll present how we characterised more than 250 million web documents, not in 9 months, but during a weekend.

Submitted by Per Møldrup-Dalum on 28 May 2014 – 9:30pm

Standing on the Shoulders of Your Peers

Notes on the Hadoop Driven Digital Preservation Hackathon in Vienna 2012 — or — How I Learned to Grunt Magic Spells.

Submitted by Per Møldrup-Dalum on 23 January 2014 – 9:01am

A Year of FITS

From November 2011 until November 2012 we at the State and University Library in Denmark have continually been running the FITS tool on harvested web resources. This blog post presents performance data on this job and how FITS performs when being fed nearly 12TB of web resources.

Submitted by Per Møldrup-Dalum on 9 January 2013 – 2:23pm

Per Møldrup-Dalum’s blog