Skip to content

NCDD-symposium: “NCDD mag best wat feller worden” (#hvog)


Published: Thu 26 Jan 2012
door Inge Angevaare – Bij het NCDD-symposium van afgelopen dinsdag was ik zo zeer betrokken (op allerlei manieren) dat een min of meer objectief verslag er niet in zit. Maar niet bloggen over dit evenement is natuurlijk geen optie, dus:
NCDD Symposium Bouw een huis voor ons digitaal geheugen

Foto: KB Optische Technieken, Jos Uljee

De opkomst was indrukwekkend. Meer dan 160 aanmeldingen, zo’n 140 echte bezoekers – dat betekent dat het onderwerp leeft. In gesprekken ontdekte ik wel dat de verwachtingen heel verschillend waren. Nog altijd komen er mensen naar deze bijeenkomsten die hopen dat de NCDD een kant-en-klare oplossing heeft liggen, en als wij dan zeggen dat wij die niet hebben, sterker nog, dat die er domweg niet zijn, dan zijn ze teleurgesteld.  Begrijpelijk natuurlijk. We moeten met zijn allen nog steeds wennen aan een digitale wereld die zo snel verandert dat je – als publieke organisatie met beperkte middelen – altijd het gevoel houdt dat je achter de feiten aan blijft hollen. Toch is dat zo en blijft dat zo.

Inge Angevaare

Foto KB Optische Technieken, Jos Uljee

“Het “Huis voor ons digitaal geheugen” moet je niet te letterlijk nemen. Zelf denk ik eerder aan het grote trappenhuis uit Hogwarts (Harry Potter); zwevende trappen die bewegen en steeds nieuwe verbindingen maken; schilderijen die leven en meepraten, een soort crowd-sourcing avant-la-lettre. Alleen hebben wij geen toverstokjes.” (IA)

In de discussies viel me op dat met name de kleine instellingen vragen om regels, om vastigheid. “Vertel ons nu maar gewoon wat we moeten doen.” En: “De NCDD mag best wat feller worden, wat meer regels opleggen.” Bij degenen die die regels zouden moeten opleggen, zeg maar de grote instellingen die al flink gevorderd zijn met duurzame toegankelijkheid, bespeur ik terughoudendheid om adviezen te geven, omdat ze zelf nog niet weten wat die adviezen op de lange termijn waard zijn. Dit is een nieuw vak en er zijn nog veel onzekerheden. De technologie zet ons iedere dag voor nieuwe uitdagingen, en de IT-wereld kan nauwelijks langer dan vijf jaar vooruitdenken. Dat maakt de ervaren instellingen voorzichtig, omdat ze geen fouten willen maken, geen claims aan hun broek willen als iets niet blijkt te werken. Dat is ook begrijpelijk.

Dagvoorzitter Karin van der Heiden en NCDD-voorzitter Bas Savenije over het onderling uitwisselen van back-ups tussen KB en Beeld en Geluid. "Dat zouden meer instellingen moeten doen."

Tijdens een parallelsessie van de werkgroep Opslag gooide Martin Berendse van het Nationaal Archief een stevige knuppel in het hoenderhok. “Wie van ons kan nu zeggen dat hij zijn IT-beheer helemaal op orde heeft?” Van de vijftig paar handen gingen er twee omhoog …

Parallelsessie opslag: "Misschien moet NCDD een soort opslagmakelaar worden, tussen wie ruimte overheeft en wie ruimte nodig heeft."

Dit alles was misschien niet prettig, maar wel nuttig. Het is heel belangrijk dat we allemaal begrijpen in wat voor wereld we zijn beland, zodat we onze verwachtingen bijstellen en wennen aan het feit dat er domweg geen garanties zijn.

Zonder garanties kun je ook aan de slag

Als we met zijn allen geaccepteerd hebben dat we aan het pionieren zijn en dat er nog geen garanties zijn, dan kunnen we eindelijk echt aan het werk.

De werkgroep Preservation buigt zich over prangende vragen uit het publiek. Vlnr Mette van Essen (NA), Jeanine Tieleman (DEN), Barbara Sierman (KB) en Robert Gillesse (DEN).

Barbara Sierman van de KB deed een praktische suggestie: misschien hebben we nog geen “best practices”, maar we kunnen wel laten zien wat we doen en waarom we dat doen. Zet een paar van die aanpakken naast elkaar en je hebt een wereld aan informatie voor instellingen die keuzes moeten maken. Want het is natuurlijk ook niet zo dat we niets weten. Kom nou, we weten best al veel; Nederland blaast een behoorlijke partij mee in de internationale context.

Het feit dat er maar twee handen omhoog gingen na de vraag van Martin Berendse werd ook wel weer genuanceerd tijdens de koffiepauze. “Alles op orde?, nee, dat durven we niet te zeggen, maar we zijn al wel een eind op weg.”

Tijdens de discussie: werkgroepvoorzitters Jeffrey van der Hoeven (links, Opslag) en Giovanna Fossati (Preservation)

Ik kijk altijd graag naar duurzame toegankelijkheid als een kwestie van risico’s managen. En dan kan iedere stap, hoe basaal ook, de risico’s die je collecties lopen weer een stuk beperken, een merkbare bijdrage leveren. Soms kan die bijdrage zelfs bestaan uit het accepteren van bepaalde risico’s omdat je hebt besloten dat een bepaalde collectie niet zo belangrijk is voor je organisatie, of omdat je in geval van nood een document opnieuw zou kunnen digitaliseren. Als je zoiets gericht besluit, dan hoef je daar tenminste niet meer over wakker te liggen en hou je energie en middelen over om de echte risico’s te lijf te gaan.

foto KB Optische Technieken, Jos Uljee

Sleutelfunctie: kennis

Bij het managen van risico’s is er een sleutelrol voor kennis. Kennis over wat er mogelijk is en wat niet, kennis over hoe de taken verdeeld zijn, kennis over hoe je beleid ten aanzien van digitaal materiaal op kunt bouwen, welke keuzes je daarbij kunt en moet maken, kennis over welke aanbieders er zijn en wat ze aanbieden. Als je dat in huis hebt, ben je al een flink eind op weg. Maar die kennis is gefragmenteerd en lastig te vinden. Daarom liep ik al een hele tijd rond met ideeën voor een NCDD kenniscentrum. Niet om overnieuw te doen wat elders al is verzameld, maar om het bij elkaar te brengen, verbindingen te leggen.

Foto KB Optische Technieken, Jos Uljee

Maar je aarzelt wel even als je zoiets op je neemt. Een kenniscentrum lanceren is geen kunst, maar het onderhoud kost heel veel energie. En hoe bouw je het zo op dat mensen er echt iets aan hebben? Ik kon het toch niet laten. In de kerstvakantie heb ik samen met Bert Bulder, mijn webmaster, met heel goedkope technische middelen een structuur opgezet. Dit project was eigenlijk nog lang niet rijp om te presenteren. De werkgroepen hadden alleen nog een paar screenshots gezien en het bestuur van de NCDD had nog helemaal niets gezien. Maar omdat er dinsdag zoveel vraag was naar kennis en verbinding, heb ik besloten het prototype aan de zaal te laten zien. En er werd heel enthousiast gereageerd, hoe onaf het werk ook is. Petra Links van NIOD twitterde: “Het wordt toch steeds concreter: online kennisbank, makelaarsbank voor storage, do’s and don’ts, #joepie!”.

Nadat ik de demoversie had laten zien zei dagvoorzitter Karin van der Heiden per ongeluk dat de site vanaf woensdag live zou zijn. Toen moest ik ingrijpen en vertellen dat het nog een demoversie was. Maar de zaal nam daar geen genoegen mee. “Niet wachten tot alles af is!” was de stemming in de zaal, gewoon live zetten. Ter plekke gingen de aanwezige bestuursleden om. “Zet maar live.” Mooi was dat – en echt, dit was allemaal niet gepland. Dit is toch het resultaat van de creatieve energie die ontstaat als je een groep mensen die allemaal eigenlijk hetzelfde willen in een ruimte zet.

Het couveusekindje heeft nog veel zorg nodig

We hebben nu dus een online kenniscentrum, maar het is nog wel een couveusekindje, de geboortekaartjes laat ik nog even in de kast. Het grote voordeel van de structuur is dat iedereen (ja, ook jij en u!) mee kan bouwen aan de informatie. Iedere pagina heeft een reactieveld. En ik hoop zo dat jullie dat gaan doen, je kennis delen. Mij op mijn vingers tikken als ik iets schrijf dat niet deugt. Wees ook niet bevreesd om verschillende invalshoeken te laten zien, zoals Barbara suggereerde. Niet opleggen, wel laten zien dat er verschillende meningen zijn. Bijvoorbeeld: het LOCKSS netwerk kiest ervoor om de bits op te slaan, maar niets te migreren naar standaarden [met uitleg waarom en bronnen]; de KB en bijvoorbeeld het Nationaal Archief van Denemarken zijn een andere mening toegedaan [met uitleg waarom en bronnen].

Ook heel belangrijk: probeer vooral links aan te dragen naar relevante informatie elders. We gaan hier niet het wiel opnieuw uitvinden, we gaan bestaande wielen zo veel mogelijk verbinden.

Infrastructuur a la NCDD: beleid, opslag, software, richtlijnen, geld, kennis, en mensen, mensen, mensen

De tussenstand van de infrastructuur

De NCDD bouwt aan een infrastructuur (kennis, mensen, faciliteiten, diensten, geld) voor de publieke sector en het symposium liet de tussentijdse resultaten zien van twee werkgroepen, opslag en duurzaam beheer/preservation (zie powerpointdia’s hier). De parallelsessies hebben de NCDD een schat aan informatie opgeleverd over hoe we bepaalde zaken moeten bijsturen. De taakafbakening tussen opslag enerzijds en duurzaam beheer/preservation lijkt nog niet goed genoeg te zijn. Daar gaan we in het bestuur over praten. Ik bespeurde ook een flinke behoefte onder vooral kleine instellingen aan kant-en-klare oplossingen die ze met weinig menskracht kunnen implementeren. De werkgroep Preservation sprak van “hulp bij implementatie”, maar misschien moet dat veel verder gaan, moeten het kant-en-klare diensten zijn, in IT-termen digital reservation as a service (DPaS?). Beeld en Geluid heeft zo’n aanpak enkele jaren geleden gepionierd met ProArchive. Toen is het niet helemaal gelukt om dat levensvatbaar te maken. Maar ik denk dat de richting wel klopte. Ook dat gaan we bespreken.

Ik heb dinsdag ten slotte contact gelegd met een kleine instelling die misschien als proefkonijn gaat fungeren. Om het allemaal heel concreet en praktisch te maken en de werkgroepen de verleiden tot uitvoerbare oplossingen. We zijn nog in overleg, dus ik noem nog even geen namen, maar het zou een geweldig project kunnen worden waar we allemaal van kunnen leren.

Kortom: met het laatste plaatje uit mijn eigen presentatie afgelopen dinsdag: Werk aan de winkel!

Voor wie de WOB-discussie tussen Ingmar Koch en de KB heeft gevolgd: Ingmar en Bas Savenije raakten geanimeerd in gesprek tijdens de NCDD-dag.

De presentaties van de dag staan hier.

Andere blogs over deze dag en/of de nationale infrastructuur:

Tweets: #hvog.

Share on TwitterShare on LinkedInShare via email

How to measure success – and work on weaknesses (Austin PASIG, 7)


Published: Mon 16 Jan 2012

As David Giaretta (Alliance for Permanent Access) would put it: “Measuring success in digital preservation is easy – if we have a 100 years or more.” But of course we are an impatient bunch, and funders especially want to know that their money is making a real difference. And yet, in an emerging field such as digital preservation it is not easy to develop objective and measurable quality criteria.

David Giaretta himself came forth at the conference to present a European framework project which is building a three-tiered approach to certification of trusted digital repositories for digital preservation: from the light-weight instrument Data Seal of Approval (which was developed separately by the Dutch DANS archive), on to self-assessment and ending in a fully-fledged third-party audit based on ISO standards.

The European certification framework

But the question remains: what can you measure objectively? Giaretta’s approach revolves around metadata: each digital object must contain enough representation information (information about everything you need to use the object: hardware, software and, e.g., vocabularies) to enable the designated community (the clients) to use the information (see OAIS for terminology). In principle, Giaretta said, this is testable. I have described Giaretta’s work on representation information in more depth here and here.

David "have you read my book" Giaretta

The next question that drew some comments from the audience is: who is doing the testing? How can the system ensure consistency? Giaretta pleaded the establishment of formal training and national accreditation boards. US consultant Bob “mister” Rogers added from an IT perspective that prospective auditors will need much more than knowledge of OAIS and representation information. They must know a lot about security issues and IT operations.

Bob "mister" Rogers

Need for a holistic approach

Rogers referred to a number of instruments that have been developed in the IT industry to take stock of cyber risks and assess the quality of systems (check out his slides when they become available. But, he warned, IT is only a part of the picture. “We need a holistic approach,” he stressed, that includes people and processes. The website www.datalossdb.org reveals, in Rogers’ words, “a staggering amount of data loss”. And, in case you wonder: intentional or unintentional mistakes by people are the most important cause of data loss:

slide Bob Rogers

Recurring conference theme: it’s people that make the difference

So, how do you audit people, I wonder. Remember Tom Cramer at bootcamp? It is all about the mindset, not about diplomas.

Bob Rogers gave the audience some insights into his consultancy practice during bootcamp. He stressed how important it is to talk with all of the stakeholders. Typically, records managers (and I would imagine librarians) want a thousand classifications; the IT folks will want everything to be nice and simple so it can be efficient; and the lawyers? They won’t tell you what they want – because lawyers are always vague about everything. In Rogers’ view, collaboration is the key – plus (of course!) an excellent consultant who asks the embarrassing questions.

“Ah, but does he get honest answers?” I asked Bob later. “Don’t most consultants write down what the customer wants to hear?” You can imagine what Bob’s answer was: a good consultant will be able to wriggle some change into politically correct language.

Measuring for improvement: the Digital Preservation Capability Maturity Model

At the end of the session, consultant Charles M. Dollar and his colleague Lori J. Ashley brought measuring to the next level: as a means to facilitate improvement. Interestingly, in the context of a conference about IT and digital preservation, Charles Dollar used a well-known IT model (CMMI), combined it with digital preservation standards such as OAIS (post to come) and TRAC (Trusted Repositories Audit and Certification), and built a “Digital Preservation Capability Maturity Model” (DPCCM), in other words: it measures how preservation-ready organizations are.

As a couple of Dutch organizations have been doing some work with the model in the Netherlands, it was a special pleasure for me to hear Charles and Lori talk about their approach during a 7 AM breakfast session. “Records managers,” Lori told me, “are overwhelmed by digital preservation. They simply don’t know where to start.”

Charles Dollar (left) and Lori Ashley of the Digital Preservation Capability Maturity Model

The DPCMM breaks up digital preservation into clearly defined components and scores them on a scale from 0 (nothing) to 5 (excellent):

The Dollar model (click to enlarge)

The resulting scores can then inspire prioritization and a roadmap for improvement:

“But how do you know where to begin?” I asked Lori, as that is a question that had come up in the Netherlands. Lori explained to me that that is where the consultant comes in. Through interviews he/she gets an acute sense of where something is glaringly missing and where the best opportunities for improvement can be found. A five-year improvement plan can then be drafted.

And if an organization scores zeros all around or is overwhelmed by the model itself? “Even that is alright, because many are in the same boat,” says Lori. “The model is about hope, even if it’s baby steps. The important thing is to leverage people’s passion about their organization.”

 

Share on TwitterShare on LinkedInShare via email

Bootcamp’s Digital preservation 101 (Austin PASIG, 6)


Published: Sat 14 Jan 2012

Perhaps this should have been blog post #1 from Austin PASIG. Because this is where it all begins. Fortunately, Tom Cramer’s (Stanford University Libraries) introduction to the issues of digital preservation from DP Bootcamp merits reproducing at any time. Thanks, Tom, for making available the slides.

I will give you most of Tom’s slides with hardly any commentary – that’s how good they are.

 

Tom had some interesting examples of data loss – there will be a separate post about that.

Tom Cramer (left, here with Gordon Bell) was the driving force behind a very successful Austin PASIG 2012. Thanks, Tom!

 

 

 

 

 

Share on TwitterShare on LinkedInShare via email

About everything, everybody, and forever (Austin, PASIG, 5)


Published: Sat 14 Jan 2012

Day 3 of Austin PASIG welcomed Gordon Bell, Principal Researcher at Microsoft. “I represent the demand side: I produce what you have to archive”, Gordon started out. He has been digitizing and digitally preserving everything about his personal and business life since 1998, to help/complement a human being’s faltering memory.

Gordon Bell (Microsoft): "Getting rid of a book is a good feeling for me.

Gordon predicts that life logging (which can be private, as opposed to “blogging” which is public) will become significant business for Microsoft. Bell: ”With extreme life logging, all of us will have the ability to recall or have recalled everything we’ve ever said, seen and done … just like today’s political candidates.”

Gordon now captures about 1 GB a month (he gave the audience a few-minute preview), and he wrote a book about the experiment, Total Recall. He estimates that every single person’s life can be stored in about 10 Terabyte of data. But good self-archiving tools are essential, and that is where Microsoft may see a business opportunity.

Someone in the audience asked Gordon how much time he spends organizing (structuring, adding metadata, etc.) all that material. His answer? He hired an assistant to do that work …

Is this an archivist’s dream or an archivist’s nightmare, one wonders. Dan Stanzione of the Texas Advanced Computing Center (TACC) had few illusions about the billions of files he stores. “More than 90% is write-once-read-never stuff. We call it ‘scratch’ and purge it regularly.”

Dan Stanzione, Texas Advanced Computing Center

Dan also had a down-to-earth definition of forever: “For most people, we make ‘forever’ 5 years with some implied things about free renewal, and some language about providing,” echoing comments by other speakers that “forever” is really not a workable proposition.

Start small, start simple, start with five years, was the general consensus, and take it from there. Because if you try to take on more, you risk getting overwhelmed and not doing anything at all.

 

"Everybody" at Austin PASIG 2012

 

Share on TwitterShare on LinkedInShare via email

“How to avoid getting a 65-pound dog that miows like a cat” – about requirements (Austin PASIG, 4)


Published: Fri 13 Jan 2012

Memory institutions are notoriously bad at specifying (digital preservation) system requirements. Why would that be? Because we lack business acumen? Because digital preservation is still so new to most of our institutions? Because we lack IT knowledge? It is probably a combination of all of those. I was talking to Mike Thuman of Tessella, and he told me that most institutions end up copying some specifications or requirements that somebody else has drafted. And those will be copied with minor alterations by yet other institutions. In the end, you will get the monstrum in the title of this blog. Or you end up asking for a 4-wheel drive that averages 100 miles per gallon and fits into half a (European) parking space.

Mark Evans: "I could talk for hours about requirements."

So, Mark Evans of the same Tessella gave the Austin PASIG audience some vendor’s advice about requirements. And just in case you wonder, let me assure you: in this case you can trust the vendor, because poor requirements cause problems for everybody - both the vendor and the client!

Mark told the audience that the “requirement gathering process” is especially critical:

In conclusion, Mark offered some general suggestions:

Mark stressed that drawing up requirements for a new system is a real opportunity to take advantage of IT to fundamentally change the way an institution operates. At the same time, he stressed the need for realistic expectations. 4 wheel drives just don’t make 100 miles a gallon!

Many speakers, including Mark, noted that we have to work on a joint vocabulary if the digital preservation institutions and IT are to work together successfully. He recommended the glossary at www.archivists.org/glossary. I’ll be sure to check that out before finalizing the final reports of the NCDD working groups on storage and preservation.

Texas requirements outsize those of many others ...

Thanks to Mark Evans for making available the slides.

And: to be continued …

 

Share on TwitterShare on LinkedInShare via email

A vision of self-healing systems at jam-packed Austin PASIG (3)


Published: Thu 12 Jan 2012

The first full day of Austin PASIG has come to an end and for once I don’t quite know where to begin – or end, for that matter. There is so much to tell, and whereas usually there are some – what shall I say – less inspiring talks that enable me to start reporting to you about the good stuff, I had no such “luck” today. So I will give you some impressions this evening and report in more detail on individual presentations in separate posts in the days to come – probably well into next week ;-) Several speakers are graciously sharing their power point slides with me, so you won’t have to miss anything, just wait a little longer.

So, did Boot Camp (yesterday’s post) deliver? It sure did – unless of course you had expected every answer to be given to every question in the program. (But you are smarter than that, aren’t you?)

The prize for the most succinct summary of the field of digital preservation goes to Tom Cramer of Stanford (and of PASIG). At a truly boot camp-like pace he ran the audience through digital preservation 101 (separate post), concluding on a note that people tend to forget: the real key is a digital (preservation) mindset:

Slide: Tom Cramer, Stanford University.

Helen Tibbo of the North Carolina School of Library and Information Sciences gave her recipe for “wrangling the chaos”:

Showing a picture of DCC’s lifecycle model she pointed to the many functions in data management, saying: no one person or institution can do it all. So, find your own niche in what is doable in this process. Bridging the gap between IT and records management people is a key task.

Mark Evans of Tessella taught “us”, memory institutions how to be “good, critical customers”. The very first requirement is … draft sound requirements, and that is not saying “our system must be user friendly” – because such a statement will only provoke more questions, but being precise in what a system must do – in order to be user friendly (separate post).

slide: Mark Evans, Tessella

Don Post of iMerge Consulting struck a pessimistic note, not because of file format problems and the like, but because preservation starts at creation, and that is where so very very much still goes wrong. Just telling people to play by the rules is not going to solve that: we have got to make it easy for them by providing proper tools. Post is co-founder of the Saving the Digital World initiative. Post also talked about reinventing the wheel, concluding that we tend to do that because we do not know that the wheel is out there – but it is probably somewhere in a community we do not regularly work with or talk with. Open knowledge centers must help address this, and Saving the Digital World is working on that.

Don Post fending off the Digital Dark Age

Raymond Clarke of Oracle covered cloud computing and the many shapes it takes. His assertion was that cloud computing is maturing, that many of the present-day issues with regard to security, ILM, availability, monitoring, governance, sustainability and enterprise management, are in fact being addressed and will be solved (more on the cloud later).

Raymond Clarke of Oracle

A very impressive and visionary presentation came from Michael Petersen of SNIA (Storage Networking Industry Association). To him, trusted digital repositories and cloud computing are only steps on the way, early steps, that is. Because of our “physical”  thinking in terms of repositories and platforms, we have a propensity to get lost in the weeds.

slide: Michael Petersen

Our end goal, according to Petersen, must be to make information truly portable, no longer depending on digital repositories (now matter how trusted they are) or any type of special platform. By making information platform-independent we will solve all issues of migration and emulation. And if we are to do this, we will need all the creative power of commercial industry. Both the information and the services must become totally virtual rather than physical. Peterson talked in terms of “self-healing” systems – systems that can repair glitches in their data. Now there is something to dream about!

Michael Petersen (left) with Roger Cummings

However, Chris Wood told the audience, no matter how virtual the services become, “The bits have to live somewhere.” He went on to give a fascinating insight into the world of storage R&D, predicting that amazing change is coming. Moving from perhaps a million to a trillion objects, we run into serious storage problems. “You just cannot back up a trillion files.” This presentation got quite technical for a non-techie like me, but the conclusion was clear: we need much higher density storage media that use less power, and the file systems must be self-correcting and self-protecting. The good news? They will come! The bad news: it may take industry a while. Wood asserted that much of what we need is actually possible, but because demand is too low, the market is not developing fully.

There’s more … but that will have to wait, because tomorrow we have another full day. Time to get some sleep …

While we were working, the weather had finally turned glorious. We got to enjoy it a bit during lunch and coffee breaks. Conference Center (left), UT Main Tower in the distance. (And the lone star flower arrangement at the front in fact slopes down quite a bit.)

 

 

 

 

Share on TwitterShare on LinkedInShare via email

Digital preservation boot camp: wrangling digital chaos (Austin PASIG, 2)


Published: Tue 10 Jan 2012
Austin PASIG has scheduled a digital preservation boot camp to kick off the conference. The boot camp is especially targeted at newcomers to the field, but it looks like all of us (including myself and you, honored readers) stand to benefit – if we can keep up with the pace, that is (starts 8.30 am; finishes noon).

Here’s the tantalizing program:

  • Digital Preservation — Things I wish someone had told me before I started
  • Digital Preservation in theory and practice
  • The Same But Different: data storage, business continuity management and preservation
  • OAIS and Emerging Standards: what does success look like?
  • Wrangling Digital Chaos: characterisation and ingest
  • Preservation Metadata: it’s not just for finding things
  • How to Avoid Reinventing the Wheel: procurement and outsourcing

Makes you wish you had come to Austin too, doesn’t it?

There is always the risk that the organizers will not deliver on the program. It happens quite often. The abstract is full of super questions like: How can we make digital preservation really work? How can we control the costs? And then the speaker comes up to the podium and basically repeats the questions but in more elaborate wordings. When he or she is finished repeating the questions, and you think things are about to get interesting, he or she simply says: ‘Thank you for your attention.’ Grrrrrhhhhh.

We have at least five speakers at this boot camp, a.o. Don Post (IMERGE Consulting), William Kilbride (UK Digital Preservation Coalition), David Giaretta (Alliance for Permanent Access), Tom Cramer (Stanford Univ), Mark Evans (Tessella). On your behalf, I shall be following them scrupulously.

 

Share on TwitterShare on LinkedInShare via email

How can mainstream IT help digital preservation? (Austin PASIG, 1)


Published: Tue 10 Jan 2012

When I entered the field of digital preservation, I was told that mainstream Information Technology (IT) did not have much to offer memory organizations, because a) private industry was not much interested in preserving anything at all (unless obliged by law), and b) if it did preserve something, it did not care about the “look and feel” which matters so much to memory organizations. And, of course, we ourselves kept on telling  everybody that digital preservation is much much more than simply making a back-up.

Austin, TX; former Treasury Office

Early IT at well protected former Treasury Office, Texas State Capitol, Austin

So libraries and archives took it upon themselves to develop preservation projects and do preservation research – with varying results, because in the beginning, those projects where mainly staffed by library and archives people without solid IT backgrounds. Also, R&D projects, especially the international EU projects, tended to live a life by themselves and not reflect the requirements of their own organizations. In many cases, I would assume, the organizations were not yet capable of clearly articulating their requirements. It was all so new to us.

Meanwhile, I have been picking up signals that IT and digital preservation may be moving closer together, and that’s why I have flown to Austin, TX, for a PASIG meeting (Preservation and Archiving Special Interest Group). Judging by the program it is quite a technical meeting, with much attention paid to cloud services, super computing and IT infrastructures. As I am not a technical expert, I may not understand everything that will be going on, but the combination of a focus on digital preservation and the strong presence of such companies as Oracle, Microsoft and DuraSpace, will have me looking out for opportunities to work together with mainstream IT.

Because, let’s face it, none of us memory organizations have IT engineering as our core business or expertise. Wouldn’t it be nice if we could make more use of mainstream IT? And spend more time on being archives and libraries and museums and research data centres?

The backdrop of the Texas Ranger Statue along the Great Walk has changed considerably over the years ...

This does not mean going back to old-fashioned library and archive practices, because IT is so crucial to our business that we must build up expertise – if only to be a good and critical customer of third-party services.

Plus: sound IT practices (durable storage media, multiple copies at different places, regular media refreshment and basic data management) are essential to digital preservation. Without sound IT practices, there is no long-term access.

Texas State Capitol, Austin

Just my luck: I came to Austin hoping for a bit of sunshine, but instead arrived on the coldest and rainiest day in a long time. Austinites are happy with the rain, though, there's been quite a drought recently, and the grass of the Capitol's grounds is still more yellow than green.

By the way: I am a little in awe of Texas. Everything is so BIG here. If you put the map of Texas on Europe and locate the North of Texas on top of Amsterdam, the south would be in Umbria, Italy; the west near Paris, the east beyond of the Czech Republic. Streets are big here and buildings are big; my hotel room is the biggest I’ve ever had. At breakfast I got a half-litre coffee mug (fortunately, the coffee was weak as ever, otherwise I might not have survived), the cutlery felt big and heavy, and I sat on a leather chair that seemed to be built for super sized people (of which there are quite a few as well). That’s of course because the portions of food are also …  you-know-what.

The State Capitol dome is 15 ft higher than the one in Washington, and the floor space is the biggest of any state capitol.

PS: Two headlines from this morning’s USA Today:

  • “Drought area drenched”
  • “Supersized vehicles sap gains from improved fuel economy.”

What did I tell you??

Share on TwitterShare on LinkedInShare via email

Dutch National Archives repositioning in the digital world


Published: Fri 30 Dec 2011

Recently, Digital Repository Project leader Ruud Yap of the Dutch National Archives explained to a Digital Deposit event in Estonia (15 Nov 2011) how the digital reality is causing the National Archives to reposition themselves within their network. Here is the link to the video (ca. 45 minutes).

Ruud Yap National Archives
The traditional and rather strict boundaries between the different stakeholders in the public records custody chain are no longer tenable in the digital world, Yap explained. Digital objects just cannot be neglected for 20 to 75 years before being transferred to the archive. Therefore, the Dutch National Archives are reaching out to records creators within government agencies and helping them manage semi-current digital files long before the Archives legally take custody after the legal retention period of 20 to 75 years. To this end, a shared digital repository has been established where government agencies can deposit their non-current files. The system is managed by the National Archives, but custody remains with the records creators until the legal retention period has expired and the records flow into the NA’s own digital repository. Management services include a uniform architecture, standardization and preservation watch and services.

In the experience of the National Archives, record producers are not really interested in the fate of non-current files. As the service is not mandatory, the National Archives must make a clear financial case to encourage government agencies to make use of the service.

Yap also described the fundamental impact on the National Archives’ organization. Every staff member and every work process is affected by the transition from paper to digital. Whereas the paper process is rather manual, the digital world is about automation and about formalization. Implicit knowledge and implicit work processes from the analogue era have to be eradicated. Yap stressed that changing the frame of mind of the organization is a huge effort and that the effort is still ongoing – slowly and strenuously. New competencies must be developed as well, such as customer relations management – with regard to record producers. ‘We have to learn to sell our service’, Yap said, and that is an entirely new challenge.

Some of the National Archives’ lessons learned include:

  • physical transfer of records remains necessary, as government agencies’ IT systems are not geared to long-term management
  • even files that were transferred within 10 years after being created contained unsupported file formats and applications
  • record creators have no vested interest in non-current files, you have to “sell” services to them
  • you need shared standards
  • customizing the ingest process is an enormous and time-consuming effort with lots of manual intervention
  • the NA had to lower its metadata requirements to make the system work.

Yap’s final words: “PLEASE PLEASE PLEASE, dare to make mistakes!!”

National Archives and National Library to merge in 2013

On another note: just before Christmas, the Dutch Ministry of Education, Culture and Sciences announced that the Dutch National Archives are to merge with the Dutch National Library (KB). Whether the digital repositories will be merged also, is yet to be decided. The National Archives has implemented Tessella’s Safety Deposit Box whereas the KB has been working with IBM’s DIAS system since 2003, but is presently developing a new architecture to replace DIAS.

 

 

 

Share on TwitterShare on LinkedInShare via email

EU causes legal trouble for EU digital preservation projects (#KEEP, 3)


Published: Mon 19 Dec 2011

The KEEP project has organized a Europe-wide road show to promote its emulation framework (I blogged about the project and the workshop in The Hague earlier). If you have not had a chance to attend, there is one last opportunity in Cardiff 24-25 January; I can certainly recommend the project and the workshop if you are interested in preservation strategies other than migration – which is fine for simple objects but does not work for complex or composite objects.

However, there was a bit of disconcerting news at the workshop as well: legal trouble. Today I am finally keeping my promise to write some more about that. As reported in the earlier blog post, emulation is about emulating, or ‘recreating’ the hardware/software combination on which a digital object was created in order to be able to play it on a newer platform. This means that you need all the old software – not just the application, but the operating system, the browser, the plug-ins, the fonts, and whatever.

The KEEP team realized this would involve making copies of those software applications and that copyright issues might arise. So David Anderson of Portsmouth University took upon himself the unthankful task of digging through a tangle of national and EU laws and regulations to find out what was legal and what was not.

David Anderson tried to unravel the tangle of legal regulations

The result of his endeavours is modestly called a “Layman’s Guide to Keep Legal Studies” because David is not a certified lawyer and thus disclaimers apply. But David is to be commended for his painstaking work to unravel piles of tedious literature. The details are complex, but the overall message, according to David, is clear:

David Anderson's 'layman's' conclusions about the legal regime

Janet Delve, also of the KEEP team, told me that the KEEP project itself must disable some of its own deliverables before sending them to the Commission in order to stay within the law.

Crazy, isn’t it?

Once again the conclusion must be that organizational, legal and financial issues are much more difficult to crack than the technical ones. We’ve got our work cut out for us in 2012!

Share on TwitterShare on LinkedInShare via email