Published on October 29, 2007
Stephan Heuscher, Swiss Federal ArchivesERPANET Workshop on Trusted Digital Repositories for Cultural Heritage, Rome 17th – 19th Nov. 2003 : Stephan Heuscher, Swiss Federal Archives ERPANET Workshop on Trusted Digital Repositories for Cultural Heritage, Rome 17th – 19th Nov. 2003 Today’s Design of Tomorrow’s Trust in Digital Archives – Considerations from the ARELDA project Overview: Overview ARELDA Digital Archives vs. Digital Library ARELDA Basics Characteristics of Trust Low Level Trust High Level Trust Conclusion ARELDAArchiving of Electronic Digital Data and Records: ARELDA Archiving of Electronic Digital Data and Records Goal: Finding long-term solutions for the permanent archiving of digital records in the Swiss Federal Archives Indispensable for the long-term execution of the Federal Archives Act Development costs 2001 – 2008: ~ 11 Mio € Operational costs from 2005: 2.5 – 3.3 Mio € per year (expected growth: 20 TB/yr net) Today’s project team: 7 people (4 CS engineers) Digital Archives vs. Digital Library: Digital Archives vs. Digital Library Self-contained (in operation) Must not rely on external infrastructure or services for operation All documentation in-house Disaster recovery built-in Lesser need to cooperate Value of archived data Evidentiary value (course of business) Information value Non reproducable (e.g. climate and oceanographic data) Enormous costs (e.g. space flight missions) Better control over the archived data Look & feel less relevant ARELDA Basics: ARELDA Basics The system will be OAIS (ISO 14721:2003) compliant (really!) The preservation strategy is migration in open environments using open standards (ISO, W3C, etc.) All potential archivalia is validated against the current format standards All holdings remain intelligible all the time Appraisal is outside of scope (but very important) Trust: Trust Definition: (by Merriam Webster Dictionary: m-w.com) assured reliance on the character, ability, strength, or truth of someone or something . . . one in which confidence is placed . . . a charge or duty imposed in faith or confidence or as a condition of some relationship . . . something committed or entrusted to one to be used or cared for in the interest of another. (emphasis mine) Characteristics of Trust:Low Level Trust: Characteristics of Trust: Low Level Trust The archives holdings must remain Integer: protected from unintended and intended harm Authentic: authentication (of authorship and provenance) and reliability (of the contained evidence) Intelligible: understandable and comprehensible Original: “as close to the original as possible” Accessible: technically readable and usable to users These are competitive and conflicting characteristics Characteristics of Trust:High Level Trust: Characteristics of Trust: High Level Trust The archives must demonstrate an active management of integrity, authenticity, intelligibility, originality, and accessibility over time To achieve this, the archives must document its procedures and provide a processing history for each item Retraceability (as a consequence) provides a posteriori auditability for any user Low Level Trust: Low Level Trust Integrity Intelligibility Authenticity Originality Accessibility Integrity:Technical Infrastructure: Integrity: Technical Infrastructure Hardware and software solutions must be Sustainable Finance Complexity Management Open Modular (flexibility to dump obsolete HW and SW) Migratable (possibility to change to HW and SW) Secure (no unauthorized access) Reliable (no data loss) Available (the data is at hand when you need it) Integrity / Authenticity (technical): Integrity / Authenticity (technical) Public and open principles and methods (not “security by obscurity”) Archival Information Package: Simple Durable Migratable Digital Signatures: Limited validity Computing power Long term considerations Integrity / Authenticity (technical): Integrity / Authenticity (technical) Example: Digital signatures are temporary Validity is measured in years not decades Can be revoked Need complex system to work Can be cracked once we have quantum computing (massive parallelism) They are of no use, when the byte stream changes (migration) Re-institution of integrity and authenticity by the archives Be very careful with digital signatures! Original Authenticity Archival Authenticity: Original Authenticity Archival Authenticity Migration destroys the original byte stream Renders digital signatures useless The archives and stakeholders have to define which significant properties are migrated The archival authenticity is guaranteed by the chain of custody, process history, and the descriptions of the migration processes Archives become substitutes of the original’s creator authenticity Intelligibility / Accessibility: Intelligibility / Accessibility Strict ingest verification for “total” control of ingested records Interpretability Context Rendering Usability Retrieval and rendering on demand Emulation Originality: Originality Highly coupled to the functionality present at the time of archiving Look & Feel for user experience Many layers of complexity Recreation of functionality Structure for evidence and data Easier to handle Decoupling of the data from functionality (similar to decoupling of a bit stream from its medium) Best match: Emulation High Level Trust: High Level Trust Workflows Operation Automation Audit / Review Workflows: Workflows Workflow: Automation of a business process (in the archives) Workflow definitions Document archival business processes Allow reviews and comparison between different archives Formally defined semantics Verifiable Processable by workflow engines (automatic execution) Enacted workflows generate records Suitable for long term preservation Allow tracing of the workflows Openly and properly defined and accessible Operation of the Archives: Operation of the Archives Constant change of the infrastructure Keep up with storage technology to minimize costs Changes in the services provided by the digital archive New services created / old services stopped New functionality Every change will have influence on the workflow Need for documentation to remain retraceable Verification Formal language Automation Automation: Automation Huge quantities of data Workflows in electronic form allow integration of independent services Standardized logging/record creation Less human interaction Cheaper Less error prone Enables higher level of security and reliability Intensive test and verification needed Mistakes are very costly Audit / Review: Audit / Review What information does an audit need? Policies defined by the archives Inner workings of the archives (workflows) Chain of custody of an archivalia (enactment and results of workflows) Who will perform the audits? Audit information must become part of the archive’s holdings Important for the understanding of the context of the other holdings for future users Conclusions: Conclusions The archives must actively manage the characteristics of trust over time Retrospective auditing of the chain of processing within the archives by the user is much more important for digital records The work of the archives will be much more transparent, public, and open than in the past. Any Questions?: Any Questions? Contact: Stephan Heuscher Swiss Federal Archives Archivstrasse 24 CH-3003 Bern +41 31 324 10 95 [email protected] ? ? ? ! ! !