[Pasig-discuss] Digital repository storage benchmarking

Schaefer, Sibyl sschaefer at ucsd.edu
Thu May 18 15:24:03 EDT 2017


I’d like to chime in with some information about the Chronopolis Digital Preservation Network. We were originally funded by the Library of Congress NDIPP program and ingested our first production content in 2008. Chronopolis was designed to preserve hundreds of terabytes of digital data with minimal requirements on the data provider. The single, overriding commitment of the Chronopolis system is to preserve objects in such a way that they can be transmitted back to the original data providers in the exact form in which they were submitted. Chronopolis leverages high-speed networks, mass-scale storage capabilities, and the expertise of the partners in order to provide a geographically distributed, heterogeneous, and highly redundant archive system. Our partners include the University of California San Diego Library, the National Center for Atmospheric Research, The University of Maryland Institute for Advanced Computer Studies, and our newest partner, the Texas Digital Library.

Features of the project include:
•         Three geographically distributed copies of the data
•         Curatorial audit reporting
•         Development of best practices for data packaging and sharing

We also serve as a founding node in the Digital Preservation Network and partner with DuraSpace to provide our services. We currently preserve over 50 TBs (150 replicated) of data. Our prices vary depending on the ingest mechanism, but the base rate for storage is $286/TB/year for three geographically-distributed copies.

Best,

Sibyl


Sibyl Schaefer
Chronopolis Program Manager // Digital Preservation Analyst
University of California, San Diego



From: Pasig-discuss <pasig-discuss-bounces at asist.org> on behalf of Arthur Pasquinelli <arthurpasquinelli at gmail.com>
Date: Thursday, May 18, 2017 at 11:40 AM
To: "pasig-discuss at mail.asis.org" <pasig-discuss at mail.asis.org>
Subject: Re: [Pasig-discuss] Digital repository storage benchmarking

I was just thinking the same thing since we have had some good discussions now and in the past. Since I have kept a copy of all past PASIG emails, I'll work on it with the other PASIG steering committee members. We are in the middle of some administrative work for PASIG right now, so I'll add this to the things being worked on.

On 5/18/17 11:18 AM, Jeanne Kramer-Smyth wrote:
What would folks think of all of this amazing information being collected in a shared document somewhere?

Jeanne

On Thu, May 18, 2017 at 1:52 PM, Katherine Skinner <katherine at educopia.org<mailto:katherine at educopia.org>> wrote:
I love this thread--thank you for starting it, Tim!

The MetaArchive Cooperative started preserving content with six institutions in 2004; it has grown to encompass more than 60 institutions, including through consortial memberships with several regional consortia (in Barcelona and Ohio) and a library alliance (HBCU).

Our mission is to provide a strong preservation community as well as an affordable preservation solution for distributed digital preservation for a wide variety of memory-oriented organizations. Our members constantly learn from each other as they compare workflows, tools, approaches, and policies.

More details, specific to your questions, Tim:

  *   we are actively preserving 1,200+ collections totaling 85TB of content (and that is slated to almost double in the next year)
  *   content is ingested via bags (BagIt) and can be submitted in a variety of ways
  *   every file is replicated 7 times and stored in 7 secure, geographically distributed locations on infrastructure that includes both physical servers (at some member institutions) and "cloud-based" and VM infrastructures
  *   content is regularly audited using LOCKSS voting and polling mechanisms
  *   when needed, content is repaired and metadata describing that event is created
Other details that may be of interest:

  *   pricing is $500/TB for storage fees, plus an annual membership fee of between $3,000-$5,500 depending on the selected category
  *   some members host network infrastructure; others pay a small annual fee ($1000) to waive that responsibility
  *   MetaArchive is entirely run, owned, and controlled by its members--including pricing decisions
Carly Dearborn (Purdue University) is the current Chair of the Steering Committee. If you are interested in learning more, please reach out to me (Katherine at Educopia.org<mailto:Katherine at Educopia.org>) or Carly (cdearbor at purdue.edu<mailto:cdearbor at purdue.edu>) while the network's facilitator, Sam Meister, is out on paternity leave until early July.




Katherine Skinner, PhD
Executive Director, Educopia Institute
http://educopia.org

Working from Greensboro, NC
katherine at educopia.org<mailto:katherine at educopia.org> | 404 783 2534<tel:%28404%29%20783-2534>




----
To subscribe, unsubscribe, or modify your subscription, please visit
http://mail.asis.org/mailman/listinfo/pasig-discuss
_______
PASIG Webinars and conference material is at http://www.preservationandarchivingsig.org/index.html
_______________________________________________
Pasig-discuss mailing list
Pasig-discuss at mail.asis.org<mailto:Pasig-discuss at mail.asis.org>
http://mail.asis.org/mailman/listinfo/pasig-discuss





----

To subscribe, unsubscribe, or modify your subscription, please visit

http://mail.asis.org/mailman/listinfo/pasig-discuss

_______

PASIG Webinars and conference material is at http://www.preservationandarchivingsig.org/index.html

_______________________________________________

Pasig-discuss mailing list

Pasig-discuss at mail.asis.org<mailto:Pasig-discuss at mail.asis.org>

http://mail.asis.org/mailman/listinfo/pasig-discuss


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/pasig-discuss/attachments/20170518/a1ddf5d0/attachment-0001.html>


More information about the Pasig-discuss mailing list