[Pasig-discuss] Digital repository storage benchmarking
Steve Knight
Steve.Knight at dia.govt.nz
Sun May 14 19:43:36 EDT 2017
Hi Tim
At the National library of New Zealand, we are storing about 210TB of digital objects in our permanent repository.
We have a 25TB online cache, with an online copy of all the digital objects sitting on disk.
Three tape copies of the objects are made as soon as they enter into the disk archive. 1 copy remains within the tape library (nearline), the other 2 copies are sent offsite (offline). We use Oracle SAM-QFS to manage the storage policies and automatic tierage.
We have a similar treatment for our 100TB of Test data, which has 1 less offsite tape copy.
We are currently looking at replacing this storage architecture with a mix of Hitachi's HDI and HCP S30 object storage products and our cloud provider's object storage offering. The cloud provider storage includes replication across 3 geographic locations providing both higher availability and higher resilience than we currently have.
By moving to an all online solution we hope to increase overall performance and make savings through utilising object storage and exiting some services related to current backup and restore processes.
Regards
Steve
-----Original Message-----
From: Pasig-discuss [mailto:pasig-discuss-bounces at asis.org] On Behalf Of Sheila Morrissey
Sent: Saturday, 13 May 2017 5:44 a.m.
To: pasig-discuss at asis.org
Subject: [Pasig-discuss] FW: Digital repository storage benchmarking
Hello, Tim,
At Portico (http://www.portico.org/digital-preservation/), we preserve e-journals, e-books, digitized historical collections, and other born-digital scholarly content.
Currently, the Portico archive is comprised of roughly 77.7 million digital objects (we call them "Archival Units", or AUs); comprising over 400 TB; made up of 1.3 billion files.
We maintain 3 copies of the archive: 2 on disk in geographically distributed data centers, and a 3rd copy in commercial cloud storage. We create and maintain backups (including fixity checks) using our own custom-written software.
I hope this helpful.
Best regards,
Sheila
Sheila M. Morrissey
Senior Researcher
ITHAKA
100 Campus Drive
Suite 100
Princeton NJ 08540
609-986-2221
sheila.morrissey at ithaka.org
ITHAKA (www.ithaka.org) is a not-for-profit organization that helps the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. We provide innovative services that benefit higher education, including Ithaka S+R, JSTOR, and Portico.
-----Original Message-----
From: Pasig-discuss [mailto:pasig-discuss-bounces at asis.org] On Behalf Of Tim Walsh
Sent: Friday, May 12, 2017 10:16 AM
To: pasig-discuss at asis.org
Subject: [Pasig-discuss] Digital repository storage benchmarking
Dear PASIG,
I am currently in the process of benchmarking digital repository storage setups with our Director of IT, and am having trouble finding very much information about other institutions’ configurations online. It’s very possible that this question has been asked before on-list, but I wasn’t able to find anything in the list archives.
For context, we are a research museum with significant born-digital archival holdings preparing to manage about 200 TB of digital objects over the next 3 years, replicated several times on various media. The question is what precisely those “various media” will be. Currently, our plan is to store one copy on disk on-site, one copy on disk in a managed off-site facility, and a third copy on LTO sent to a third facility. Before we commit, we’d like to benchmark our plans against other institutions.
I have been able to find information about the storage configurations for MoMA and the Computer History Museum (who each wrote blog posts or presented on this topic), but not very many others. So my questions are:
* Could you point me to published/available resources outlining other institutions’ digital repository storage configurations?
* Or, if you work at an institution, would you be willing to share the details of your configuration on- or off-list? (any information sent off-list will be kept strictly confidential)
Helpful details would include: amount of digital objects being stored; how many copies of data are being stored; which copies are online, nearline, or offline; which media are being used for which copies; and what services/software applications are you using to manage the creation and maintainance of backups.
Thank you!
Tim
- - -
Tim Walsh
Archiviste, Archives numériques
Archivist, Digital Archives
Centre Canadien d’Architecture
Canadian Centre for Architecture
1920, rue Baile, Montréal, Québec H3H 2S6 T 514 939 7001 x 1532 F 514 939 7020 www.cca.qc.ca<http://www.cca.qc.ca/>
Pensez à l’environnement avant d’imprimer ce message Please consider the environment before printing this email Ce courriel peut contenir des renseignements confidentiels. Si vous n’êtes pas le destinataire prévu, veuillez nous en aviser immédiatement. Merci également de supprimer le présent courriel et d’en détruire toute copie.
This email may contain confidential information. If you are not the intended recipient, please advise us immediately and delete this email as well as any other copy. Thank you.
----
To subscribe, unsubscribe, or modify your subscription, please visit http://mail.asis.org/mailman/listinfo/pasig-discuss
_______
PASIG Webinars and conference material is at http://www.preservationandarchivingsig.org/index.html
_______________________________________________
Pasig-discuss mailing list
Pasig-discuss at mail.asis.org
http://mail.asis.org/mailman/listinfo/pasig-discuss
----
To subscribe, unsubscribe, or modify your subscription, please visit http://mail.asis.org/mailman/listinfo/pasig-discuss
_______
PASIG Webinars and conference material is at http://www.preservationandarchivingsig.org/index.html
_______________________________________________
Pasig-discuss mailing list
Pasig-discuss at mail.asis.org
http://mail.asis.org/mailman/listinfo/pasig-discuss
More information about the Pasig-discuss
mailing list