[Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
Matthew Addis
matthew.addis at arkivum.com
Fri Jun 23 05:04:36 EDT 2017
+1 from me on worrying about what happens to data in the cloud if payments aren’t maintained. Something to look for in the contract!
Reminds me of the SDSC storage service - I looked at their T&Cs a few years back and they had/have a graceful model (Arkivum provides something similar).
For the first X months of payment being overdue, you still have full access to the service, including read/write of data.
For the next X months of payment being overdue, you still have read access to your data, but you can’t add any new data.
For the next X months of payment being overdue, you are blocked from accessing the data, but it’s still retained in case you want to restart using the service.
This gives a graceful way to deal with budget issues that a customer might have - nothing that anything measured in months is still a relatively short period of times in preservation/archiving terms.
More widely, this also raises the issue of other ways to deal with sustaining payment that perhaps doesn’t get enough attention, e.g. endowment/annuity, ring fenced money that drip feeds PAYG, paid-up data escrow as contingency etc.
In other walks of life there are strategies to deal with non-payment risks, e.g. think about payment protection insurance, mortgages, annuities etc.
I’d be very interested to know whether archives use any of these strategies when dealing with the risks of not being able to sustain payments on cloud services or anything else for that matter.
Cheers,
Matthew
Matthew Addis
Chief Technology Officer
tel: +44 1249 405060
mob: +44 7703 393374
email: matthew.addis at arkivum.com<mailto:matthew.addis at arkivum.com>
web: www.arkivum.com<http://www.arkivum.com/>
twitter: @arkivum
This message is confidential unless otherwise stated.
Arkivum Limited is registered in England and Wales, company number 7530353. Registered Office: 24 Cornhill, London, EC3V 3ND, United Kingdom
From: "Antonio Guillermo Martínez (libnova)" <a.guillermo at libnova.com<mailto:a.guillermo at libnova.com>>
Date: Thursday, 22 June 2017 at 15:42
To: Mark Myers <mmyers at tsl.texas.gov<mailto:mmyers at tsl.texas.gov>>, Matthew Addis <matthew.addis at arkivum.com<mailto:matthew.addis at arkivum.com>>, "gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>" <gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>>, "pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>" <pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>>
Subject: RE: [Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
Hi Gail,
We lean also for (at least) one local copy of the assets (if you can pay for it). We are not looking at the probability of the cloud operator closing unexpectedly (without time to migrate) or a massive technology problem, that we think would be unrealistic, but we are thinking about an economic problem for the owner of the assets, not being able to pay the cloud operator for the storage (Microsoft, Amazon, etc. –if you don’t pay, data disappears almost immediately) or a security incident that affects the cloud data.
Also the capex vs opex may not be an issue; today, there are companies (like LIBNOVA) offering on-premise storage as a service, with a standard massive storage devices with a very low cost per TB/year, paid as a service, totally opex.
I would say that a combined approach would be the best and I would consider two scenarios:
Cloud approach, with two copies in two providers: for instance, LIBSAFE Cloud uses Microsoft Azure as the main storage, but it is replicated over Amazon Glacier every day (from a Glacier-running instance). You could mimic this architecture easily. This way, even in the case you have a severe security problem with your main copy, the “cloud backup” remains unaffected. We usually are recommending, when possible, for the owner of the assets to pay for the “cloud backup” storage directly, so, even in the case the digital preservation provider collapses unexpectedly (and stops paying its ej. Azure bills), you have access to your information (as long as you continue paying for it).
At least, if you are able to prepay for it, you also minimize the economic risk.
Hybrid approach, If you can pay for it, minimizes most of the risks. Having your copies in the cloud and in your local infrastructure greatly decreases the involved risks. I would add that you need to pay attention to the synchronization method between two copies. Just replicating is not enough. We are using a home grown software (LIBNOVA Dark Storage Sync) that synchronizes copies checking hashes and takes care of retention periods in the customer storage side. A good thing of this software model is that we (from the cloud) are unable to delete or overwrite customer content (even if we want or if we have had a security incident with an attacker taking control over our infrastructure). Instead of LIBNOVA Cloud connecting to your internal storage, your internal storage synchronizes with the cloud storage (using a read only key). You can also easily mimic this approach in your architecture without using LIBSAFE.
I have a technical paper and some diagrams that explains how this dark cloud sync works, that I can send you, if you are interested.
There are also models like the one Matthew Addis (Arkivum) proposes (which I really like), with hybrid approaches and escrow services that are really worth investigating.
Best regards, AG.
----
Antonio Guillermo Martínez Largo
libnova – Technology changes. Information prevails.
www.libnova.com
EMEA & LATAM: Paseo de la Castellana, 153 – Madrid [t] +34 91 449 08 94
USA & CANADA: 14 NE First Ave (2nd Floor) - Miami, Florida 33132, USA [t]: +1 855-542-6682
De: Pasig-discuss [mailto:pasig-discuss-bounces at asist.org<mailto:pasig-discuss-bounces at asist.org>] En nombre de Mark Myers
Enviado el: Thursday, June 22, 2017 3:42 PM
Para: Matthew Addis <matthew.addis at arkivum.com<mailto:matthew.addis at arkivum.com>>; gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>; pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>
Asunto: Re: [Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
In TX was use the cloud (Amazon Gov Cloud) as our primary storage system since that’s where our preservation system (Preservica) is built in. We also keep copies locally on external hard drives and RAIDs. We have the “original” files as we receive them, push the files into the cloud and perform the preservation and normalization actions on them, then (eventually) copy the preservation files back down to another set of hard drives as well. The cloud also serves as our geographically dispersed redundancy.
A side note, even if used the state of TX data center, they use the Azure cloud as dark storage as well. So it’s still ultimately cloud storage whether it’s managed through our preservation vendor under our direct control or through the state data center (which is actually managed and vended by Xerox).
Lots Of Copies Keep Stuff Safe.
Mark J. Myers
Senior Electronic Records Specialist
Texas State Library and Archives Commission
1201 Brazos Street
Austin TX 78711-2516
Phone: 512-463-5434
mmyers at tsl.texas.gov<mailto:mmyers at tsl.texas.gov>
www.tsl.texas.gov<http://www.tsl.texas.gov>
Texas Digital Archive<https://tsl.access.preservica.com/>
From: Pasig-discuss [mailto:pasig-discuss-bounces at asist.org] On Behalf Of Matthew Addis
Sent: Thursday, June 22, 2017 1:51 AM
To: gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>; pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>
Subject: Re: [Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
Hi Gail,
I think this is a case of the perennial problem of how to balance cost, risk and accessibility when storing digital assets. As others have pointed out, cost includes budgeting issues, e.g. capex or opex, risk includes data security and sovereignty as well as risk of data corruption or loss, and access includes how quickly you can retrieve and use assets. It’s easy to find a solution for any two out of three of these factors, but all at once is hard - low cost, low risk and fast access.
Not trusting the cloud is sensible and keeping at least one copy onsite is a practical way for many people to reduce risks and provide fast local access to their data. But it doesn’t necessarily have to be that way. The usual approach to managing risk of data loss is to have multiple copies of data in multiple geographic locations and not all with one vendor. Diversity is your friend and removing vendor dependencies and lock-in is really important. If you use multiple cloud providers or data storage facilities then it is possible to get a good balance of cost and risk without needing an onsite copy. As the trend continues towards cloud providers building in-territory data centres, the sovereignty issue is becoming less of a challenge, but of course can’t be eliminated in some cases where data security is paramount.
This is all a long-winded way of saying that the question isn’t necessarily one of having an onsite copy or not, it’s one of how best to address cost, risk and access - with an onsite copy being one of the more common solutions (and often a good one), but it’s not the only one that’s viable. Indeed, in some cases where there isn’t in-house capacity to store data onsite or the capex/opex issues raise their head, then alternatives such as using multiple cloud providers can be more attractive.
BTW, as a ‘cloud provider’ we too don’t ‘trust' the cloud for all copies of the data that we store - including when we store it ourselves in our own data centres! It might sound odd, but in some respects we don’t trust ourselves let alone any single third-party cloud provider. Instead, we adopt a ‘rely on nothing’ type approach and we store a complete copy of our customer’s data offline with an independent escrow provider. This is an automatic and built in part of our service and not something that a customer opts into or has to remember to do. This gives our customers reassurance of no lock-in to us as a provider, but it also gives us our own fallback if there were ever to be issues with our own infrastructure or operations. Some of our customers keep an onsite copy as well as archiving their data with us - but many don’t - which is because their data is being held not just by us but by an independent third-party too.
Cheers,
Matthew
Matthew Addis
Chief Technology Officer
tel:
+44 1249 405060
mob:
+44 7703 393374
email:
matthew.addis at arkivum.com<mailto:matthew.addis at arkivum.com>
web:
www.arkivum.com<http://www.arkivum.com/>
twitter: @arkivum
This message is confidential unless otherwise stated.
Arkivum Limited is registered in England and Wales, company number 7530353. Registered Office: 24 Cornhill, London, EC3V 3ND, United Kingdom
From: Pasig-discuss <pasig-discuss-bounces at asist.org<mailto:pasig-discuss-bounces at asist.org>> on behalf of "gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>" <gail at trumantechnologies.com<mailto:gail at trumantechnologies.com>>
Date: Wednesday, 21 June 2017 at 20:47
To: "pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>" <pasig-discuss at asis.org<mailto:pasig-discuss at asis.org>>
Subject: [Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
Experts, please share your thoughts.
Are your institutions ready to "trust" the cloud for all copies of data, or is there still an argument for an onsite copy? I usually lean to keeping one onsite copy, but am I stuck in an old paradigm? From earlier PASIG thread (started by Tim) it's clear other institutions are keeping at least one copy on site.
But how do you defend this decision?
Gail
Gail Truman
Truman Technologies, LLC
Certified Digital Archives Specialist, Society of American Archivists
Protecting the world's digital heritage for future generations
www.trumantechnologies.com<http://www.trumantechnologies.com>
facebook/TrumanTechnologies
https://www.linkedin.com/in/gtruman
+1 510 502 6497
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/pasig-discuss/attachments/20170623/63bd6534/attachment-0001.html>
More information about the Pasig-discuss
mailing list