[Pasig-discuss] Arguments for keeping an onsite copy of digitally preserved/stored digital content?
Antonio Guillermo Martínez (libnova)
a.guillermo at libnova.com
Thu Jun 22 10:42:29 EDT 2017
Hi Gail,
We lean also for (at least) one local copy of the assets (if you can pay
for it). We are not looking at the probability of the cloud operator
closing unexpectedly (without time to migrate) or a massive technology
problem, that we think would be unrealistic, but we are thinking about an
economic problem for the owner of the assets, not being able to pay the
cloud operator for the storage (Microsoft, Amazon, etc. –if you don’t pay,
data disappears almost immediately) or a security incident that affects the
cloud data.
Also the capex vs opex may not be an issue; today, there are companies
(like LIBNOVA) offering on-premise storage as a service, with a standard
massive storage devices with a very low cost per TB/year, paid as a
service, totally opex.
I would say that a combined approach would be the best and I would consider
two scenarios:
*Cloud approach,* with two copies in two providers: for instance, LIBSAFE
Cloud uses Microsoft Azure as the main storage, but it is replicated over
Amazon Glacier every day (from a Glacier-running instance). You could mimic
this architecture easily. This way, even in the case you have a severe
security problem with your main copy, the “cloud backup” remains
unaffected. We usually are recommending, when possible, for the owner of
the assets to pay for the “cloud backup” storage directly, so, even in the
case the digital preservation provider collapses unexpectedly (and stops
paying its ej. Azure bills), you have access to your information (as long
as you continue paying for it).
At least, if you are able to prepay for it, you also minimize the economic
risk.
*Hybrid approach,* If you can pay for it, minimizes most of the risks.
Having your copies in the cloud and in your local infrastructure greatly
decreases the involved risks. I would add that you need to pay attention to
the synchronization method between two copies. Just replicating is not
enough. We are using a home grown software (LIBNOVA Dark Storage Sync) that
synchronizes copies checking hashes and takes care of retention periods in
the customer storage side. A good thing of this software model is that we
(from the cloud) are unable to delete or overwrite customer content (even
if we want or if we have had a security incident with an attacker taking
control over our infrastructure). Instead of LIBNOVA Cloud connecting to
your internal storage, *your internal storage synchronizes with the cloud
storage *(using a read only key). You can also easily mimic this approach
in your architecture without using LIBSAFE.
I have a technical paper and some diagrams that explains how this dark
cloud sync works, that I can send you, if you are interested.
There are also models like the one Matthew Addis (Arkivum) proposes (which
I really like), with hybrid approaches and escrow services that are really
worth investigating.
Best regards, AG.
----
Antonio Guillermo Martínez Largo
libnova – Technology changes. Information prevails.
www.libnova.com
EMEA & LATAM: Paseo de la Castellana, 153 – Madrid [t] +34 91 449 08
94
USA & CANADA: 14 NE First Ave (2nd Floor) - Miami, Florida 33132, USA
[t]: +1 855-542-6682
*De:* Pasig-discuss [mailto:pasig-discuss-bounces at asist.org] *En
nombre de *Mark
Myers
*Enviado el:* Thursday, June 22, 2017 3:42 PM
*Para:* Matthew Addis <matthew.addis at arkivum.com>;
gail at trumantechnologies.com; pasig-discuss at asis.org
*Asunto:* Re: [Pasig-discuss] Arguments for keeping an onsite copy of
digitally preserved/stored digital content?
In TX was use the cloud (Amazon Gov Cloud) as our primary storage system
since that’s where our preservation system (Preservica) is built in. We
also keep copies locally on external hard drives and RAIDs. We have the
“original” files as we receive them, push the files into the cloud and
perform the preservation and normalization actions on them, then
(eventually) copy the preservation files back down to another set of hard
drives as well. The cloud also serves as our geographically dispersed
redundancy.
A side note, even if used the state of TX data center, they use the Azure
cloud as dark storage as well. So it’s still ultimately cloud storage
whether it’s managed through our preservation vendor under our direct
control or through the state data center (which is actually managed and
vended by Xerox).
Lots Of Copies Keep Stuff Safe.
Mark J. Myers
Senior Electronic Records Specialist
Texas State Library and Archives Commission
1201 Brazos Street
Austin TX 78711-2516
Phone: 512-463-5434
mmyers at tsl.texas.gov
www.tsl.texas.gov
Texas Digital Archive <https://tsl.access.preservica.com/>
*From:* Pasig-discuss [mailto:pasig-discuss-bounces at asist.org
<pasig-discuss-bounces at asist.org>] *On Behalf Of *Matthew Addis
*Sent:* Thursday, June 22, 2017 1:51 AM
*To:* gail at trumantechnologies.com; pasig-discuss at asis.org
*Subject:* Re: [Pasig-discuss] Arguments for keeping an onsite copy of
digitally preserved/stored digital content?
Hi Gail,
I think this is a case of the perennial problem of how to balance cost,
risk and accessibility when storing digital assets. As others have pointed
out, cost includes budgeting issues, e.g. capex or opex, risk includes data
security and sovereignty as well as risk of data corruption or loss, and
access includes how quickly you can retrieve and use assets. It’s easy to
find a solution for any two out of three of these factors, but all at once
is hard - low cost, low risk and fast access.
Not trusting the cloud is sensible and keeping at least one copy onsite is
a practical way for many people to reduce risks and provide fast local
access to their data. But it doesn’t necessarily have to be that way.
The usual approach to managing risk of data loss is to have multiple copies
of data in multiple geographic locations and not all with one vendor.
Diversity is your friend and removing vendor dependencies and lock-in is
really important. If you use multiple cloud providers or data storage
facilities then it is possible to get a good balance of cost and risk
without needing an onsite copy. As the trend continues towards cloud
providers building in-territory data centres, the sovereignty issue is
becoming less of a challenge, but of course can’t be eliminated in some
cases where data security is paramount.
This is all a long-winded way of saying that the question isn’t necessarily
one of having an onsite copy or not, it’s one of how best to address cost,
risk and access - with an onsite copy being one of the more common
solutions (and often a good one), but it’s not the only one that’s viable.
Indeed, in some cases where there isn’t in-house capacity to store data
onsite or the capex/opex issues raise their head, then alternatives such as
using multiple cloud providers can be more attractive.
BTW, as a ‘cloud provider’ we too don’t ‘trust' the cloud for all copies of
the data that we store - including when we store it ourselves in our own
data centres! It might sound odd, but in some respects we don’t trust
ourselves let alone any single third-party cloud provider. Instead, we
adopt a ‘rely on nothing’ type approach and we store a complete copy of our
customer’s data offline with an independent escrow provider. This is an
automatic and built in part of our service and not something that a
customer opts into or has to remember to do. This gives our customers
reassurance of no lock-in to us as a provider, but it also gives us our own
fallback if there were ever to be issues with our own infrastructure or
operations. Some of our customers keep an onsite copy as well as archiving
their data with us - but many don’t - which is because their data is being
held not just by us but by an independent third-party too.
Cheers,
Matthew
Matthew Addis
Chief Technology Officer
tel:
+44 1249 405060
mob:
+44 7703 393374
email:
matthew.addis at arkivum.com
web:
www.arkivum.com
twitter: @arkivum
This message is confidential unless otherwise stated.
Arkivum Limited is registered in England and Wales, company number 7530353.
Registered Office: 24 Cornhill, London, EC3V 3ND, United Kingdom
*From: *Pasig-discuss <pasig-discuss-bounces at asist.org> on behalf of "
gail at trumantechnologies.com" <gail at trumantechnologies.com>
*Date: *Wednesday, 21 June 2017 at 20:47
*To: *"pasig-discuss at asis.org" <pasig-discuss at asis.org>
*Subject: *[Pasig-discuss] Arguments for keeping an onsite copy of
digitally preserved/stored digital content?
Experts, please share your thoughts.
Are your institutions ready to "trust" the cloud for all copies of data, or
is there still an argument for an onsite copy? I usually lean to keeping
one onsite copy, but am I stuck in an old paradigm? From earlier PASIG
thread (started by Tim) it's clear other institutions are keeping at least
one copy on site.
But how do you defend this decision?
Gail
Gail Truman
Truman Technologies, LLC
Certified Digital Archives Specialist, Society of American Archivists
*Protecting the world's digital heritage for future generations*
www.trumantechnologies.com
facebook/TrumanTechnologies
https://www.linkedin.com/in/gtruman
+1 510 502 6497
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.asis.org/pipermail/pasig-discuss/attachments/20170622/02b41242/attachment-0001.html>
More information about the Pasig-discuss
mailing list