[Pasig-discuss] Risks of encryption & compression built into storage options?

David Rosenthal dshr at stanford.edu
Fri Mar 17 13:44:53 EDT 2017


On 03/17/2017 03:19 AM, BURNHILL Peter wrote:

> 2. David Rosenthal must surely have written on the maths of how many copies for LOCKSS, see http://blog.dshr.org/?m=1

Models claiming to estimate loss probability from replication factor,
whether true replication or erasure coding, are wildly optimistic and
should be treated with great suspicion. There are two reasons:

- The models are built on models of underlying failures. The data on
   which these failure models are typically based are (a) based on
   manufacturers' reliability claims, and (b) ignore failures upstream
   of the media. Much research shows that actual failures in the field
   are (a) vastly more likely than manufacturers' claims, and (b) more
   likely to be caused by system components other than the media.

- The models almost always assume that the failures are un-correlated,
   because modeling correlated failures is much more difficult, and
   requires much more data than un-correlated failures. In practice
   it has been known for decades that failures in storage systems are
   significantly correlated. Correlations among failures greatly raise
   the probability of data loss.

For replicated systems, three replicas is the absolute minimum IF your
threat model excludes all external or internal attacks. Otherwise four
(see Byzantine Fault Tolerance).

For (k of n) erasure coded systems the absolute minimum is three sites
arranged so that k shards can be obtained from any two sites. This is
because shards in a single site are subject to correlated failures
(e.g. earthquake).

	David.

PS - this discussion is based on a mis-apprehension of how disk
technology works. See:

https://en.wikipedia.org/wiki/Hardware-based_full_disk_encryption

 From there:

"The drive except for bootup authentication operates just like any
drive with no degradation in performance."

The encrypted data is never visible outside the drive. So as far
as systems using them are concerned, whether the drive encrypts or not
is irrelevant. They have one additional failure mode over regular
drives; they support a crypto erase command which renders the data
inaccessible. The effect as far as the data is concerned is the same
as a major head crash. Archival systems that fail if a head crashes
are useless, so they must be designed to survive total loss of the
data on a drive. There is thus no reason not to use self-encrypting
drives, and many reasons why one might want to. But note that their
use does not mean there is no reason for the system to encrypt the
data sent to the drive (see my earlier mail).






More information about the Pasig-discuss mailing list