So i'm reading various articles about the "postRAID" era. If you want to see what i'm reading i'll just link them at the bottom with some excerpts.
The 'nightmare' of hard disk implosion is less of an issue with modern RAID NAS's but the bigger secret growing problem is the data corruption problem. ZFS is a great solution on the block level for this and i'm a big fan of it. Then of course there is software like PAR, which works on a filesystem level to do a similar job and most commonly is used with Usenet files. (so that a series of parity files can replace any missing file in a set if someone gets 49 out of 51 parts for instance complete) Though i've used it on "shelf storage" hard drives to both check integrity and repair-in-place damaged or missing files, it's extremely extremely disk-thrash intensive to create anything of any size and totally unsuited for easy repair of entire hard drives for instance.
What i'm curious is how are NAS's specifically tackling the 'postRAID' problems? I don't even fully understand FreeBSD yet so beyond ZFS repairing in place data i'm not sure if it's using an erasure coding method or not. I am wondering if there is anything else cutting edge, "beyond ZFS", "enhancements to ZFS", and similar that is worth knowing about.
Specifically other ways of using erasure codes for OFFLINE STORAGE or/and on the filesystem level (outside of PAR) that can work on the terabyte level. For instance I would like to use erasure codes on LTO Ultrium tapes for long term storage, including for a box of tapes so that if say 2 tapes out of 20 break I can still recover an entire archive those tapes represented. I am not sure if there is a direct or indirect/workaround way to do this with current solutions. ZFS is great for realtime online stuff, i'm looking for archival grade equivalents on tape. (i'm aware tape has built in protection from lost bits on the tape, but that doesn't fix tape BREAKAGE - i'm looking for a RAID equivalent for tape/so a broken or lost or damaged tape doesn't cause an archive loss, even if it's something like creating volumes that can be block mirrored to tape and restored as well to a ZFS archive if there's no filesystem level method)
PostRAID articles:
http://searchcloudstorage.techtarge...r-if-cloud-providers-use-erasure-codes-or-MCM
With autonomic healing, erasure codes change that equation by protecting against six or more concurrent drive, node, system, site or other failures. It does this by dividing a data object into a number of chunks, the total being referred to as the width. A common width is 16 chunks. The object storage has to read only a subset of those chunks (referred to as the breadth) to reconstitute the data object. A common breadth is 10 chunks. In this example, the object storage system can tolerate six concurrent failures of drives, nodes, sites and so on, and still read the data objects. Autonomic healing enables the lost chunks to be recreated and written elsewhere. This example demonstrates three times the resilience of MCM at one-fifth the overhead.
http://storagemojo.com/2012/07/23/the-post-raid-era-has-begun/
https://code.facebook.com/posts/1433093613662262/-under-the-hood-facebook-s-cold-storage-system-/
Facebook using 10 data blocks and 4 parity blocks to recover from up to 4 simultaneous failures of the same point of data storing 1gig of data in 1.4gigs of space with more protection than even RaidZ3 or a "Raid 7"/beyond Raid 6 triple parity provides.
https://www.backblaze.com/blog/vault-cloud-storage-architecture/
Backblaze using 17 data blocks and 3 parity blocks to be equivalent of RaidZ3 or triple parity.
The 'nightmare' of hard disk implosion is less of an issue with modern RAID NAS's but the bigger secret growing problem is the data corruption problem. ZFS is a great solution on the block level for this and i'm a big fan of it. Then of course there is software like PAR, which works on a filesystem level to do a similar job and most commonly is used with Usenet files. (so that a series of parity files can replace any missing file in a set if someone gets 49 out of 51 parts for instance complete) Though i've used it on "shelf storage" hard drives to both check integrity and repair-in-place damaged or missing files, it's extremely extremely disk-thrash intensive to create anything of any size and totally unsuited for easy repair of entire hard drives for instance.
What i'm curious is how are NAS's specifically tackling the 'postRAID' problems? I don't even fully understand FreeBSD yet so beyond ZFS repairing in place data i'm not sure if it's using an erasure coding method or not. I am wondering if there is anything else cutting edge, "beyond ZFS", "enhancements to ZFS", and similar that is worth knowing about.
Specifically other ways of using erasure codes for OFFLINE STORAGE or/and on the filesystem level (outside of PAR) that can work on the terabyte level. For instance I would like to use erasure codes on LTO Ultrium tapes for long term storage, including for a box of tapes so that if say 2 tapes out of 20 break I can still recover an entire archive those tapes represented. I am not sure if there is a direct or indirect/workaround way to do this with current solutions. ZFS is great for realtime online stuff, i'm looking for archival grade equivalents on tape. (i'm aware tape has built in protection from lost bits on the tape, but that doesn't fix tape BREAKAGE - i'm looking for a RAID equivalent for tape/so a broken or lost or damaged tape doesn't cause an archive loss, even if it's something like creating volumes that can be block mirrored to tape and restored as well to a ZFS archive if there's no filesystem level method)
PostRAID articles:
http://searchcloudstorage.techtarge...r-if-cloud-providers-use-erasure-codes-or-MCM
With autonomic healing, erasure codes change that equation by protecting against six or more concurrent drive, node, system, site or other failures. It does this by dividing a data object into a number of chunks, the total being referred to as the width. A common width is 16 chunks. The object storage has to read only a subset of those chunks (referred to as the breadth) to reconstitute the data object. A common breadth is 10 chunks. In this example, the object storage system can tolerate six concurrent failures of drives, nodes, sites and so on, and still read the data objects. Autonomic healing enables the lost chunks to be recreated and written elsewhere. This example demonstrates three times the resilience of MCM at one-fifth the overhead.
http://storagemojo.com/2012/07/23/the-post-raid-era-has-begun/
https://code.facebook.com/posts/1433093613662262/-under-the-hood-facebook-s-cold-storage-system-/
Facebook using 10 data blocks and 4 parity blocks to recover from up to 4 simultaneous failures of the same point of data storing 1gig of data in 1.4gigs of space with more protection than even RaidZ3 or a "Raid 7"/beyond Raid 6 triple parity provides.
https://www.backblaze.com/blog/vault-cloud-storage-architecture/
Backblaze using 17 data blocks and 3 parity blocks to be equivalent of RaidZ3 or triple parity.