Erasure coding has been around for a very long time. Remember PAR2 files on Usen...

masspro · on Aug 27, 2024

I was unpleasantly surprised by but thankful to have found eclecticlight.co’s findings about PAR2. When I learned about PAR2 I immediately wanted to make par files for everything because bit rot scares me. But, from https://eclecticlight.co/2020/04/20/file-integrity-5-how-wel... :

> This has serious implications for the use of Par2 with files much larger than 20 MB, and probably rules it out as a method of ECC for those larger than 1 GB.

I assumed 10% PAR file size == resistance to 10% of the input file being corrupted, but that’s not how it works. The article shows some nonlinear and non-obvious relationships between input file size, par file size, and maximum number of recoverable errors.

dragontamer · on Aug 27, 2024

Bitrot is reasonably handled by erasure codes by simply having CRC32 checksums (or similar) verifying the parts.

If a piece has bitrotted away, then you throw away the whole segment.

CRC32 is closely related to ReedSolomon / Galois Fields. It's basically a repeated division + remainders in Galois Field. And as we all know: Division is very good at mixing up bits (true in normal math as well as Galois Fields).

The real benefit of cyclical codes is the guarantee to catch any burst error of size 32 or less (for a CRC32). You only get a chance of false negatives if the error region is larger than the CRC size.

------

Indeed: the whole erasure code / correction code thing has complex math constructs so that these tight guarantees can be made. (Be it CRC32 or ReedSolomon, or any old school biterror algorithm).

justsomehnguy · on Aug 27, 2024

> ecause bit rot scares me.

Use WinRAR/RAR recovery record for the important things.

There is one site what still mandates 5% RR for the archives because before the ubiquitous HTTPS the trashed in transit archives were the norm.

hcs · on Aug 27, 2024

It's my understanding that par2 is designed for missing files (parts of a multi part archive), not the uniform random bit rot corruption used in that article. I think it can recover a much larger corrupted or missing block, approaching the size of the parity files.

But yeah if that's your data loss model then par2 isn't the right approach. (Not sure what is.)

loeg · on Aug 27, 2024

I think something about the test methodology in that article is severely flawed.

jeremycarter · on Aug 26, 2024

When I was younger, I literally thought PAR's were magic files. I had no idea how they worked, and from a distance it was magic.

cdumler · on Aug 27, 2024

"Any sufficiently advanced technology is indistinguishable from magic." -- Arthur C. Clarke

I thought the same thing when using PAR files. They're still useful today if you save things on media that can be damaged (CD, DVD, Blue-Ray) or across multiple multiple media.

Eventually, I decided to dig into the math behind it. It is a surprisingly simple principle:

Given polynomial of a degree X and an array of data points of size X, there is one and only one solution to the polynomial's coefficients such that it will pass through those data points.

So, stripe the data into bands of arrays, compute the polynomial, and compute additional data points of the curve, and save it with the original data. If you have at least the array's size of data points ( original array and/or parity values) and know the place in the list for each data point (thus which data is missing), there is one and only one solution to the polynomial equation. Once you solve the polynomial again, you can compute any point, including the missing ones. Again, because there is one and only one solution for the curve.

The devil is the math necessary solve the polynomials, which is why it is so computationally intensive.

kqr · on Aug 27, 2024

> "Any sufficiently advanced technology is indistinguishable from magic." -- Arthur C. Clarke

"If anything seems like magic you're not asking enough questions." -- Mario Romero Vega

pwg · on Aug 27, 2024

> They're still useful today if you save things on media that can be damaged (CD, DVD, Blue-Ray)

For writable disk media, there is also dvdisaster:

https://dvdisaster.jcea.es/

dragontamer · on Aug 27, 2024

PAR files use ReedSolomon error correction which IMO might as well be magic.

Galois Fields are really awesome (and are related to CRC codes). The level of effort to learn is quite high. NASAs guide to ReedSolomon was amazing though

-----------

XOR codes are easier and sloppier. But are actually what's used today despite being imperfect.

Let's say you have A, B, C... Z as your data.

Parity#1 could be XOR(A, B, Z). If B were missing, Parity#1 XOR A XOR Z can back-calculate B.

Parity#2 can be a random set of all previous entries (including Parity#1). Etc. etc. etc.

Keep adding XOR parity codes until your probability of reconstruction is high enough to please you.

I believe this concept is called a Fountain Code.

exikyut · on Aug 27, 2024

Is that introduction https://ntrs.nasa.gov/citations/19900019023 -> https://ntrs.nasa.gov/api/citations/19900019023/downloads/19... [PDF]?

dragontamer · on Aug 27, 2024

Yes, this is it.

It's a very shallow introduction to Galois Fields but it's just barely enough to reach Reed Solomon encoding and understand error correction codes.

As I said earlier: it's a lot of math, even in this simplified form. Abstract conceptual math. But I do find this kind of abstractness very fun.

halfcat · on Aug 27, 2024

Also I’ve found that while most people (non-tech) don’t have a concept of XOR, they probably took basic algebra and understand 1+?=3

Arithmetic wouldn’t be a good implementation due to integer overflow (a problem XOR doesn’t have) but it’s helpful if you ever have to explain it to the less technical business person who you need to sign off on the purchasing decision.

genewitch · on Aug 27, 2024

that's how i used to explain what a nonce was to explain what all the computers were doing to "mine" bitcoin. and then explain "they're instead trying to get a number that has a certain number of zeros in a specific place"

halfcat · on Aug 27, 2024

Yes, also RAID5 has been in use at least since the 1980’s