I made a table because I wanted to test more files, but almost all PDFs I downloaded/had stored locally were already compressed and I couldn't quickly find a way to decompress them.
Brotli seemed to have a very slight edge over zstd, even on the larger pdf, which I did not expect.
EDIT: Something weird is going on here. When compressing zstd in parallel it produces the garbage results seen here, but when compressing on a single core, it produces result competitive with Brotli (37M). See: https://news.ycombinator.com/item?id=46723158
Turns out that these numbers are caused by APFS weirdness. I used 'du' to get them which reports the size on disk, which is weirdly bloated for some reason when compressing in parallel. I should've used 'du -A', which reports the apparent size.
Here's a table with the correct sizes, reported by 'du -A' (which shows the apparent size):
Ran the tests again with some more files, this time decompressing the pdf in advance. I picked some widely available PDFs to make the experiment reproducable.
For learnopengl.pdf I also tested the decompression performance, since it is such a large file, and got the following (less surprising) results using 'perf stat -r 5':
The conclusion seems to be consistent with what brotli's authors have said: brotli achieves slightly better compression, at the cost of a little over half the decompression speed.
I wasn't sure. I just went in with the (probably faulty) assumption that if it compresses to less than 90% of the original size that it had enough "non-randomness" to compare compression performance.
maybe that was too strongly worded but there was an expectation for zstd to outperform. So the fact it didnt means the result was unexpected. i generally find it helpful to understand why something performs better than expected.
Isn't zstd primarily designed to provide decent compression ratios at amazing speeds? The reason it's exciting is mainly that you can add compression to places where it didn't necessarily make sense before because it's almost free in terms of CPU and memory consumption. I don't think it has ever had a stated goal of beating compression ratio focused algorithms like brotli on compression ratio.
I actually thought zstd was supposed to be better than Brotli in most cases, but a bit of searching reveals you're right... Brotli, especially at the highest compression levels (10/11), often exceeds zstd at the highest compression levels (20-22). Both are very slow at those levels, although perfectly suitable for "compress once, decompress many" applications which the PDF spec is obviously one of them.
Its interesting how early programming languages are mentioned as having proper names. In the 60s we went from BCPL to B to C. B and C are not descriptive at all. Should they have been named "operating systems language" or "portable Unix language" instead?
I also fail to see how AWK is a well named tool according to these metrics. The initials of the creators dont tell you anything about what it actually is or does.
I used vim for about one year before switching to kakoune.
Using vim motions in Intellij or just vim hasn't been a problem for me at all. I don't use very advanced shortcuts (the most complicated motions I ever use are ciw and the likes). Even though they are quite different (ciw is <alt>iwc in kakoune) I adjust to them quite easily. Kakoune has also helped me get better at using vim because there is quite some overlaps that are more discoverable in kakoune (<alt>i summons a clippy that shows all the different objects you can select)
Except that ripgrep isn't actually a drop-in replacement for grep as it behaves differently. It is a nice program don't get me wrong, but it is not interchangeable with grep.
Brotli seemed to have a very slight edge over zstd, even on the larger pdf, which I did not expect.
reply