Problem 2 can be solved by "cryptographic erasure" techniques.
1. Encrypt the data with an ephemeral key.
2. Upon delete, just wipe the key. The data becomes unrecoverable.
I'm not sure about problem 1. A fast seekable stream cipher like ChaCha might help here (don't use e.g. RC4 for this), but a faster hash like BLAKE3 might be more appropriate. I'm happy to noodle over this some more. These are just my unfiltered thoughts.
If you combined the two: Each entry in the graph could have a public salt used for key derivation for the actual data. This salt would not be part of the DAG, but stored outside of it. To delete an entry, zero-out the salt and the underlying key becomes unrecoverable, which makes the data unreadable, without needing to change the history of the DAG.
Yes. But diamond types stores a new entry in the DAG for every keystroke each user makes. The current scheme of IDs being (agent, sequence number) allows subsequent operations by the same user to be trivially run-length encoded. Eg if I make 1000 changes in order, we can encode that as (“seph”,1..1000). It takes much less than 1 byte of overhead on disk per character typed.
Salts per character + cryptographic erasure would solve problem 2. But how do we do it without a massive increase in storage, memory use and network bandwidth? I don’t want to generate and store a cryptographically unique salt per keystroke. (Even though it might run fine on modern hardware).
I apologize for commenting on this with almost no context about what the actual data structure looks like, but would it be possible to put the random salts "higher up in the tree"? Then whenever you want to delete some small leaf element, you actually delete a whole salted subtree and replace it with a new subtree, mostly the same but with a new salt and missing the deleted element? This might make edits more expensive, in exchange for reducing the storage overhead of the salts?
1. Encrypt the data with an ephemeral key.
2. Upon delete, just wipe the key. The data becomes unrecoverable.
I'm not sure about problem 1. A fast seekable stream cipher like ChaCha might help here (don't use e.g. RC4 for this), but a faster hash like BLAKE3 might be more appropriate. I'm happy to noodle over this some more. These are just my unfiltered thoughts.
If you combined the two: Each entry in the graph could have a public salt used for key derivation for the actual data. This salt would not be part of the DAG, but stored outside of it. To delete an entry, zero-out the salt and the underlying key becomes unrecoverable, which makes the data unreadable, without needing to change the history of the DAG.