You can use ListenBrainz to discover new music based on your listening activity from your self-hosted library. I've started doing this recently with Navidrome and I'm happy with the results. This is the plugin I've been using: https://github.com/kgarner7/navidrome-listenbrainz-daily-pla....
There is also Troi[1], a tool provided by ListenBrainz to generate playlists and radios from your local music collection.
If you don't mind self-hosting, I've recently started using ListenBrainz in combination with Navidrome. You can upload your Spotify listen history to seed it, and scrobble your ongoing listening to keep it up to date with what you listen to. You can use a plugin[1] to automatically generate daily, weekly, and discovery playlists based on your listen history, and what you have available in your library. You can generate even more playlists using ListenBrainz data via their tool, Troi[2].
In my org they count both the number of pull requests, and the number of comments you add to reviews. Easily gamed, but that's the performance metric they use to compare every engineer now.
V2 definitely makes it easier, but it's possible to identify files with V1 by iterating through metadata looking for files with the same length as a local file, then checking each piece against the local file. If they all match, then it's the same file. For boundary pieces where it consists of multiple files, I think it's safe enough to ignore if all of the remaining pieces match the local file, but you could do a more complex search looking for sets of files that all have the same length, and then compare the piece hash with the local files.
I wrote a project to do this using libtorrent a while ago, but unfortunately libtorrent crashes when seeding thousands of torrents at once, which is the current blocker. I haven't been working on it since.
I did work on a proof of concept program to accomplish this for my own content library. It would scan a directory to find files and compare them with locally stored metadata. For v2 torrents this is trivial to do via a "pieces root" lookup, for v1 torrents it involves basically checking that each piece matches, and since pieces may not align with the file then it's not possible to guarantee that it's the same file without having all of the other files in the torrent.
I built it with libtorrent and after loading in all of the torrents (multiple TBs of data), it would promptly and routinely crashed. I couldn't find the cause of the error, it doesn't seem it was designed to run with thousands of torrents.
One problem that I've yet to build a solution for is finding the metadata to use for the lookup phase. I haven't been able to find a publicly available database of torrent metadata. If you have an info hash then itorrents.org will give you the metadata, if it exists. I started scraping metadata via DHT announcements, but it's not exactly fast, and each client would have to do this unless they can share the database of metadata between them (I have an idea on how to accomplish this via BEP 46).
This is how I was originally achieving this. As I said, it's very slow. I don't think it would be a good solution on its own because it would require that every client be constantly sampling all DHT nodes, and downloading all metadata and indexing it for a potential future lookup. It's a huge amount of additional load on the DHT.
I think a better solution would be some way for clients to query the DHT for a specific "pieces root", but I don't know if all clients publishing "pieces root" for the torrents they know about would also be a good idea. Some kind of distributed metadata database where clients can query would be ideal.
A v2 torrent allows clients to identify duplicate files across torrents via the "pieces root" entry in the metadata, so if they're downloading from torrent A and B, and each share file C, they can utilize peers from either swarm.
But there's no way for other clients to know that there exists another torrent containing the file they are interested in if they only have the metadata for torrent A. In other words, there's no lookup mechanism for a "pieces root" to know that torrent B exists and contains file C.
If you were to make a v2 torrent of your entire drive, other clients won't know to download from your torrent. They'd need to have the contents of the metadata to know it would contain a file they are interested in, and have no way of knowing which metadata contains the desired "pieces root" entries without downloading all of them.
I'm very interested in this problem space, if you are aware of clients/mechanisms that allow for this I would love to hear them.
But you already advertise (via the DHT, which offers no secrecy) the torrent file... and that contains all the data needed to get all the files within the torrent.
So this data is already being published, just in a less queryable form.
A recurring problem I've noticed in companies with large, legacy codebases is upgrading dependencies. Typically these projects have been around for a long time, no one currently working on them understands the project deeply, and test coverage is quite low. The result is that developers cannot rely on a successful CI/CD result to know if an upgrade didn't break the project. They instead resort to a lot of manual searching and testing to verify that everything is still working.
I built Dutest to easily identify where dependencies are used within a project, and what the level of test coverage is like, broken down by the test type (unit, integration, or end-to-end). This allows me to quickly triage whether an upgrade can be done quickly and safely (high test coverage + low number of usages), or if it is more risky (low test coverage + high usages). It also provides references to quickly jump to that line of code within an IDE.
I thought I'd build this as my first solo SaaS product, and see if this solves a problem that others are experiencing. I'm happy to hear your feedback and to answer any questions.
There is also Troi[1], a tool provided by ListenBrainz to generate playlists and radios from your local music collection.
[1] https://troi.readthedocs.io/en/latest/index.html
reply