I hate to be "reviewer 2", but: I used to work on what your paper calls "unsuper...

jxmorris12 · 2025-05-21T20:42:40 1747860160

Hey, I appreciate the perspective. We definitely should cite both those papers, and will do so in the next version of our draft. There are a lot of papers in this area, and they're all a few years old now, so you might understand how we missed two of them.

We tested all of the methods in the Python Optimal Transport package (https://pythonot.github.io/) and reported the max in most of our tables. So some of this is covered. A lot of these methods also require a seed dictionary, which we don't have in our case. That said, you're welcome to take any number of these tools and plug them into our codebase; the results would definitely be interesting, although we can expect the adversarial methods still work best, as they do in the problem settings you mention.

As for the name – the paper you recommend is called 'vecmap' which seems equally general, doesn't it? Google shows me there are others who have developed their own 'vec2vec'. There is a lot of repetition in AI these days, so collisions happen.

jackpirate · 2025-05-21T22:57:12 1747868232

> We tested all of the methods in the Python Optimal Transport package (https://pythonot.github.io/) and reported the max in most of our tables.

Sorry if I'm being obtuse, but I don't see any mention of the POT package in your paper or of what specific algorithms you used from it to compare against. My best guess is that you used the linear map similar to the example at <https://pythonot.github.io/auto_examples/domain-adaptation/p...>. The methods I mentioned are also linear, but contain a number of additional tricks that result in much better performance than a standard L2 loss, and so I would expect those methods to outperform your OT baseline.

> As for the name – the paper you recommend is called 'vecmap' which seems equally general, doesn't it? Google shows me there are others who have developed their own 'vec2vec'. There is a lot of repetition in AI these days, so collisions happen.

But both of those papers are about generic vector alignment, so the generality of the name makes sense. Your contribution here seems specifically about the LLM use case, and so a name that implies the LLM use case would be preferable.

I do agree though that in general naming is hard and I don't have a better name to suggest. I also agree that there's lots of related papers, and you can't cite/discuss them all reasonably.

And I don't mean to be overly critical... the application to LLMs is definitely cool. I wouldn't have read the paper and written up my critiques if I didn't overall like it :)

newfocogi · 2025-05-21T20:37:27 1747859847

Naming things is hard. Noting the two alternative approaches that you referenced are called "vecmap" and "alignment" which "aren't the first/only algorithm for ... and you have no right to claim such a general title" could easily apply there as well.

jackpirate · 2025-05-21T23:01:31 1747868491

Except those papers are 8ish years old; they actually were among the first 2-3 algs for this task; and they studied the fully general vector space alignment problem. But I agree that naming things is hard and don't have a better name.

mjburgess · 2025-05-21T22:53:27 1747868007

> I strongly dislike your name of vec2vec.

Imagine having more than a passing understanding of philosophy, and then reading much of any major computer science papers. By this "No right to claim" logic, I'd have you all on trial.

austinpilot · 2025-05-22T00:49:32 1747874972

The problem solved in this paper is strictly harder than alignment. Alignment works with multiple, unmatched representations of the same inputs (e.g, different embeddings of the same words). The goal is to match them up.

The goal here is harder: given an embedding of an unknown text in one space, generate a vector in another space that's close to the embedding of the same text -- but, unlike in the word alignment problem, the texts are not known in advance.

Neither unsupervised transport, nor optimal alignment can solve this problem. Their input sets must be embeddings of the same texts. The input sets here are embeddings of different texts.

FWIW, this is all explained in the paper, including even the abstract. The comparisons with optimal assignment explicitly note that this is an idealized pseudo-baseline, and in reality OA cannot used for embedding translation (as opposed to matching, alignment, correspondence, etc.)