I must admit reading the abstract made me think to myself that I should read the...

jxmorris12 · 2025-05-21T19:59:22 1747857562

Very fair!

> Does this extend to being able to analytically determine which concepts are encodable in one embedding but not another? An embedding from a deft tiny stories LLM presumably cannot encode concepts about RNA replication.

Yeah, this is a great point. We're mostly building off of this prior work on the Platonic Representation Hypothesis (https://arxiv.org/abs/2405.07987). I think our findings go-so-far as to apply to large-enough models that are well-enough trained on The Internet. So, text and images. Maybe audio, too, if the audio is scraped from the Internet.

So I don't think your tinystories example qualifies for the PRH, since it's not enough data and it's not representative of the whole Internet. And RNA data is (I would guess) something very different altogether.

> Assuming that is true. If you can detect when you are trying to put a square peg into a round hole, does this mean you have the ability to remove square holes from a system?

Not sure I follow this part.

Lerc · 2025-05-21T20:22:38 1747858958

>So I don't think your tinystories example qualifies for the PRH, since it's not enough data and it's not representative of the whole Internet. And RNA data is (I would guess) something very different altogether.

My thought there was that you'd be comparing tinystories to a model that trained on the entire internet. The RNA related information would be a subset of the second representation that has no comparable encoding in the tinystories space. Can you detect that? If both models have to be of sufficient scale to work the question becomes "what is the scale, is it sliding or a threshold? "

>> Assuming that is true. If you can detect when you are trying to put a square peg into a round hole, does this mean you have the ability to remove square holes from a system?

>Not sure I follow this part.

Perhaps the metaphor doesn't work so well. If you can detect if something is encodable in one embedding model but not another. Can you then leverage that detection ability in order to modify an embedding model so that it cannot represent an idea.

eximius · 2025-05-21T22:43:13 1747867393

As I read the paper, you would be able to detect it in a couple of ways

1. possibly high loss where the models don't have compatible embedding concepts 2. given a sufficient "sample" of vectors from each space, projecting them to the same backbone would show clusters where they have mismatched concepts

It's not obvious to me how you'd use either of those to tweak the vector space of one to not represent some concept, though.

But if you just wanted to make an embedding that is unable to represent some concept, presumably you could already do that by training disjoin "unrepresentable concepts" to a single point.

oofbey · 2025-05-21T21:55:20 1747864520

I read a lot of AI papers on arxiv, and it's been a while since I read one where the first line of the abstract had me scoffing and done.

> We introduce the FIRST method for translating text embeddings from one vector space to another without any paired data

(emphasis mine)

Nope. I'm not gonna go a literature search for you right now and find the references, but this is certainly not the first attempt to do unsupervised alignment of embeddings, text or otherwise. People were doing this back in ~2016.

austinpilot · 2025-05-22T00:52:12 1747875132

There has been plenty of work on alignment of embeddings, a lot of it cited in the paper. This paper solves the problem of translation, where (unlike in word alignment) there isn't already a set of candidate vectors in the target embedding space. It's generation, not matching.