As I understand it, the restriction of LLMs has nothing to do with getting poor ...

bee_rider · 2025-07-05T19:51:40 1751745100

Huh. That’s an interesting additional risk. I don’t think it is what the original commenter meant, because they were talking about catching cheaters. But it is interesting to think about…

I dunno. There generally isn’t super high security around preprint papers (lots of people just toss their own up on arxiv, after all). But, yeah, it is something that you’ve been asked to look after for somebody, which is quite important to them, so it should probably be taken pretty seriously…

I dunno. The extent to which, and the timelines for, the big proprietary LLMs to feed their prompts back into the training set, are hard to know. So, hard to guess whether this is a serious vector for leaks (and in the absence of evidence it is best to be prudent with this sort of thing and not do it). Actually, I wonder if there’s an opening for a journal to provide a review-helper LLM assistant. That way the journal could mark their LLM content however they want, and everything can be clearly spelled out in the terms and conditions.

mbreese · 2025-07-05T20:09:55 1751746195

>I don’t think it is what the original commenter meant, because they were talking about catching cheaters.

That's why I mentioned it. Worrying about training on the submitted paper is not the first thing I'd think of either.

When I've reviewed papers recently (cancer biology), this was the main concern from the journal. Or at least, this was my impression of the journal's concern. I'm sure they want to avoid exclusively AI processed reviews. In fact, that may be the real concern, but it might be easier to get compliance if you pitch this as the reason. Also, authors can get skittish when it comes to new technology that not everyone understands or uses. Having a blanket ban on LLMs could make it more likley to get submissions.

baxtr · 2025-07-05T19:29:27 1751743767

I don’t think that’s how LLMs work. If that was the case anyone could feed them false info eg for propaganda purposes…

bee_rider · 2025-07-05T19:58:47 1751745527

Of course, LLMs have training and inference stages clearly split out. So I don’t think prompts are immediately integrated into the model. And, it would be pretty weird if there was some sort of shared context where that all the prompts got put into, because it would grow to some absurdly massive size.

But, I also expect that eventually every prompt is going to be a candidate for being added into the training set, for some future version of the model (when using a hosted, proprietary model that just sends your prompts off to some company’s servers, that is).

coliveira · 2025-07-05T20:21:41 1751746901

That's nonsense. I can spend the whole day creating false papers on AI, then feeding it back to another AI to check its "quality". Is this making the paper to be "remembered" by AI? If yes, then we have deeper problems and we shouldn't be using AI to do anything related to science.

mbreese · 2025-07-05T20:58:36 1751749116

The key option in ChatGPT is under Data controls.

"Improve the model for everyone - Allow your content to be used to train our models, which makes ChatGPT better for you and everyone who uses it."

It's this option that gives people pause.

convolvatron · 2025-07-05T21:24:51 1751750691

not that fact that a 4 year old on LSD is deciding what qualifies as good science?

bee_rider · 2025-07-05T21:36:35 1751751395

I think he means WRT the leaking issue that we were discussing.

If someone is just, like, working chatGPT up to automatically review papers, or using Grok to automatically review grants with minimal human intervention, that’d obviously be a totally nuts thing to do. But who would do such a thing, right?