> To maintain
safety, no operational details are included in this manuscript
What is it with this!? The second paper this week that self-censors ([1] this was the other one). What's the point of publishing your findings if others can't reproduce them?
I imagine it's simply a matter of taking the CSV dataset of prompts from here[0], and prompting an LLM to turn each into a formal poem. Then using these converted prompts as the first prompt in whichever LLM you're benchmarking.
Also arxiv papers appear here too often, imo. It’s a preprint. Why not wait a bit for the paper to be published? (And if it’s never published, it’s not worth it.)
What is it with this!? The second paper this week that self-censors ([1] this was the other one). What's the point of publishing your findings if others can't reproduce them?
1: https://arxiv.org/abs/2511.12414