Very cool to see the effectiveness of recurrence on ARC. For those interested in recurrence, here are other works that leverage a similar approach for other types of problems:
This third party browser extension looks like it hides UI contents whereas with Kagi your preferences are used in what is actually returned from the search engine and in what order. I don't think these things are all that similar.
Filter lists can be hosted anywhere and imported with the @ syntax:
# Make these domains stand out in results
+en.wikipedia.org
+stackoverflow.com
+github.com
+api.rubyonrails.org
# SPAM - never show these results
experts-exchange.com
# Pull filters from external source
@https://clobapi.herokuapp.com/default-filters.txt
This default list is the only one I distribute but users have come up with own lists.
It would be nice to have a Github repo with such lists (or meta lists: the @ syntax works recursively, allowing lists to import other lists).
Your suggestion of having a standard for the list syntax is interesting.
A related question is: "What is the smallest fixed set of guesses which always solve Wordle, narrowing down possible hidden words to just one?". So far the answer is 8: MODEL LEVIN TAPPA GRABS DURGY FLYTE CHAWK SPOOR [1].
Bringing this down to 5 would mean that one could always win at Wordle with the same set of 5 guesses. Seems unlikely that such a solution exists, but interesting question nonetheless.
Very cool idea. How does fine tuning the SVD initialization compare to training from random initialization using the same architecture? I couldn't find this in the paper.
These older word embedding models (word2vec, GloVe, LexVec, fastText) are being superseded by contextual embeddings ( https://allennlp.org/elmo ) and fine-tuned language models ( https://ai.googleblog.com/2018/11/open-sourcing-bert-state-o... ). These contextual models can infer that "bank" in "I spent two hours at the bank trying to get a loan" is very different from "The ocean bank is where most fish species proliferate."
Language modeling:
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach https://arxiv.org/pdf/2502.05171
Puzzle solving:
A Simple Loss Function for Convergent Algorithm Synthesis using RNNs https://openreview.net/pdf?id=WaAJ883AqiY
End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking https://arxiv.org/abs/2202.05826
Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks https://proceedings.neurips.cc/paper/2021/file/3501672ebc68a...
General:
Think Again Networks and the Delta Loss https://arxiv.org/pdf/1904.11816
Universal Transformers https://arxiv.org/abs/1807.03819
Adaptive Computation Time for Recurrent Neural Networks https://arxiv.org/pdf/1603.08983