Claude Code works similarly to Aider in that they both run in the terminal and write code using LLMs, but it shares no code with Aider as far as I know. Aider is written in Python and Claude Code in Javascript, among other reasons to think that it is not derived from Aider.
The tools also work very differently when you're actually using them, with Claude Code doing more of an agent loop and having very different methods of putting together the context to pass to the model.
Google for AHA stack. (Astro, HTMX, Alpine) There was a great site by Flavio Copes that went into a lot of detail on using them together but it looks like it’s gone.
LLMS tend to be pretty bad at answering questions about which one it is, what version, etc. You can put stuff into the system prompt to try to help it answer better, but otherwise the LLM has little to no intrinsic knowledge about itself and whatever happens to be in the training data shows up instead (which now is a bunch of ChatGPT output all over the internet).
I actually fed two GPT4’s into each other as an experiment and they very quickly devolved into just saying things like “It’s clear you’re just feeding my answers into ChatGPT and posting the replies. Is there anything else I can help you with?”
I have no inside info, but it sounds like the key was inadvertently bundled into the client-side code. This could happen when using web frameworks that do both client-side and server-side rendering, if one of your client-side files imports something from a file that is supposed to be server-only, and contains the API key environment variable.
Some frameworks automatically detect this and fail to build if you do it, but apparently not all of them.
Only the weights themselves. There have been other models since then built on the same Llama architecture, but trained from scratch so they're safe for commercial user. The GGML code and related projects (llama.cpp and so on) also support some other model types now such as Mosaic's MPT series.
Author here. This page was actually just some notes I threw together in a couple of hours combining some research on RLS with my own existing knowledge, so it's quite possible there are some inaccuracies.
My "source" for thousands of users not scaling well was simply a few comments I found scattered around; there wasn't really anything empirical to back up that line. I'll remove it. Thanks for the note!
The tools also work very differently when you're actually using them, with Claude Code doing more of an agent loop and having very different methods of putting together the context to pass to the model.