If you add a dialectic between Opus 4.5 and GPT 5.2 (not the Codex variant), your workflow - which I use as well, albeit slightly differently [1] - may work even better.
This dialectic also has the happy side-effect of being fairly token efficient.
IME, Claude Code employs much better CLI tooling+sandboxing when implementing while GPT 5.2 does excellent multifaceted critique even in complex situations.
[1]
- spec requirement / iterate spec until dialectic is exhausted, then markdown
- plan / iterate plan until dialectic is exhausted, then markdown
- implement / curl-test + manual test / code review until dialectic is exhausted
- update previous repo context checkpoint (plus README.md and AGENTS.md) in markdown
adding another external model/agent is exactly what I have been planning as the next step. in fact i already paste the implementation and test summaries into chatgpt, and it is extremely helpful in hardening requirements, making them more extensible, or picking up gaps between the implementations and the initial specs. it would be very useful to have this in the workflow itself, rather than the coding agent reviewing its own work - there is a sense that it is getting tunnel visioned.
i agree that CC seems like a better harness, but I think GPT is a better model. So I will keep it all inside the Codex VSCode plugin workflow.
>particularly high profile example would be the plot to kidnap a state governor a few years ago.
iirc that was something more than infiltration. The FBI found an extremist loser who lived in a basement, egged him on, helped him network & gave him resources. Without them, he probably would have been thinking really hard about it, not much more.
I've been using 5.2 a lot lately but hit my quota for the first time (and will probably continue to hit it most weeks) so I shelled out for claude code. What differences do you notice? Any 'metagame' that would be helpful?
I just use Cursor because I can pick any mode. The difference is hard to say exactly, Opus seems good but 5.2 seems smarter on the tasks I tried. Or possibly I just "trust" it more. I tend to use high or extra high reasoning.
>I could spend an hour and have a full setup done on physical or VPS to have 1) remote git hosting 2) pipelines running on changes 3) pipelines publishing images or some artifacts 4) automated deployment for these images/artifacts
This sounds like a week or two of work to me (I'm a novice though). You should write a guide.
The federal reserve?
reply