It sounds like you've been extremely lucky and only had GPT "omit the irrelevant...

emporas · on Feb 14, 2024

>For example, GPT will do things like write a class with all the methods as simply stubs with comments describing their function.

The tool needs a way to guide it to be more effective. It is not exactly trivial to get good results. I have been using GPT for 3.5 years and the problem you describe never happens to me. I could share with you just from last week, 500 to 1000 prompts i used to generate code, but the prompts i used to write the replacefn, can be found here [1]. Maybe there are some tips that could help.

[1] https://chat.openai.com/share/e0d2ab50-6a6b-4ee9-963a-066e18...

anotherpaulg · on Feb 14, 2024

The chat transcript you linked is full of GPT being lazy and writing "todo" comments instead of providing all the code:

  // Handle struct-specific logic here
  // Add more details about the struct if needed
  // Handle other item types if needed
  ...etc...

It took >200 back-and-forth messages with ChatGPT to get it to ultimately write 84 lines of code? Sounds lazy to me.

emporas · on Feb 14, 2024

Ok it does happen, but not so frequently. You are right. But is this such a big problem?

Like, you parse the response, and throw away the comment "//implementation goes here", throw away also the function/method/class/struct/enum it belongs to, and keep the functional code. I am trying to implement something exactly like aider, but specifically for Rust, parsing the LLM's response, filtering out blank functions etc.

In Rust, filtering out blank functions is easy, in other languages it might be very hard. I haven't looked into tree-sitter, but getting a sense of Javascript code, Python and more, sounds pretty much a very difficult problem to solve.

Even though i like when GPT compresses the answer and doesn't return a lot of code, other programs like Mixtral 8x7b, never compress it like GPT in my experience. If they are not lacking much than GPT4, maybe they are better for your use case.

>It took >200 back-and-forth messages with ChatGPT to get it to ultimately write 84 lines of code? Sounds lazy to me.

Hey Rust throws a lot of errors. We do not want humans go around and debug code, unless it is absolutely necessary, right?

resters · on Feb 20, 2024

> But is this such a big problem?

It really is. It wastes a ton of time even if the user explicitly requests that code listings be printed in full.

Further, all the extra back and forth trying to get it to do what it is supposed to pollutes the context and makes it generally more confused about the task/goals.

rpmisms · on Feb 14, 2024

Just use Grimoire.

Benjaminsen · on Feb 14, 2024

Really great article. Interestingly I have found that using the function call output significantly improves the coding quality.

However for now, I have not run re-tests for every new version. I guess I know what I will be doing today.

This is an area I have spend a lot of time working on, would love to compare notes.