I have a startup of legal AI, the quality jump from GPT3.5 to GPT4 in this domain is straight mind-blowing, GPT3.5 in comparison is useless. But I see how in more conversational settings GPT3.5 can provide more appealing performance/price.
I suggested to my wife that ChatGPT would help with her job and she has found ChatGPT4 to be the same or worse as ChatGPT3.5. It’s really interesting just how variable the quality can be given your particular line of work.
Legal writing is ideal training data: mostly formulaic, based on conventions and rules, well-formed and highly vetted, with much of the best in the public domain.
Medical writing is the opposite, with unstated premises, semi-random associations, and rarely a meaningful sentence.
> Legal writing is ideal training data: mostly formulaic, based on conventions and rules, well-formed and highly vetted, with much of the best in the public domain.
That makes sense. The labor impact research suggests that law will be a domain hit almost as hard as education by language models. Almost nothing happens in court that hasn't occured hundreds of thousands of times before. A model with GPT-4 power specifically trained for legal matters and fine tuned by jurisdiction could replace everyone in a courtroom. Well there's still the bailiff, I think that's about 18 months behind.