God I wish contracts were encoded semantically rather than as plain text. I just tried to look through Github's terms of service[1]. I'd search for "Github can <verb> with <adjective> code" if I could. Instead I'm giving up.
That looks hard. More politically feasible might be a language I've unfortunately forgotten the name of, ordinary English but with some extra rules designed to eliminate ambiguity -- every noun has to carry a specifier like "some" or "all" or "the" or "a", etc.
Pretty much everything is trained on copyrighted content: machine translation software, TWDNE, DALL-E, and all the GPTs. Software people are bringing this up now because it's their ox being gored. It's the same as when furries got upset about This Fursona Does Not Exist.[1][2]
To expand on your argument, pretty much every person is trained on copyrighted content too. Doesn't make their generated content automatically subject to copyright either.