You might find more information here helpful https://sabareesh.com/posts/llm-int...

		sabareesh 8 months ago \| parent \| context \| favorite \| on: All You Need Is 4x 4090 GPUs to Train Your Own Mod... You might find more information here helpful https://sabareesh.com/posts/llm-intro/ But i am still in process of evaluating post training process with RL. RLHF is almost a mirage that shows what is possible but not the full capability of what model can do