Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You might find more information here helpful https://sabareesh.com/posts/llm-intro/ But i am still in process of evaluating post training process with RL. RLHF is almost a mirage that shows what is possible but not the full capability of what model can do


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: