> This is the first open research to validate that reasoning capabilities of LLM... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

justinl33 5 months ago | parent | context | favorite | on: DeepSeek-R1

> This is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

This is a noteworthy achievement.

throwaway314155 5 months ago [–]

Excuse my ignorance. What does SFT refer to here?

josephcsible 5 months ago | [–]

Supervised fine-tuning

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact