Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
justinl33
5 months ago
|
parent
|
context
|
favorite
| on:
DeepSeek-R1
> This is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.
This is a noteworthy achievement.
throwaway314155
5 months ago
[–]
Excuse my ignorance. What does SFT refer to here?
josephcsible
5 months ago
|
parent
[–]
Supervised fine-tuning
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
This is a noteworthy achievement.