SFT Memorizes,RL Generalizes: Comparative Study of Foundation Model PostTraining arxiv.org 1 points by fofoz 8 hours ago