A
alignment
Projects with this topic
-
https://github.com/princeton-nlp/SimPO SimPO: Simple Preference Optimization with a Reference-Free Reward
Updated
https://github.com/princeton-nlp/SimPO SimPO: Simple Preference Optimization with a Reference-Free Reward