R
rlhf
Projects with this topic
-
🔧 🔗 https://github.com/forhaoliu/chain-of-hindsightChain-of-Hindsight, A Scalable RLHF Method
Updated -
https://github.com/THUDM/ImageReward [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Updated