Skip to content
GitLab
Explore
Sign in
0.0.11
e41ac42b
·
just create a new DPO trainer per iteration, so that scheduler and optimizer is reset
·
Jan 28, 2024