This project is mirrored from https://github.com/meta-llama/llama-recipes.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update .
- Oct 24, 2024
-
-
Sanyam Bhutani authored
-
celestinoalan authored
**Problem** Currently, we're val_loss.append(best_val_loss) in each epoch. This is misleading because we're appending the corresponding epoch (not best across epochs) quantities in train_loss, train_prep, and val_prep. This is also inconvenient, as one often would like to plot both train and validation losses as a function of the epochs to look for overfitting. **Solution** val_loss.append(eval_epoch_loss)
-
- Oct 21, 2024
- Oct 18, 2024
-
-
Sanyam Bhutani authored
-
Kai Wu authored
-
Kai Wu authored
-
Kai Wu authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Kai Wu authored
-
Sanyam Bhutani authored
-
- Oct 17, 2024
-
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
Sanyam Bhutani authored
-
- Oct 16, 2024
-
-
Kai Wu authored
-
Patrik Lambert authored
-
- Oct 15, 2024
-
-
Hamid Shojanazeri authored
-
Matthias Reso authored
-
Matthias Reso authored
-
Matthias Reso authored
-
Sanyam Bhutani authored
-
Kai Wu authored
-
Ethan authored
-
Ethan authored
-
Patrik Lambert authored
Set numpy seed in finetuning.py to fix it during finetuning (including in custom_dataset.py) and have it set in functions such as Dataset.train_test_split. This avoids having different train/test splits in different ranks, which may cause NCCL collective operation timeout errors.
-
celestinoalan authored
-
- Oct 14, 2024
-
-
Matthias Reso authored
-
Matthias Reso authored
-
Matthias Reso authored
-
Matthias Reso authored
-
Matthias Reso authored
-