For an example `get_custom_dataset` you can look at the provided datasets in llama_recipes.datasets or [examples/custom_dataset.py](../examples/custom_dataset.py).
The `dataset_config` in the above signature will be an instance of llama_recipes.configs.dataset.custom_dataset with the modifications made through the command line.
The split signals wether to return the training or validation dataset.
The default function name is `get_custom_dataset` but this can be changes as described below.
The default function name is `get_custom_dataset` but this can be changed as described below.
In order to start a training with the custom dataset we need to set the `--dataset` as well as the `--custom_dataset.file` parameter.