Skip to content
Snippets Groups Projects
Commit d5e67a53 authored by bys0318's avatar bys0318
Browse files

Add ChatGLM3-6B-32k

parent 8a415f25
No related branches found
No related tags found
No related merge requests found
...@@ -126,15 +126,15 @@ Install the requirements with pip: `pip install -r requirements.txt`. For Llama- ...@@ -126,15 +126,15 @@ Install the requirements with pip: `pip install -r requirements.txt`. For Llama-
First, run [pred.py](pred.py) and select the model you want to evaluate via `--model`. Let's take ChatGLM2-6B-32k as an example (HuggingFace model weight will be downloaded automatically according to the path in [model2path.json](config/model2path.json), you can change the path in this file to load the model weight from local): First, run [pred.py](pred.py) and select the model you want to evaluate via `--model`. Let's take ChatGLM2-6B-32k as an example (HuggingFace model weight will be downloaded automatically according to the path in [model2path.json](config/model2path.json), you can change the path in this file to load the model weight from local):
```bash ```bash
CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm2-6b-32k CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm3-6b-32k
``` ```
You can obtain the output of the model under all LongBench datasets under the `pred/` folder corresponding to the model name. Similarly, with the `--e` command: You can obtain the output of the model under all LongBench datasets under the `pred/` folder corresponding to the model name. Similarly, with the `--e` command:
```bash ```bash
CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm2-6b-32k --e CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm3-6b-32k --e
``` ```
You can obtain the output on LongBench-E under the `pred_e/` folder. After that, run the evaluation code in [eval.py](eval.py): You can obtain the output on LongBench-E under the `pred_e/` folder. After that, run the evaluation code in [eval.py](eval.py):
```bash ```bash
python eval.py --model chatglm2-6b-32k python eval.py --model chatglm3-6b-32k
``` ```
You can get the evaluation results on all datasets in `result.json`. The average score of the model over different length intervals in all LongBench-E datasets can be obtained with the `--e` command. You can get the evaluation results on all datasets in `result.json`. The average score of the model over different length intervals in all LongBench-E datasets can be obtained with the `--e` command.
......
...@@ -123,15 +123,15 @@ for dataset in datasets: ...@@ -123,15 +123,15 @@ for dataset in datasets:
首先,运行仓库下的[pred.py](pred.py),并通过`--model`选择你想评测的模型,我们以ChatGLM2-6B-32k模型为例(代码将会根据[model2path.json](config/model2path.json)中的路径自动下载HuggingFace模型,你可以修改此文件中的路径以从本地载入模型参数): 首先,运行仓库下的[pred.py](pred.py),并通过`--model`选择你想评测的模型,我们以ChatGLM2-6B-32k模型为例(代码将会根据[model2path.json](config/model2path.json)中的路径自动下载HuggingFace模型,你可以修改此文件中的路径以从本地载入模型参数):
```bash ```bash
CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm2-6b-32k CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm3-6b-32k
``` ```
可以在`pred/`对应模型名称的文件夹下得到模型在LongBench所有数据集下的输出,类似地,通过`--e`命令: 可以在`pred/`对应模型名称的文件夹下得到模型在LongBench所有数据集下的输出,类似地,通过`--e`命令:
```bash ```bash
CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm2-6b-32k --e CUDA_VISIBLE_DEVICES=0 python pred.py --model chatglm3-6b-32k --e
``` ```
可以在`pred_e/`对应模型名称的文件夹下得到模型在LongBench-E所有数据集下的输出。此后运行[eval.py](eval.py)的评测代码: 可以在`pred_e/`对应模型名称的文件夹下得到模型在LongBench-E所有数据集下的输出。此后运行[eval.py](eval.py)的评测代码:
```bash ```bash
python eval.py --model chatglm2-6b-32k python eval.py --model chatglm3-6b-32k
``` ```
可以在存储模型输出文件夹下的`result.json`中得到模型在LongBench各数据集上的评测结果。通过`--e`命令也可以得到模型在LongBench-E所有数据集中不同长度区间内的平均得分。 可以在存储模型输出文件夹下的`result.json`中得到模型在LongBench各数据集上的评测结果。通过`--e`命令也可以得到模型在LongBench-E所有数据集中不同长度区间内的平均得分。
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment