diff --git a/README.md b/README.md index c358330df62fea17c255fdade044333a84aa66c4..3b38cff610eb212304ba6373ede4a5f11821452d 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ 阅读[ä¸æ–‡ç‰ˆæœ¬](README_ZH.md). -# LongBench: A Multilingual, Multitask Benchmark Tailored for Long Context Understanding +# LongBench: A Multilingual, Multitask Benchmark for Long Context Understanding **LongBench** is the first benchmark for multilingual, multitask, and comprehensive assessment of **long context understanding** capabilities of large language models. LongBench includes different languages (Chinese and English) to provide a more comprehensive evaluation of the large models' multilingual capabilities on long contexts. In addition, LongBench consists of 20 different tasks, covering key long context application scenarios such as single-document QA, multi-document QA, summarization, few-shot learning, code completion, and synthetic tasks. @@ -191,4 +191,4 @@ The following tables show the Zero-shot evaluation results (%) on all datasets, ## Citation This is a joint work by **THU-KEG** and **Zhipu AI**. We are currently working on the paper, and the citation information will be updated when it's ready. Please stay tuned~ -When citing our work, please cite all of the original dataset papers. The relevant citation information is listed [here](refs/ref.bib). \ No newline at end of file +When citing our work, please cite all of the original dataset papers. The relevant citation information is listed [here](refs/ref.bib).