Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • mirrored_repos/machinelearning/thukeg/LongCite
1 result
Show changes
Commits on Source (3)
*__pycache__/
samples*/
runs/
checkpoints/
logs/
*.DS_Store
.idea
output*
test*
\ No newline at end of file
......@@ -6,7 +6,7 @@
# LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
<p align="center">
🤗 <a href="https://huggingface.co/datasets/THUDM/LongCite-45k" target="_blank">HF Repo</a> • 📃 <a href="https://arxiv.org/abs/2409.02897" target="_blank">Paper</a> • 🚀 <a href="https://huggingface.co/spaces/THUDM/LongCite" target="_blank">HF Space</a>
🤗 <a href="https://huggingface.co/datasets/THUDM/LongCite-45k" target="_blank">HF Repo</a> • 📃 <a href="https://arxiv.org/abs/2409.02897" target="_blank">Paper</a> • 🚀 <a href="https://huggingface.co/spaces/THUDM-HF-SPACE/LongCite" target="_blank">HF Space</a>
</p>
[English](./README.md) | [中文](./README_zh.md)
......@@ -65,6 +65,7 @@ We are also open-sourcing CoF (Coarse to Fine) under `CoF/`, our automated SFT d
You can download and save the **LongCite-45k** dataset through the Hugging Face datasets ([🤗 HF Repo](https://huggingface.co/datasets/THUDM/LongCite-45k)):
```python
from datasets import load_dataset
dataset = load_dataset('THUDM/LongCite-45k')
for split, split_dataset in dataset.items():
split_dataset.to_json("train/long.jsonl")
......
......@@ -5,7 +5,7 @@
# LongCite: 让LLM在长上下文问答中生成细粒度引用
<p align="center">
🤗 <a href="https://huggingface.co/datasets/THUDM/LongCite-45k" target="_blank">HF 库</a> • 📃 <a href="https://arxiv.org/abs/2409.02897" target="_blank">论文</a> • 🚀 <a href="https://huggingface.co/spaces/THUDM/LongCite" target="_blank">HF空间</a>
🤗 <a href="https://huggingface.co/datasets/THUDM/LongCite-45k" target="_blank">HF 库</a> • 📃 <a href="https://arxiv.org/abs/2409.02897" target="_blank">论文</a> • 🚀 <a href="https://huggingface.co/spaces/THUDM-HF-SPACE/LongCite" target="_blank">在线体验</a>
</p>
[English](./README.md) | [中文](./README_zh.md)
......@@ -64,6 +64,7 @@ CUDA_VISIBLE_DEVICES=0 streamlit run demo.py --server.fileWatcherType none
你可以通过 Hugging Face 数据集 ([🤗 HF Repo](https://huggingface.co/datasets/THUDM/LongCite-45k)) 下载和保存 **LongCite-45k** 数据集:
```python
from datasets import load_dataset
dataset = load_dataset('THUDM/LongCite-45k')
for split, split_dataset in dataset.items():
split_dataset.to_json("train/long.jsonl")
......