Implement T5 Attention with attention_fn
Showing
- SwissArmyTransformer/model/cached_autoregressive_model.py 15 additions, 13 deletionsSwissArmyTransformer/model/cached_autoregressive_model.py
- SwissArmyTransformer/model/t5_model.py 12 additions, 57 deletionsSwissArmyTransformer/model/t5_model.py
- SwissArmyTransformer/mpu/transformer.py 21 additions, 16 deletionsSwissArmyTransformer/mpu/transformer.py
Loading
Please register or sign in to comment