3. Now, you can use `kw_args['output_this_layer']` (any hooks in the transformer layers) to return values to final outputs and `kw_args['output_cross_layer']` to pass values to `kw_args` in the next layer.
Examples:
```
def attention_fn(...some_args):
...
kw_args['output_this_layer']['mem_kv'] = cache_kv
...
```
This will let the key `'mem_kv'` appear in the `outputs_per_layers[i]` of `logits, *outputs_per_layers = model(...)`.