Commit0a212ca

Mesh TensorFlow Team

committed

Fix bug in shared_kv attention for autoregressive decoding.

PiperOrigin-RevId: 391048755

1 parent7c09bb9 commit0a212caCopy full SHA for 0a212ca

File tree

-1

lines changed

-1

lines changed

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -247,7 +247,7 @@ def call(self, context, x, losses=None):`
`247`	`247`	`context.position,memory_length,dtype=context.activation_dtype)`
`248`	`248`	`inv_one_hot=1.0-one_hot`
`249`	`249`	`ifself.shared_kv:`
`250`		`-old_kv=context.get_states(1)`
	`250`	`+old_kv,=context.get_states(1)`
`251`	`251`	`kv=old_kvinv_one_hot+kvone_hot`
`252`	`252`	`else:`
`253`	`253`	`old_k,old_v=context.get_states(2)`

Comments

(0)