Potential bug in timing embedding #1923

Open

Potential bug in timing embedding#1923

Description

addtt

opened

on Jan 19, 2023

Hi,

There might be a small bug here:

tensor2tensor/tensor2tensor/layers/common_attention.py

Lines 445 to 449 inef1fcce

	log_timescale_increment= (
	math.log(float(max_timescale)/float(min_timescale))/
	tf.maximum(tf.to_float(num_timescales)-1,1))
	inv_timescales=min_timescale*tf.exp(
	tf.to_float(tf.range(num_timescales))*-log_timescale_increment)

I think in the last line theexp should be divided bymin_timescale rather than multiplied, since it's inverse timescales. Usuallymin_timescale is 1 so it doesn't matter. But e.g. if you fixmax_timescale and changemin_timescale, the resulting inverse timescale corresponding tomax_timescale changes.

A simpler implementation could be roughly something like this:

inv_timescales = exp(-linspace(log(min_timescale), log(max_timescale), num_timescales))

and from this one you can derive the current implementation, except with division instead of multiplication. It can be even simpler with logspace but tf seems to have this function only as experimental.

Let me know if this makes sense.

Thanks a lot!

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential bug in timing embedding #1923

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions