Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Intuition about project input embedding tokens to queries, keys and values#919

krishnan-duraisamy started this conversation inGeneral
Discussion options

In Section 3.4.1 we have this definition -These three matrices are used to project the embedded input tokens, x(i), into query, key, and value vectors, respectively, as illustrated in figure 3.14

These weight matrices are then initialized later to random tensors like so:

torch.manual_seed(123)W_query = torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad=False)W_key   = torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad=False)W_value = torch.nn.Parameter(torch.rand(d_in, d_out), requires_grad=False)
  • Where then or how is the input being projected then to these respective weight matrices?
  • Is the intuition behind the split to also reduce dimensionality?
You must be logged in to vote

Replies: 0 comments

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
General
Labels
None yet
1 participant
@krishnan-duraisamy

[8]ページ先頭

©2009-2025 Movatter.jp