You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
where we used the fact that the $\delta_{ip}$ means that only a single term contributes to the sum.
46
46
47
47
```{note}
@@ -57,7 +57,7 @@ Observe that:
57
57
58
58
Now ${\bf z}$ and ${\bf y}^k$ are all vectors of size $N_\mathrm{out} \times 1$ and ${\bf x}^k$ is a vector of size $N_\mathrm{in} \times 1$, so we can write this expression for the matrix as a whole as:
where the operator $\circ$ represents_element-by-element_ multiplication (the[Hadamard product](https://en.wikipedia.org/wiki/Hadamard_product_(matrices))).
63
63
@@ -76,19 +76,18 @@ descent suggests, scaled by a _learning rate_ $\eta$.