Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit7501ee0

Browse files
committed
fix a typo
fix dark mode
1 parent3d8f6d9 commit7501ee0

File tree

1 file changed

+12
-13
lines changed

1 file changed

+12
-13
lines changed

‎content/11-machine-learning/neural-net-derivation.md‎

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ and activation function, $g(\xi)$.
1010

1111
Let's start with our cost function:
1212

13-
$$\mathcal{L}(A_{ij}) = \sum_{i=1}^{N_\mathrm{out}} (z_i - y_i^k)^2 = \sum_{i=1}^{N_\mathrm{out}}
13+
$$\mathcal{L}(A_{ij}) = \sum_{i=1}^{N_\mathrm{out}} (z_i - y_i^k)^2 = \sum_{i=1}^{N_\mathrm{out}}
1414
\Biggl [ g\biggl (\underbrace{\sum_{j=1}^{N_\mathrm{in}} A_{ij} x^k_j}_{\equiv \alpha_i} \biggr ) - y^k_i \Biggr ]^2$$
1515

1616
where we'll refer to the product ${\boldsymbol \alpha} \equiv {\bf
@@ -21,16 +21,16 @@ element, $A_{pq}$ by applying the chain rule:
2121

2222
$$\frac{\partial \mathcal{L}}{\partial A_{pq}} =
2323
2 \sum_{i=1}^{N_\mathrm{out}} (z_i - y^k_i) \left . \frac{\partial g}{\partial \xi} \right |_{\xi=\alpha_i} \frac{\partial \alpha_i}{\partial A_{pq}}$$
24-
24+
2525

2626
with
2727

2828
$$\frac{\partial \alpha_i}{\partial A_{pq}} = \sum_{j=1}^{N_\mathrm{in}} \frac{\partial A_{ij}}{\partial A_{pq}} x^k_j = \sum_{j=1}^{N_\mathrm{in}} \delta_{ip} \delta_{jq} x^k_j = \delta_{ip} x^k_q$$
2929

3030
and for $g(\xi)$, we will assume the sigmoid function,so
3131

32-
$$\frac{\partial g}{\partial \xi}
33-
= \frac{\partial}{\partial \xi} \frac{1}{1 + e^{-\xi}}
32+
$$\frac{\partial g}{\partial \xi}
33+
= \frac{\partial}{\partial \xi} \frac{1}{1 + e^{-\xi}}
3434
=- (1 + e^{-\xi})^{-2} (- e^{-\xi})
3535
= g(\xi) \frac{e^{-\xi}}{1+ e^{-\xi}} = g(\xi) (1 - g(\xi))$$
3636

@@ -41,7 +41,7 @@ which gives us:
4141
(z_i - y^k_i) z_i (1 - z_i) \delta_{ip} x^k_q\\
4242
&= 2 (z_p - y^k_p) z_p (1- z_p) x^k_q
4343
\end{align*}
44-
44+
4545
where we used the fact that the $\delta_{ip}$ means that only a single term contributes to the sum.
4646

4747
```{note}
@@ -57,7 +57,7 @@ Observe that:
5757

5858
Now ${\bf z}$ and ${\bf y}^k$ are all vectors of size $N_\mathrm{out} \times 1$ and ${\bf x}^k$ is a vector of size $N_\mathrm{in} \times 1$, so we can write this expression for the matrix as a whole as:
5959

60-
$$\frac{\partialf}{\partial {\bf A}} = 2 ({\bf z} - {\bf y}^k) \circ {\bf z} \circ (1 - {\bf z}) \cdot ({\bf x}^k)^\intercal$$
60+
$$\frac{\partial\mathcal{L}}{\partial {\bf A}} = 2 ({\bf z} - {\bf y}^k) \circ {\bf z} \circ (1 - {\bf z}) \cdot ({\bf x}^k)^\intercal$$
6161

6262
where the operator $\circ$ represents_element-by-element_ multiplication (the[Hadamard product](https://en.wikipedia.org/wiki/Hadamard_product_(matrices))).
6363

@@ -76,19 +76,18 @@ descent suggests, scaled by a _learning rate_ $\eta$.
7676

7777
The overall minimization appears as:
7878

79-
<divstyle="border:solid;padding:10px;width:80%;margin:0auto;background:#eeeeee">
79+
```{card} Minimization
8080
* Loop over epochs
8181
8282
* Loop over the training data, $\{ ({\bf x}^0, {\bf y}^0), ({\bf x}^1, {\bf y}^1), \ldots \}$. We'll refer to the current training
8383
pair as $({\bf x}^k, {\bf y}^k)$
84-
84+
8585
* Propagate ${\bf x}^k$ through the network, getting the output
8686
${\bf z} = g({\bf A x}^k)$
87-
87+
8888
* Compute the error on the output layer, ${\bf e}^k = {\bf z} - {\bf y}^k$
89-
89+
9090
* Update the matrix ${\bf A}$ according to:
91-
92-
$${\bf A} \leftarrow {\bf A} - 2 \,\eta\, {\bf e}^k \circ {\bf z} \circ (1 - {\bf z}) \cdot ({\bf x}^k)^\intercal$$
93-
</div>
9491
92+
$${\bf A} \leftarrow {\bf A} - 2 \,\eta\, {\bf e}^k \circ {\bf z} \circ (1 - {\bf z}) \cdot ({\bf x}^k)^\intercal$$
93+
```

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp