This is a toy MLP with one hidden layer for backprop.
The target vector y is fixed to an arbitrary value of [10,-2] for simplicity. It can be modified to depend on the input vector x for useful predictions.
File | Desc |
---|---|
Notebook → Download and modify the code! :) |
z = np.dot(W1, x)
a = (-y + out).T
b = np.dot(a, W2)
c = sigmoid(z) * (1 - sigmoid(z))
d = b.T * c
dL_dW1 = d * x.T
h = sigmoid(np.dot(W1, x))
dL_dW2 = np.dot((-y + out), h.T)
We can see that the backpropagation is working and the correct gradients are calculated. The network is learning and decreasing its loss: