We can find dL/dw by multiplying the transpose of the jacobian of y w.r.t. w by dL/dy
srb
In a nutshell, back-propagation is the process through which each neuron in the neural network is assigned what is sometimes called an "error value" which is indicative of its contribution to the loss.
Then, these error values can be used to calculate a gradient of the loss function with respect to the weights of each of the individual neurons, and then the weights can be updated.
Back-propagation is the mechanism by which the weights are, over time, set correctly.
We can find dL/dw by multiplying the transpose of the jacobian of y w.r.t. w by dL/dy
In a nutshell, back-propagation is the process through which each neuron in the neural network is assigned what is sometimes called an "error value" which is indicative of its contribution to the loss. Then, these error values can be used to calculate a gradient of the loss function with respect to the weights of each of the individual neurons, and then the weights can be updated. Back-propagation is the mechanism by which the weights are, over time, set correctly.