Unlike during network evaluation, you must hold onto all of these outputs until back propagation comes back with the network. This is why there is a limit to the size of the network you can train.
BigPapaChu
Right now its a 3d representation of a training grid (like 3x3x384) are there possibly 4d representations of neural nets like (3x3x3x384)? Or is that pointless?
crow
@bigpapachu yes, this comes into play when working with video data or 3d data (such as CT scans)
rsvaidya
If you are using momentum and velocity to calculate the changing gradients you would need to hold on to those as well while training.
Unlike during network evaluation, you must hold onto all of these outputs until back propagation comes back with the network. This is why there is a limit to the size of the network you can train.
Right now its a 3d representation of a training grid (like 3x3x384) are there possibly 4d representations of neural nets like (3x3x3x384)? Or is that pointless?
@bigpapachu yes, this comes into play when working with video data or 3d data (such as CT scans)
If you are using momentum and velocity to calculate the changing gradients you would need to hold on to those as well while training.