Previous | Next --- Slide 23 of 46
Back to Lecture Thumbnails
Levy

SGD for many neural networks work in an asynchronous way, i.e. there's not such a synchronized reduction for each sum. Instead, they simply hand it over parameter server & update it.

mak

What criteria is used to decide "loss too high"? Is it based on domain specific / application knowledge or empirical value? Or is it specified as requirement?

themj

Generally, you check if the loss is too high by computing the difference between the current solution and the desired solution. If this difference is above a predetermined threshold, then the loss is considered too high.