Previous | Next --- Slide 14 of 46
Back to Lecture Thumbnails
chenh1

What if in the max function, we change 12 to 17? Now it will contribute to output.

bochet

Something interesting is if x and y output the same value, then gradient of max should be propagated back to both gates.

williamx

@chenh1 True, this could be an issue. But when we do gradient descent, we usually use step sizes that are small enough so that such a scenario wont occur frequently.

Master

@bochet. In max pooling, if x and y happen to have the same value, only one of their gradients will be propagated, in a deterministic way. However, it seldom happens due to small step sizes.

POTUS

How the blue numbers were obtained:

  • + propagates the derivative backwards on the link

  • max() switches the gradients to one of the incoming links

  • * takes incoming link and multiplies it by the value on the opposite link