Slide View : Parallel Computer Architecture and Programming : 15-418/618 Fall 2016

Previous | Next --- Slide 64 of 69

ferozenaina

Is the internode communication due to pixels "moving" due to NxN convolution? For example, in a 6x6 image split into two 3x6 blocks, after the first 5x5 convolution, pixel (0,0) would have an effect on pixel position (5,0). Thus, we would need internode communication to synchronize the output of first convolution into worker node 1?

What exactly is 1x1 convolution? I understand the intuition that 1x1 will not move the pixel around but is it even useful?

yey1

I would say the purpose of reducing convolution kernel size is to have deeper networks. VGG is a first attempt in this direction. Now everybody follows this design pattern, including GoogleNet and ResNet.

Empirical studies find that you have a better performance with deeper networks, as long as you don't suffer from gradient vanishing problem. With deeper networks, you also don't need a large convolution kernel in each layer to get a large receptive field, then you can use smaller kernels like 1x1, 3x3 to same parameters and computation.