Previous | Next --- Slide 62 of 69
Back to Lecture Thumbnails
acortes

Wouldn't this decrease accuracy significantly? We are already partitioning the updates by using worker nodes and then partitioning the updates even further by adding servers.

ferozenaina

This wouldn't decrease accuracy at all from the previous implementation.

Here, we are simply trying to reduce network traffic along a single interconnect (between the worker nodes and parameter server) by having two parameter servers. This is similar to how RAID0 works - the data (parameters) are sharded or separated into 2 servers. If network communication or parameter IO is the bottleneck, this will actually improve the performance.

gogogo

Is the frequency of communication of the most up to date parameter values up to the implementer of the system? Is the optimal frequency determined empirically?

ferozenaina

The parameter values are updated in the parameter server only when a worker node completes. Likewise, the other worker nodes acquire the updated parameters from the server only on their next implement.

So, the frequency would vary and depend on the workload for every worker node.