Previous | Next --- Slide 13 of 56
Back to Lecture Thumbnails
jhhardin

Something related we talked a lot about in Distributed Systems is task migration - we try to statically assign work so that it is balanced, but it can be hard to know how long each task will take. In certain cases, it might be best to re-assign tasks after initial assignment, in which case we migrate the task to another worker. Doing this is expensive, but could be faster or even necessary (if a worker dies, for example).

nrchu

Yes, except that I would point out that if a worker dies you cannot migrate the process but would rather have to restart the process on another node. I believe that "migrating" implies some way of transferring the work already done. If it is writing to a distributed file system, in theory you could try to find where it left off but I am not sure that it would be possible to tell that no data was corrupted.