Previous | Next --- Slide 19 of 45
Back to Lecture Thumbnails
jmnash

At first I found child stealing/continuation first vs. continuation stealing/child first to be confusing, so I'm going to try to explain the differences between them.

I thought child stealing was easier to understand. I think the way it works is the system keeps executing the continuation and puts the child on thread 0's work queue. Then the child is available to be stolen by other threads in the system.

With continuation stealing, it seems like what happens is that thread 0 immediately starts executing the child and then puts the continuation (from the current point onward) on its work queue. Then the continuation is available to be stolen by other threads. If a thread steals the continuation and it spawns more children, those children will be immediately executed by the thread and the continuation from the new current point will be put on the queue again and be available to be stolen.

tomshen

I'm going to try to explain continuation first and child first a little bit differently.

When a system runs continuation first, a thread adds all of the child calls to its work queue before attempting to execute any of them. Anything in the body of the continuation up cilk_sync will be executed sequentially, and the work in the cilk_spawns will be added to the queue in order. Then, once the thread reaches cilk_sync, it will run any children still in the queue in order. This allows other threads to steal any of the child calls.

When a system runs child first, each time a thread reaches a call to cilk_spawn, it will execute that work, and put the rest of the calling function (the continuation) into its queue. This allows other threads to steal the (potentially very large) continuation while this thread is executing the first child.

mofarrel

Fork-join parallelism (here the kind expressed by cilk), creates a dependency tree. Certain information must be computed prior to a specific cilk_sync call. We can compute answers to this dependency tree in different orders. Here computing the continuation first, and putting the child on the work queue, corresponds to a BFS traversal. On the other hand, computing the child first, and putting the continuation on the work queue, corresponds to a DFS traversal. One could imagine coming up with many other ways to traverse the dependency tree that could lead to performance improvements.