Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

In-Memory Distributed Computing using Spark

Previous | Next --- Slide 31 of 43

chenh1

Narrow dependencies are similar to 'map' style transformation, while wide dependencies are similar to 'reduce' style transformation.

blah329

Wide dependencies can pose as bottle necks during a computation as they require a previous stage in the calculation pipeline to be complete before being able to move on, versus narrow dependencies, which allow the nodes to continue with the computations independent of the status of the other nodes.

sadkins

Going back to the example from the beginning of lecture, having to count all of the requests from each mobile provider would be an example of wide dependences. This is because partitions on the data are processed on each node, so communication must occur between them to get the mobile client instances to the correct node.

rrp123

In this case, we will need all of RDD A to be materialized in memory, since RDD B needs all of RDD A to be built.