Narrow dependencies are similar to 'map' style transformation, while wide dependencies are similar to 'reduce' style transformation.
blah329
Wide dependencies can pose as bottle necks during a computation as they require a previous stage in the calculation pipeline to be complete before being able to move on, versus narrow dependencies, which allow the nodes to continue with the computations independent of the status of the other nodes.
sadkins
Going back to the example from the beginning of lecture, having to count all of the requests from each mobile provider would be an example of wide dependences. This is because partitions on the data are processed on each node, so communication must occur between them to get the mobile client instances to the correct node.
rrp123
In this case, we will need all of RDD A to be materialized in memory, since RDD B needs all of RDD A to be built.
Narrow dependencies are similar to 'map' style transformation, while wide dependencies are similar to 'reduce' style transformation.
Wide dependencies can pose as bottle necks during a computation as they require a previous stage in the calculation pipeline to be complete before being able to move on, versus narrow dependencies, which allow the nodes to continue with the computations independent of the status of the other nodes.
Going back to the example from the beginning of lecture, having to count all of the requests from each mobile provider would be an example of wide dependences. This is because partitions on the data are processed on each node, so communication must occur between them to get the mobile client instances to the correct node.
In this case, we will need all of RDD A to be materialized in memory, since RDD B needs all of RDD A to be built.