Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

In-Memory Distributed Computing using Spark

Previous | Next --- Slide 36 of 43

unparalleled

Narrow dependencies are good because of lower communication. Also if something fails, the cost of redoing an operation is less when the dependency is narrow.

aperiwal

In case of node failure, the lost partitions can be reconstructed from the original RDDs through the sequence of transformations that were applied to the original RDD. Thus, you wouldn't lose the data completely and can recreate data lost during crashes.