Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

In-Memory Distributed Computing using Spark

Previous | Next --- Slide 17 of 44

Funky9000

Load from disk incurs heavy latency cost each iteration. Perhaps we can reuse intermediate graphs to avoid frequent loads.

cyl

Why do we need to stick to the process of map -> reduce instead of writing some specialized program for our need?

lol

You don't have to use map-reduce. It's just saying that in this context, i.e. the pagerank graph processing algorithm, using the map-reduce framework would entail loading form disk on each iteration.