Load from disk incurs heavy latency cost each iteration. Perhaps we can reuse intermediate graphs to avoid frequent loads.
Why do we need to stick to the process of map -> reduce instead of writing some specialized program for our need?
You don't have to use map-reduce. It's just saying that in this context, i.e. the pagerank graph processing algorithm, using the map-reduce framework would entail loading form disk on each iteration.