Previous | Next --- Slide 42 of 44
Back to Lecture Thumbnails

Apparently, Spark did a great job as a huge improvement upon MapReduce. However, for ML training tasks through Big data, Spark doesn't perform as well as other frameworks which introduce asynchrony. The reason behind is: Bulk Synchronous Parallel (BSP) model, which is adopted by such MapReduce-like systems as Hadoop and Spark, is not desirable any more for lack of speed, since workers must wait for stragglers at each iteration.


This is a great article talking about Hadoop and Spark. It talks about how they compare in ease of use, speed, combining SQL, streaming and complex analytics. Finally, it also talks about MapReduce from the perspective of both Spark and Hadoop.