Previous | Next --- Slide 43 of 43
Back to Lecture Thumbnails
dyzz

Spark is a very powerful tool to build on top of because it provides very robust methods of parallelism that can be very powerful for its use cases. I wonder on which workloads these spark versions of libraries perform better than their non-spark counterparts. Specifically, I wonder if they still perform better/the same for smaller datasets that dont necessarily have to be streamed from disk.