This recalls a classic bit of 418 wisdom, don't compare yourself to an unoptimized parallel solution, compare yourself to an optimized sequential solution.
holard
What are some applications for which Spark DOES perform significantly better than sequential solutions?
vasua
One thing that isn't taken into consideration in this benchmark is Spark's ability to handle failure, which becomes very important in an enterprise setting when not simply setting benchmarks. Should a critical computation fail, it is important that it gets rerun, which is what Spark would do given a cluster of machines. In this particular benchmark you pay a high price for that reliability and redundancy.
This recalls a classic bit of 418 wisdom, don't compare yourself to an unoptimized parallel solution, compare yourself to an optimized sequential solution.
What are some applications for which Spark DOES perform significantly better than sequential solutions?
One thing that isn't taken into consideration in this benchmark is Spark's ability to handle failure, which becomes very important in an enterprise setting when not simply setting benchmarks. Should a critical computation fail, it is important that it gets rerun, which is what Spark would do given a cluster of machines. In this particular benchmark you pay a high price for that reliability and redundancy.