In which category would we put frameworks like Hadoop?
This comment was marked helpful 0 times.
mchoquet
I think of Hadoop as closest to a data-parallel model: you specify some map and reduce operation that should be applied to all of the data in your dataset, and the framework handles the scheduling and communication.
This comment was marked helpful 0 times.
kayvonf
I agree.
This comment was marked helpful 0 times.
yihuaf
In addition, Hadoop is designed to run on a cluster of machines that does not share any disk or memory space. This design choice allows Hadoop to run on out of box commodity hardwares.
Conceptually, Hadoop's abstraction is closer to data-parallel model, but Hadoop is implemented a little differently from the data-parallel described here. If you wish to explore, you can read about how MapReduce is implemented in Google's paper.
In which category would we put frameworks like Hadoop?
This comment was marked helpful 0 times.
I think of Hadoop as closest to a data-parallel model: you specify some map and reduce operation that should be applied to all of the data in your dataset, and the framework handles the scheduling and communication.
This comment was marked helpful 0 times.
I agree.
This comment was marked helpful 0 times.
In addition, Hadoop is designed to run on a cluster of machines that does not share any disk or memory space. This design choice allows Hadoop to run on out of box commodity hardwares. Conceptually, Hadoop's abstraction is closer to data-parallel model, but Hadoop is implemented a little differently from the data-parallel described here. If you wish to explore, you can read about how MapReduce is implemented in Google's paper.
This comment was marked helpful 0 times.