I had a doubt about the abstraction and implementation for RDD's. Professor Kayvon explained it to me. I thought his reply was helpful. Here it is below.
"The main idea of an RDD is that it is an abstract collection (a sequence). That abstraction can be implemented by being backed by memory (like an array) or the elements can be computed on demand as needed. So I would say that the RDD abstraction allows us to execute a full program -- consisting of a sequence of operations on RDDs -- without every having to materialize all of the RDDS (or even all parts of a single RDD in memory at once).
"- Kayvon Fatahalian
This comment is super helpful. It clearly highlights the benefits of RDDs. So are there a variety of RDD implementation? Maybe even multiple implementations used side-by-side that are chosen from depending on the sequence of operations for further optimization?