Previous | Next --- Slide 10 of 43
Back to Lecture Thumbnails

Which process(and at what stage) does the gathering of all values associated with a key and finding total no. of unique keys? Is it the case that after mapper finishes its computation, it does gathering?


@mak Yes after the mapper completes it's job, internally the results are grouped according to the user_agent and then the reducer is called to find the count of each unique user_agent. This is done internally and is not generally the headache of the user. The user only writes the mapper and reducer and the rest is taken care by the framework.


@mak The programming abstraction used here is implemented as RDDs.


@yes I'm pretty sure it's not RDDs, its a distributed system like HDFS mentioned 2 slides before this. RDDs are a data structure that exist exclusively in Spark.