Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 11 of 46

safari

Another way to consider of cost of communication: we can probably consider communication and the necessary number of communication a task we need to parallelize. Apparently our scheme in class is not very effective in this regard. The max number of communications happened at one time was limited to the number of rows in the lecture room.

Of course this argument has to take into consideration to cost of adding numbers vs. the cost of communication. A change in the relation of these two costs would change our algorithm greatly.

l8b

Our strategy in class wasn't very efficient in that there were many idle "processors" at any one point in time. Although the rows calculated their sums in parallel, there was not as clear of a strategy for each individual row. Some rows may have attempted to calculate their sum in parallel, but this introduced more communication cost, and other rows may have just done a serial sum down a row. Really, the different rows calculating their sums was the only significant parallel portion of the computation, so if a strategy that used parallelization more heavily was used, it likely would have been much more clear and efficient.

srb

Another thing to note about our in-class demo was the added constraint that each processor (person) had one piece of necessary information (how many classes they were taking) and they were the only one who had access to that information. This constraint doesn't usually exist - I can't think of a situation in which supposedly identical processors have exclusive access to certain variables. The constraint here forced each processor to be used, to demonstrate that sometimes the communication cost far outweighs the computation cost - but I think it's worth noting that if we really wanted to quickly compute how many classes in total we were taking, and everyone had access to all the information, we could've done it much quicker.

pht

@srb - That's a good point that each person had access to a specific variable that no one else had access to. However, if everyone had access to all the information, how could the process be optimized? I still think there's going to be communication and synchronization latency.

themj

If everyone has access to all of the information, there is less waiting around that processors have to do due to lack of information. For example, if Processor B needed information that Processor A had, Processor B would have to wait idly until it received this information. If there is shared memory among all of the processors, Processor B could immediately proceed with it's computations without need to wait around idly for Processor A to finish.