Previous | Next --- Slide 6 of 40
Back to Lecture Thumbnails
meatie

Example: machine learning practitioners implement models that minimize communication between different components of the model to speed up the training process. See more: http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf (Figure 2 on Page 5 describes the model)

srw

I don't quite understand "distance" as a concept which applies to multi-core machines. In Distributed, distance could refer to actual distance, latency, bandwidth, connection speed, etc. In this case, different machines could actually be different distances (i.e. have different communication times). How do these factors differ when using a single machine?

For instance, in class our different "processors" (people) could only talk to people close to them. For machines, do processors have substantially different communication times depending on which of the other processors they communicate with? Or are all processors approximately equivalent? If so, how would we "minimize the cost of communication"?

arjunh

Anyone want to take a stab at answering @srw's question? There are a few ways of looking at this; one has to do with the hardware and the nature of the interconnects between the processors (think about the scale of the machine that you're working with; we'll be working with machines with anything from several cores to several hundred/thousand cores).

Another has to do with software and how we as programmers can solve a problem differently to reduce the amount of inter-task communication. @meatie gave a good example of this, but if anyone could provide another such example, that would be great too.

Again, this is something we'll take a look at in more detail later on in the class.

Simba

In my opinion, the communication could be referred to as any costs that decreases the speedup of a parallel computing. On multi-core machines, different parts of a task may run on different cores and under most circumstances different cores still need to exchange information, e.g. core A passes the intermediate calculation results to core B. Here the cost could be the contention for the bus or getting access to the same memory unit.

srw

@simba That makes sense. So then how would one go about minimize this cost? Perhaps careful scheduling of tasks? Still, it seems like the communication cost is determined mostly at a hardware level, which isn't something we can improve in the scope of this class.

mangocourage

@srw, following Simba's definition of "communication cost", trying to maintain consistency among the individual caches of each cpu in a multicore machine could cause certain programs to experience cache misses other than those types we learned in 213. These misses result from trying to maintain "cache coherence". We can rewrite certain programs to avoid these types of cache misses.

hohoho

Though not a computer architecture aspect in any sense, my understanding would be that even though different cores seems to be "close" to each other, the differences in distances do exists. And communication do take time. Although not as in a distributed systems where communication might be a bottle neck for performances, we do care about "travel time", because people are less tolerant with latency in a local host system than a distributed system. And we are talking about performance improve, so if the latency overpowers the computational performance improvement brought by multi-cores, naive users probably would not think that parallel computing is a good idea in reality.