Previous | Next --- Slide 27 of 56
Back to Lecture Thumbnails
benchoi

It seems to me that the blocked assignment is preferable because of locality, and an interleaved assignment would be more appropriate for situations when the the blocking assignment would lead to poor distribution of work.

kayvonf

@benchoi: and for the solver application example used in this lecture, are you particularly concerned about poor distribution of work in the case where the grid is equally partitioned?

aaguilet

In this particular application, the Blocked Assignment is better, as we have a balanced work distribution (same amount of work per pixel). Even more, there's less data that needs to be communicated between processors (at most two rows need to be send), as we can see in Slide 29

kkz

@benchoi The work performed per pixel is rather straightforward and static: take the average of itself and 4 neighboring pixels. Because of this, the blocked assignment would not result in a poor distribution of work like it would with some other problems (cough).

rokhinip

I guess I'd like to clarify what kind of parallelism we're talking about here - multi threading on different processors? If so, then yes block assignment will be better since we will not have as much cache coherence issues and we will take advantage of spatial locality. However, if we are using something like ISPC where each program instance writes to a specific cell, then the data being read is stored in a shared cache in that core (assuming it fits) and we will probably want interleaved assignment.

Can someone verify/comment on this? I'm not a 100% sure if I'm thinking about this in the right way.

shabnam

@rokhinip If by Cache Coherence you just mean that there will be fewer cache misses then I think you are thinking right. I don't think there is any Coherence involved here. Everything else is fine.

nrchu

aside from cache issues, blocked in this case is better because it requires less communication. Realize that you have to communicate for every element that lies on the "border", and in the case of the interleaved every element lies on two borders of a partition.