Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

Programming for Performance: Work Assignment

Previous | Next --- Slide 54 of 60

kayvonf

Question: Notice at this point thread blocks 1, 3, 4, and 5 are running on the GPU concurrently. (We've started running block 5 before blocks 3 and 4 have completed.) Is this correct? Why?

BigFish

It is correct. Thread in block 1 may suffer more from divergent execution so that it takes more time to finish. In addition, the block scheduler can put blocks into execution in any order.

yuel1

This is correct because in the cuda abstraction, blocks have to be independent from each other.