Slide View : Parallel Computer Architecture and Programming : 15-418/618 Fall 2017

Previous | Next --- Slide 46 of 78

star013

Explain the first red rectangular part of this slide: My understanding is that when GPU allocates memory on per-block shared memory, it interprets the assignment instructions as a whole rather than runs the load instructions in each thread. It makes sense because the shared memory allocation can be done without a thread and this assignment method reduces the total instructions and can make use of space locality to speed up loading.