In SPMD model, the program logic is the same for each thread which computes the data. Thus the instructions for each thread seem to be same. Previously, I am confused that if the instruction for each thread is the same, why one warp context contains 32 different parts for different threads (The figure in the previous slide). One possible reason I guess is that different thread may have different conditions. For example, if there is a "for" loop and "if" statement inside the loop, the number of running cycle (I mean the instructions inside the loop) may depend on the data. Loop in some thread may stop early while loop in other threads may still run. At this time the former thread will be idle and the method to do this may be masking the corresponding instructions. Therefore it is necessary to hold different parts for different threads in one warp. I do not know whether this explanation is correct. Welcome to verify my thoughts!