Slide View : 15-418/618 Spring 2014

Previous | Next --- Slide 17 of 30

yanzhan2

I did not understand why O(P^2) traffic. When test-and-set lock for each processor generates invalidation message, suppose all P processors gets to the test-and-set part, then at most P invalidations. But when the first invalidation occurs(from the first test-and-set), other processor cache would go to I state, for the next test-and-set operation, only one processor cache (in M state) would be invalidated.

This comment was marked helpful 0 times.

yetianx

@yanzhan2: I think the assumption here is the critical section is long enough, so that all other processors are doing "test". Then all processors waiting for the lock is in the S state. So in each round, O(P) traffic occur and for P rounds there are at most O(P^2) traffic.

This comment was marked helpful 1 times.

tcz

Why do we look at P rounds of lock releases? Specifically, on slide 19, one comment by @sbly implies that the $O(P)$ traffic cited is from one round of lock releases.

Of course, assuming we look at P rounds of lock releases makes the two slides consistent, but that makes me wonder why, on slide 19, traffic is not O(P) per lock release (see my comment there).

Also, why look at P rounds in the first place? Because each processor wants the lock and thus must at some point release it?

This comment was marked helpful 0 times.

LilWaynesFather

I think the slide makes the assumption that the lock is contested for by P processors, which is what is shown on slides 12 and 16. It says that if there are O(P) invalidations (which means P rounds, each round has 2 invalidations) then there would be O(P^2) traffic total. Assuming constant/O(1) invalidations, then yes from that one round there would only be O(P) traffic.

This comment was marked helpful 0 times.

yanzhan2

@tcz, @LilWaynesFather,the comparison is for one round, so for TTS lock, worst case scenario one release would cause O(p^2) traffic (refer to later slides for explanation), but ticket lock would cause O(p) traffic.

This comment was marked helpful 0 times.

mofarrel

@tcz @yanzhan2 @yetianx @LilWaynesFather

During one lock release the number of invalidations is $O(P)$, and the amount of traffic is $O(P^2)$.

Here the lock is held by a processor p1, the other $P-1$ processors are waiting for the lock. When p1 releases the lock, it will invalidate all the other $P-1$ processor's cache entries (1st invalidation). All the $P-1$ processors, that were waiting for the lock, pass the outer test of the test-and-test-and-set. They are now all trying to get the lock using test-and-set. Each processor will need exclusive access to the cache line in order to be able to run the test-and-set. This means $P-1$ more invalidations must occur, allowing each processor to have exclusive access to the line, while it runs test-and-set. In total we had about $P$ invalidations, so $O(P)$ invalidations. Each invalidation causes $O(P)$ bus traffic (communicating with each of the other processors), giving a total of $O(P^2)$ bus traffic.

This comment was marked helpful 0 times.

yanzhan2

@mofarrel, that depends on how you define traffic, if you define snoop based protocol, and traffic is snoop traffic, then it would be fine. Other wise, if you define traffic as messages, then it is not that easy to construct O(p^2).

This comment was marked helpful 0 times.

Yuyang

I think this slide is slightly confusing with its wording. By $O(P)$ invalidations, it meant, $O(P)$ BusRdx were issued. And by $O(P^2)$ traffic, it meant there are worst case $O(P^2)$ occurrences where a processor invalidates its own cache.

This comment was marked helpful 1 times.