This large amount of invalidation is the motivation behind the idea of "test and test and set". If the locked out processors merely attempted to BusRd, not BusRdX, then there would be significantly less interconnect traffic from coherence issues.
This comment was marked helpful 0 times.
pradeep
In test and set all the threads first try to read a shared variable and then write to it. Hence, while loading that variable we broadcast a BusRdX.
Here in this case proc 1 invalidates every one's cache, reads the test variable and since it passes it holds the lock. During this phase proc 2 and proc 3 also try to get this lock and try in a loop and hence for each try we have p-1 invalidations happening, this also takes the cache line away from proc 1 and hence when it tries to do the set to 0, it has to load the variable into memory again and p-1 invalidations happen.
To reduce invalidation traffic, test and test and set is used.
This comment was marked helpful 0 times.
RICEric22
Is there a reason why 'test-and-set' is implemented as a straightaway BusRdX? I was confused for a long time until I realized that test and set is not a 'test' (BusRd) followed by a set (BusRdx).
This comment was marked helpful 0 times.
ycp
@RICEric22, I'm not quite sure, but I would say because you want test and set to be atomic and having two different bus requests could make that more difficult. Also, when you are reading for the test, you are doing so with the intent to write, so that could be another reason why you have a BusRdx.
This comment was marked helpful 0 times.
mchoquet
@RICEric22: it's because of what @ycp said; test-and-set consists of a read followed by an optional write, with the guarantee that nobody else can have that address in the modified state between the read and write. The only way to ensure that this occurs is to get exclusive access to the line up-front, which means a BusRdX.
This large amount of invalidation is the motivation behind the idea of "test and test and set". If the locked out processors merely attempted to BusRd, not BusRdX, then there would be significantly less interconnect traffic from coherence issues.
This comment was marked helpful 0 times.
In test and set all the threads first try to read a shared variable and then write to it. Hence, while loading that variable we broadcast a BusRdX.
Here in this case proc 1 invalidates every one's cache, reads the test variable and since it passes it holds the lock. During this phase proc 2 and proc 3 also try to get this lock and try in a loop and hence for each try we have p-1 invalidations happening, this also takes the cache line away from proc 1 and hence when it tries to do the set to 0, it has to load the variable into memory again and p-1 invalidations happen.
To reduce invalidation traffic, test and test and set is used.
This comment was marked helpful 0 times.
Is there a reason why 'test-and-set' is implemented as a straightaway BusRdX? I was confused for a long time until I realized that test and set is not a 'test' (BusRd) followed by a set (BusRdx).
This comment was marked helpful 0 times.
@RICEric22, I'm not quite sure, but I would say because you want test and set to be atomic and having two different bus requests could make that more difficult. Also, when you are reading for the test, you are doing so with the intent to write, so that could be another reason why you have a BusRdx.
This comment was marked helpful 0 times.
@RICEric22: it's because of what @ycp said; test-and-set consists of a read followed by an optional write, with the guarantee that nobody else can have that address in the modified state between the read and write. The only way to ensure that this occurs is to get exclusive access to the line up-front, which means a BusRdX.
This comment was marked helpful 0 times.