Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

kayvonf

Question: give a conflict management policy that avoids livelock in Case 4.

gryffolyon

Can we bring in some kind of an ordering here: like if 2 threads are doing a write, thread with a lower(or higher) thread id is allowed to go first and allowed to commit. Only after that we allow thread 1 to do its write and commit. In between, if thread1 tries to abort thread0, given our priority scheming, we allow thread 0 to continue and abort thread1 till thread0 completes.

Of course there can be many other ways of doing it too.

iZac

That would lead to starvation for T1. OS might schedule T0 again and again and starve T1 to run.

ericwang

My idea is that a transaction won't send write check until the very last step before commit. After that, it commits.

So if there is any transaction commits before this one, it will receive check and restart given any conflict exists. And the transaction who sent the checks will commit anyway.

mangocourage

How is it guaranteed in software that the two threads can't be checking each other at the same time? For example if t0 is checking t1, what's the logic to guarantee that t1 is not simultaneously checking t0?

ChandlerBing

What would happen if we make the other thread stall (instead of abort) until the current thread commits its write in case 4?

cgjdemo

I think if a write-write conflict is detected, the thread which receives the check should stall until the thread which issues the check commits. Now the two transactions are executed in serial order so there is no livelock.

BigFish

@cgjdemo I agree. Actually I cannot see any point why in case 4, T1's wrA should cause T0 to restart. It is more reasonable for T1 to be stalled until T0 commit.

sam

Yes stalling T0 on a write check can cause starvation to T0. I think better way would be to stall the thread which is checking and not which is being checked.

subliminal

I think stalling based on some global numbering plus randomized exponential backoff would fix this. This would solve cases with more than two conflicting write-write conflicts as well.

russt17

Can someone explain why the stalling is done in case 2? I don't see why it is necessary.

landuo

To avoid live lock in case4, I think we can apply timestamp of the transactions. When conflict occurs, transaction with latest timestamp gets aborted and scheduled again with the same timestamp. Thus, the system will schedule conflicting transactions based on a first come, first serve strategy.

jezimmer

What about this solution to livelocking:

instead of immediately aborting, we wait to see if the transaction commits, and the time we wait before restarting increases by a factor of two each time?

amaliujia

@landuo. Timestamp is a good idea. I think the idea essentially is to maintain a steady order, like a FIFO queue.

gryffolyon

@russt17, the reason why we stall T1 is because it has read a value which is not state. Remember T0 has done a write earlier than T1 but T0 has not committed it yet. Hence by the semantics of transactional memory, T1 has read the value which does not reflect the change done by T0 and goes about its calculations. The idea of transactions is that we should be able to represent the transactions of T0 and T1 sequentially in a timeline. If we did not stall the read of T1, T1 breaks the order of the timeline and it cannot fit alongside T0. Hence we stall T1 until T0 commits after which T1 reads the value and goes about its commit. Now by doing this, we are ensuring T1 occurs after T0 on the timeline

abist

I think stalling the thread which checks until the other thread would be a good way, because that way the transaction is serialized, which is one of the requirements for transactional memory.

kamladi

What determines whether a thread is stalled vs. restarted? Is this decided by the convention manager's policy? When is one chosen over the other?

Berry

@kayvonf We could change the policy from "aggressive" to "polite" and use exponential backoff to use the number of aborts as a basis for choosing how long to stall for. This way we can keep stalling until T1 commits. After some number of stalls if T1 still isn't able to commit (maybe it's aborting due to a third thread) we can start stalling T1 instead.

parallelfifths

@kamladi, I have the same question. Everyone here has proposed some variation of stalling instead of restarting in Case 4, the Write-Write conflict, to avoid livelock. It seems to me that we could then also use a stall initially in Case 3, the Read-Write conflict. Couldn't we simply have T0 wait to complete its read until T1 has committed its write the first time, instead of doing the initial abort? I think this achieves an identical result in the end to the current Case 3 implementation.

Is there ever a case when we need a restart (and abort) instead of a stall?

ankit1990

@landuo I agree with the idea of time stamping. In general, any sort of scheme which always prioritizes one transaction over the other (maybe a transaction id, or the id of the thread executing the transaction) should work fine.