Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

mingf

It is possible to perform insertion at two different places of the list if the two places are not influence each other. In the above implementation, an insertion at a deeper position can not be done if there are other ongoing insertion at a shallower position because the deeper one has to acquire a shallower lock in order to acquire a deeper one. So, rather than grabbing lock along the way, is it possible that we do not grab any locks until reaching the desired position.

cgjdemo

I think there may be some problem if not grabbing locks along the way. For example, if Thread 2 wants to delete a node at a deeper position when Thread 1 is deleting a node at shallower position. To reach the target node, Thread 2 must go pass all the nodes which are shallower than the target node. This means that Thread 2 may access the node which Thread 1 is deleting. This may cause a segmentation fault. The fine-grained locking ensures that no thread can access the nodes which are currently in operation(insertion or deletion).

kayvonf

Question: On the slide it mentions there's a way to optimize insert() even further. I want to see someone do it.

funkysenior15

With the current insert(), if we want to insert an element, we need to lock two elements, one greater and one lesser than it (for simplicity, all elements are distinct).

Instead, will it work if we lock every alternate element as we traverse the linked list? For example, each thread will try to lock the 1st, 3rd, 5th... elements. Once it obtains the lock for an element, it checks its "previous" and "next" elements to see if the element to be inserted fits in between "cur" and "prev", or "cur" and "next". To me, it seems like this should work, and it'll use half the amount of locks as before. (Also, we'll be holding only one lock at a time)

ak47

@funkysenior15 I think this is the n-sized block method for n=2.

gryffolyon

@kayvonf: How about if we try somethings like this:

 void insert(List* list, int value) {

  Node* n = new Node;
  n->value = value;

  //assume case of insertions before head
  //is handled here

  Node *prev, *curr;
  lock(list->lock);
  curr = list->head;
  lock(curr->lock);
  unlock(list->lock);
  prev = NULL;

  while(curr) {
     prev = curr;
     curr = curr->next;
     if(curr) {
     lock(curr->lock);
        if(curr->value > value) {
           //we no longer need the lock on the current node
           unlock(curr->lock);
           break;
        }else 
           unlock(prev->lock);
  
     }
  }

   n->next = prev->next;
   prev->next = n;
   unlock(prev->lock); 
}

The code for deletion will remain the same. I can see that we hold 2 locks only till we do the check "if(curr->value > value)" otherwise the insert code will just have 1 lock and can potentially lead to lesser contention.

ankit1990

@gryffolyon I think that the answer is correct.

dsaksena

A good thing to notice is this, we take the list lock only once and rest of the time we don't need the list lock but use only the curr and prev locks. This not only fine grains the locks used and amount of the list locked but also saves us from TOCTOU.

TOCTOU = time of check to time of use

when we use:

if (curr) 
   lock(curr->lock);

we check curr and then lock it, between the check and lock there could be a context switch and we could lose the element. Maybe it is deleted. But here we are safe as we also hold the prev->lock and a deletion also asks for the lock and we therefore won't have a TOCTOU bug.

All in all its a nice piece of code.

gryffolyon

@dsaksena Glad that you realized there is no deadlock. And yup aware of TOCTOU and our paradise is not lost here as far as I can see. Nice observation. Thanks to 15-410 for telling us that :)

iZac

I think even delete could be improved like this:



void delete (List* list, int value) {
 Node* prev, *cur;
 lock (list->lock);
 prev = list->head;
 cur = list->head->next;

 lock(prev->lock);
 unlock(list->lock);
 if (cur) lock(cur->lock);


 while (cur) {
  if (cur->value == value) {
   prev->next? = cur->next,
   unlock(prev->lock);
   unlock(cur->lock);
   delete cur;
   return ;
  }
  unlock(prev->lock);
  prev = cur;
  cur = cur->next
  if (cur) lock(cur->lock);
 }
 unlock(prev->lock);
}

gryffolyon

@iZac, I don't see any reason why it shouldn't work. We could release the lock earlier and I don't seem to understand why we need another pointer old_prev on the stack in the code given in the slide as we have not released the lock on curr and we still have curr. By virtue of releasing the lock early, we bring in more concurrency though only by a few instructions :P. Unless someone else thinks otherwise.

cgjdemo

@iZac I think there may be some problem if we unlock(prev->lock) before prev = cur. Let's say thread 1 performs unlock(prev->lock), thread 2 immediately cuts in and acquire lock on prev. Now thread 1 cannot perform prev = cur since thread 2 is holding lock on prev. There is a deadlock now since both threads cannot proceed.

iZac

@cgjdemo, prev is a local pointer, you can do prev = cur in the scenario you stated.

gryffolyon

Yup and hence no deadlock as its not shared across threads.

skywalker

@kayvonf, @gryffolyon (Do we need to lock the next node before unlocking the current? I feel like this works...)


void insert(List* list, int value) {

  Node* n = new Node;
  n->value = value;

  //assume case of insertions before head
  //is handled here

  Node *curr;
  lock(list->lock);
  curr = list->head;
  lock(curr->lock);
  unlock(list->lock);

  Node *prev = NULL;

  while(curr) {
     Node* next = curr->next;
     //at the last node or found the place to insert
     if (next == NULL or next->value > value) break;
     unlock(curr->lock);
     curr = curr->next;
     lock(curr->lock) //curr won't ever be null
  }
  //we have a lock on the node to insert at
  n->next = curr->next;
  curr->next = n;
  unlock(curr->lock);

}

gryffolyon

@skywalker, the snag lies in the line where you do unlock(curr->lock); inside the while(curr) loop. The moment you unlock curr and I see you are just holding 1 lock, this thread can get context switched and another thread which is doing a delete on this curr node can be switched in. Now curr can be deleted and freed. Now you get switched back in and you try to do curr = curr->next; So you are trying to dereference a pointer which is deleted and freed. Hence the problem I feel.

kayvonf

@skywalker. great question. @gryffolyon. You are correct, great response.

grose

I can see why these solutions in the comments are better, but how much are they better by? It looks like they just release one lock on one element slightly earlier than it otherwise would be, allowing a thread that's been tailing you to get ahead a little earlier.

But these optimizations only take place during the part when you find the element you were looking for, which seems like a small fraction of the time.

toothless

In this section on gryffolyon's code:

lock(list->lock);

curr = list->head;

lock(curr->lock);

unlock(list->lock);

is it true that the only reason to lock/unlock list->lock is because of potential insertions before the head?

If so, can't we have a dummy node in the list that is always at the front, so we can remove the list->lock, and also avoid special casing on inserting at the front?

Kapteyn

@gryffolyon's solution definitely allows us to take one less lock at the very end when we've found curr-> val > my_val. But I do not believe it will lead to more concurrency.

Since all threads must walk the linked list from head to tail searching for the element they need, since we still have a lock on prev, no other thread can advance to the nodes in front of prev until we release prev. Therefore, I believe the solution provided where we avoid taking the lock on curr when curr->val > my_val will not allow for more concurrency. It will, however, prevent us from the overhead of taking another lock.

The only situation where I can see this solution allowing for more concurrency is when a thread can walk the list from tail to head, i.e. we have a doubly linked list.

landuo

Can we apply similar idea of optimistic detection of transaction memory here? That is, we do not acquire locks when traversing the list to find the node. When the target is found (similar to when a transaction is about to commit), the current running thread acquires locks of prev and curr as before and then checks if those pointers have been changed (similar to checking if there is conflict between two transactions). If none of the pointers were changed, do the operation and unlock the nodes. Otherwise the thread rolls back and restarts the operation.

zhiyuany

@mingf, although what you said is true, the problem is, we've no idea about which node to operate on because we insert/delete by value not by reference or index. If we operate by reference or index, we can know our working set and definitely be able to do the optimization you supposed.