Previous | Next --- Slide 33 of 57
Back to Lecture Thumbnails
pajamajama

I think the site throughput won't change under heavy load because it will still be processing the same number of requests per second--the site is still capable of processing R requests per second. However, latency will increase because as more and more requests come flooding in, the wait time in the queue will go up since L, the queue length, will increase (and thus L/R, the time in queue, increases too). This relates to a DOS (denial of service) attack, as mentioned in class, because such an attack happens when the site is flooded with requests and the valid requests (i.e. the ones that actually want to access the site) have to sit in the queue behind the "fake" requests.

Master

When the load exceeds the server capacity, requests will begin to queue up. Once the queue becomes full, the server will begin dropping requests. For those unlucky users, they will receive a denial of service.

axiao

It seems like the strategies we talked in this lecture about using one queue per worker thread instead of a one global request queue might also apply here. Using multiple queues would reduce contention, especially if we have many workers and many requests.

apadwekar

If we, as axiao suggested, use one queue per worker with stealing, we will no longer need a load balancer!

Abandon

@axiao I think your approach may not work. Those queues of workers still need a global request queue to steal from. Even though the worker queues' stealing speed could be faster than the global request speed, the queues' growth speed will always higher than the processing ability of the workers. All those worker queues would still grows infinitely and eventually leads to request drop as @Master mentioned.

axiao

@Abandon I agree that there will still be dropped requests. I was thinking that work stealing would perform better under certain situations by reducing the contention from retrieving the next request from the global queue, since if a worker queue is empty it could just randomly pick any queue to steal from. If we have many workers/requests and each request takes very little time, the contention from all the workers trying to get the next request from the global queue could be significant enough to visibly affect the theoretical throughput of the server. Maybe in this scenario dividing up the queue among all the workers and having an incoming job randomly assigned to a worker queue would be better (or some other assignment strategy).

It might the case though that these "super short request" scenarios are unrealistic since I believe that the time the worker spends on a request usually includes performing I/O to send the response back to the client. This makes a global queue feasible, since two workers usually won't be trying to get the next request at the same time if each request needs a decent amount of processing time.