In general, if we seek to maximize performance, we would prefer to have longer queues. However, given that longer queues need more memory, we need to make some tradeoffs. If we have a fairly predictable, balanced workload, then a shorter queue would be fine, since we wouldn't need to queue up more than a few requests at a time. However, if we have a highly unpredictable workload, longer queues would be worth the extra memory because of the time saved by avoiding stalls.
This comment was marked helpful 0 times.
For those interested, the probability that a request finds the queue busy is given by the Erlang-B formula. The formula is a little messy, but it decreases exponentially with the length of the queue. This is assuming that request times follow a Poisson process, though, and I don't know how realistic that is for this situation. Maybe Kayvon has some data on that?
We had an interesting discussion about how work rates translate into different size of queues. If we want maximum performance and assuming we can make some assumptions that the rest of the parameters are standard and assuming we don't know if work steps are even or not, we will chose the biggest queue possible.
I think this was the conclusion?
I think we need smaller queues if workload is assumed to be around balanced whereas we need bigger queues if workloads aren't that balanced.
In this case, I think the tradeoff is between bus lines and queue length. For a longer queue, the bus needs to be wider to have more contain more tag bits.