Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

Previous | Next --- Slide 9 of 58

BryceToTheCore

I believe that from the Website's perspective performance is a matter of throughput, where the website desires to handle as many requests as possible, whereas form the user's perspective latency is of paramount importance. Even if a website has a high throughput that amortizes the costs of all of the users' requests, the individual users that get the short end of the stick and are required to wait for long periods of time will not be impressed by the theoretical asymptotic complexity of the web server.

Zarathustra

I would say that site performance is pretty much always a question of latency. Throughput might help some engineer appeal to his boss for a raise, but it's not going to feed my Facebook addiction any faster, and if I get fed up with waiting on the site, Facebook loses ad revenue. So I would say, in nearly every case, latency is of prime importance.

Elias

@Zarathustra: I'm going to have to disagree pretty strongly here. From the perspective of the user, latency is the most important thing. However, much of the processing behind a successful application occurs offline, without the user actually waiting on its completion. Take, for example, Google's search. Google doesn't index the web in response to a single query (or even a host of queries). Instead, there's an asynchronous model, where Google is always indexing (with some throughput), and searches make use of the indexed content.

In this case, latency isn't as important as throughput: the more pages Google indexes per unit time, the higher quality their search results will be (even though those search results are going to be more focused on latency then throughput). Which of {throughput, latency} is most important really depends on the application.

byeongcp

I agree with the points that BryceToTheCore has mentioned, but I also think the site performance isn't necessarily a question of either throughput or latency but both. For example, Kayvon mentioned of a scenario (when he was going over this slide) that if the throughput does not keep up with the number of requests, the request queue will build up and it will start slowing down the latency.

BryceToTheCore

I agree with @byeongcp. I believe that bad throughput can certainly effect the latency. Given a finite amount of time it is possible that latency can affect throughput as well.

Throughput --> Latency:

The Request queue will be unstable and requests will have longer and longer waiting times due to them waiting for the previous requests to be serviced.

Latency --> Throughput

With bad enough latency, even with a stellar processor, it is possible that with interactive programs not enough requests will come in to fully utilize the processor, because some programs will only send one message at a time and will wait for a response before providing any more queries. One example would the most abstract conceptual model of Google Search, where the client sends a search request and then twiddles its thumbs waiting for the results with no useful work to do in the mean time.

Latency --> Throughput (Aside)

If latency is bad, then non committed people will get board and stop using the webpage. Because of this loss of customers, fewer and fewer users will use the program and therefore fewer requests will be sent, so again the stellar processor will be underutilized and therefore throughput will be bound by the amount of work actually available.

Olorin

@Elias brings up an important point, but I'd argue that when we're talking about google search indexing, that's not really the same case -- the google search indexing servers aren't going to directly handle user requests, most likely. I'd argue that in the case of a web server, latency is the metric you want to minimize. Now, sometimes, you need to improve throughput in order to improve latency, but it's all in the goal of minimizing the time the user has to wait to see their page.

Certainly, when we're talking about services the user isn't directly interacting with, the game changes a bit, as @Elias mentioned. Backend indexing / machine learning servers are more likely to have throughput be the more important metric.

VP7

Information retrieval aided by indexing is not the only scenario to deal with while scaling a website. Hence in a more generic sense, I strongly agree with BryceToTheCore's opinion.

pmassey

I also have to disagree with @Elias and agree with @ByrceToTheCore. The latency of the web server delivering the content is what is going to effect the end user happiness. Most of the optimizations done that would want a large throughput are not done on the serving web server itself, but instead are done on another computer(s) attached the same filesystem such that the web server can access them when they are updated. Search indexing is a prime example, because partially updated indexes are barely useful. (They don't contain the total corpus statistics, and would have a harder time predicting spam scores etc.)

Elias

@pmassey: Be precise with what you're disagreeing with! In fact, I'm in agreement with @ByrceToTheCore, and we're actually saying the same thing: from the perspective of the user, latency is paramount. The point I go on to make is a rebuttal of @Zarathustra's assertion that latency is (in nearly every case) of prime importance for performance. It is certainly important, but it's important to recognize that there are other factors behind the scenes.

pmassey

@Elias -- fair point! I suppose I was immediately caught up on your example of Google's indexing (which I'm aware is just one of many possible examples). It is my impression that for systems that require a continuous (but not real-time) update (such as updating search indices), they are done separately from the server that servers users. Granted, throughout is still important to these, but they are very disconnected from web content throughout.