Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

kipper

When an elastic web server experiences a burst of requests, it can scale up the number of available servers to be able to service each request and prevent the request queue from piling up. I think Kayvon mentioned that in order to be able to scale up and down, the servers need to be stateless. On the next slide, after a burst, we see that the elastic web server can scale down to a number of servers that can sustain the average load request.

nramakri

Like how Apache has a preset number of worker threads in its worker pool (which may not be doing work but are in standby incase a heavy load needs to be processed), there are a number of web servers here that were on standby, and have been activated after a high workload was detected.

It is better to have a bunch of web servers initialized but in a standby/sleep mode when the workload is low, and instantiate them during times of high workload. Otherwise if suddenly confronted with a high workload, the latency of initializing these web server instances at runtime would become very high.

xiaoguaz

@kipper I think it is more accurate to use "scale out" here. "scale up" means that we use larger resources while "scale out" means provisioning extra resources.

kipper

@xiaoguaz, oh okay, thanks for the clarification! I wasn't aware of the distinction.

maxdecmeridius

And these extra web servers are spun up from the hardware resources that we already have, since we have the hardware resources to accommodate peak load?