Slide View : 15-418/618 Spring 2014

Previous | Next --- Slide 20 of 59

wcrichto

Another way to approach the problem is to use an nginx-style architecture. This means having a static number of processes (so you don't have to spawn a new process for every request, which gets memory intensive) and each process handles multiple requests concurrently with some clever event-based code.

This comment was marked helpful 1 times.

squidrice

Maintaining several idle workers in pool could also hide latency (or setup cost) in a system with dynamic number of workers. Compared to system without idle workers, system here could send requests to idle workers when a burst happens, and at the same time, set up more workers to handle the flood of requests. In this case, a part of setup costs is hidden by the idle workers. A drawback of this strategy is the energy cost of the idle workers. However, this cost is insignificant in a system with a number of running workers.

This comment was marked helpful 0 times.

traderyw

Error checking could easily be implemented in this model. If the application requires reliability, the parent process could send two requests to two workers and compare their results. The parent process could check the result itself or could send the results to a worker process. The second method is more desirable because it would not delay handling new client requests.

This comment was marked helpful 0 times.

idl

@wcrichto sounds like the nginx architecture would then not be able to take advantage of hardware parallelism as much, since you're limiting the number of worker processes? The event-based system would then cause the static number of workers to alternate serving requests, thus lowering throughput.

This comment was marked helpful 0 times.

pinkertonpg

@idl Without reading the page that Will linked to, I'm assuming that the number of processes that nginx spawns is directly correlated to the number of processors on the machine that is running it. Either 1:1, or 2:1 to take advantage of hyper-threading like we are used to seeing. This way hardware parallelism is taken advantage of, at the core level (ignoring SIMD). I think the key point here is that yes, a static number of processes can be less than optimal, but you have to remember that hardware is also static, even more so.

I did, actually, read the page, so I can safely assert that is it up to the administrator running nginx to determine the number of processes (nginx calls them workers) to create initially. Typically this is 1:1 with the number of cores, but if most of the expected work is going to be I/O, the ratio might be 1.5:1 or 2:1.

This comment was marked helpful 1 times.

nrchu

@traderyw if you're error checking by doing work multiple times I would think that it would be easiest to simply check a hash based signature of the results, which would take almost no time and avoids all the other pitfalls associated with more communication (latency and general work request overhead).

This comment was marked helpful 0 times.