As you lean towards more specialization, the responsibility of designing a better performing system gets punted towards the hardware engineers.
srb
This has also been pointed out in the context of our final projects - we need to build a high quality, naive implementation running on a CPU as a benchmark, or else speedup may not mean anything and a parallel algorithm may be slower (because of overhead, etc) than a completely serial algorithm.
koala
Because we have these specialized pieces in the architecture, it is often better to use built-in libraries for complicated computations (e.g. video decoding) than to try to write highly optimized low-level code that cannot take advantage of these specialized pieces.
As you lean towards more specialization, the responsibility of designing a better performing system gets punted towards the hardware engineers.
This has also been pointed out in the context of our final projects - we need to build a high quality, naive implementation running on a CPU as a benchmark, or else speedup may not mean anything and a parallel algorithm may be slower (because of overhead, etc) than a completely serial algorithm.
Because we have these specialized pieces in the architecture, it is often better to use built-in libraries for complicated computations (e.g. video decoding) than to try to write highly optimized low-level code that cannot take advantage of these specialized pieces.