This is the wrong way to read the 512 bit cache line because all of the values we need are only on one chip, and the other 7 chips are essentially feeding irrelevant data to the memory controller on each load. So by storing data contiguously on the same chip we end up missing out on the advantages of having multiple chips.
This is the wrong way to read the 512 bit cache line because all of the values we need are only on one chip, and the other 7 chips are essentially feeding irrelevant data to the memory controller on each load. So by storing data contiguously on the same chip we end up missing out on the advantages of having multiple chips.