Can't we also connect all the 8 banks to a single shared 64 bit data line so that we can still get the same throughput of 64 bits per cycle and also keep all the data inside one bank's row buffer? This would also save power as we don't need to do PRE + RAS + CAS on other banks for accessing the same cache line. If memory systems were designed to optimize for cache line accesses, wouldn't this approach be better? Are the 8 pins per chip fixed?
@mrrobot I think its possible but that would require adding additional functionality to each bank. By reading from all banks in parallel we can get higher throughput with less performant banks. For completely random access to cache lines your approach could potentially reduce power. However if we read multiple sequential cache lines then there is negligible power benefit as we would have to do PRE + RAS on the other banks too due to the length of the read.