Question: How does the opimization discussed on this slide relate to the idea of arithmetic intensity from lecture 2?
yulunt
Arithmetic intensity is a ratio of math operations and data access operations. By streaming foo() and bar() function, tmp is not written back to memory but passed directly to bar(). The total math operations remain the same but number of memory access decreases. Therefore, arithmetic intensity increases.
pdp
I think the arithmetic intensity of the optimization here is very high because once the different cores loads data chunks onto their local memory, everyone starts applying their kernel functions in parallel and the processed data is output.
cluo1
If we can pipeline the these executions, there is no need to go to memory to fetch the intermediate results since all the values needed by next kernel are already in the chip. This increases arithmetic intensity, which in turn increases the overall throughput.
boba
This optimization increases arithmetic intensity by removing the data access operations for storing and retrieving tmp.
Question: How does the opimization discussed on this slide relate to the idea of arithmetic intensity from lecture 2?
Arithmetic intensity is a ratio of math operations and data access operations. By streaming foo() and bar() function, tmp is not written back to memory but passed directly to bar(). The total math operations remain the same but number of memory access decreases. Therefore, arithmetic intensity increases.
I think the arithmetic intensity of the optimization here is very high because once the different cores loads data chunks onto their local memory, everyone starts applying their kernel functions in parallel and the processed data is output.
If we can pipeline the these executions, there is no need to go to memory to fetch the intermediate results since all the values needed by next kernel are already in the chip. This increases arithmetic intensity, which in turn increases the overall throughput.
This optimization increases arithmetic intensity by removing the data access operations for storing and retrieving tmp.