Just to clarify, cores with SIMD execution units do not require us to use all the units right? so if we only needed to compute a data manipulation such as ab=c instead of multiplying elements of 2 whole arrays, then we can just use as many of the ALU units as we need? so in the case of ab=c we only use one ALU unit and the rest are idle.
That is correct. We don't need to use every execution unit but it will generally be preferred to use as many as we can at one time so that parallelization is increased and our programs will have better performance.