Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 52 of 55

rohany

In systems where we dont have access to laying out memory - such as different / higher level languages, what can we do to avoid false sharing?

muchanon

My question is similar to the above but opposite. Which low level languages allow us to avoid this? Even in a systems language, how would we be able to influence any control over this? It seems like the problem is that we have two variables with addresses that map to the same cache line, but we have no control over which memory addresses our variables have. For example, I could malloc two different arrays which each map to the same cache line, but I don't have any control over the addresses of these variables.

pht

The goal is to ensure that variables causing false sharing are spaced far enough apart in memory that they cannot reside on the same cache line, and methods to fix false sharing include using compiler directives to force individual variable alignment, padding the structure to the end of a cache line, and using thread-local copies of data.

See https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads for more information.

M12

@rohany I'll direct you to this article that I found: Click me. It is about multi-core issues involving Python, a high level language. The article does not provide a solution to the issue. Maybe in Python, they will require the programmer to deal with these issues themselves. In functional languages, I doubt such an abstraction would want to be broken, so it would have to be up to the compiler to deal with them. I'm curious as to what the actual answer to this is as well though.

@muchanon To solve false sharing in a low level language such as C or C++, Professor Kayvon showed us an example where he forced the falsely shared variable counter to not be false shared. This was done by wrapping it in a struct with a lot of padding (in particular, (CACHE_LINE_SIZE - sizeof(int)). This program ran much faster. See the previous slide (slide 51).

Tiresias

I'm still not sure I understand the concept of false sharing. It seems to me that it happens when a cache line crosses lines between the work of one processor to the other, which crosses from one cache to another, making lots of unnecessary communication between caches when using this approach. But I'm still not sure what it means by artifactual vs. inherent communication. Can someone elaborate?

Also, can false sharing cause errors, or is it just costly?

woohoo

@Tiresias the explanations of inherent and artifactual communication are here: http://15418.courses.cs.cmu.edu/spring2017/lecture/progperf2/slide_039