Previous | Next --- Slide 31 of 40
Back to Lecture Thumbnails
amolakn

What's the overhead for partitioning graph vertices into intervals and then determining what data is relevant such as incoming vertices and such? I understand that this is increasing locality, but I wonder what the overhead trade off for this sharded distribution is.

acortes

The overhead is likely large but if you are reusing data frequently or have multiple iterations of running your algorithm on the nodes then preprocessing will lead to a significant increase in speed.

Split_Personality_Computer

Would these edges be stored as an edgelist to be most space efficient? Or would there be an attempt to make a local adjacency list to get the best of both worlds (memory and computation)?