Another workload characteristic of graphs in the real world is that there usually will be a few vertices with a significant percentage of all the edges (e.g. Mark Zuckerberg's followers on Facebook). These super popular vertices can cause issues like workload imbalance, high communication overhead, and reduced opportunity for parallelism if we simply just partition the graph by putting vertices onto different machines.
PowerGraph, yet another graph-processing framework developed by CMU, aims to solve this problem by allowing vertices to be replicated across different machines. This allows PowerGraph to divide up the edges of a super popular vertex among its replicates.
This is a good read on a comparison of different graph processing frameworks:
https://code.facebook.com/posts/319004238457019/a-comparison-of-state-of-the-art-graph-processing-systems/
Another workload characteristic of graphs in the real world is that there usually will be a few vertices with a significant percentage of all the edges (e.g. Mark Zuckerberg's followers on Facebook). These super popular vertices can cause issues like workload imbalance, high communication overhead, and reduced opportunity for parallelism if we simply just partition the graph by putting vertices onto different machines.
PowerGraph, yet another graph-processing framework developed by CMU, aims to solve this problem by allowing vertices to be replicated across different machines. This allows PowerGraph to divide up the edges of a super popular vertex among its replicates.
Read more here:
http://www.select.cs.cmu.edu/publications/paperdir/osdi2012-gonzalez-low-gu-bickson-guestrin.pdf http://www.cl.cam.ac.uk/~ey204/teaching/ACS/R212_2015_2016/presentation/S3/James_POWERGRAPH_P.pdf
This is a good read on a comparison of different graph processing frameworks: https://code.facebook.com/posts/319004238457019/a-comparison-of-state-of-the-art-graph-processing-systems/