In lecture we mentioned that this layout is harder to achieve on chip, since we observe that there are several paths of the network that must physically cross each other.

maxdecmeridius

Yes, that's correct. This is very difficult to do on a 2D chip, so if we implement this layout we'd have to deal with multiple metal layers and the like.

In fact, IBM's Blue Gene/Q uses a 6D torus interconnect called Tofu. (https://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FQ)

haibinl

One good use case for Torus, as mentioned in class, is the grid-solver example. For a grid solver, each processor need to access to its neighbours' data, and the torus topology makes the latency to O(1).

grizt

While I understand why it's difficult to put on the chip, what does this actually mean for chip designers? Is it just very expensive? If so, how expensive is it? Is the cost worth the payoff? How can I, as the 418 student, reason about these tradeoffs?

PandaX

I think it should be 'lower cost that 2d grid right', is it a typo?

kayvonf

@PandaX. It is higher cost to build a torus, as compared to a mesh, because a torus has more links, as well as higher design and layout complexity.

PandaX

@kayvonf Oh I see! I thought cost means average latency, turns out it means total link length.

kayvonf

@PandX. Just to clarify -- I meant physical cost to create the network.

randomized

So is "cost" the combined cost for both links and switches?

mrrobot

Is the average latency for Torus still the same as mesh - O(sqrt(n))?

grizt

@mrrobot, yes. Though I believe the longest shortest path in a torus actually half as long as one in the mesh, with length sqrt(n)-1 instead of 2 (sqrt(n)-1). (Consider the longest path in a mesh is from corner to corner, whereas in a torus it's from center to corner.)

grizt

@randomized, since any connected network must have at least n-1 links (i.e., a tree), we cannot have fewer than O(n) links. So it's only necessary to consider the cost of the links, which can be between O(n) (e.g., a ring or a tree) and O(n^2) (e.g., a fully connected network).

cyl

It looks like the torus is like 2D ring, or the mesh with connection at farthest node.

ojd

I have very little understanding of electronic circuitry, but is the Torus topology becoming more feasible with more recent advances in 3d chip tech like what was used for HBM/Stacked memory?

In lecture we mentioned that this layout is harder to achieve on chip, since we observe that there are several paths of the network that must physically cross each other.

Yes, that's correct. This is very difficult to do on a 2D chip, so if we implement this layout we'd have to deal with multiple metal layers and the like.

In fact, IBM's Blue Gene/Q uses a 6D torus interconnect called Tofu. (https://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FQ)

One good use case for Torus, as mentioned in class, is the grid-solver example. For a grid solver, each processor need to access to its neighbours' data, and the torus topology makes the latency to O(1).

While I understand why it's difficult to put on the chip, what does this actually mean for chip designers? Is it just very expensive? If so, how expensive is it? Is the cost worth the payoff? How can I, as the 418 student, reason about these tradeoffs?

I think it should be 'lower cost that 2d grid right', is it a typo?

@PandaX. It is

higher cost to builda torus, as compared to a mesh, because a torus has more links, as well as higher design and layout complexity.@kayvonf Oh I see! I thought cost means average latency, turns out it means total link length.

@PandX. Just to clarify -- I meant physical cost to create the network.

So is "cost" the combined cost for both links and switches?

Is the average latency for Torus still the same as mesh - O(sqrt(n))?

@mrrobot, yes. Though I believe the longest shortest path in a torus actually half as long as one in the mesh, with length sqrt(n)-1 instead of 2 (sqrt(n)-1). (Consider the longest path in a mesh is from corner to corner, whereas in a torus it's from center to corner.)

@randomized, since any connected network must have at least n-1 links (i.e., a tree), we cannot have fewer than O(n) links. So it's only necessary to consider the cost of the links, which can be between O(n) (e.g., a ring or a tree) and O(n^2) (e.g., a fully connected network).

It looks like the torus is like 2D ring, or the mesh with connection at farthest node.

I have very little understanding of electronic circuitry, but is the Torus topology becoming more feasible with more recent advances in 3d chip tech like what was used for HBM/Stacked memory?