Routing is basically deciding which path to take. A message originating from one node may have several paths at its disposal to reach the destination node. Routing determines which one of those paths the message will take. The routing algorithm we decide for a network is very critical. It should be such that the whole network topology is utilized evenly, there is minimum contention, minimum latency and no deadlocks. If all these goals are met, the network will have the best throughput and that is always desired.
We can classify the routing algorithms in broadly three categories: Deterministic routing, Oblivious routing and Adaptive routing.
Deterministic Routing
Deterministic routing is the simplest one – all messages will follow the same pre-defined path. To expand a bit, if a message must go from A to B, it only has one path available to do it. It is easy to see how that can be a problem. If A and B are talking a lot, the messages will have to wait but the other paths are not being utilized. This will increase latency and decrease throughput. Because of the pre-decided paths for any message, this routing is deadlock-free, but it lacks path diversity.
Oblivious Routing
Oblivious means not realizing something happening around you. That’s basically the gist of what this routing does. A message from A to B can take any path at its disposal, but it will do so without considering the traffic in those paths. This is slightly better than DOR, but still the situation like DOR can occur here as well. The main tradeoff in oblivious routing is between locality and load balance. Load can be balanced for any traffic pattern on almost any topology using Valiant’s algorithm, which is an example of oblivious routing algorithm. In Valiant’s algorithm, a packet that is to be sent from 's' to 'd', is first sent from 's' to a randomly chosen intermediate terminal node 'd’' and then from 'd’' to 'd'. This load balance comes at a cost - it destroys any locality in the traffic pattern. Even if the source and destination nodes are neighbors, the latency can still be as worse as when the two farthest nodes are communicating, because the intermediate terminal node 'd’' can by any node within the network, basically increasing the hop count. To reduce hop count and preserve locality, we can use something called Minimal Oblivious Routing. In this, the 'd’' can only be selected such that it does not exceed the worst-case hop count between s and d. Please refer the picture below.
The hop count for minimum latency between 's' and 'd' is 3 in this picture. In (a), the hop count is more than 3 for the 'd’' selected. Using minimal oblivious routing, we can restrict the 'd’' to be in the smaller quadrant as you can see in (b). The hop count will not exceed 3 in that case.
Valiant’s routing algorithm and minimal oblivious routing is not deadlock free by design, but it can be made deadlock free if we restrict certain turns in a path. For example, a message traveling east or west is allowed to turn north and south, but a message traveling north or south is not allowed any turns. Basically, we must avoid forming a cycle in the network.
Adaptive Routing
Adaptive routing adapts to the situation around it. It can change paths if it sees congestion in the current path. To do this, it needs network state information. Network state information can be anything like queue occupancy, FIFO almost full, almost empty kind of status signals. Since routing is dependent on the network state, adaptive routing is closely coupled with the flow control mechanism. All these requirements increase the logic for the network and hence increasing the real state used in chip floorplan. The network state information can local or global. Let us understand with an example.
Consider the ring topology in above picture with 8 nodes. Assume node 5 is sending packets continuously to node 6 using all the available bandwidth. At the same time, node 3 also wants to send packets to node 7. It can either choose to go in the clockwise direction, where it will see contention when it reaches node 5, or it choose to go in the anticlockwise direction where it will see no contention and can transmit the packet easily to node 7. It is obvious to us that the anticlockwise direction suits the best because that global information is available to us, but not to node 3. Node 3 can only have the local information, which is of node 2 or node 4. Nodes can have global information indirectly through backpressure. Backpressure is nothing but the FIFO full signal. Node 5 can send backpressure to node 4, which in turn will send backpressure to node 3 and node 3 will not send the packet in the clockwise direction. This is but a simple example of how adaptive routing can be implemented. To learn about it in detail, please refer [4].
Let us look at some other ways to implement routing.
Source routing – In source routing, the routing information is embedded in the packet header itself. Say, a packet is of 32-bits, the top 5 bits has routing information, and the rest is the actual message. In this routing, the entire route is decided at the source itself. The nodes or routers can read the MSB of the packet and easily understand in which direction this message must go. There is no need to implement any complex decoders in the router which saves logic and latency is improved as well since no decoding is happening.
Node table-based routing – This is better explained in the following paragraph which I am quoting from [3]. “More sophisticated algorithms are realized using routing tables at each hop which store the outgoing link a packet should take to reach a particular destination. By accessing routing information at each hop (rather than all at the source), adaptive algorithms can be implemented, and per-hop network congestion information can be leveraged in making decisions. The most significant downside to node routing tables is the increase in packet delay. Source routing requires a single look-up to acquire the entire routing path for a packet. With node-based routing, the latency of a look-up must be expended at each hop in the network.”
Combinational circuits – One can simply use comparators at each router to decide the path. Assume all routers have an x, y co-ordinate associated with them. We can easily come up with four equations using comparators, each for east, west, north and south, which will decide from which direction the packet will go out of the router. These equations will be specific to the topology you have created it for.
New routing algorithms are being developed every day in hopes to create an ideal network. You can look for research papers in this topic to learn more.
So, this is a good place to stop and I will see you in the next one!
References:
[1] Dally, William & Towles, Brian. (2001). Route packets, not wires: On-chip interconnection networks. Proceedings - Design Automation Conference. 684 - 689. 10.1109/DAC.2001.156225.
[2] Networks-on-Chip: From Implementations to Programming Paradigms by Sheng Ma, Libo Huang, Mingche Lai, Wei Shi, Zhiying Wang
[3] On-Chip Networks, Second Edition by Natalie Enright Jerger, Tushar Krishna, Li-Shiuan Peh
[4] Principles and Practices of Interconnection Networks by William James Dally, Brian Patrick Towles
[5] Wu, Chang & Li, Yubai & Peng, Qicong & Chai, Song & Yang, Zhongming. (2009). Construction of a multidimensional plane network-on-chip architecture based on the hypercube structure. Progress in Natural Science - PROG NAT SCI. 19. 635-641. 10.1016/j.pnsc.2008.10.003.
[7] Computer Architecture - Lecture 20: Interconnects (Fall 2022) by Onur Mutlu (Link)
[8] Computer Architecture - Lecture 21: On-Chip Networks & Efficient Router Design (Fall 2022) by Onur Mutlu (Link)
[9] W. J. Dally, "Virtual-channel flow control," in IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 2, pp. 194-205, March 1992, doi: 10.1109/71.127260.
[10] Next-Gen Computer Architecture: Till the End of Silicon by Smruti R. Sarangi
Comments