The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 1

ISSN 2229-5518

Performance Analysis of Routing Algorithms

Mr. Lokesh M. Heda

Shri Ramdeobaba, College of Engineering and Management, Nagpur, India

Dr. N P. Narkhede

Asst. Professor

Shri Ramdeobaba College of Engineering and Management, Nagpur, India

ABSTRACT

As an emerging and advanced technology, Network-on-Chip (NoC) may become the alternative of the
traditional bus-based System-on-Chip (SoC). In an interconnection network structure, interconnected routers are the core of the whole system and the network's performance mainly depends on their performance. There are many significant factors which determine the working mechanism of a router and its performance, such as topology, switching strategy, routing algorithm, and flow control mechanism. The switching mechanism plays a vital role to move the data from an input channel and place it on an output channel. Virtual cut through (VCT) and wormhole (WH) switching techniques are widely used in NoC architecture. In this paper, Wormhole and VCT technique has been proposed for Network on chip architecture and its performance is analyzed using the parameters such as area and speed.
In this paper the designing and implementation of Wormhole and Virtual Cut through router have been discussed. The simulation of Wormhole system and VCT is done in Modelsim-SE as a simulation & debugging tool. The design is synthesized in Xilinx ISE 9.1i as well as in Quertus 8.1 for the packet size of 16 bits (0-15) on the platform of family automotive spartan2 for device-XC2S200, PQG208 package and speed -5.

GENERAL TERMS

NoC architecture, Switching technique, Wormhole Routing algorithm and Virtual Cut through

(VCT)

1 INTRODUCTION

Wormhole switching was introduced as a suitable switching technique for the design of compact and fast routers. Wormhole switching pipelines message transmission across multiple routers and requires very small buffers, leading to single chip routers even with the technology available in 1985 [5]. When the channels requested by a message are busy, the message is blocked in place. Flow control signals exchanged between routers prevent buffer overflow. Therefore, buffers only need to provide storage for a few flow control digits, or flits.
Virtual cut-through switching is proposed for NoC. It also pipelines message transmission across multiple routers, behaving in the same way as wormhole switching in the absence of contention. However, message flow control is performed at the level of packets, therefore requiring buffers with capacity for one or more packets to store blocked packets. Thus,
when a packet blocks, it is removed from the network and stored in a single buffer. Although removing blocked packets from the network reduces contention considerably, the large storage capacity required for virtual cut-through routers led manufacturers to prefer wormhole switching. Most current routers implement this switching technique.
Network on Chip architecture is composed of multiple
resources or intellectual property (IP) cores, connected by channels intersecting at various switches. The switches have buffers for each interface which queue message packets waiting for a given destination. On- chip network contains four fundamental components. These are IP-core, network adapter, router and links. For comparison of different NoC architectures a standard set of performance metrics are used.

2 MOTIVATION

Routing algorithms used in interconnection

IJSER © 2012

http://www.ijser.org

The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 2

ISSN 2229-5518

networks must be deadlock and live lock free. In
designing algorithm, speed and area are challenges. Therefore, the algorithms used in FPGA must be efficient in case of buffer, speed, area and routing logic. We are in search for routing algorithm which can give best performance for FPGA, so we need to commence with comparison of the existing routing algorithms for FPGA such that the one having suitable characteristics for FPGA can be recognized by using the VHDL environment.
In this paper, we show that wormhole routers can be efficient than virtual cut-through routers in terms of area, Moreover WH algorithm provide better speed than the VCT algorithm. We evaluate the performance of WH and VCT algorithm in a HDL environment. Efficient algorithms were designed, which were having same IO capabilities so as to enable possibility of comparison between the designed algorithms.
The rest of the paper is organized as follows. In Section 3 we review the Switching Techniques. A wormhole switching is proposed in Section 4. Also, a Virtual Cut through routing algorithm is proposed in Section 5. In this section scheduler and packet patter has also discuss. In Section 6 software utilization for the WH and VCT is summarized. In section 7 Hardware Implementation & Synthesis has been discuss. The performance of the wormhole and VCT is evaluated in Section 8. Finally, some conclusions are drawn.

3 SWITCHING TECHNIQUES

NOC architectures are based on packet-switched networks In the design of communication network, the selection of appropriate switching technique is one of the challenging task. Switching is the actual mechanism that removes data from an input channel and places it on an output channel. [2] Which output channel is to be chosen depends on routing algorithm. It determines how network resources are allocated for data
transmission; the routing algorithm decides how and when the input channel is connected to the
output channel. Switching techniques can be classified based on network characteristics and it is shown in figure1.

Figure 1. Switching techniques

3.1 PACKET SWITCHING

In packet switching data is partitioned into fixed length blocks called packets. Each packet has its own addressing information. These causes extra overhead hence buffer requirement is high in these cases. However by dividing messages into packets, a message can use a number of links on a path simultaneously. Characteristic of this switching are more complete resource sharing, higher channel utilization & lower network delay. In message switching and packet switching extra delay is occurred since a message is not permitted to be transmitted out before it is received completely. Figure2 shows the concept of packet switching.

Figure 2: Packet Switching

Typical packet switching techniques include store-and- forward, virtual cut- through, and wormhole switching.

4 WORMHOLE SWITCHING

IJSER © 2012

http://www.ijser.org

The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 3

ISSN 2229-5518

Wormhole routing is one of the most popular switching techniques in NoC and it’s more suitable for implementing NoCs compared to store-and-forward and virtual cut-through switching[4,5]. In wormhole switching, a data packet (message) is divided into small flits for transmission and flow control. The header flit governs the route of the packet and the reaming data flits follow it in a pipeline fashion. When the header flit is blocked due to contention for output channels or due to insufficient buffer space, all other data flits wait at their current nodes forming a chain of flits that spans over multiple nodes. Wormhole switching makes the end-to-end delay insensitive to the packet destination due to the pipelining of flits, and routers require only small amount of buffer space.
The general format of a message and the units of packets and flits are shown in Figure 3(a). Figure3(b) shows the routing mechanism using wormhole switching, where the header flit H contains the destination address and the data flits D follow H contiguously in a pipelined fashion.

Figure3: Message format and routing in wormhole switching.
The main advantage of wormhole switching derives from the pipelined message flow, since transmission latency is insensitive to the distance between the source and destination.

4.1 WORMHOLE ALGORITHM IMPLEMENTATION

The Wormhole router is designed and implemented for the four nodes (SoC‟s). The design is synthesized in Xilinx ISE 9.1i as well as in Quartus II 8.1 for the packet size of 16 bits (0-15) on the platform of family automotive spartan2 for device-XC2S200 and PQG208 package& speed -5.

4.2 WORMHOLE ROUTER

The Wormhole router is designed for four nodes. Each
node is connected to all the four nodes by the link. This cross bar switching fabrics has total 16 links. All these nodes transfer the data when clock signal is high or the event is occurred on it. Router has four inputs in terms of packets of size 15 down to 0 & outputs of size 7
down to 0. Each packet has source and destination addresses each of two bits (1 down to 0). RTL view of Wormhole router is obtained using Xilinx synthesis tool. Wormhole router has been tested for both the Area and the speed. Round Robin (RR) scheduler plays a vital role to decide the priority to transfer the packet.
4.3 WORMHOLE SYSTEM
The RTL view of Wormhole system is shown in figure4.

In which clock is provided to both router as well WH system. The signals are allotted for handling the internal flow of packets.
Figure 4: RTL view of Wormhole system

5.Virtual Cut through (VCT)

In VCT When a message comes in an intermediate node and if its outgoing channel is free
then it is transmitted out immediately. Instead of waiting for the whole packet buffered, the incoming header flit is cut through into the next router as soon as the decision was made and the output channel is free. Every further flit is buffered whenever it reaches the router, but it is also immediately cut-through to the next router if the output channel is free. In case of no resource conflicts along the route, the packet is effectively pipelined through successive routers as a loose chain of flits. All the buffers along the routing path are blocked for other communication requirements. Buffers are required when a busy channel is encountered. VCT routers have buffers for whole packet. Each node must provide sufficient buffer space for all the messages passing through it, and because multiple messages may be blocked at any node, a very large buffer space is required at each node. VCT switching of a packet is shown in figure 5.

Figure 5: Virtual Cut through Switching

5.1 VIRTUAL CUT THROUGH SYSTEM

IJSER © 2012

http://www.ijser.org

The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 4

ISSN 2229-5518



The RTL view of Wormhole system is shown in figure6. In which clock is provided to both router as well WH system. The signals are allotted for handling the internal flow of packets.

Figure 6: RTL view of VCT system

5.2 SCHEDULER

Round-Robin scheduling algorithm is used in our
Wormhole and VCT system as a scheduler which assigns time slices to each process in equal portions and in circular order. Round-Robin scheduling algorithm is simple and easy to implement, and starvation-free. If two or more source addresses are demanding the same destination address then contention occurs, then as per Round- Robin scheduling algorithm, priority is given to node 1 first then to node 2, then to node 3 & so on in a cyclic manner.

5.3 PACKET PATTERN FOR WORMHOLE AND VCT

Source & destination address of each node is same. Each packet consists of first four bit as a header to identify starting of the packet, two bit source address followed by two bit destination address and after that eight bit random data. Figure 7 shows the arrangement of the packet which is use for wormhole routing process.

Figure 7: Packet Format

7 HARDWARE IMPLEMENTATION & SYNTHESIS

Implementation of Hardware is done on Cyclone II
(EP2C35 Family) FPGA by using Quartus II Software of version 8.1. Figure 8 shows the top view of Cyclone II – EP2C35F672C6. Red symbol represent the assigned I/O. Pin assigned summary is shown in table 1.



Figure 8: TOP VIEW – WIRE BOND CYCLONE II – EP2C35F672C6

Symbol

Pin Type

User I/O

.

User Assigned I/O

Filtered Assign I/O

Unbonded Pad

Reserved Pin

CLK_n

CLK_p



Table 1: Pin type and assigned pin summary

8 SIMULATIONS OF WORMHOLE AND VCT

The simulation of the algorithm is carried out on Modelsim-SE as a simulation & debugging tool. The algorithm is validated for the fixed number of packets. If any one of the node is not used then „Z‟ is appears at the output of that node. Simulation waveform is shown in figure 8.

IJSER © 2012

http://www.ijser.org

The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 5

ISSN 2229-5518

Figure9: Simulation waveform of WH

Figure 10: Simulation Waveform of VCT

8.1 PERFORMANCE OF WORMHOLE AND VCT

TIMING SUMMARY (for WH)

Minimum period: 10.857ns

TIMING SUMMARY FOR VCT

Minimum period: 26.714ns

9 CONCLUSION

Networks-on-Chip are emerging as a viable interconnects architecture for multiprocessor SoC platform. In our work, we designed and implemented Wormhole and VCT routing algorithm on HDL platform. The designs were implemented on Cyclone family of Altera FPGA boards. Efficient algorithms were designed, which were having same IO capabilities so as to enable possibility of comparison between the designed algorithms.
Performance of Wormhole and VCT is analyzed on basis of area and speed
As per the results of simulations
and statistical analysis based on the simulations, it was observed that the wormhole routing algorithm is efficient as compared to the VCT algorithm in terms of parameters of Area requirement and switching speed. The analysis is done on the platform of family automotive Spartan 2 for device- XC2S200 and PQG208 package & speed -5.Implementation of Hardware is done on Cyclone II (EP2C35 Family) FPGA by using Quartus II Software of version 8.1. Figure 7 shows the top view of Cyclone II – EP2C35F672C6.

REFERENCES

[1] Taghavi T., Ghiasi S. and Sarrafzadeh M., “Routing Algorithms: Architecture-Driven Rerouting Enhancement for FPGAs,” In Proc. Of IEEE International Symposium on Circuits and Systems (ISCAS), 2006.

[2] E. Rijpkema, K. Goossens, A. Radulescu, J.
Dielissen, J. van Meerbergen, P. Wielage, and E. Waterlander,“Trade-offs in the design of a router with both guaranteed and best-effort services for networks on chip”, IEE Proc. on Computers and Digital Techniques, vol. 150, Issue 5, pp. 294-302, September 2003.

IJSER © 2012

http://www.ijser.org

The research paper published by IJSER journal is about Performance Analysis of Routing Algorithms 6

ISSN 2229-5518

[3] Partha Pratim Pande, Michael Jones,,Andre Lanov, “Performance evaluation and Design Trade–Offs for Network –on Chip InterconnectArchitectures” Published by Computer Society, 15 June 2005.
[4] Y.A.Sadawarte, M.A.Gaikwad and Rajendra
M.Patrikar “Review of Switching Techniques for Network-on-Chip Architectures”, International Journal on Computer Engineering & Information Technology,Vol 17, No.22,Special edition 2010, pp. 52-
57.
[5] Erno Salminen,Ari Kulmala, and Timo
D.Hamalainen”Survey of Network-on-chip
Proposals”,WHITE PAPER, MARCH 2008.
[6] Y.A.Sadawarte, M.A.Gaikwad and Rajendra M.Patrikar “Comparative study of switching techniques for Network on chip Architectures” ACM Digital Library http://dl.acm.org/citation.cfm?id=1947940
[7] T.T.Ye, L. Benini, G. D. Micheli “Analysis of Power Consumption on Switch Fabrics in Network routers”. In Proc. Design Automation Conference, 2002.
[8] Ankur Agarwal, Cyril Iskander & Ravi Shankar “Survey of Network on Chip (NoC) Architectures & Contributions”, Journal of Engineering, Computing and Architecture Volume 3, Issue1, 2009.
[9] Parviz Kermani and Leonard Kleinrock “Virtual Cut– Though: A New Computer Communication Switching Technique”, North Holland Publishing Company, Computer Networks 3 (1970) pp. 267-269. [10] E. Rijpkema, K. Goossens, A. Radulescu, J. Dielissen, J. van Meerbergen, P. Wielage, and E. Waterlander, “Trade-offs in the design of a router with both guaranteed and best-effort services for networks on chip”, IEE Proc. on Computers and Digital Techniques, vol. 150, Issue 5, pp. 294-302, September 2003.
[11] A. Kumar, D. Manjunath, and J. Kuri,
Communication Networking: An Analytical Approach, Morgan Kaufmann, 2004.
[12] P. T. Wolkotte, G. J. M. Smit, G. K. Rauwerda, and L. T. Smit, “An energy-efficient reconfigurable circuit switched network-on-chip”, Proc. 19th IEEE International Conference on Parallel and Distributed Processing Symposium, pp. 155-163, 2005.
[13] W. J. Dally and B. Towles. Route packets, not
wires: On-chip interconnection networks. In
Proceedings of the 38th Design Automation Conference,
2001.
[14] C. Albenes, Zeferino Frederico G. M. E. Santo,
Altarniro Amadeu Susin, “ParlS: A parameterizable interconnect switch for Networks-on-Chips”, Proc. ACM Conference, pp. 204-209, 2004.
[15] J. W. Dally and B. Towles, “Route packets, not
wires: On-Chip interconnection networks”, Proc. IEEE International Conference on Design and Automation, pp. 684-689, June 2001.
[16] L. M. Ni and P. K. McKinley. A survey of
wormhole routing techniques in direct networks. IEEE Computer, 26(2):62–76, February 1993.

IJSER © 2012

http://www.ijser.org