Networks-on-Chip, International Symposium on
Download PDF

Abstract

Scaling of interconnects exacerbates the already challenging reliability of on-chip networks. Although many researchers have provided various fault handling techniques in chip multi-processors (CMPs), the fault-tolerance of the interconnection network is yet to adequately evolve. As an end-to-end recovery approach delays fault detection and complicates recovery to a consistent global state in such a system, a link-level retransmission is endorsed for recovery, making a higher-level protocol simple. In this paper, we introduce a fault-tolerant flow control scheme for soft error handling in on-chip networks. The fault-tolerant flow control recovers errors at a link-level by requesting retransmission and ensures an error-free transmission on a flit-basis with incorporation of dynamic packet fragmentation. Dynamic packet fragmentation is adopted as a part of fault-tolerant flow control to disengage flits from the fault-containment and recover the faulty flit transmission. Thus, the proposed router provides a high level of dependability at the link-level for both datapath and control planes. In simulation with injected faults, the proposed router is observed to perform well, gracefully degrading while exhibiting 97% error coverage in datapath elements. The proposed router has been implemented using a TSMC 45nm standard cell library. As compared to a router which employs triple modular redundancy (TMR) in datapath elements, the proposed router takes 58% less area and consumes 40% less energy per packet on average.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles