Abstract
The authors propose a novel ILP model for the scheduling problem in self-recovering microarchitecture synthesis. A self-recovering microarchitecture, on detecting a (transient), fault, roll back to a previously known correct state - the checkpoint - and retries the computation. The maximum distance between adjacent checkpoints - the retry period - is determined by the transient fault rate as well as the average lifetime of a transient fault. At a checkpoint, the results of intermediate computations are compared (using voters), and if correct saved in registers. Consequently, associated with each checkpoint, there is a time overhead due to comparison and an area overhead due to the fault-tolerant nature of the voters. The authors formulate time-constrained scheduling as minimizing either the number of voters or the overall hardware, subject to constraints on the number of clock cycles, the retry period, and the number of checkpoints. Moreover, they develop a model for resource-constrained scheduling wherein both the overall system performance as well as the recovery time overhead are optimized subject to hardware constraints.