Abstract
Presents a method for parametrised partitioning of multidimensional programs for acceleration using a hardware coprocessor. The method involves a divide-and-conquer structure, with the "divide" and "merge" phases carried out by a general-purpose processor, while the "conquer" phase is handled by application-specific hardware. The partitioning strategy has been captured in a simple functional language, and we have automated the production of partitioned programs in this language. Our approach has been tested on an FPGA-based system using a number of computer vision algorithms, including the Canny edge detector, and the performance is compared against executing the programs on the PC host.<>