For embedded applications with data-level parallelism, a vector processor offers high performance at low power consumption and low design complexity. Unlike superscalar and VLIW designs, a vector processor is scalable and can optimally match specific application requirements.
1. R. Espasa, M. Valero, and J.E. Smith, "Vector Architectures: Past, Present and Future," Proc. 12th Int'l Conf. Supercomputing, 1998, ACM Press, pp. 425-432.
2. J. Smith, G. Faanes, and R. Sugumar, "Vector Instruction Set Support for Conditional Operations," Proc. 27th Int'l Symp. Computer Architecture (ISCA 2000), IEEE CS Press, 2000, pp. 260-269.
3. C. Kozyrakis, and D. Patterson, "Vector versus Superscalar and VLIW Architectures for Embedded Multimedia Benchmarks," Proc. 35th Ann. Int'l Symp. Microarchitecture (Micro-35), ACM Press, 2002, pp. 283-293.
4. R. Ho, K. Mai, and M. Horowitz, "The Future of Wires," Proc. IEEE, vol. 89, no. 4, Apr. 2001, pp. 490-504.
5. V. Agarwal, et al., "Clock Rate vs IPC: The End of the Road for Conventional Microarchitectures," Proc. 27th Int'l Symp. Computer Architecture (ISCA 2000), IEEE CS Press, 2000, pp. 248-259.
6. C. Kozyrakis, and D. Patterson, "Overcoming the Limitations of Conventional Vector Processors," Proc. 30th Int'l Symp. Computer Architecture (ISCA 2003), ACM Press, 2003, pp. 283-293.
7. S. Rixner, et al., "Register Organization for Media Processing," Proc. 6th Int'l Conf. High-Performance Computer Architecture (HPCA 6), IEEE CS Press, 2000, pp. 375-386.
8. K. Farkas, et al., "The Multicluster Architecture: Reducing Processor Cycle Time Through Partitioning," Proc. 30th Ann. Int'l Symp. Microarchitecture (Micro-30), IEEE CS Press, 1997, pp. 327-356.
9. J. Smith, "Decoupled Access/Execute Computer Architecture," ACM Trans. Computer Systems, vol. 2, no. 4, Nov. 1984, pp. 289-308.
10. J. Fisher, et al., Clustered Instruction-Level Parallel Processors, tech. report HPL-98-204, HP Labs, Dec. 1998.