Abstract
Activation patterns across recurrent units in recurrent neural net works (RNNs) can be thought of as spatial codes of the history of inputs seen so far. When trained on symbolic sequences to perform the next-symbol prediction, RNNs tend to organize their state space so that “close” recurrent activation vectors correspond to histories of symbols yielding similar next-symbol distributions [1]. This leads to simple finite-context predictive models built on top of recurrent activations by grouping close activation patterns via a vector quantization. In this paper, we investigate an unsupervised alternative to the state space organization. In particular, we use a recurrent version of the Bienenstock, Cooper and Munro (BCM) network with lateral inhibition [2] to map histories of symbols into activations of the recurrent layer. Recurrent BCM networks perform a kind of time-conditional projection pursuit. We compare the finite-context models built on top of BCM recurrent activations with those constructed on top of RNN recurrent activation vectors. As a test bed, we use two complex symbolic sequences with rather deep memory structures. Surprisingly, the BCM-based model has a comparable or better performance than its RNN-based counterpart. This can be explained by the familiar information-latching problem in recurrent net works when longer time spans are to be latched [3, 4].