Abstract
Handshape has an important role in sign languages. It would be inconceivable to try to understand sign language without recognising the handshapes. Over the years, numerous different approaches have been proposed for extracting the hand configuration information. The existing approaches for hand-shape recognition have problems especially with the huge sizes of modern linguistic corpora. Computationally expensive methods become easily infeasible with such large amounts of data. In this paper we examine the straightforward and efficient approach of recognising handshapes by our existing image category detection methodology, involving state-of-the-art local image descriptors. In the experiments the approach produces promising results. On the image feature side, we find that surprisingly complex hierarchical descriptors of shape primitive statistics provide the best overall performance in hand shape recognition. The accuracy of feature-wise detections can be improved by fusing together several features. Considering the temporal succession of the hand blobs markedly improves the accuracy over detecting the hand shape in each video frame in isolation.