Abstract
In this study we compare human and machine acceptability judgments for extreme variations in sign productions. We gathered acceptability judgments of 26 signers and scores of three different automatic gesture recognition (AGR) algorithms that could potentially be used for automatic acceptability judgments, in which case the correlation between human ratings and AGR scores may serve as an dasiaacceptability performancepsila measure. We found high human-human correlations, high AGR-AGR correlations, but low human-AGR correlations. Furthermore, in a comparison between acceptability and classification performance of the different AGR methods, classification performance was found to be an unreliable predictor of acceptability performance.