Abstract
We propose a hybrid framework to address the problem of tracking multiple articulated humans from a single camera. Our method incorporates offline learned category-level detector with online learned instance-specific detector as a hybrid system. To deal with humans in large pose articulation, which can not be reliably detected by off-line trained detectors, we propose an online learned instance specific patch-based detector, consisting of layered patch classifiers. With extrapolated track lets by online learned detectors, we use the discriminative color filters learned online to compute the appearance affinity score for further global association. Experimental evaluation on both standard pedestrian datasets and articulated human datasets shows significant improvement compared to state-of-the-art multi-human tracking methods.