2007 IEEE International Parallel and Distributed Processing Symposium
Download PDF

Abstract

Feature Pyramid Network (FPN) has been a generic feature extractor in computer vision tasks, which utilizes multi-level features to generate discriminative pyramidal representations. However, the way simply using Sum or Concatenate operation on features to integrate multi-scale information is not sufficient to obtain discriminative semantic representations. In this paper, we propose a dynamic feature pyramid network (DyFPN) to merge multi-scale information in both features and weights. DyFPN uses both high-level context features and low-level spatial structural features to obtain dynamic convolution kernel that contains multi-scale information. In this manner, each resolution in the pyramid performs unique and adaptive convolution directly, meanwhile strengthening the information flow. Specially, DyFPN can be regarded as a complementary enhancement to existing feature pyramid networks. We analyze the effective receptive field and attention map of DyFPN. It proves that our method contains more local information and global information compared with merging multi-scale information only on feature level. Benefit from multi-ways of integrating multi-scale information, our method outperforms other existing feature pyramid methods on COCO detection tasks by a large margin.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles