Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
Title | Long-tailed Recognition by Routing Diverse Distribution-Aware Experts |
Publication Type | Conference Paper |
Year of Publication | 2021 |
Authors | Wang, X., Lian L., Miao Z., Liu Z., & Yu S. X. |
Published in | Proceedings of International Conference on Learning Representations |
Date Published | 05/2021 |
Other Numbers | arXiv:2010.01809 |
Keywords | bias and variance, ensemble model, long-tail distribution |
Abstract | Natural data are often long-tail distributed over semantic classes. Existing recognition methods tackle this imbalanced classification by placing more emphasis on the tail data, through class re-balancing/re-weighting or ensembling over different data groups, resulting in increased tail accuracies but reduced head accuracies. We take a dynamic view of the training data and provide a principled model bias and variance analysis as the training data fluctuates: Existing long-tail classifiers invariably increase the model variance and the head-tail model bias gap remains large, due to more and larger confusion with hard negatives for the tail. We propose a new long-tailed classifier called RoutIng Diverse Experts (RIDE). It reduces the model variance with multiple experts, reduces the model bias with a distribution-aware diversity loss, reduces the computational cost with a dynamic expert routing module. RIDE outperforms the state-of-the-art by 5\% to 7\% on CIFAR100-LT, ImageNet-LT and iNaturalist 2018 benchmarks. It is also a universal framework that is applicable to various backbone networks, long-tailed algorithms, and training mechanisms for consistent performance gains. Our code is available at: \href{https://github.com/frank-xwang/RIDE-LongTailRecognition}{https://github.com/frank-xwang/RIDE-LongTailRecognition}. |
URL | http://www1.icsi.berkeley.edu/~stellayu/publication/doc/2021rideICLR.pdf |