Dual-Dependency Attention Transformer for Fine-Grained Visual Classification
Visual transformers (ViTs) are widely used pom pom cupcake toppers in various visual tasks, such as fine-grained visual classification (FGVC).However, the self-attention mechanism, which is the core module of visual transformers, leads to quadratic computational and memory complexity.The sparse-attention and local-attention approaches currently use