FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

NeurIPS 2021

pdf   bibtex


 We propose FMMformers, a class of efficient and flexible
transformers inspired by the celebrated fast multipole method (FMM)
for accelerating interacting particle simulation. FMM decomposes
particle-particle interaction into near-field and far-field components
and then performs direct and coarse-grained computation,
respectively. Similarly, FMMformers decompose the attention into
near-field and far-field attention, modeling the near-field attention
by a banded matrix and the far-field attention by a low-rank
matrix. Computing the attention matrix for FMMformers requires linear
complexity in computational time and memory footprint with respect to
the sequence length. In contrast, standard transformers suffer from
quadratic complexity. We analyze and validate the advantage of
FMMformers over the standard transformer on the Long Range Arena and
language modeling benchmarks. FMMformers can even outperform the
standard transformer in terms of accuracy by a significant margin. For
instance, FMMformers achieve an average classification accuracy of
60.74% over the five Long Range Arena tasks, which is significantly
better than the standard transformer's average accuracy of 58.70%.