##article.return##
Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport
Download
Download PDF