##article.return## Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport Download Download PDF