A Retraction-free Method for Nonsmooth Minimax Optimization over a Compact Manifold
Authors
Necdet Serhat Aybat
Jiang Hu
Zhanwang Deng
Abstract
We study the minimax problem $\min_{x\in M} \max_y f_r(x,y):=f(x,y)-h(y)$, where $M$ is a compact submanifold, $f$ is continuously differentiable in $(x, y)$, $h$ is a closed, weakly-convex (possibly non-smooth) function and we assume that the regularized coupling function $-f_r(x,\cdot)$ is either $μ$-PL for some $μ>0$ or concave ($μ= 0$) for any fixed $x$ in the vicinity of $M$. To address the nonconvexity due to the manifold constraint, we use an exact penalty for the constraint $x \in M$, and enforcing a convex constraint $x\in X$ for some $X \supset M$, onto which projections can be computed efficiently. Building upon this new formulation for the manifold minimax problem in question, a single-loop smoothed manifold gradient descent-ascent (sm-MGDA) algorithm is proposed. Theoretically, any limit point of sm-MGDA sequence is a stationary point of the manifold minimax problem and sm-MGDA can generate an $O(ε)$-stationary point of the original problem with $O(1/ε^2)$ and $\tilde{O}(1/ε^4)$ complexity for $μ> 0$ and $μ= 0$ scenarios, respectively. Moreover, for the $μ= 0$ setting, through adopting Tikhonov regularization of the dual, one can improve the complexity to $O(1/ε^3)$ at the expense of asymptotic stationarity. The key component, common in the analysis of all cases, is to connect $ε$-stationary points between the penalized problem and the original problem by showing that the constraint $x \in X$ becomes inactive and the penalty term tends to $0$ along any convergent subsequence. To our knowledge, sm-MGDA is the first retraction-free algorithm for minimax problems over compact submanifolds, and this is a very desirable algorithmic property since through avoiding retractions, one can get away with matrix orthogonalization subroutines required for computing retractions to manifolds arising in practice, which are not GPU friendly.