MULTILEVEL AUGMENTED LAGRANGIAN SOLVERS FOR OVERCONSTRAINED CONTACT FORMULATIONS ∗

. Multigrid methods for two-body contact problems are mostly based on special mortar discretizations, nonlinear Gauss-Seidel solvers, and solution-adapted coarse grid spaces. Their high computational eﬃciency comes at the cost of a complex implementation and a nonsymmetric master-slave discretization of the nonpenetration condition. Here we investigate an alternative symmetric and overconstrained segment-to-segment contact formulation that allows for a simple implementation based on standard multigrid and a symmetric treatment of contact boundaries, but leads to nonunique multipliers. For the solution of the arising quadratic programs, we propose augmented Lagrangian multigrid with overlapping block Gauss-Seidel smoothers. Approximation and convergence properties are studied numerically at standard test problems.


Introduction
Contact problems are ubiquituous. They show up in many applications such as biomechanics, civil engineering, material sciences, mechanical engineering, and manufacturing just to name a few. For the mathematical formulation and numerical solution of contact problems, clearly the formulation of nonpenetration conditions is an essential ingredient. While in the abstract, continuous setting, the admissible set can be characterized in a straightforward way, the numerical discretization is by no means obvious. Consequently, various strategies have been proposed and investigated.
Nonpenetration conditions can be posed either as inequality constraints to the elastic energy minimization, or as penalties integrating the contact constraints into the energy. Popular examples of the first class are point-to-point, point-to-segment, and segment-to-segment approaches [15,21] enforcing nonpenetration in normal direction at discrete points. The by now established mortar methods, in contrast, are based on a weak formulation of the nonpenetration constraints and enforce them using well-chosen discrete multiplier spaces. Due to the weak formulation, mortar based approaches show superior behaviour in patch tests and surface locking, and allow the derivation of a priori error estimates for displacement and contact stress. However, mortar methods are based on a nonsymmetric master-slave formulation and require the computation of a discrete L 2 projection. Whereas the computation of the L 2 projection for practical application is possible but technically demanding [13], the asymmetry of the approach necessitates a tedious assignment of master and slave sides, which is particularly demanding for self-contact. Pressure smoothing [9] or G 1 -continuous displaced contact surfaces obtained by surface smoothing [14,17] or in isogeometric analysis [15] can recover the otherwise lacking stability of point-to-segment approaches.
The second class of contact formulation includes exterior penalty methods as well as barrier type approaches [22]. Their obvious advantage is the structural simplicity of the resulting unconstrained minimization problems. The drawback is the resulting ill-conditioning of the arising system matrices and the additional error introduced by the penalization at the contact interface.
Obviously, the problem formulation determines the types of solution algorithms that can be applied. For the simplest case, stationary contact problems in linear elasticity, constraint formulations lead to quadratic programs to be solved. Both standard algorithms from optimization, such as projected gradient or active set/SQP methods, and algorithms adapted to the PDE structure at hand, such as multigrid and domain decomposition methods, are in use. The latter ones exploit the structural similarity of contact problems to linear elliptic equations of solid mechanics, for which they are optimal. The adaptation to the constraints, however, is technically demanding, leading to complex implementations. For penalty formulations, a nonlinear but unconstrained problem has to be solved, for which again standard NLP algorithms based on gradient or Newton methods can be employed, or nonlinear multigrid methods.
Of course, the constraint and penalty formulations do not define strictly distinct algorithmic classes. A prominent example of a hybrid is the augmented Lagrangian method, which converges to the constrained solution but solves only unconstrained subproblems with finite penalty parameter. In passing we note that Nitsche's method, which can be interpreted as a direct elimination of the multiplier in an augmented Lagrangian formulation by the first order multiplier update formula, provides an intermediate contact formulation between constraints and penalties.
Desirable properties of contact algorithms are symmetry of nonpenetration conditions (removing the necessity of a priori assignment of master and slave sides, in particular in self-contact), optimal complexity of the solver, and simplicity of implementation. We emphasize, that existing approaches fail to satisfy all these requiremens simultaneously. E.g., the by now established mortar methods are capable of solving contact problems, including self-contact, efficiently and with high accuracy. However, based on our experience, they require time-consuming and sophisticated implementation, in particular for curved surfaces.
In order to obtain a simpler method, we allow for non-uniqueness of the multiplier, since contact stresses can be obtained in several ways by postprocessing. We therefore base our approach on a symmetric overconstrained formulation, with arbitrarily placed pointwise constraints. For solving the resulting quadratic programs, we use an augmented Lagrangian strategy together with a nonlinear multigrid using local QP solvers as smoother. The resulting solution method is investigated numerically using standard test examples.

Continuous contact formulation
We will start with defining linear contact problems in a continuous setting. For small strains, they are interesting in their own right, but they also arise as subproblems in SQP methods for nonlinear, finite strain contact problems. Subsequently, we will discuss their discretization.

Problem setting.
Let Ω be a finite union of bounded domains Ω i ⊂ R d , d ∈ {2, 3}, with Lipschitz boundaries and disjoint closure. The elastic bodies to be considered are assumed to occupy the domains Ω i in reference configuration, which we assume to be void of self-contact. We denote by H 1 (Ω) the usual Sobolev space of weakly differentiable functions on Ω with weak derivatives in L 2 (Ω). The sought displacement u ∈ H 1 (Ω) is a minimizer of the quadratic and elliptic total energy subject to the Dirichlet boundary conditions u = u D on Γ D ⊂ ∂Ω with meas(Γ D ) > 0. Furthermore, we impose linearized nonpenetration conditions in terms of the usual gap function g Ω : ∂Ω → R ∪ {∞}, as an approximation to general nonpenetration [7]. For a union Ω of finitely many disjoint bounded domains with Lipschitz boundaries, it is defined almost everywere by in terms of the outer unit normal n(x), which exists for almost all x ∈ ∂Ω according to Rademacher's theorem. If g(x) is finite, we denote the opposing point, i.e. the potential contact partner, by x = x + g(x)n(x).
Writing J(u) = a(u, u) + f, u with a H 1 (Ω)-elliptic bilinear form a, the displacement u is a minimizer of subject to boundary conditions u| Γ D = 0 and linearized nonpenetration conditions We denote the continuously admissible set by

Discretization
In order to actually compute an approximate correction u, the continuous problem (2) has to be discretized. For the displacement u, we use a standard finite element ansatz space V K ⊂ H 1 (Ω) of order P and dimension n = n K on a conforming triangulation T of Ω, i.e. V K = {u ∈ C(Ω) | ∀T ∈ T : u| T ∈ P P }. By that discretization, the continuous energy (2) is replaced by its finite dimensional approximation For discretizing the infinitely many constraints (3), we rely on sampling on a sufficiently large set X c = {x i ∈ ∂Ω β | i = 1, . . . , m} of finitely many points, and define the discretely admissible set which results in a true outer approximation of the feasible set in the form Of course, sample points x i with no potential contact partner x i found, i.e. g(x i ) = ∞, can be omitted. This setting includes two-pass point-to-segment and segment-to-segment contact formulations depending on the distribution of sample points x i , but is different from integrated constraints used in mortar methods.
Remark 2.1. A quadrature-based mortar approximation can easily be derived from (6) by forming appropriate linear combinations of the rows, corresponding to integrating the gap function with mortar basis functions.
This setup has both structural advantages and disadvantages, which we discuss now. On one hand, the ansatz is symmetric, which eliminates the need to distinguish between master and slave sides of the contact boundaries. This is particularly convenient in multi-body contact situations and for self-contact. It is also simple to implement since only ray-surface intersections have to be computed instead of proper integration over nonmatching surface grids as necessary for mortar methods, and no multiplier spaces need to be constructed.
On the other hand, the overconstrained contact formulation in general leads to non-unique multipliers, i.e. the active constraints need not be linearly independent. This violation of the inf-sup condition renders the solution of the Karush-Kuhn-Tucker system unstable if all contact constraints are treated as equality constraints [3].
Since there is usually no one-to-one relationship between displacement degrees of freedom and constraints in the overconstrained contact discretization, efficient yet involved nonlinear Gauss-Seidel smoothers [11,20] cannot be used in a multigrid context, such that other approaches are required. A straightforward penaltybased method is worked out in Sec. 3. It is also more general and allows for including, e.g., incompressibility constraints.

Multilevel solution with overlapping block smoothers
Multilevel approaches, such as geometric multigrid, algebraic multigrid, or H 2 -matrices, provide optimal solution methods for symmetric positive definite linear systems. They rely on a hierarchy of -usually nested -approximation spaces V 0 ⊂ · · · ⊂ V K−1 ⊂ V K , transfer operators between these spaces, and approximate solvers, which are called smoothers. The optimiality of multigrid methods depends on the interplay between the approximation quality of the hierarchy of spaces and of the properties of the smoothers, cf. [2,8]. In the case of constrained problems, it is necessary to modify the smoothers, and possibly also the hierarchy of approximation spaces, maintaining optimality.
Here, we present a multilevel method particularly designed for quadratic problems of the form min x∈R n subject to the inequality constraints arising from the discretization of (2)-(3). The symmetric positive definite matrix A ∈ R n,n is the stiffness matrix corresponding to a(·, ·), cf. (2). The matrix B ∈ R m,n and the vector b ∈ R m represent the discretized pointwise nonpenetration constraint (3).

Multigrid hierarchy
As multilevel hierarchy we choose the conforming finite element spaces V 0 ⊂ · · · ⊂ V K−1 ⊂ V K of Lagrangian finite elements on a sequence of nested conforming meshes. We set dim(V k ) = n k . The interpolation I l k : R n k → R n l , k ≤ l ≤ K is realized by natural injection, and the restriction as R l k = (I l k ) T . The stiffness matrix on level k is taken as A k = (I K k ) T AI K k . We note that I l k = I l k+1 I k+1 k for 0 ≤ k < l and therefore A k = (I k+1 We furthermore assume that there exists a constant γ > 1 such that n k+1 ≥ γn k . For geometric multigrid, usually γ = 2 d . This ensures the optimal complexity O(n) of one step of a multigrid V -cycle.
In our approach, the constraints are exclusively taken care of by the smoothers on the different levels. However, no coarsening of the constraints is performed. We correspondingly denote the constraints on the This ensures that the coarse level corrections lead always to feasible iterates.

Block Gauss-Seidel smoother
In contrast to unconstrained linear problems, a nonlinear Gauss-Seidel smoother has to solve nonlinear or constrained local scalar problems. It is well known that constrained Gauss-Seidel methods are only guaranteed to converge for box constraints [4], which is not the case for two-body contact. As a remedy, a transformation of the Lagrange basis into joint or glued motion on a linear subspace and the jump, which has to fullfill box contraints, can be used. This, however, requires a special structure of the constraint discretization achieved by dual mortar formulations [19].
Standard Gauss-Seidel can be generalized to block Gauss-Seidel, which solves local, but multi-dimensional, nonlinear or constrained sub-problems. It is known to converge, if the block structure separates the constraints, i.e. if B in (8) is block-diagonal [4,10]. Interestingly, block Gauss-Seidel can often be observed in practice to converge, even if B is not block-diagonal. In this case, usually overlapping decompositions are employed.
Following this approach, we construct an overlapping block Gauss-Seidel method as a smoother for the multigrid method. The block construction is motivated by the following two goals: (1) Coverage: every degree of freedom (dof) i has to be covered (2) Row consistency: degrees of freedom coupled to a dof i need to be treated jointly with i Thus we define a family of index sets on level k by and the set of blocks as We note that often |S k | < n k holds, since index sets of different degrees of freedom can coincide. This is particularly prominent for higher order finite element discretizations. We also note that for interior degrees of freedom i, i.e. those associated to Lagrangian basis functions vanishing on the boundary, the corresponding index set is S k,i = {i}. Thus, on the interior degrees of freedom, a standard pointwise Gauss-Seidel smoother results. The smoother application for a QP of the form on level k is then given by solving sequentially the local QPs for each s ∈ S k : Algorithm 1: Smoother k (r, q).
Here, (A k ) ss = ((A k ) ij ) i,j∈s denotes the submatrix corresponding to row and column indices contained in the index set s. (A k ) s and (B k ) s are defined by the corresponding column selection. The advantage of this decomposition is that it can easily be constructed in a purely algebraic way and that it leads to local QPs of moderate size with high arithmetic density.

Multigrid V-cycle
In a multigrid setting, the block Gauss-Seidel smoother, cf. Alg. 1, is applied on each level 0 ≤ k ≤ K, starting from the finest level k = K down to the coarse grid k = 0 and up again to the fine level, forming the standard multigrid V-cycle of Alg. 2, see [2,8].
In contrast to pointwise smoothers of fixed size, here the shape of B k depends on the level k, see (9). While the number |S k,i | of degrees of freedom for all local QPs is in general about the same, by design the number of constraints grows for coarser levels as O(2 (d−1)(K−k) ) for red refinement. With a fixed number of, e.g., projected gradient or projected Newton steps, the complexity of a local smoother application involves a computational effort of O(2 (d−1)(K−k) ). Since there are O(2 (d−1)k ) boundary degrees of freedom, but n K = O(2 dK ) total degrees of freedom in quasi-uniform mesh hierarchies, the complete smoother effort during a V-cycle is of order Thus, one V -cycle with the block Gauss-Seidel smoother from Alg. 1 has an optimal computational complexity of O(n K ) despite the growing size of local QPs.
As a result of the smoother design, the coarse grid corrections are always feasible. A drawback might be that the intersection U h ∩ V k of feasible set and coarse space can be too small to allow for a good approximation of the error [10,12], leading to little progress in each iteration. In fact, this can be observed in the convergence of the block Gauss-Seidel multigrid, see Fig. 3 below, rendering this approach inefficient.
A highly effective, but rather complex remedy is a solution-dependent choice of the multigrid hierarchy [12]. In the next section, we consider a much simpler approach based on a relaxation of the constraints.

Augmented Lagrangian relaxation
Let us consider an augmented Lagrangian approach, where a moderately large penalization of interpenetration relaxes the strict feasibility requirement and allows for larger correction steps. The augmented Lagrange functional for the linearized problem (7)-(8) is Here, · denotes an arbitrary norm on R n (we use a diagonal weighting based on quadrature formulas such that the penalty can be interpreted as continuous L 2 penalty independent of the mesh width), γ > 0 a penalty factor, and v + = max(0, v) the plus function. The augmented Lagrangian method alternatingly minimizes L with respect to x for given λ, i.e.
x j+1 = arg min and performes an update of the multiplier λ for given x, e.g., the first order multiplier update λ j+1 = min 0, λ j + γ(Bx j+1 − b) realizing a gradient method for the Lagrangian dual problem. For sufficiently large, but finite penalty factor γ, this iteration in (x, λ) is known to converge linearly with a rate O( 1 γ ), see, e.g., [1,16]. The penalty parameters are usually much smaller than those required for reasonable accuracy without augmentation, which provides a beneficial relaxation of the feasible set.
The minimization problem (13) in x can be solved exploiting the standard multilevel hierarchy presented in Sec. 3.1. Again, the smoother is taken as overlapping block Gauss-Seidel iteration. In contrast to Sec. 3.2, local penalized subproblems are to be solved instead of local QPs, changing line 3 of Alg. 1 to solve min In contrast to the QP smoother in Sec. 3.2, a block Jacobi method can be used instead of block Gauss-Seidel. This allows for easier parallelization, exploiting the high arithmetic density of solving the small, dense subproblems. In this case, descent can be enforced by a global linesearch.

Numerical examples
The implementation of the method requires only a few standard components to be glued together: ray-line (2D) or ray-triangle (3D) intersection tests, for which we use the boost.geometry R-tree implementation [18], (penalized) QP solvers for small to mid-sized problems, where we use a semismooth Newton method for penalized relaxations and a quickly convergent augmented Lagrangian iteration, and finally an finite element solver with geometric multigrid, where we rely on Kaskade 7 [5,6]. With this setup, we present a few illustrative numerical examples.

Patch test
Patch tests are simple test problems intended to show that spatially constant normal tractions are transmitted correctly across the nonmatchingly discretized contact interface. Though neither necessary nor sufficient for stability and covergence, passing the patch test is generally considered a requirement for reasonable contact discretizations [3].
We The analytical solution exhibits a constant displacement derivative, and is therefore contained in all conforming piecewise polynomial finite element spaces. If the nonpenetration discretization (5) is not only an outer approximation, but also guarantees that any infeasible configuration violates (5), the minimization of the discretized problem (4) subject to (6) yields the exact solution. This is the case for P1 and Q1 elements if the boundary vertices are included in the contact sampling set X c , since any (infinitesimal) interpenetration must result in at least one vertex crossing the opposite boundary due to polygonal deformed configurations. Consequently, the computational results are exact down to the algebraic QP solver accuracy.
For higher ansatz order, preventing interpenetration at m e equidistant sample points on each boundary edge cannot guarantee the continuous feasibility of the deformed configuration. Consequently, the computed stress distribution will no longer be constant. Increasing the number of contact sample points can, however, be expected to increase the accuracy. This is indeed the case, as shown in Fig. 1 right. It is also apparent that the local accuracy of stresses does not improve on refinement. This can be attributed to the facts that first, the exact solution is already attainable on the coarse grid and second, that a finer discretization further enlarges the discretely feasible set, which is just counterbalanced by the correspondingly increasing number of contact constraints. Accordingly, the second order results are better than for third order FE.
For approximating two-pass mortar formulations, integration of the gap function weighted by p + 1 Bezier functions of the FE ansatz order has been used for m e ≥ p + 1, with the trapezoidal rule as quadrature. For growing m e , the mortar-like contact discretization improves due to better quadrature, but is not as accurate as the pointwise constraints.

Hertzian contact
As a standard test case, we consider Hertzian contact of two parallel cylinders of radius 0.4 and 0.6, respectively, with a linear St. Venant-Kirchhoff material with Lamé parameters λ = µ = 1 just for testing purposes. Due to symmetry, the problem is solved on a cross section consisting of two hemi-circles, with perimeter discretized by t points. The contact pressure reconstructed pointwise from the computed displacement is shown in Fig. 2, left, for different discretizations, and exhibits the expected convergence and reaches, after a few Newton steps, an adjoint residual at machine precision and a complementarity error around 10 −10 . Note that due to the pointwise reconstruction, the volume finite element discretization error shows up in the pressure reconstruction, e.g., as nonvanishing pressure outside the contact region.
The augmented Lagrangian approach allows for both, a Gauss-Seidel and a Jacobi block smoother in the multigrid V-cycle. The contraction, i.e. the ratio δx K,i+1 / δx K,i of consecutive V-cycle corrections δx K,· averaged over several iterations, of both smoothers is depicted in Fig. 2, right, for different mesh refinement levels. While both show mesh-independent convergence rates, as opposed to a single-level Gauss-Seidel iteration, the Jacobi smoother is less effective than the Gauss-Seidel smoother -probably due to blocks overlapping to a different extent.
As mentioned in Sec. 3.2, enforcing strict feasibility of corrections leads to unsatisfactory contraction rates, see Fig. 3. In contrast, the penalty relaxation leads to rapid convergence. Compared to a pointwise Gauss-Seidel smoother, the overlapping block Gauss-Seidel smoother improves the robustness of multigrid contraction with respect to the penalty parameter γ considerably. Penalty parameters larger by two to three orders of magnitude can be used, see Fig. 4. This is due to the joint displacement of contact boundaries, which incurs a condition number growing linearly with γ and dominates convergence for large γ. Since the contact penalty has the structure of a mass matrix, the growing total number of smoother iterations performed on deeper mesh hierarchies leads to faster convergence on finer meshes in this situation.

Conclusions
The combination of overconstrained segment-to-segment contact, augmented Lagrangian relaxation, and multigrid solvers shows promising results with competitive accuracy, while being particularly simple to implement. Convergence theory, currently missing, and refined control of algorithmic details will be subject of future research.