Prioritized optimization by Nash games : towards an adaptive multi-objective strategy

. This work is part of the development of a two-phase multi-objective diﬀerentiable optimization method. The ﬁrst phase is classical: it corresponds to the optimization of a set of primary cost functions, subject to nonlinear equality constraints, and it yields at least one known Pareto-optimal solution x (cid:63)A . This study focuses on the second phase, which is introduced to permit to reduce another set of cost functions, considered as secondary, by the determination of a continuum of Nash equilibria, { x ε } ( ε ≥ 0), in a way such that: ﬁrstly, x 0 = x (cid:63)A (compatibility), and secondly, for ε suﬃciently small, the Pareto-optimality condition of the primary cost functions remains O ( ε 2 ), whereas the secondary cost functions are linearly decreasing functions of ε . The theoretical results are recalled and the method is applied numerically to a Super-Sonic Business Jet (SSBJ) sizing problem to optimize the ﬂight performance.


Introduction
In the engineering environment, the process of numerical optimization is complex due to several coupled elements that we outline in the example of the shape optimization of a civil aircraft wing or configuration.The process is • multi-disciplinary, since aerodynamic forces, structural and thermal loads, acoustics, in particular, should be assessed and, or optimized; • multi-objective, since, e.g. in the aerodynamic design alone, it requires the control of cost and constraint functions related to forces (drag and lift), and moments (pitching moment in particular); 1 Université Côte d'Azur, Inria, CNRS, LJAD, 2004 Route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex (France); e-mail: jean-antoine.desideri@inria.frregis.duvigneau@inria.fr • multi-point, since the design must validated over the entire flight envelope, from the subsonic regime of take-off and landing to the transonic cruise regime, and also for certain extreme conditions in view of certification.Additionally most of the functions of interest are not directly expressed in terms of the optimization variables, such as geometrical parameters, but are instead functionals of state variables through the partial-differential equations of fluid dynamics, elasticity and of all other physical disciplines to be accounted for, via a complex modeler-mesher-solver coupled sequence [7].As a result, these functions are evaluated at high cost, and the calculation of the gradient is usually not trivial.
Consequently, many objective functions are potentially significant in the design optimization process and some may reveal, at the stage of an a posteriori validation, more critical than anticipated.Hence, if a preliminary phase of optimization accounted for the most essential design objectives, a second phase may turn out to be necessary to correct the design in view of reducing additional cost functions.The second phase should be conducted in a way such that the optimality of the first design should be the least possible degraded.Our approach is precisely meant to introduce this type of adaptivity in the multi-objective optimization process.
In Section 1, the problematics is stated in mathematical terms, a theoretical result formulated, and a numerical method proposed.In Section 2 the method is applied to a Super-Sonic Business Jet (SSBJ) sizing problem devised for optimal flight performance.

Mathematical problem definition
One considers a multi-objective differentiable optimization problem involving a set of M cost functions {f j (x)} (j = 1, . . ., M ; x ∈ Ω a ⊆ R n ), the first m (1 ≤ m < M ) of which are thought of preponderant, or primary, and the remaining ones (j = m + 1, . . ., M ), secondary, and all are subject to K equality constraints, c k (x) = 0 (k = 1, . . ., K). Objective and cost functions are all smooth, say C 2 (Ω a ).
A first phase of optimization (omitted here) has led to the knowledge of a point x A in the open interior of the admissible domain Ω a that is Pareto-optimal for the minimization of the sole primary cost functions under the constraints.
The second phase of optimization is aimed to construct a continuum of Nash equilibria {x ε } parameterized by ε (ε ≥ 0) in way such that: firstly, x 0 = x A (compatibility), and secondly, for ε sufficiently small, the Pareto-optimality condition of the primary cost functions remains O(ε 2 ), whereas the secondary cost functions are decreasing functions of ε (linearly, or faster).

Preliminary calculations
The preliminary calculations have been defined in full in [4].They are made of three main steps: • Definition of a convex "Primary Steering Function" f + A (x) • Definition of a "Territory Splitting" • Definition of a "Secondary Steering Function" f B (x).The cost functions are all assumed to be strictly positive.This may require the preliminary application of an adequate exponential transform.This permits us to consider gradients in the logarithmic form, and examine the variations of these functions in terms of proportions.
The Pareto-stationarity condition satisfied at x = x A is expressed as follows: where the supercript on any symbol denotes throughout an evaluation at x = x A , ∇ is the gradient operator w.r.t.x, {α j } (j = 1, . . ., m) are the known coefficients of a convex combination, P is the projection operator onto the linear subspace orthogonal to all constraint gradients, {∇c k } (k = 1, . . ., K).Then the primary steering function is defined as follows: where c is a nonnegative constant adjusted to enforce the local convexity of f + A (x) and the associated Lagrangian.Then, by virtue of (1), the primary steering function is stationary and convex at x A , and therefore achieves a local minimum.
The territory splitting is based on the orthogonal decomposition of the following n × n reduced Hessian matrix [4,5]: where A is strictly-positive definite by convexity fix.The eigenvectors, column-vectors of matrix Ω, are conventionally arranged in a special way: the first K are associated with the directions of the constraint gradients (null space of P), and the remaining ones are ordered by decreasing eigenvalue.In this way, the tail eigenmodes are associated with the smaller sensitivities of the primary steering function.The splitting is associated with the following change of variables: where u ∈ R n−p , v ∈ R p , and p is an integer free parameter to be chosen such that and 1 ≤ p < n − K.In this setting, the following notations are used: and the following scaled logarithmic gradients are associated with the secondary cost functions: where the scaling matrix S = (∇ 2 vv f + A ) is the lower p × p diagonal block of matrix H of (3), positive-definite by convexity fix.Hence these gradients reflect the sensitivities of the secondary cost functions when v varies, and a scaling by elements related to the primary steering function as well.Following the strategy employed in the Multiple Gradient Descent Algorithm (MGDA) [4], one identifies the minimum-norm element ω B in the convex hull of the gradients g j : and the coefficients {α j } of this convex combination are used to define the secondary steering function: and this completes the preliminaries.

Nash game formulation and theoretical results
One defines a continuation parameter ε (0 ≤ ε ≤ 1) and the auxiliary cost function: For every fixed value of ε, two virtual players, P layer A and P layer B, respectively controling the sub-vectors u and v in (4), are involved in a Nash game defined by the following strategies: -Strategy A : P layer A attempts to minimize f + A X(u, v) by the strategy u, subject to the equality constraint c X(u, v) = 0, and by accounting for P layer B's fixed strategy v, whereas: -Strategy B : P layer B attempts to minimize f AB (X(u, v)) by the strategy v, subject to no constraints, but by accounting for P layer A's fixed strategy u.The following results have been established [4]: • The Nash equilibrium point x ε exists for all sufficiently small ε, and an essential property referred to as compatibility or consistency.
which reflects the fact that, as ε varies, the Pareto-optimlity of the primary cost functions is degraded by O(ε 2 ) only.
Hence, the secondary steering function f B (x) decreases linearly as ε increases, and by virtue of properties of the MGDA construction, all the secondary cost functions do as well.Thus the objectives of our construction are achieved.

Numerical implementation
The present method can be applied formally to simple cases defined analytically.In the more general case in which the formal derivation is not at hand, we propose to resort to models of substitution, and formulate the Nash game in terms of these meta-models.Precisely, quadratic models can be constructed once for all for the two steering functions based on local differential elements at x = x A (typically approximated by central differencing), and function values over a database to account on second derivatives more globally.For constraints, quadratic meta-models can also be constructed, but it is recommended to upgrade these models, as the determination of the continuum of Nash equilibria proceeds, because constraints participate critically to the definition of Paretooptimality, and this notion cannot be altered legitimately.Such techniques have been implemented over the software platform http://mgda.inria.fraccording to procedures fully defined in [6].

Application to a SSBJ sizing problem for optimal flight performance
The test-case is inspired from Multi-Disciplinary Optimization (MDO) problematics raised by Dassault Aviation within the ANR (French Research Agency) "Optimisation Multidisciplinaire" Project (2006-2009) documented in [2,3].Dassault Aviation had provided partners with a software permitting to exercise numerical procedures for multidisciplinary or multi-objective problems in flight-mechanics design [7].This software has been used again here for purpose of illustration of our method for prioritized multi-objective optimization.
The software is made of several procedures that calculate certain performance quantities according to classical laws of flight-mechanics, the Breguet laws in particular, as functions of 15 design variables {X i } (i = 1, . . ., 15), all subject to prescribed interval constraints X i,min ≤ X i ≤ X i,max .For the design of a SuperSonic Business Jet (SSBJ) these variables and their bounds are described in Table 1 1.Physical design variables in the flight-mechanics test-case and their specified bounds In order to respect the prescribed interval constraints, we have used, for each design variable X i , a transformation of the following type and used the dimensionless vector x = {x i } as the optimization variable.
The software permits to compute in particular • mass at take-off, M (to be minimized), • range, R (to be maximized), • approach speed, V (to be minimized), • take-off distance, D (to be minimized, or maintained below a specified upper bound).
In the present test-case, the take-off distance D was subject to the following inequality constraint: This led us to introduce an additional variable, x 16 , as a slack variable, and to replace the above inequality by the following equality constraint: thus increasing by one the dimension of the vector x.
Three cost functions were defined in relation to mass, range and approach speed: where starred quantities correspond to initial values at x = x A and are case-dependent as well as the positive numerical parameter γ.In this setting, the parameter γ appears as a scaling factor in the expression of the initial gradients of the cost functions, since it permits to render their magnitudes close to 1.The form given to (15) and ( 17) reflects the objective to minimize mass and approach speed, and to (16) to maximize range.The three functions f 1 , f 2 , f 3 are uniformly positive and equal to 1 initially: 2.1.Problematics, Pareto front and prioritized optimization.
The cost functions f 1 and f 2 representing mass and range have been prioritized, and f 3 representing approach speed considered as a secondary cost function, all three functions being subject to the constraint c 1 (x) = 0 related to take-off distance.
In a first experiment, not involving the slack-variable x 16 , the mass-range Pareto front associated with the pair (f 1 , f 2 ) subject to the equality constraint c 1 (x) = 0 handled by penalization was determined approximately by applying the Pareto Archived Evolution Strategy (PAES) [1].This task revealed more laborious than we had anticipated due to the small scales affecting the design variables.After some 100,000 evaluations of the functions, eliminating all the points violating the constraint (13) and all the dominated points, resulted in the approximate Pareto front represented by the symbols in black of Figure 1.
Five points A, B, C, D and E associated with mass-range Pareto-optimal solutions subject to the distance constraint were selected from the right-most point of the approximate discrete Pareto front (largest mass and range), to the left-most point (smallest mass and range).Each of these points was used to conduct an independent numerical experiment, in which the continuum of Nash equilibria originating from it has been calculated with the objective to diminish the approach speed.In this way, we obtained five different paths initiated from the front and represented on Figure 1 in different colors.
We refer to [6] for the precise description of this experiment.Here, the discussion is limited to the numerical results.
Let us first examine the most standard case corresponding to the green path initiated at B. In this experiment, the parameter γ was set to 10 to better visualize the variations of the functions.Evidently, it is confirmed that the path is tangent to the Pareto front and constitutes a satisfactory approximation of it over a significant segment.
A great part of the path remains relatively close to the front.For example, the point B' at which ε/ε max = 0.8, where ε max is the theoretical upper bound for convexity defined in [6] is indicated by the symbol •.Thus, the portion of the path beyond this point only represents the 20% tail of the continuum.
The variation with ε of the three cost functions, and two steering functions f + A and fB (meta-model of f B ) is given on Figure 2 on which the values at point B' are indicated by the symbol •.
Several observations can be made from this plot: • The general pattern of the function plot is horn-shaped; here bounded from above by the increasing primary steering function and from below by the decreasing secondary steering function.
-  • The primary steering function f + A has indeed a null initial derivative.The two primary cost functions vary inversely: f 1 diminishes, and f 2 increases.For example, at B': Hence mass was reduced of 1.4%, a gain, and range diminished 2.7%, a loss.The function f + A has a more rapid increase due to the convexity-fix term.
• The initial derivative of the secondary steering function fB is well given by the theoretical prediction indicated by a dashed tangent line of slope −σ B .The function f 3 , calculated a posteriori, follows accurately its meta-model fB .At B': Hence the approach speed was reduced of some 5.6% This is confirmed by the plot of V as a function of mass along the continuum given by Figure 3, over which the value at B' is indicated by the symbol •.Correspondingly, the variations with ε of the optimization variables {x i } (i = 1, . . ., 16) are given by Figure 4 over which the values at B' are indicated by the symbols •.Only the first four of these variables have a definite variation, the other ones remain nearly constant.Also note that the angular point in the plot of x 3 (ε) does not reflect a derivative discontinuity, but a convention adopted in the graphics representation [6].
In Figure 5, we present the variation with ε of the constraint c 1 related to the take-off distance.Over a broad portion of the interval, the constraint is satisfied with very high accuracy thanks to the local constraint meta-model, upgraded prior to the calculation of each new Nash equilibrium point.For example, at point B', the value is c 1 ≈ −3.7×10 −5 , while T OL = 10 −4 .Beyond this point, the constraint value ceases to be negligible rapidly.These values are firstly negative which implies that an unnecessarily severe strict inequality is satisfied, that is, some significant distance away from the Pareto front, the approach speed is overly reduced at the cost of an unduly mass increase and range reduction.Further away, when approaching ε = ε max , an instability is triggered.As a result, the tail of the continuum is meaningless.Perhaps a more sophisticated constraint meta-model could permit to prolongate the continuum further.This point is currently being examined.We finish this description of Case B by some information relative to the number λ of outer iterations necessary to coordinate the two subvectors u and v at a given ε for the specified tolerance T OL = 10 −4 , and the number µ of Newton's method inner iterations to solve for u for fixed v with accuracy tolerance of T OL/100 = 10 −6 .Note that over the hundred ε steps, 94 were successful to determine the Nash equilibrium.In all these cases, µ never exceeded 3, while the termination value of λ is given by Figure 6 in terms of ε.Over more than 60% of the continuum, λ remains at most equal to 2. Then when approaching the limit of convexity it increases, and finally the process diverges, or more rigorously speaking, is interrupted.These results reflect some arbitrariness related to the setting of numerical parameters; they would change with a different setting of the accuracy tolerances, or with the specification of a different maximum allowable number of iterations in the inner or outer loop.
We now discuss the results related to the continuum originating from point A, at the extreme right of Figure 1 corresponding to the case of largest mass and range.These results are similar to those of Case B except that the portion of the continuum close to the Pareto front extends somewhat less.Again, a lighter aircraft could permit a smaller approach speed, but at the cost of a smaller range.
Thirdly, the continuum initiated at point C remains very close to the discrete Pareto front over a large segment, over which it is an excellent approximation to it.
Fourthly, the continuum initiated at point D is very small and obtained with difficulty using the very large value of 1000 for the parameter γ.In this area, there seems to occur an inversion of trend, more visible from the continuum initiated at point E, using γ = 40, and a very low upper bound on the condition number (κ = 1.5) in which results a large convexity-fix constant.The continuum there makes a loop, indicating a form of instability.
Evidently, the results in Cases D and E do not bring improved practical designs.
In conclusion of this case study, we emphasize the following points: • the test-case is of actual technical interest; • it involves 15 design variables subject to interval constraints that were handled by a change of variables, and one functional constraint handled by the introduction of a slack variable; many of these variables however remained almost constant along the continuum; • the scales in the design variables were found very small and made the establishment of the Pareto front by the evolutionary strategy laborious.Nevertheless, we were able to identify a continuum of Nash equilibria originating from a large portion of the Pareto front.This continuum complies with the theoretical findings, in particular w.r.t.tangency to the Pareto front.Along the continuum, the cost functions exhibit a horn-shaped pattern indicating a primary steering function that increases with the continuation parameter ε and a secondary steering function that decreases, initially linearly, in compliance with the theoretical results.The constraint is very well satisfied over a large part of the interval [0, ε max ], where ε max is the known theoretical limit of convexity of the meta-model-based problem.Some extreme cases required an unusual adjustment of the parameter κ (which controls the added convexity term), and γ which controls the magnitudes of initial gradients.
In these experiments, we have determined a priori an approximate but entire Pareto front by PAES.This was for purpose of numerical illustration and verification of the theoretical relationship between the front and the continuum.However, only a few Pareto-optimal points were actually necessary to initiate the continuums, and the first phase of the experiment could have been limited to the determination of these few points.Also note that this determination could have been conducted by other means, perhaps more efficiently, and the Pareto front roughly sketched by the continuums themselves alone.This would have been an alternative to the evolutionary strategy.
Lastly, to establish one such continuum typically required about 45 sec of run time on a standard laptop for program assembly and compilation, execution (mostly in the 15 phases of preparation of the Nash games), and the production of plots using the gnuplot software.In this test-case that involved 16 variables, the database contained 1920 points about each x A .Consequently, most of the computational work was devoted to the establishment of the meta-models and the evaluation of the primary agglomerated function f A , and relatively less to the calculation of the Nash equilibria per se.For example, in Case B, 94 equilibrium points were calculated.The numerical experiment required a total of 2048 calls to the evaluation procedures of the primary and secondary cost functions, and 5055 calls to the procedure of constraint evaluation.
The authors wish to acknowledge the very fruitful technical discussions with M. Ravachol about the supersonic business jet problematics, and are indebted to Dassault Aviation that provided the flight mechanics software.

Figure 2 .
Figure 2. Variation with ε of the cost functions f 1 , f 2 , f 3 and the two steering functions f + A and fB in Case B. The values at point B' are indicated by the symbol •.

Figure 3 .
Figure 3. Approach speed as a function of mass along the continuum.The value at point B' is indicated by the symbol •.

Figure 5 .
Figure 5. Variation with ε of the landing distance constraint along the continuum.The value at B' is indicated by the symbol •.

Figure 6 .
Figure 6.Number λ of outer iterations necessary to coordinate the subvectors u and v at a given ε as a function of ε.(The process was interrupted whenever the convergence test still was not met after the 15th iteration.) .