## Abstract

Parametric optimization is the process of solving an optimization problem as a function of currently unknown or changing variables known as parameters. Parametric optimization methods have been shown to benefit engineering design and optimal morphing applications through its specialized problem formulation and specialized algorithms. Despite its benefits to engineering design, existing parametric optimization algorithms (similar to many optimization algorithms) can be inefficient when design analyses are expensive. Since many engineering design problems involve some level of expensive analysis, a more efficient parametric optimization algorithm is needed. Therefore, the multi-objective efficient parametric optimization (MO-EPO) algorithm is developed to allow for the efficient optimization of problems with multiple parameters and objectives. This technique relies on the parametric hypervolume indicator (pHVI) which measures the space dominated by a given set of data involving both objectives and parameters. The novel MO-EPO algorithm is demonstrated on a number of analytical benchmarking problems with various numbers of objectives and parameters. Additionally, a morphing airfoil case study is examined. In each case, MO-EPO is shown to find solutions that are as good as or better than those found from the existing P3GA (i.e., equal or greater pHVI value) when the number of design evaluations is limited.

## 1 Introduction

Parametric optimization (or multiparametric programming [1,2]) introduced in the 1970s [3] looks at how an *optimal solution changes as parameters change*. Parametric optimization problems are different from traditional optimization problems because they directly consider the effect of exogenous variables on the optimal solution. Since some of the earlier work on its mathematical theory [4,5], parametric optimization has been demonstrated for design and control in engineering [6–11] as well as other disciplines (e.g., economics [12]). The benefit provided by parametric optimization lies in its ability to directly consider parameters (i.e., variables that affect the design, yet are not controlled by the designer) in the optimization process. In this way, parametric optimization solves a “family” of optimization problems related by these parameters. Figure 1 illustrates a notional Pareto frontier in two dimensions as well as a parametrized Pareto frontier [13] in three dimensions (two objectives and one parameter). The two-dimensional Pareto frontier is simply a subset of the three-dimensional parameterized Pareto frontier at a specific parameter value. Choosing a different parameter value would result in a different Pareto frontier.

For example, consider an aircraft wing capable of shape morphing. An objective could be to maximize the lift-to-drag ratio of the aircraft which depends on the wing shape and flight conditions (e.g., angle of attack). Although one could choose to treat all variables (defining the wing shape and flight conditions) as optimization variables, the result would be a single shape at specific flight conditions. However, a plane cannot stay at these specific flight conditions at all times. The plane must take off, cruise, land, and make any other flight maneuvers deemed necessary by a control tower or pilot. So, the optimal shape is needed as a function of flight conditions such that the wing can reconfigure to adapt to changing conditions. Such a solution is found through parametric optimization by treating variables defining flight conditions as parameters [6,14,15].

Parametric optimization is particularly useful when considering adaptive systems design when parameter values will be known with certainty in the future. In the morphing wing example, flight conditions are constantly changing, but known at any given point in time. With the ability to shape morph, the aircraft could potentially reconfigure in real time as the parameters (i.e., flight conditions) are sensed. In the absence of morphing capabilities, a framework such as robust design optimization makes more sense [16]. In this framework, flight conditions could be treated as uncertain. Rather than find the optimal shape for any potential flight condition, an engineer could search for an optimal *fixed* geometry whose performance is robust to changing flight conditions. This is valuable when a system is not adaptive, but parametric optimization is a better formulation when adaptivity is present.

Additionally, parametric optimization is also suited for preliminary design of fixed systems when constant parameters are not yet determined. For example, prior work demonstrates the optimization of structural components prior to a strict determination of the applied loads [7,17]. Another work demonstrates a trade study of a cooling concept performed under unknown subsystem requirements due to lacking security clearance regarding these other subsystems [8].

Despite the benefits of parametric optimization, existing algorithms do not necessarily handle problems involving expensive analyses (with respect to time, computational resources, or monetary cost). The efficient parametric optimization (EPO) algorithm is the only existing method for parametric optimization involving expensive analyses [17]. This single-objective algorithm is built off of the seminal efficient global optimization (EGO) algorithm [18] for nonparametric problems. Both EPO and EGO use at least one response surface model to reduce optimization cost by reducing the number of expensive analyses needed compared to other methods. In the morphing wing example, a multi-objective problem formulation may make more sense to find the optimal tradeoff between maximum lift and minimal drag for any potential flight condition. Therefore, an efficient multi-objective parametric algorithm is needed.

The focus of this paper is to present a multiple objective EPO (MO-EPO) algorithm, that combines strategies from the parametric EPO algorithm and multi-objective expected improvement formulations [19]. Figure 2 relates MO-EPO to other efficient metamodel-based optimization algorithms by the type of problems they can solve. Included in this figure are nonparametric single-objective methods (the EGO algorithm [18], Gutmann’s radial basis function (GRBF) method [20], and sequential approximation optimizations (SAO) [21]), and nonparametric multi-objective methods (the Pareto EGO (ParEGO) algorithm [22], Keane’s statistically based improvement (KSI) method [23], meta-model assisted evolution strategies (MAES) [19], the predictive entropy search (PES) [24], and the parallelized multi-objective EGO (ParMOEGO) [25]).

## 2 Background

### 2.1 Parametric Optimization.

**is the set of objective functions to be minimized,**

*J***is the vector of optimization variables bounded by**

*x*

*x*_{lb}and

*x*_{ub},

**is the vector of parameters bounded by**

*θ*

*θ*_{lb}and

*θ*_{ub}, and

**(**

*g***,**

*x***) and**

*θ***(**

*h***,**

*x***) represent inequality and equality constraints, respectively, as functions of both optimization variables and parameters. The solution to a multi-objective parametric optimization problem is a description of the Pareto frontier**

*θ**** as it depends on the parameters**

*J***(i.e., a parameterized Pareto frontier [13]—cf. Fig. 1). In this formulation, a designer has full control of optimization variables**

*θ***. However, the designer must also consider the effects of parameters**

*x***. These parameters could represent important design considerations outside of the engineer’s control that may be currently unknown or constantly changing (e.g., operational conditions [6,14,10], design requirements [8,17], state variables [9], and so on).**

*θ*In general, parameters may be dependent on or independent of the optimization variables. Equation (1) shows the more general formulation where parameters are dependent. Summers et al. [7] present a single case study that contains both one dependent parameter and one independent parameter. The case study is to optimize an actuator (to enable aircraft geometry changes) under a currently unknown loading magnitude and a currently unknown stroke requirement. The loading magnitude represents an independent parameter because it is agnostic to the actuator design. Instead, the loading magnitude will depend entirely on the aircraft geometry and flight conditions. On the other hand, the unknown requirement on actuation stroke represents a dependent parameter because the actuator design consists of a shape memory alloy (SMA) torque tube. The amount of stroke possible from this actuator is directly dependent on the torque tube geometry (i.e., the optimization/design variables).

There are many methods capable of solving parametric problems with independent parameters [26], but only the predictive parametric Pareto genetic algorithm (P3GA) [27] and EPO [17] are capable of solving problems including dependent parameters (in addition to independent parameters). The newly presented MO-EPO algorithm can also handle both dependent and independent parameters. Furthermore, P3GA, EPO, and MO-EPO can solve problems with nonlinear objective, constraint, and parameter functions whereas other parametric algorithms may be specialized for specific problem types [4,28–32].

### 2.2 Metamodel-Based Efficient Optimization.

In engineering, it is common to have design optimization problems that involve expensive analyses. Whether those analyses involve physical experiments or complex numerical simulations such as FEA and CFD, engineers must balance design quality with their available time, budget, and other resources. Therefore, there is a need for methods that can produce better designs with fewer required analyses. One common approach to optimize designs more efficiently is to use metamodels (or response surface models). Given some initial sampling of designs evaluated via expensive analyses, an engineer can train a relatively inexpensive metamodel capable of predicting the relevant attributes for other designs. In this way, an engineer can predict which designs may be optimal without further expensive analysis. However, these predictions are typically confirmed by using those expensive analyses to evaluate the new design(s). In some cases when the prediction was not good enough, the process could be repeated by updating the metamodel with the new information and predicting some new promising design(s).

*J*

_{min}is the minimum expected single-objective value over all of parameter space given the current expensive data set. $J^$ and

*s*are the response surface model predictions of the single-objective value and its corresponding standard error, respectively. Φ represents the standard normal cumulative distribution function, and

*ϕ*is the standard normal probability distribution function. In the case where parameter functions are also expensive,

**is replaced by the corresponding metamodel $\theta ^$ as a function of the optimization variables**

*θ***.**

*x*A compact pseudocode for EPO is illustrated by Algorithm 1. There are two issues with the EPO algorithm, and both trace back to the expected improvement formula in Eq. (2). First, the equation is inherently single objective because it was derived from that of the single-objective EGO. Second, the *J*_{min} variable as a function of the parameters is not easily defined without data at all possible parameter values (which would not be efficient). Both of these issues are addressed by defining a new expected improvement formula based on the work by Emmerich et al. [19] as discussed in the next section.

#### EPO (dependent parameters) [17]

1: $F(x)=[J(x),\theta (x)]\u2190$ Expensive model

2: $F^(x)=[J^(x),\theta ^(x)]\u2190$ Initial response surface

3: $N\u2190$ Number of expensive evaluations performed

4: **input**: $Nmax\u2190$ Maximum allowable expensive evaluations

5: **while**$N<Nmax$**do**

6: $Jmin(\theta ^)\u2190minxJ^(x)\u2200\theta ^$

7: $x*(\theta ^)\u2190argmaxxEI(x,\theta ^)$

8: **if** all($EI(x*(\theta ^),\theta ^)<tol$) **then**

9: **break while**

10: **else**

11: Define $nxnew$ tuples such that: $xnew\u2286x*(\theta ^)$

12: $Jnew,\theta new\u2190F(xnew)$

13: $F^(x)\u2190$ Update response surface via $xnew$, $\theta new$, and $Jnew$

14: $N\u2190N+n$

15: **end if**

16: **end while**

17: **return** all high-fidelity samples

## 3 Preliminaries

The single-objective EPO algorithm [17] provided an important first step into efficient parametric optimization. However, its limitations proved to be concerning at the least. The expected improvement formulation is extended from that of EGO [18] with the most notable difference being the change from the scalar *J*_{min} in EGO to the parametric function *J*_{min}(** θ**) in EPO. Although this equation does not violate any principles of the original equation, the

*J*

_{min}(

**) is calculated in an ad hoc manner that is potentially flawed (using one approximation to make another approximation).**

*θ**EI*(

**) is the expected improvement given the optimization variables**

*x***.**

*x**I*(

**) is a function that outputs a scalar improvement value given a point in objective space. Finally,**

*J**ϕ*

_{x}(

**) is the probability density function for any point in objective space given the optimization variables**

*J***. Therefore, this equation states that the expected improvement is the integral of the product of the improvement of a point in objective space and the probability of achieving that point in objective space given the optimization variables**

*x***and current beliefs. The integral is taken with respect to the objectives over the feasible objective space.**

*x***as the attribute space of both parameters and objectives (i.e.,**

*F***= [**

*F***,**

*θ***]). Then, it is easy to write**

*J**parametric*improvement function based on the parametric hypervolume indicator (pHVI) [33] accounting for multiple objectives

*and*parameters

**S**denotes the set of existing data. Equation (6) then defines improvement as the amount of increase of the parametric hypervolume indicator value. Recall that the hypervolume indicator should be strictly monotonic [34,35], so a positive value from Eq. (6) indicates that

**(**

*F***) is not dominated by any data in**

*x***S**whereas a value of exactly zero indicates that

**(**

*F***) is dominated by at least one point in**

*x***S**. A negative improvement value is not possible so long as the set

**S**retains all existing nondominated data.

*N*samples in the attribute space are produced from the probability distribution

*ϕ*

_{x}[19]

*q*

_{i}represents the

*i*th Monte-Carlo sample selected from the probability distribution of the attribute space given the optimization variables. Alternatively, if we assume that the probability distribution

*ϕ*

_{x}follows a Gaussian distribution, Gauss quadrature rules can be used to approximate the numerical integral from Eq. (4) as [36]

*w*

_{i}is a weight corresponding to sample

*q*

_{i}dependent on the probability density function

*ϕ*

_{x}. Unlike a Monte-Carlo sampling, Gauss quadrature techniques prescribe the weights and samples in a structured manner. In this work, the Gauss–Legendre quadrature is used to determine the samples and weights for Eq. (8) [37].

## 4 The Algorithm

Building upon the structure of the single-objective EPO and Eqs. (6)–(8), the multi-objective efficient parametric optimization (MO-EPO) algorithm can be defined. The basic procedure of MO-EPO after initialization is to:

train a response surface model for each expensive model,

evaluate parametric Pareto dominance (PPD) [13] via the hypercone heuristic (HCH) [33] to determine which data are parametrically nondominated and compute pHVI value [33] of data,

find new samples of

that parametrically maximize the expected improvement function from Eq. (7) or Eq. (8),*x*evaluate some of these samples using the expensive models and add them to the full data set,

repeat 1–4 until stopping criterion is met, and

output information for all expensive samples including values of optimization variables, parameters, and objectives as well as the ranks of all expensive samples.

A more detailed pseudocode for MO-EPO is presented in Algorithm 2.

### MO-EPO

1: **input:**$F(x)=[J(x),\theta (x)]\u2190$ Expensive models

2: **input:**$Nevalmax\u2190$ Maximum allowable number of expensive evaluations

3: **input:**$n\u2190$ Maximum number of expensive evaluations taken per iteration

4: **initialize:**$i=1\u2190$ Iteration counter

5: $xi\u2190$ Define some initial samples of $x$ via design of experiments

6: $Fi=F(xi)\u2190$ Evaluate initial samples using expensive models

7: $Neval=|xi|\u2190$ Count the number of expensive evaluations performed

8: **while**$Neval<Nevalmax$**do**

9: $F^(x),\varphi x=GPR(xi,Fi)$

10: $Hpi,ranks=Update(Fi)$

11: $x*(\theta ^)=argmaxxEI(x)|Hpi,\varphi x$,

12: **if** all($EI(x*(\theta ^))|Hpi<tol$) **then**

13: **break while**

14: **else**

15: Define $xnew$ as $n$ tuples such that: $xnew\u2286x*(\theta ^)$

16: $Fnew=F(xnew)$

17: $xi+1=xi\u222axnew$

18: $Fi+1=Fi\u222aFnew$

19: $Neval=Neval+n$

20: $i=i+1$

21: **end if**

22: **end while**

23: $Hpi,ranks=Update(Fi)$

24: **return**$xi,Fi,ranks$

25:

26: **definition**$Update(F)$

27: $pHVI=Hp(F)\u2190$ Calculate pHVI value

28: $ranks\u2190$ Rank all points in $F$ via HCH

29: **return**$pHVI,ranks$

30: **end definition**

31:

32: **definition**$GPR(x,F)$

33: $F^(x),\varphi x\u2190$ Train a Gaussian process regression model (including the mean prediction and uncertainty) of each expensive model given the known data in ${x,F}$

34: **return**$F^(x),\varphi x$

35: **end definition**

### 4.1 Necessary Assumptions

#### 4.1.1 Smooth and Continuous Functions.

The first assumption made by MO-EPO is that the expensive models can be approximated by a *smooth* and *continuous* response surface model. The most important of these is the need for the function to be continuous. This assumption is made implicitly when fitting a response surface model to the data. For most engineering applications where expensive functions are representative of or dependent on some physical phenomenon (i.e., mass, energy, etc.), this assumption is valid. In cases where the expensive models are continuous but not smooth (e.g., a piecewise linear relationship where *C*^{0} continuity is maintained), the response surface approximation will be disadvantaged. Despite this disadvantage, the model will still approach the true solution given enough training data.

#### 4.1.2 Predictive Parametric Pareto Dominance.

The second assumption is that parametric Pareto dominance (PPD) can be *predicted* via an SVDD [13,27] or the HCH [33]. Although these heuristic methods are not theoretically proven, empirical evidence supports the use of these predictions of PPD under the right circumstances discussed below.

The SVDD method could be flawed if any numerical inconsistencies are experienced or if the width hyperparameter is too “loose” or too “tight”. Numerical inconsistencies are not common and could be reflective of the specific implementation used here instead of a pitfall for the method as a whole [26]. Tax and Duin [38] illustrate and describe the effect of the width hyperparameter on the resulting SVDD. If a small value is used, the SVDD is trained to be a tight description around the data. For its application to parametric Pareto dominance (PPD), this means that there is little generalization from the training data and dominance will look similar to the strict PPD rules whereby much data will be considered as nondominated. On the other hand, a large width hyperparameter results in a loose description of the SVDD. This means that the description over-generalizes the data. For PPD in this case, the danger is that much of the nondominated data will be incorrectly classified as dominated. Despite this potential pitfall, Galvan et al. [39] demonstrate the method’s effectiveness using a moderate value for the width hyperparameter.

The HCH method also has hyperparameters that factor into its trustworthiness. Similar to the width hyperparameter of the SVDD, the HCH requires the value for the hypercone angle. In similar fashion to its analogous hyperparameter in the SVDD method, too small of a cone angle will result in little generalization where much data are considered as nondominated. For a large cone angle (with 180 deg being its upper bound), the parameters become less important in determining dominance as objective values become more and more important. As the cone angle nears 180 deg, the predictive parametric Pareto dominance begins to look more and more like traditional Pareto dominance (which negates the consideration of parameters altogether). An additional limitation to the HCH method is the need for a user-defined reference space (similar to the reference point chosen for multi-objective HVI calculations). Both the reference space and the hypercone angle are needed to fully define the HCH to predict PPD. A user may avoid these pitfalls by choosing a hypercone angle and reference space consistent with the intuition discussed in Ref. [26].

### 4.2 Implementation-Specific Details.

Although Algorithm 2 provides a general overview of MO-EPO, there are some details specific to the current implementation worth mentioning. First of all, the algorithm allows for the input of both expensive *and* inexpensive models (cf. line 1 in Algorithm 2). Inexpensive models are evaluated exactly when necessary (i.e., there are no response surface models for inexpensive functions). The sample initialization (cf. line 5) is either input by a user or generated via Latin hypercube sampling (LHS) though this could be updated to any number of sampling techniques. As expressed in line 9, Gaussian process regression (GPR) is needed to predict the value of expensive models in addition to the predicted error of those models. In this implementation, a Kriging interpolation via the DACE toolbox [40] is used. Then, the Kriging predictions of both value and standard error are used to define a Gaussian probability distribution *ϕ* throughout the attribute space ** F** for a given set of optimization variables

**.**

*x*The optimization of the response surface model in line 12 is performed via P3GA [27] to generate a number of design configurations with high expected improvement across the parameter space. Depending on the number of expensive functions, the expected improvement is calculated via Eq. (7) or Eq. (8). When the number of expensive functions is less than or equal to four, the Legendre–Gauss quadrature is used to define a grid with a resolution of five samples in each dimension (of the expensive functions) along with their corresponding weights using the implementation of Greg von Winckel [41] to evaluate Eq. (8). If the number of expensive functions is greater than four, a random sampling of 1000 points (i.e., *N* = 1000) are taken from the probability density function to evaluate Eq. (7). In either case, the improvement function used is the improvement of the parametric hypervolume indicator presented in Eq. (6).

*n*tuples chosen from P3GA solution are done on the basis of a score calculated as the product of the expected improvement of the point and its Euclidean distance to the nearest neighbor via Eq. (9).

*EI*(

**) represents the expected improvement of design**

*x***,**

*x***X**represents the set of new designs that have already been chosen, and

*d*

_{NN}( · ) represents the Euclidean distance between

**and its nearest neighbor in**

*x***X**. Therefore, this crowding metric penalizes similar data and promotes a more diverse set of new samples. If

*n*= 1 (cf. line 15 in Algorithm 2), the point with the greatest expected improvement is chosen since the set

**X**is empty. If

*n*> 1, each chosen point is added to

**X**and scores are updated before choosing a new point. Once all

*n*points are chosen in this way,

*x*

_{new}is simply the set of points in

**X**

### 4.3 User Options.

In its current implementation, MO-EPO has many options that can be set by the user. Aside from the inputs labeled in Algorithm 2, a user can choose to modify the conditions for training the response surface model. As the algorithm currently accommodates a Kriging model only via the DACE toolbox [40], one could easily change the correlation or regression functions. Default options include using the Gaussian correlation function and a second-order regression function.

Since MO-EPO hinges on the improvement of the parametric hypervolume indicator, options such as the reference space and hypercone angle can also affect the algorithm performance. The default value of the hypercone angle is 90 deg. There is an option to automatically define the reference space based on the initial population, but this is not recommended since the bounds of the initial population are not always representative of the entire feasible space. Instead, a user should input reference bounds based on the parameter ranges of interest and objective goals. In many cases, this is easy to define using the problem formulation and intuition about underlying physics and logic. For example, in the morphing airfoil case study in Ref. [6] the parameter ranges are set in the problem formulation and can be used directly for the reference space. For the lift-to-drag ratio objective, the lower bound of the reference space can be zero (since negative lift-to-drag ratios are not meaningful in this context) and the upper bound can be placed using some intuition based on engineering knowledge. For example, if the maximum lift-to-drag ratio for the *fixed* airfoil is about 100, the reference space will need to sufficiently extend beyond 100 to account for the maximum performance of the *morphing* airfoil (which is currently unknown). Knowing the true solution, one could say that an upper bound of 200 on lift-to-drag reference space is sufficient, but a larger upper bound would still be valid [26]. The upper bound only becomes invalid when the true solution exceeds the bounds on the reference space. For example, if the morphing airfoil is capable of achieving lift-to-drag ratios up to 180 under certain conditions, having the upper bound of lift-to-drag be a number less than 180 would result in misleading pHVI values, though dominance calculations would remain in tact.

Another user input is related to stopping criteria. By default, the MO-EPO algorithm will run until the full budget of expensive evaluations has been spent. However, one may choose to detect convergence by modifying the maximum amount of expected improvement necessary to continue (cf. *tol* on line 12 of Algorithm 2). The default value of this tolerance is zero to promote spending the whole computational budget, but a user could input a small positive tolerance that could allow MO-EPO to stop when the expectation of improvement falls below the user-defined threshold.

Finally, the last user input affects the transfer of information from one iteration to the next (cf. line 8). When P3GA is used to find data with high expected improvement (line 11), only *n* of these points are used to improve the solution and/or response surface model (lines 15–18). Some or all of the leftover data from P3GA at iteration *i* can be used to seed the maximization of expected improvement at iteration *i* + 1. In some cases, the expected improvement of a design ** x** will drop due to the new information. However, there may be other points in design space that will continue to have high expected improvement until there is a new expensive sample taken in its vicinity. By transferring this information from one iteration to the next, MO-EPO can consider promising designs in one iteration when the designs were unable to be sampled in the previous iteration. The tradeoff of this is increased computational time of the MO-EPO algorithm. The default value of this is zero meaning that no information is shared from one iteration to the next, but a user could increase this number to be any positive integer.

## 5 Benchmarking

To demonstrate the MO-EPO algorithm performance, the three test problems from Ref. [26] are considered. Of interest is the solution accuracy given by the parametric hypervolume indicator (pHVI) [33] as a function of both the number of design evaluations and the computational time assuming an evaluation time for the analytic test problems.^{2} Since the true pHVI value can be calculated for these analytic problems, the pHVI value is presented as a percentage of the true value. For each problem, all combinations of the number of objectives and parameters are considered up to a four-dimensional attribute space. Purely multi-objective formulations without consideration of parameters are ignored since the MO-EPO method is built to be used for multi-objective parametric problems. Although MO-EPO will solve nonparametric problems, there are more efficient methods that should be used when parameters are not present (cf. Fig. 2).

In the following results, both methods are initialized by a Latin hypercube sampling. P3GA is run with 50 generations of 50 population members for a total of 2500 design evaluations along with the options listed in Table 1. MO-EPO, on the other hand, is allowed to take ten expensive samples per iteration with a maximum budget of 1000 design evaluations (including the initial Latin hypercube samples). Other specifics for each test problem are given in the following subsections.

### 5.1 Scalable Test Problem #1.

The first test problem considered is given by Eq. (10) with five optimization variables where *N*_{var}, *N*_{o}, and *N*_{p} denote the number of optimization variables, objectives, and parameters, respectively [26]. This test problem proposed by Galvan et al. [39] is characterized by a discontinuous and nonconvex solution. Furthermore, the analytical problem has a biased solution density. This means that it is very unlikely to find the true solution by a random sampling. These three characteristics provide a meaningful benchmarking example.

Figure 3 presents the results from five replications of both MO-EPO and the existing P3GA [27] using the SVDD for PPD. The figure plots the solution pHVI as a function of the number of design evaluations. Note that a single-design evaluation consists of evaluating all objective and parameter functions for a single vector of optimization variables ** x**. Each row of Fig. 3 denotes the dimensionality of the attribute space (i.e., the number of objectives plus the number of parameters) whereas each column denotes the number of parameters. Since larger pHVI values are preferred, the trend shows that MO-EPO consistently finds a better solution with fewer design evaluations compared to P3GA for all problem formulations here.

These results are compelling, but there is one factor not considered in these figures: the computational “cost” of algorithmic overhead. P3GA requires the training of a SVDD in each iteration whereas MO-EPO requires the training of surrogate models for each expensive function and the optimization of the expected improvement using these models in each iteration. To account for this overhead cost, Fig. 4 displays the pHVI value as a function of computational time (i.e., wall-clock time) assuming that each design evaluation requires 1 min (serial computation). Results in Fig. 4 are very similar to that of Fig. 3. This shows that the overhead cost of MO-EPO does not diminish the results on this problem.

As discussed previously, parameters can be dependent or independent. When parameters are defined by expensive functions, they are dependent. However, it is also possible for parameters to be independent and therefore cheap.^{3} Therefore, the previous study is repeated for both P3GA and MO-EPO considering all parameters as independent and cheap. In this case, MO-EPO allows 1000 randomly chosen designs of interest (i.e., *EI*(** x**) > 0) to be carried over from one iteration to the next. Figure 5 illustrates the superiority of MO-EPO over P3GA on this problem with independent parameters. This is not surprising since P3GA relies on a naive search while MO-EPO is able to use cheap functions and models of expensive functions to predict optimal solutions more quickly. Of particular note here is that MO-EPO performs increasingly better as more parameters are considered (since these independent parameters are evaluated quickly) whereas P3GA pales in comparison.

### 5.2 Scalable Test Problem #2.

*variable*solution density on the parametric Pareto frontier making certain parts of the true solution more difficult to find than others. Finally, this problem exhibits high multimodality with many local optima. These characteristics provide a meaningful and different benchmarking example.

Similar to the previous test problem, all objectives and parameters are considered as expensive, and MO-EPO is employed to solve the problem. For both P3GA and MO-EPO, the reference space is bounded by 0 and 40 for the final objective. All other functions are bounded from 0 to 1. Additionally, both methods use a hypercone angle of 90 deg. For MO-EPO, 0 points of interest will carry over from one iteration to the next (cf. Sec. 4.3) and the algorithm will run until either: the maximum number of design evaluations has been met or no point with positive expected improvement can be found.

Figure 6 presents the pHVI versus the number of design evaluations. The pHVI versus computational time (assuming 1 min per design evaluation) shows the same trend [26]. Similar to the performance of EPO on the single-objective version of this problem [17], MO-EPO is unable to outperform the existing P3GA with respect to either computational time or the number of design evaluations. When comparing the characteristics of this test problem to the assumptions made within MO-EPO, these results are not surprising. In Sec. 4.1, the assumption that functions are “smooth and continuous” is made. Although this test problem is continuous, it is highly multimodal leading to nonsmooth objective/parameter functions overall. To generate a trustworthy response surface model, much data are needed to capture the multimodality of this problem. Therefore, the solution improvement is relatively slow as a result of an untrustworthy response surface model. Despite this assumption violation and disadvantage, MO-EPO and P3GA performance are largely the same on this difficult problem for low numbers of design evaluations.

### 5.3 Scalable Test Problem #3.

For this problem, the reference space is bounded by 0 and 20 in every attribute and a hypercone angle of 50 deg is employed for both MO-EPO and pHVI calculations. For MO-EPO, 100 points of interest will carry over from one iteration to the next (cf. Sec. 4.3) and the algorithm will run until either: the maximum number of design evaluations has been met or no point with positive expected improvement can be found.

Figure 7 presents the pHVI versus the number of design evaluations. Due to the shape of the parameterized Pareto frontier, P3GA could have a slight advantage over MO-EPO. However, the results relative to both the number of design evaluations and the computational time show MO-EPO as the more efficient method. Discussion in Ref. [26] reveals that the overhead time of MO-EPO is increased on this problem. This is a result of the smaller hypercone angle that, although necessary to correctly identify the true parametric Pareto frontier, increases the computational overhead. Reference [26] briefly discusses two ways that the HCH-based pHVI method can be implemented efficiently by using dominance information from one iteration of an evolutionary algorithm to the next. In this case, a smaller hypercone angle results in less dominated data finally resulting in an increase in computational expense for computing the pHVI value used for expected improvement estimation. In this example, relatively cheap high-fidelity functions (<1 minute per evaluation) could be more easily optimized using something like P3GA instead of MO-EPO. As the design evaluations become more expensive, MO-EPO becomes the better choice.

## 6 Engineering Case Study

This section demonstrates the use of the novel MO-EPO on the optimization of a morphing airfoil from prior work [6]. Results from MO-EPO are compared to that of P3GA considering solution quality (via pHVI), computational effort, and total number of expensive analyses. The computational effort considered is the combination of algorithmic overhead and time spent on expensive analyses. To allow for a side-by-side comparison with previous studies, the single-objective problem is solved with MO-EPO and compared to the solution from P3GA [6]. However, since MO-EPO is fundamentally a *multi-objective* algorithm, an additional objective is also considered. This multi-objective problem is solved using both P3GA and MO-EPO. Both problem formulations and algorithmic information are presented in detail in Ref. [26].

Unlike the previous section, the pHVI of the true solution is not known for this engineering case study. Therefore, absolute pHVI values are presented instead of a fraction of a true value. Note that a pHVI value of exactly 1 is not possible; instead the pHVI value denotes the ratio of the reference space dominated by the data (cf. Refs. [26,33]).

For the single objective and three parameters problem, MO-EPO converges after only 518 function evaluations in 47 iterations requiring about 2 h of computational time. Figure 8 shows the pHVI as a function of time and design evaluations for both P3GA and MO-EPO. It is clear that MO-EPO is capable of finding a better solution than P3GA given a short time constraint. However, given more time for analysis, P3GA is capable of finding a better solution (at a larger computational cost). Figure 9 illustrates lower dimensional subspaces of the full, four-dimensional attribute space (one objective and three parameters). The parametric frontiers are generated from the full 50,000 designs evaluated by P3GA compared to that of the 518 designs evaluated by MO-EPO. It is clear that MO-EPO is able to gather much information about the solution to the problem with only a fraction of the time and analysis required by P3GA.

For the two objectives and three parameters problem, P3GA is run with a smaller population size to improve results. Figure 10 presents the solution quality for both methods as a function of time and the number of design evaluations. These results show the same trend as that of the single-objective formulation. Whether looking at the computational time or the number of design evaluations, MO-EPO is able to find a higher quality solution faster than P3GA. In this image, the overhead time of MO-EPO can be seen since the computational cost of the objective functions is on the order of seconds. Despite MO-EPO considering half of the total design evaluations as P3GA, the computational time of MO-EPO is nearly three times that of P3GA. In problems with more computationally expensive functions, the overhead MO-EPO time will take a lower percentage of the overall time. The results indicate, however, that this algorithmic overhead is worth the expense as MO-EPO is still able to outperform P3GA on this problem.

Figure 11 illustrates three subspaces of the five-dimensional parameterized Pareto frontier (two objectives plus three parameters).^{4} In each of these parameter spaces, the Pareto frontier found by MO-EPO is able to find designs with larger lift-to-drag ratios compared to that of P3GA. In the short time that each algorithm was allowed to run, P3GA was only able to find designs with low moment coefficient whereas MO-EPO found a larger variety of designs to give a designer tradeoff information between maximizing lift-to-drag ratio and minimizing moment coefficient.

## 7 Conclusions

Parametric optimization has been shown to benefit engineering design in addition to other contexts. For design problems with expensive analyses, efficient parametric optimization algorithms are shown to find better solutions with a limited analysis budget. Despite the promise of the single-objective EPO algorithm, a parametric extension of EGO is limited since some valid assumptions for single-objective, nonparametric optimization do lend themselves to general parametric optimization.

Therefore, the MO-EPO algorithm is developed to be more general in handling both parameters and objectives. This method hinges upon the relatively new parametric hypervolume indicator (pHVI). MO-EPO follows the same overarching procedure as EPO and EGO, with some important internal differences. The overall procedure to train a response surface model of expensive functions, find and evaluate a design that is expected to improve the solution and/or model, and repeat is identical to all three of these algorithms. What MO-EPO does differently is finding a design with high expectation of improving the solution and/or model. By using the pHVI metric as the basis for “improvement,” MO-EPO can allow for any number of objectives and parameters providing a more meaningful metric compared to the ad hoc approach from EPO. Therefore, MO-EPO seeks to find the parametric solution maximizing the pHVI value.

The novel MO-EPO algorithm is demonstrated on a number of analytical benchmarking problems and two variants of an engineering case study. In all cases, MO-EPO is shown to be as good as or better than the existing P3GA when the number of design evaluations is limited (based on computational resources, time, and/or monetary funds). This finding is promising for engineering design where analysis can become costly in terms of both money and time.

There are multiple avenues for future work. The scalability of the MO-EPO algorithm with respect to the number of optimization variables should be investigated. The scalability of the algorithm with respect to the number of expensive objectives and parameters should also be considered to a higher degree. More parametric test problems should be developed with new or different properties to further evaluate MO-EPO, P3GA, and any future parametric algorithms. Finally, metrics other than the pHVI should be developed and considered to improve parametric algorithms and the detection of solution quality. Although the pHVI has some desirable characteristics (cf. Sec. 3), there are many existing nonparametric metrics for multi-objective performance—each with characteristics that may be desirable in different contexts [43]. Candidates for new parametric metrics include extensions of the $\u03f5$-indicator [44], the G-metric [45], and the performance comparison indicator (PCI) [46].

## Footnotes

All calculations are performed on the same machine: Intel(R) Core(TM) i7-9700 at 3.00 GHz CPU with 32 GB of RAM.

Independent parameters are always cheap by definition, but dependent parameters could be cheap or expensive.

To plot scattered data in each subspace, data with $\alpha \xb11\u2218$, *V* ± 10 m/s, and *H* ± 5000 ft of the printed value are projected into the given space.

## Acknowledgment

This work is supported by the NASA *University Leadership Initiative* (ULI) program under federal award number NNX17AJ96A, titled Adaptive Aerostructures for Revolutionary Civil Supersonic Transportation.

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

## Nomenclature

=*x*optimization variable vector

**J**=objective function vector

=*θ*parameter vector

- EGO =
efficient global optimization

- EPO =
efficient parametric optimization

- HCH =
hypercone heuristic

- HVI =
hypervolume indicator

- MO-EPO =
multi-objective efficient parametric optimization

- pHVI =
parametric hypervolume indicator

- PPD =
parameterized Pareto dominance

- P3GA =
predictive parameterized Pareto genetic algorithm

- SVDD =
support vector domain description

## References

*Multi-parametric Programming: Theory, Algorithms, and Applications*, No. Centre for Process Systems Engineering, E. N. Pistikopoulos, M. C. Georgiadis, and V. Dua, eds. (Process systems engineering, Vol. 1), Wiley-VCH, Weinheim. OCLC: 180943251.

**79**, p.