Shapley effects for sensitivity analysis with dependent inputs: bootstrap and kriging-based algorithms

In global sensitivity analysis, the well known Sobol' sensitivity indices aim to quantify how the variance in the output of a mathematical model can be apportioned to the different variances of its input random variables. These indices are based on the functional variance decomposition and their interpretation become difficult in the presence of statistical dependence between the inputs. However, as there is dependence in many application studies, that enhances the development of interpretable sensitivity indices. Recently, the Shapley values developed in the field of cooperative games theory have been connected to global sensitivity analysis and present good properties in the presence of dependencies. Nevertheless, the available estimation methods don't always provide confidence intervals and require a large number of model evaluation. In this paper, we implement a bootstrap sampling in the existing algorithms to estimate confidence intervals of the indice estimations. We also proposed to consider a metamodel in substitution of a costly numerical model. The estimation error from the Monte-Carlo sampling is combined with the metamodel error in order to have confidence intervals on the Shapley effects. Besides, we compare for different examples with dependent random variables the results of the Shapley effects with existing extensions of the Sobol' indices.


Introduction
In the last decades, computational models have been increasingly used to approximate physical phenomenons. The steady improvement of computational means, lead to the use of very complex numerical codes involving an increasing number of parameters. In many situations, the model inputs are uncertain resulting in uncertain outputs. In this case it is necessary to understand the global impact of inputs uncertainties on the output to validate the computer code and use it properly. Sensitivity Analysis methods aim at solving this range of issues by characterizing input-output relationships of computer codes.
Within Sensitivity Analysis, three kinds of methods can be distinguished. First, Screening methods aim to discriminate influential inputs from non influential ones, especially when the inputs are numerous and the problem should be simplified. Secondly, local methods, based on partial derivatives, are used to assess the influence of input variables for small perturbations. Finally, Global Sensitivity Analysis (GSA) methods aim at ranking input random variables according to their importance in the output uncertainty, or even quantify the global influence of a particular input on the output. In this paper we are specifically interested in global sensitivity analysis. One can refer to Iooss and Lemaître (2015) for a comprehensive review of sensitivity analysis methods.
Among GSA methods, variance-based approaches are a class of probabilistic ones that measure the part of variance of the model output which is due to the variance of a particular input. Theses methods were popularized by Sobol (1993) who introduced the well known first order Sobol' indices. Shortly after, the total Sobol' indices have been introduced by Homma and Saltelli (1996) also taking advantage of Jansen et al. (1994). These sensitivity indices are based on the functional ANalyse Of VAriance (ANOVA) which is unique only making the assumption of independence between the input random variables. However, this hypothesis is sometimes not verified in practice making their interpretation much harder. Several works have been carried to deal with this difficulty and extend Sobol' indices to the case of a stochastic dependence between the input variables, as Chastaing et al. (2012); Mara and Tarantola (2012); Mara et al. (2015); Kucherenko et al. (2012). But the practical estimation of these sensitivity measures and their interpretation remain difficult.
Recently, Owen (2014) established a relation between the Shapley values (Shapley and Shubik, 1954) coming from the field of game theory and Sobol' indices. Song et al. (2016) proposed an algorithm to estimate these indices. Some studies also highlighted the potential of this kind of index in the case of correlated input, as Owen and Prieur (2017); Iooss and Prieur (2017). In this last case the Shapley effects can be a good alternative to the existing extensions of Sobol' indices mentioned above. Indeed, Shapley effects allows an apportionment of the interaction and dependences contributions between the input involved, making them condensed and easy-to-interpret indices.
Most estimation procedures of the Sobol' indices and Shapley effects are based on Monte-Carlo sampling. These methods require large sample sizes in order to have a sufficiently low estimation error. When dealing with costly computational models, a precise estimation of these indices can be difficult to achieve or even unfeasible. Therefore, the use of a surrogate model (or metamodel) instead of the actual one can be a good alternative and dramatically decrease the computational cost of the estimation. Various kinds of surrogate model exists in the literature, such as Fang et al. (2005). In this paper we get interested in the use of kriging as metamodels (see for example Martin and Simpson (2004)). The approach developed here is based on Le Gratiet et al. (2014) who provides an estimation algorithm of Sobol' indices using kriging models which allows computing the meta-model and Monte-Carlo errors.
In this paper, we draw a comparison between the Shapley effects and the independent and full Sobol' indices defined in Mara et al. (2015). We also establish an extension of the Shapley estimation algorithm proposed in Song et al. (2016) by implementing a bootstrap sampling to catch the Monte-Carlo error. Inspired by the work of Le Gratiet et al. (2014), we used a kriging model in substitution of the true model for the estimation of these indices. Thus, the kriging model error is associated to the Monte-Carlo error in order to correctly catch the overall estimation error.
The paper's outline is as follow: Section 2 recalls the basic concept of Sobol' indices in the independent and dependent configuration; Section 3 introduces the Shapley values and their links with sensitivity analysis; Section 4 theoretically compares the Sobol' indices and the Shapley effects for two toy examples; Section 5 studies the quality of the estimated Shapley effects and their confidence intervals; Section 6 introduces the kriging model and how the kriging and Monte-Carlo errors can be separated from the overall error; Section 7 compares the indice performances using a kriging model on two toy examples; finally, Section 8 synthesizes this work and suggests some perspectives.

Sobol' sensitivity indices 2.1 Sobol' indices with independent inputs
Consider a model Y = η(X) with d random inputs denoted by X D = {X 1 , X 2 , . . . , X d }, where D = {1, 2, . . . , d}, and X J indicates the vector of inputs corresponding to the index set J ⊆ D. η : R d → R is a deterministic squared integrable function and Y ∈ R the model output random variable. The random vector X follows a distribution p X and we suppose, in this section, that p X follows a d-dimensional uniform distribution U([0, 1] d ). However, these results can be extended to any marginal distributions. In particular, all inputs are independent and the distribution of X is only defined by its margins. (1948), also known as high dimensional model representation (HDMR) (Li et al., 2001), allows writing η(X) in the following way:

The Hoeffding decomposition introduced in Hoeffding
for some η ∅ , η i , . . . , η 1,...,d set of functions. In this formula, η is decomposed into 2 d terms such as η ∅ is a constant and the other terms are square integrable functions.
The decomposition (1) is not unique due to the infinite possible choice for these terms. The unicity condition is granted by the following orthogonality constraint: where 1 ≤ i 1 < i 2 < · · · < i s ≤ d and i w ∈ {i 1 , i 2 , . . . , i s }. The consequence of this condition is that the terms of (1) are orthogonal to one another. This property implies the independence of the random variables X i in the stochastic configuration and allow to obtain the following expressions for the functions η i 1 ,i 2 ,...,is of (1) : where X ∼i = X D\{i} and similarly for higher orders. Thus, the functions {η i } d i=1 are the main effects, the η i,j for i < j = 1, . . . , d are the second-order interaction effects, and so on.
The representation (1) leads to the functional ANAlyse Of VAriance (ANOVA) which decompose the global variance into a sum of partial variances such as The so-called Sobol' indices (Sobol, 1993) can be derived from (6) by dividing both sides with Var(Y ). This operation results in the following property: where S i is a first-order sensitivity index, S ij is a second-order sensitivity index and so on. Thus, sensitivity indices are defined as The first-order index S i measures the part of variance of the model output that is due to the variable X i , the second-order S ij measure the part of variance of the model output that is due to the interaction of X i and X j and so on for higher interaction orders.
Another popular variance based coefficient, called total Sobol' index by Homma and Saltelli (1996), gathers the first-order effect of a variable with all its interactions. This index is defined by The property (7) does not always hold for the total indices as summing total indices for all variables introduces redundant interactions terms appearing only once in (7). Thus, in most cases d i ST i ≥ 1. Note that both the first order and total Sobol' indices are normalized measures.
As mentioned in the introduction (6) holds only if the random variables are independent. Different approaches exist to treat the case of dependent input and one of them is explained in Section 2.2.

Sobol' indices with dependent inputs
In this section, we suppose that X ∼ p X with correlations between the components. Thanks to the Rosenblatt Transformation (RT) (Rosenblatt, 1952), it is possible to transform X ∼ p X into an uniform and independent random vector U ∼ U d (0, 1). However, due to the possible permutations of elements of X, this transformation is not unique and has d! possibilities. Note that in this procedure, only the d RT obtained after left circularly reordering the elements (X 1 , . . . , X d ) are considered. We denote as It is important to note that this RT corresponds to a particular ordering i. Changing the ordering results in another RT. Such a mapping is bijective and we can consider a function g i such as Y = η(X) = g i (U i ). Because the elements of U i are independent, the ANOVA decomposition is unique and can be established to compute sensitivity indices. Thus, we can write where g ∅ = E[g i (U i )]. Because the summands in (11) are orthogonal, the variance based decomposition can be derived, such that and so on for higher orders. The so-called Sobol' indices are defined by dividing (12) with the total variance Var(Y ) such that, We also consider the total sensitivity indice introduced by Saltelli (2002) which is the overall contribution of U i k on the model output including the marginal and interaction effects. They can be written as where U i ∼k is the vector U i not containing U i k . We refer to Iooss and Lemaître (2015) for a review on the sensitivity indices and their properties. Mara et al. (2015) derived equations (13) and (14), to introduce two types of indices which deal with correlated variables: • the full Sobol' indices which describe the influence of a variable including its dependencies with other variables, • the independent Sobol' indices which describe the influence of variables without its dependencies with other variables.
For a given ordering (X i , X i+1 , . . . , X d , X 1 , . . . , X i−1 ), the joint density of X can be written as From this ordering and a given RT, the full and independent Sobol' indices can be described by with the convention that U d+1 = U 1 .
Thanks to RT, we can also define the sensitivity indices of (X i |X u ), i = 1, . . . , d and u ⊂ D\{i}, u = ∅ via U i u which represent the effect of X i without its mutual dependent contribution with X u . These indices can be estimated with a Monte Carlo algorithm and the procedure is describe in the next section.

Estimation
The estimation of (S i , ST i , S ind i−1 , ST ind i−1 ) can be done with four samples using the "pick and freeze" strategy (see Saltelli et al. (2010)). The procedure is divided in two steps: • -Two independent sampling matrices A and B of size N × d are created from U(0, 1) d -B (1) : all columns from B except the 1-th (d-th) column which is from A • Compute the indices with a given estimator: where g i 0 is the estimate of the mean and V = 1 This procedure considers the estimator from Janon et al. (2014) and the overall cost is 4dN with N the number of samples. However, another estimator can be used to estimate the indices. See Saltelli et al. (2010) for a review of various estimators of sensitivity indices.

Shapley effects
The purpose of the Sobol' indices is to decompose Var(Y ) and allocate it to each subset J whereas the Shapley effects decompose Var(Y ) and allocate it to each input X i . This difference allows to consider any variables regardless of their dependence with other inputs.

Definition
One of the main issues in cooperative games theory is to define a relevant way to allocate the earnings between players. A fair share of earnings of a d players coalition has been proposed in Shapley (1953). Formally, in Song et al. (2016) a d-player game with the set of players D = {1, 2, . . . , d} is defined as a real-valued function that maps a subset of D to its corresponding cost, i.e., c : 2 D → R with c(∅) = 0. Hence, c(J ) represents the cost that arises when the players in the subset J of D participate in the game. The Shapley value of player i with respect to c(·) is defined as where |J | indicates the size of J . In other words, v i is the incremental cost of including player i in set J averaged over all sets J ⊆ D {i}.
This formula can be transposed to the field of global sensitivity analysis (Owen, 2014) if we consider the set of inputs of η(·) as the set of players D. We then need to define a c(·) function such that for J ⊆ D, c(J ) measures the part of variance of Y caused by the uncertainty of the inputs in J . To this aim, we want a cost function that verifies c(∅) = 0 and c(D) = Var(Y ). However, for some reasons described at the end of the section 3.1 of the article Song et al. (2016), about the estimation of these two cost functions, it is better to define the Shapley effect of the i-th input, Sh i , as the Shapley value obtained by applying the cost function c instead ofc. We denote in the sequel the Shapley effect by Sh i and a generic Shapley value by v i . A valuable property of the Shapley effects defined in this way is that they are non-negative and they sum to one. Each one can therefore be interpreted as a measure of the part of the variance of Y related to the i-th input of η.

Estimation of the Shapley effects
An issue with the Shapley value is its computational complexity as all possible subsets of the players need to be considered. Castro et al. (2009) proposed an estimation method based on an alternative definition of the Shapley value.
Indeed, the Shapley value can also be expressed in terms of all possible permutations of the players. Let us denote by Π(D) the set of all possible permutations with player set D. Given a permutation π ∈ Π(D), define the set P i (π) as the players that precede player i in π. Thus, the Shapley value can be rewritten in the following way : From this formula, Castro et al. (2009) proposed to estimate v i with v i by drawing randomly m permutations in Π(D) and thus we have : ) and c(·) the cost function.
where m refers to the number of permutations. Song et al. (2016) proposed the following two algorithms whose we just give the main features: • The exact permutation method if d is small, one does all possible permutations between the inputs (i.e. m = d!); • The random permutation method which consists in randomly sampling m permutations of the inputs in Π(D).
For each iteration of this loop on the inputs' permutations, a conditional variance expectation must be computed. The cost C of these algorithms is the following Note that the full first-order Sobol' indices and the independent total Sobol' indices can be also estimated by applying these algorithms, each one during only one loop iteration.
Based on theoretical results, Song et al. (2016) recommends to fix parameters at the following values to obtain an accurate approximation of Shapley effects computationally affordable: • The exact permutation method: N o as large as possible and N i = 3; • The random permutation method: N o = 1, N i = 3 and m as large as possible.
The choice of N v is independent from these values and Iooss and Prieur (2017) have also illustrated the convergence of two numerical algorithms for estimating Shapley effects.

Confidence interval for the Shapley effects
In this part, we propose a methodology to compute confidence interval for the Shapley effects, which will allow us to quantify the Monte-Carlo error (sampling error).

Exact permutation method: bootstrap
Concerning this algorithm, we'll use the bias-corrected percentile method of the Bootstrap (Efron, 1981).
Let be θ(X 1 , . . . , X n ) an estimator of a unknown parameter θ, function of n independent and identically distributed observations of law F. In non-parametric Bootstrap, from a n-sample (x 1 , . . . , x n ), we compute θ(x 1 , . . . , x n ). After, we draw with replacement a bootstrap sample (x * 1 , . . . , x * n ) from the original sample (x 1 , . . . , x n ) and compute θ * = θ(x * 1 , . . . , x * n ). We repeat this procedure B times and obtain B bootstrap replications θ * 1 , . . . , θ * B which allows the estimate of the following confidence interval of level 1 − α for θ: where • Φ is the cdf of a standard normal distribution; • z α/2 percentile of level α/2 of N (0, 1); • G is the cdf of the bootstrap distribution for the estimator θ; This confidence interval has been justified in Efron (1981) when there exists an increasing transfor- for some constants z 0 ∈ R and σ > 0. In the sequel, we'll see in our examples that g(.) can be considered as identity.
Thus, we need independent observations to obtain this interval but in our case as there is conditioning in the Shapley effects (more exactly in the cost function), it's not possible. To overcome this problem and estimate correctly the cdf G(.), we make a bootstrap by bloc (on the N o blocs) in order to use independent observations and preserve the correlation within each one. This strategy allowed to develop the algorithm (1) in order to obtain the distribution of Sh i to calcule the confidence interval for Sh i . It's suitable to remark that confidence intervals for the Shapley effects can also be calculated from the Central Limit Theorem (CLT) on the outer loop (Monte Carlo sample of size No) as Iooss and Prieur (2017) did it. But, it was necessary to establish a method based on the Bootstrap in order to be able to design in the sequel an algorithm allowing to distinguish correctly the metamodel and Monte-Carlo errors.

Random permutation method: CLT
For the random permutation method, we have two options to calculate confidence intervals.
• The first one is to use the CLT like Iooss and Prieur (2017). Indeed, in Castro et al. (2009) the CLT gives us: Thus, by estimating σ by σ we have the following 1 − α asymptotic confidence interval for the Shapley effects : • The second one is we can estimate the confidence interval doing a bootstrap on the permutations. We describe in the algorithm (2) the procedure allowing to do that.  (27) ; Sample with replacement m permutations from the original sample and retrieve in y (2) those corresponding to drawn bootstrap permutations ;

10
Compute Sh i b thanks to the equation (27). ; 11 end 12 Compute confidence intervals for Sh i with 28.

Examples in Gaussian framework: analytical results and relations between indices
In this section, we compare and interpret the analytic results of the studied indices for two different Gaussian models: an interactive and a linear model. We study the variation of the indices by varying the correlation between the input random variables.

Interactive model with two inputs
Let us consider a purely interactive model with X ∼ N (0, Σ). We consider two cases: a model with independent variables and another with dependent variables. So we have the two following covariance matrices: Table 1: Sensitivity indices of independent and dependent Gaussian models

Shapley effects
From the definition of sensitivity indices, for j = 1, 2, we get for these models the results presented in Table 1. In the independent model, the independent and full first-order Sobol indices are null because there is no dependence and the inputs have not marginal contribution. Thus, the independent and full total Sobol indices represent the variability in the model which is due to interactions only. These ones are each equal to the variance model, i.e. each input is fully responsible of the model uncertainty, due to its interaction with the other variable. In contrast, the Shapley effects award fairly the interaction effect to each input, which is more logical. About the dependent model, S ind j = 0, j = 1, 2 are still null because the inputs have not uncorrolated marginal contribution. But now, S f ull j = 0, j = 1, 2 and represent marginal contribution due to the dependence. We see in these terms that the dependence effect (ρ 2 β 2 1 β 2 2 σ 2 1 σ 2 2 ) is counted two times in comparison with the total variance. Concerning the independent and full total Sobol indices, the interaction effect (β 2 1 β 2 2 σ 2 1 σ 2 2 ) of these indices is still allocated half in Shapley effects. Besides, for the full total Sobol indices, each term is equal to the variance model whereas the interaction and dependence effects are fairly distributed for the Shapley effects which sum to the total variance. This example supports the idea mentioned in Iooss and Prieur (2017) whereby a full Sobol index of an input comprises the effect of another input on which it is dependent. We can add that the model is independent or not, the phenomenon is similar for the interaction effect about the independent and full total Sobol indices of an input, i.e. these indices comprise the effect of another input on which the input is interacting.
In their article, Iooss and Prieur (2017) tell which goals of the SA settings defined in Saltelli and Tarantola (2002) and Saltelli et al. (2004) the four Sobol indices as well as the Shapley effects apply. According to them, a combined interpretation of the four Sobol indices would just allow to do the FP (Factor prioritization) setting. But we can add that these indices allow also to do the FF (Factor Fixing) setting only if a factor has both indices ST ind Var(Y ) = 0 and as the variance is always a positive function, that implies Var(Y |X ∼i ) = 0 and Var (Y | (X ∼i |X i )) = 0. Thus, Y can be expressed only as a function of X ∼i or X ∼i |X i ,i.e. X i has not impact on Y taking account the dependence or not.
About the Shapley effects, they would allow to do the VC (Variance Cutting) setting as the sum is equal to Var(Y ) and the FF setting. Sure enough, if Sh i = 0, then we have ∀J ⊆ D\{i}, Var Y |X −(J ∪{i}) = Var [Y |X −J ] and so express Y as a function of X −(J ∪{i}) equates to express Y as a function of X −J . Hence, X i is not an influential input in the model and can be fixed. The FP setting is not achieved according to them because of the fair distribution of the interaction and dependence effects in the indice. However, this share allocation makes the Shapley effects easier to interpret than the Sobol' indices and might be a great alternative to the four Sobol' indices. Thus, in the sequel, we'll compare the Sobol indices' and the Shapley effects on an basic examples to see if they make correctly the factor prioritization.

Linear model with three inputs
Let us consider with the constants β 0 ∈ R, β ∈ R 3 and X ∼ N (0, Σ) with the following covariance matrix : We obtained the following analytical results.
• For j = 1, 2, 3, from the definition of independent Sobol indices, we have: • For j = 1, 2, 3, from the definition of full Sobol indices, we have: In both cases, full and independent Sobol indices, the first order index is equal to the total order index because the model is linear, i.e., there is no interaction between the inputs.
• For j = 1, 2, 3, in this example we obtain the following decomposition for the Shapley effects : So, for the linear Gaussian model we found a relation between the Shapley effects and the sensitivity indices obtained with the RT method. For more details about the calculation of Shapley effects, we refer the readers to the Appendix A.1. About the results, as the formula is similar regardless the input, we analyse it with the first input. We observe that the Shapley effect Sh 1 is in some way the average of all possible contributions of the input X 1 in the model. Indeed, S f ull 1 represents the full marginal contribution of X 1 . Then, we have the total contributions of X 1 without its correlative contribution with each element of the set D = {1, 2, 3}\{1} = {2, 3}. Sure enough, ST 1|2 is the total contribution of X 1 without its correlative contribution with X 2 , i.e. ones just look at the total effect with its dependence with X 3 ; ST 1|3 is the total contribution of X 1 without its correlative contribution with X 3 , i.e. ones just look at the total effect with its dependence with X 2 and finally the uncorrelated total contribution of X 1 via ST ind 1 = ST 1|2,3 . As in {2, 3}, there are two elements of size one, we find the coefficients 1/2 before ST 1|2 and ST 1|3 and 1 for ST ind 1 . We then find the fair allocation of the Shapley effects.

Particular cases
Now, we'll consider several particular cases of correlation in order to compare the prioritization obtained with the Sobol' indices and the Shapley effects. We'll take in the following examples β 0 = 0 ; β 1 = β 2 = β 3 = 1 and σ 1 = 0.2, σ 2 = 0.6, σ 3 = 1. By making this choice, we define implicitly the most influential variables and we want to observe how the correlation affects the indices. Besides, for each considered case, we verify that the covariance matrix is positive definite.   Iooss and Prieur (2017) and thus, all the indices carry out to the same ranking of the inputs.
In the second configuration with the symmetric case, we remark a decrease of the independent Sobol indices and an increase of the full Sobol indices with respect to the independent model (α = ρ = γ = 0). As regards of the Shapley effects, it reduces for the third input, raises slightly for the second input and significantly for the first input. All these changes are due to the mutualization of uncertainties within the inputs because of the correlation but the individual contributions of the inputs are still well captured for all the indices. Indeed, in spite of the correlation, all the indices indicate the same ranking for the inputs.
In this last configuration, we have strongly correlated a non-influential variable (X 1 has a low variance) in the model with two very influential variables. The independent Sobol' indices give us as ranking: X 3 , X 2 , X 1 . However, as the values of these indices are close to zero, we can suppose they are not significant and implicitly the ranking too. We obtain with the full indices the following ranking X 1 , X 3 , X 2 . X 1 is supposed to be a non-influential variable and turns out to explain 95% of the model variance. Which is logical because being highly correlated with X 2 and X 3 , X 1 has a strong impact on these variables. Then, X 2 and X 3 are correlated in the same way with X 1 and weakly between them. Regardless of the correlations, X 3 is more influential than X 2 in the model, hence this second position taking account the correlation. Lastly, we obtain the same ranking as the full Sobol' indices with the Shapley effects. FP (Factors Prioritization) setting aims to find which factors would allow to have the largest expected reduction in the variance of the model output. Thus, if we follow the previous ranking, we should reduce the uncertainty on the first input. But we'll make several tests by reducing the uncertainty of 20% one by one on each input and we get: Model variance σ 1 = 0.2, σ 2 = 0.6, σ 3 = 1 2.06 σ 1 = 0.16, σ 2 = 0.6, σ 3 = 1 1.95 σ 1 = 0.2, σ 2 = 0.48, σ 3 = 1 1.86 σ 1 = 0.2, σ 2 = 0.6, σ 3 = 0.8 1.60 Table 3: Model variance by reducing the uncertainty on each input one by one It is clearly observed that the largest expected reduction in the variance is obtained with the third input. These results conflict the obtained ranking with the full Sobol indices and the Shapley effects. Indeed, X 1 is an influential input only because of the strong correlation with X 2 and X 3 , and these indices capture this trend. However, without this correlation X 1 is non-influential input and the independent Sobol indices are supposed to highlight meaningfully that these are the inputs X 2 and X 3 which are the most influential without take account the correlation. Nevertheless, these indices struggle to emphasize the uncorrelated marginal contributions of these inputs due to the small values we obtain.
Thus, on this basic example we can see that the combined interpretation of the four Sobol indices as well as the Shapley effects doesn't allow to answer correctly to the purpose of the Factor Prioritization (FP) setting, i.e. on which inputs the reduction of uncertainty leads to the largest reduction of the output uncertainty. We can make a factor prioritization with these indices but not for the goal defined at the outset.

Numerical studies
Optimal values for the parameters of the exact and random permutation methods were given by Song et al. (2016). Using a toy example, we empirically study how the algorithm settings can influence the estimation of the indices. We compare the accuracy of the estimations of the Sobol' indices from the Shapley algorithm and the RT method.

Parametrization of the Shapley algorithms
As defined in Section 3.2, three parameters of the Shapley algorithm govern the estimation accuracy: N v , N o and N i . The first one, is the sample-size for the output variance estimation of Y . The second, is the number of outer loop for the sample-size of the expectation and the third one is the number of inner loop which controls the sample-size for the variance estimation of each conditioned distribution. Therefore, we empirically show the influence of N o and N i on the estimation error and the coverage probability. The Probability Of Coverage (POC) is defined as the probability to have the true indice value inside the confidence intervals of the estimation. We consider the three dimensional linear Gaussian model of Section 4.2 as a toy example with independent inputs, β 1 = β 2 = β 3 = 1, σ 1 = σ 2 = 1 and σ 3 = 2. The POC is estimated with 100 independent algorithm runs and for a 90 % confidence interval. When the bootstrap procedure is considered, the confidence intervals are estimated with 500 bootstrap sampling. We also set a large value of N v = 10000 for all the experiments.

Theses variances are estimated through Monte-Carlo procedures. The output variance Var[Y ] is computed from a sample
First experiments aim to show the influence of N o on the estimation accuracy and the POC for the exact permutation algorithm. The Figure 1 shows the variation of the POC (solid lines) and the absolute error (dashed lines), averaged over the three indices, in function of the product N o N i m, where only N o is varying and for three values of N i at 3, 9 and 18. Because the errors are computed for 100 independent runs, we show in color areas the 95% quantiles.
We observe that the estimation error is similar for the three different values of N i and decrease to 0 at the same rate. The true difference is for the POC which tends, at different rates, to the true probability: 90 %. For a same computation budget N o N i m, the smaller the value of N i and the larger the value of N o . Thus, these results show that, in order to have a correct confidence interval it is more important to have a large value of N o instead of N i . Indeed, exploring multiple conditioned variances with a lower precision (large N o and low N i ) is more important than having less conditioned variances with a good precision (low N o and large N i ). As for the exact permutation algorithm, we can see that the estimation errors are similar for the three values of N i and the difference is shown for the POC. We observe that the lower N i and the faster the POC converges to the true probability. Indeed, for a same computational cost, the lower N i and the larger the number of permutations m can be.
To show the influence of N o with the random permutation algorithm, the Figure 3 is the same as Figure 2 but with N o = 3. We observe that the convergence rates of the POC are slower than the ones for N o = 1. Thus, it shows that having a lower value of N o and a large value of m is more important to have consistent confidence intervals.
From these experiments, we can conclude that the parametrization does not significantly influence the estimation error but has a strong influence on the POC. Moreover, these experiments were established on different toy examples (Ishigami model defined in Section 7.2 and interactive model) and the same conclusion arises. Therefore, in order to have consistent confidence intervals, we can suggest: • for the exact algorithm to consider N i = 3 and to take N o as large as possible, • for the random permutation algorithm to consider N i = 3, N o = 1 and take m as large as possible.
This conclusion confirms the proposed parametrization of Song et al. (2016) explained in 3.2 and the suggestion analyzed in Iooss and Prieur (2017).

Minor bias observed
At the start of this section, we chose to establish these experiments for independent random variables. This choice was justified by unexpected results obtained for correlated variables. The Figure  4 illustrates the same experiment as Figure 1 but with a correlation of γ = 0.9 between X 2 and X 3 . We observed that the POC of the total Sobol' indice starts to tend to the true probability (at 90%) before slowly decreasing. Thus, it seems that the confidence intervals are underestimated or the indice estimation is biased. To verify this observation, Figure 5 shows the estimation of the total Sobol' indice for N v = 20000, N o = 10000, N i = 3 with the histogram from the bootstrap sampling in blue, the estimated indice ST i in red line and the true indice in green line. It is clear that the true value for X 2 and X 3 is outside of estimated distribution. This explains why the coverage probability is decreasing in 4. Moreover, this phenomenon only happens to the indices of X 2 and X 3 , which are correlated and it seems that this bias increases with the correlation strength for this example. Therefore, the reasons of this slight bias should be investigated in future works

Comparing Sobol' index estimation using Shapley algorithm and RT method
An interesting result of the Shapley algorithm, is that it gives the full first-order Sobol' indices and the independent total Sobol' indices in addition to the Shapley effects. We compare the estimation accuracy of the Sobol' indices obtained from the Shapley algorithm and the ones from the RT method. We consider the same example as Section 5.1 but with dependent random variables. In this section, only the pair X 2 -X 3 is correlated with parameter γ.
A first experiment aims to validate the confidence intervals estimated from the bootstrap sampling of the RT method by doing the same experiments as in Section 5.1 by increasing the sample-size N . The Figure 6 shows the absolute error and the POC with the computational budget (4 × N × d) for the full first-order Sobol' indices and the independent total Sobol' indices for γ = 0.5. As we can see the error tends to zero and the POC converges quickly to the true probability. Thus, we can see that the confidence intervals correctly catch the Monte-Carlo error.
We recall from Section 3 that the full first-order Sobol' indices are equivalent to the classical first-order Sobol' indices and the independent total indices are the classical total indices. The Figure  7 shows the estimated indices with γ = 0.5 from the Shapley algorithm and the RT method for similar computational costs. We observe that both algorithms seem to correctly estimate the Sobol' indices for a low computational cost. However, in this example, the estimation errors from the RT method is much larger than the ones from the Shapley algorithm.
We recall in Section 2.3 that RT method used the Janon estimator from Janon et al. (2014). The accuracy of the Sobol' estimator depends on the values of the target indices and the Janon estimator is less accurate for low value indices. Changing with another estimator, such as the one from Mara et al. (2015), can lead to another estimation variance as shown in Figure 8. We observed that the estimation errors from the RT method depends of the used estimator and this error is lower using estimator from Figure  The Figure 9 shows the Sobol' indices for the exact Shapley algorithm and the RT method in function of the correlation γ between X 2 and X 3 . The lines shows the true values of the indices and the areas are the 95% confidence intervals of the indices.
This experiment shows that the estimation of the Sobol' indices from the Shapley algorithm gives satisfying estimations of the first full and total ind Sobol' indices. Note that the error of estimation is similar for both the exact or random permutation algorithm if we consider the same computational budget.

Kriging metamodel with inclusion of errors
Shapley effects are a suitable tool for performing global sensitivity analysis. However, their estimates require an important number of simulations of the costly function η(x) and often cannot be processed under reasonable time constraint. To handle this problem, we useη(x) an approximating function of the numerical model under study η(x) (Fang et al., 2005). Its main advantage is obvioulsy to be much faster-to-calculate than the original one. In addition, if one uses a kriging method (Sacks et al., 1989) to build thisη(x) surrogate model, a quantification of the approximation uncertainty can be easily produced. The Shapley effects can then be calculated using the metamodelη(x) instead of η(x) with a control on the estimation error.
We present in this section a methodology for estimating the Shapley effects through a kriging surrogate model taking into account both the Monte Carlo error and the surrogate model error.

Introduction to the kriging model
Kriging, also called metamodeling by Gaussian process, is a method consisting in the use of an emulator of a costly computer code for which the interpolated values are modeled by a Gaussian process. More precisely, it is based on the assumption that the η(x) function is the realization of a Gaussian random process. The data is then used to infer characteristics of this process, allowing a joint modelization of the code itself and the uncertainty about the interpolation on the domain. In general, one assumes a particular parametric model for the mean function of the process and for its covariance. The parameters of these two functions are called "hyperparameters" and are estimated using the data. The Gaussian hypothesis then provides an explicit formula for the law of the process conditionaly to the value taken by η on a design of experiments D.
Thus, we consider that our expensive function η(x) can be modeled by a Gaussian process H(x) which's mean and variance are such that E[H(x)] = f (x)β and Cov (H(x), H(x)) = σ 2 r(x,x), where r(x,x) is the covariance kernel (or the correlation function) of the process.
Then, η(x) can be easily approximated by the conditional Gaussian process H n (x) having the predictive distribution [H(x)|H(D) = η n , σ 2 ] where η n are the known values of η(x) at points in the experimental design set D = {x 1 , . . . , x n } and σ 2 is the variance parameter. Therefore, we have where the mean m n (x) is given by ..,n , and The variance s 2 n (x,x) is given by The variance parameter σ 2 can be estimated with a restricted maximum likelihood method.

Kriging based Shapley effects and estimation
Inspired by the idea used in Le Gratiet et al. (2014) for the Sobol indices, we substitute the true function η(x) with H n (x) in (25) which leads to where the exact function Y = η(X) is replaced by the Gaussian process H n (X) in the cost function such as c n Therefore, if we denote by (Ω H , F H , P H ) the probability space where the Gaussian process H(x) lies, then the index Sh i n lies in (Ω H , H, P H ) (it is hence random). Then, for estimating Sh i n , we use the same estimator (27) developed by Song et al. (2016) in which we remplace Y by the Gaussian process H n (X) in the cost function to obtain : where c n is the Monte-Carlo estimator of c n .

Estimation of errors : Monte Carlo and surrogate model
The estimator (34) above integrates two sources of uncertainty : the first one is related to the metamodel approximation, and the second one is related to the Monte Carlo integration. So, in this part, we quantify both by decomposing the variance of Sh i n as follows : where Var H E X Sh i n |H n (x) is the contribution of the metamodel on the variability of Sh i n and is that of the Monte Carlo integration.
In section 4 of the article Le Gratiet et al. (2014), they proposed the algorithm (3) we adapted here to estimate each of these contributions. From this algorithm and some theoretical results, Le Gratiet et al. (2014) proposed estimators in section 4.2 to estimate each of these contributions.

Numerical simulations with kriging model
This section aims at estimating the studied indices using a surrogate model in substitution of the true and costly computational code. The previous section explained the theory behind the Gaussian processes to emulate a function. The Section 6.3 explained that the kriging error can be estimating through a large number of realization of the Gaussian Process in addition to the Monte-Carlo error estimated through a bootstrap sampling. In this section, we illustrate the decomposition of the overall error from the estimation of the indices and we consider as examples the additive Gaussian framework and the Ishigami function. We also consider the industrial application of introduced in Rupin et al. (2014) and also used in Iooss and Prieur (2017).

Gaussian framework
We use the same configuration as in the Section 5.3 with a correlation coefficient ρ = 0.7. To illustrate the influence of the kriging model in the estimation of the indices, we show in Figure 10 the distribution of the estimators of the indices with the procedure using the true function (top figure) and using the surrogate function (bottom figure). We took N v = 1000, N o = 100 and N i = 3 for the two graphics.
The kriging model is built with 10 points using a LHS sampling (at independence) and a Matern kernel with a linear basis, leading to a Q 2 of 0.90 and the kriging error is estimated with N H = 300 realizations. We intentionally took low values for the algorithm parameters in order to have a relatively high variance. If we compare the violinplots of the two figures, we observe that the variance of the estimation is larger for the kriging configuration. This is due to the additional error from the kriging model. The Figure 11 allows to distinguish which part the overall error is due to the kriging. We see immediately what the kriging error is larger than the Monte-Carlo error and it is normal that this error feeds through to the quality of the estimations as observed in Figure 10.

Ishigami Function
Introduced in Ishigami and Homma (1990), the Ishigami function is typically used as a benchmarking function for uncertainty and sensitivity analysis. It is interesting because it exhibits a strong nonlinearity and has interactions between variables. For any variable x = (x 1 , x 2 , x 3 ) ∈ [−π, π] 3 , the model function can be written as η(x) = sin(x 1 ) + 7 sin 2 (x 2 ) + 0.1x 4 3 sin(x 1 ). In this example, we consider that the random variable X follows a distribution p X with uniform margins U[−π, π] and a multivariate Gaussian copula c ρ with parameter ρ = (ρ 12 , ρ 13 , ρ 23 ). Thanks to the Sklar Theorem (Sklar, 1959), the multivariate cumulative distribution function F of X can be written as where F 1 , F 2 , F 3 are the marginal cumulative distribution functions of X. In the independent case, analytical full first order and independent total Sobol' indices are derived as well as the Shapley effects. Unfortunately, no analytical results are available for the other indices. Thus, we place in the sequel in the independent framework.
Remind that the main advantage of the metamodel is to be much faster-to-calculate than the original function. Thus, we can use this characteristic in order to decrease the Monte-Carlo error during the estimation of the indices by increasing the calculation budget.
In this example, the kriging model is built with 200 points using an optimized LHS sampling (at independence) and a Matern kernel with a linear basis, leading to a Q 2 of 0.98 and the kriging error will be estimated subsequently with N H = 300 realizations. To illustrate the influence of the kriging model in the estimation of the indices, we show in Figure 12  If we compare the violinplots of the two figures, we observe that the variance of the estimations is higher with the true function. For the true function, the uncertainty is only due to the Monte-Carlo estimation. For the surrogate function, as observed in Figure 13, in spite of a slight metamodel error, this same Monte-Carlo is obviously lower owing to a higher calculation budget. Hence, if the metamodel approximates correctly the true function, it is better to use it to estimate the sensitivity indices to gain accuracy on the distribution of the estimators.

Conclusion
Throughout this article, we studied the Shapley effects and the independent and full Sobol' indices defined in Mara and Tarantola (2012) for the models with a dependence structure on the input variables. The comparison between these indices revealed that • the full Sobol' index of an input includes the effect of another input on which it is dependent, • the independent and full total Sobol' indices of an input includes the effect of another input on which it is interacting, • the Shapley effects rationally allocate these different contributions for each input.
Each of these indices allows to answer certain objectives of the SA settings defined in Saltelli and Tarantola (2002) and Saltelli et al. (2004). But, it's important to pay attention about the FP setting. This one can be made with the Shapley effects but not for the goal defined at the outset, i.e. prioritize the input variables taking account the dependence but not to find which would allow to have the largest expected reduction in the variance of the model output. Always about the FP setting, it was declared in conclusion of our example that the combined interpretation of the four Sobol' indices doesn't allow to answer correctly to the purpose of the FP setting due to the small values that have been obtained for the independent Sobol' indices. However, although these values were close to zero, the ranking that they had provided was correct to make FP setting. Hence, it could be investigated whether these values are significant or not.
A relation between the Shapley effects and the Sobol' indices obtained from the RT method was found for the linear Gaussian model. It would be interesting to see if this relation could be extended to a general linear model in the first instance and subsequently if a overall relation can be established between these indices for a global model.
About the estimation procedure of the Shapley effects, a major contribution of this article is the implementation of a bootstrap sampling to estimate the Monte-Carlo error. The CLT can give confidence intervals but require large sample size in order to be consistent, which is rarely possible in practice for expensive computer codes. We confirmed that the parametrization of the Shapley algorithms proposed by Song et al. (2016) and analyzed by Iooss and Prieur (2017) is correct and optimal in order to have consistent confidence intervals. The numerical comparison of the Sobol' indices estimated from the Shapley algorithm and the RT method for a toy example showed that the estimations from the Shapley algorithm are a bit less accurate than the ones from the RT method, but are very satisfying for an algorithm that is not design for their estimation. A second contribution is the splitting of the metamodel and Monte-Carlo errors when using a kriging model to substitute the true model. The numerical results showed that for a reasonable number of evaluations of a kriging model, one can estimate the Shapley effects, as well as the Sobol' indices and still correctly catch estimation error due to the metamodel or the Monte-Carlo sampling. Unfortunately, the computational cost to generate a sample from a Gaussian Process realization increases significantly with the sample-size. Thus, because the Shapley algorithm becomes extremely costly in high dimension, the estimation of indices using this technique can be computationally difficult.
The Shapley algorithm from Song et al. (2016) is efficient, but is extremely costly in high dimension. The cost is mainly due to the estimation of the conditional variances. A valuable improvement of the algorithm would be the use of a Kernel estimation procedure in order to significantly reduce the number of evaluation. The Polynomial Chaos Expension are good to compute the Sobol' indices analytically from the polynomial coefficients (Crestaux et al., 2009). It would be interesting to have such a decomposition for the Shapley effects.