Front shape similarity measure for shape-oriented sensitivity analysis and data assimilation for Eikonal equation

. We present a shape-oriented data assimilation strategy suitable for front-tracking problems through the example of wildﬁre. The concept of “front” is used to model, at regional scales, the burning area delimitation that moves, undergoes shape and topological changes under heterogeneous orography, biomass fuel and micrometeorology. The simulation-observation discrepancies are represented using a front shape similarity measure deriving from image processing and based on the Chan-Vese contour ﬁtting functional. We show that consistent corrections of the front location and uncertain physical parameters can be obtained using this measure applied on a level-set ﬁre growth model solving for an eikonal equation. This study involves a Luenberger observer for state estimation, including a topological gradient term to track multiple fronts, and of a reduced-order Kalman ﬁlter for joint parameter estimation. We also highlight the need – prior to parameter estimation – for sensitivity analysis based on the same discrepancy measure, and for instance using polynomial chaos metamodels, to ensure a meaningful inverse solution is achieved. The performance of the shape-oriented data assimilation strategy is assessed on a synthetic conﬁguration subject to uncertainties in front initial position, near-surface wind magnitude and direction. The use of a robust front shape similarity measure paves the way toward the direct assimilation of infrared images and is a valuable asset in the perspective of data-driven wildﬁre modeling.


Introduction
Front-tracking problems arise in many applications, e.g. cardiac electrophysiology [15], tumor growth [36], seismic history matching [62], oil spill [34], precipitation forecasting [4], flaming combustion [44], wildfire behavior [52]. Such problems aim at tracking the contour (or "front") and the motion of an object, which is the indirect shape of, for instance, an electric field, a water saturation, a cloud, a temperature isovalue, etc. The object contour may be subject to strong shape deformations, its motion may be unsteady and present irregularities and topological changes can occur. Tracking these changes remains a challenging task from both modeling and observation viewpoint [42,64]. Simulating the time-evolving front shape requires as inputs many uncertain quantities such as the initial front position and physical parameters that are usually difficult to measure. The model formulation itself cannot perfectly describe a physical system and its evolution, because of uncertainties on the physical and numerical parameters as well as uncertainties related to the simplification and discretization of the physics and to the numerical methods. Observations are also subject to instrumental and representativeness errors that are enhanced when the target object becomes partially occluded.
To overcome these limitations, data assimilation is considered as a powerful approach. Data assimilation aims at integrating available observations into the front-tracking model, while accounting for their respective uncertainties, in order to infer more accurate front position estimates and to produce more reliable forecasts [48]. The estimation process can target the model input parameters and/or the model prognostic variables [39,57]; these estimated targets are referred to as the "control variables". The success of data assimilation lies in particular in the measure used to represent what we call the "innovation", i.e. the differences between the observations and the simulated counterparts from which the correction in the control space is derived. This measure shall be consistent with the physics of the problem to ensure optimal estimation. Classical data assimilation methods assume that errors are only of amplitude type by representing errors as additive Gaussian noise. In front-tracking problems, they perform poorly when the simulated contour is displaced from its observation. They compensate position errors by adjusting amplitudes, which can produce non-physical solutions and do not preserve coherent structures [4,7,21,22,46]. Our objective is therefore to design a shape-oriented data assimilation approach that addresses position errors, in which the innovation measures the observation-simulation distance in terms of shape discrepancies. This is particularly relevant in the context of data-driven wildfire spread modeling. In previous works [51][52][53][54]65] the observed fronts were treated as a discretized contour with a finite set of equally spaced markers. There, a distance to the simulated fronts was computed by pairing each observed marker with its closest neighbor along the simulated front. Through this selection procedure, the discrepancies between the observations and the simulated fronts (i.e. the innovations) were formulated as Euclidean distances that have the advantage of featuring Gaussian error statistics. However, one limitation of such distance computation is that the selection procedure may be difficult to operate for large scale wildfires with complex front topology with significant temporal changes due to heterogeneous environment and fire behavior. That is why, we introduce in this paper a new distance metric to design a more robust data assimilation algorithm. This avoids specific treatment of the front markers [52]. This also avoids the use of morphing that involves a complex registration mapping [7] and provides valuable spatial information compared to usual similarity scores [6,23]. To this end, our goal is to rely on a front shape similarity measure introduced in the context of data assimilation for cardiac electrophysiology [14][15][16]. This measure -initially envisioned for object detection in image processing theory -formulates a shape similarity measure inspired from the Mumford-Shah functional, in particular its Chan-Vese version [10,40]. In the present study, our first objective is to show that this front shape similarity measure can be generalized and successfully applied for level-set front-tracking simulator in the framework of data-driven wildfire spread modeling for both parameter estimation -using an Unscented Kalman Filter (UKF) -and state estimation -using a Luenberger observer (LO) [37,38]. Our second objective is to highlight how this similarity measure can be used in an uncertainty quantification process -using a Polynomial Chaos (PC) strategy [33,54,55,63] -to study the measurement sensitivity with respect to parameter changes, hence determine the influence of each control parameter on the variability of the front shape similarity measure. This is useful to ensure a meaningful feedback is achieved on the control parameters when performing joint state-parameter estimation.
Even though different types of surrogates are reported in the literature [32], we focus here our attention on PCbased surrogate that is built accounting for the error statistics on the uncertain input variables and that is thus well adapted to the data assimilation framework associated with parameter estimation and already envisioned for interface variability studies [45].

1.
Fire front-tracking problem: models and observations At regional scales (ranging from a few tens of meters up to several hectares), a wildfire is usually described as a 2-D front that self-propagates along its normal into the unburnt vegetation. The (fire) front separating the active flame areas from the unburnt vegetation propagates and deforms according to the environmental conditions such as local terrain orography, micrometeorology and biomass fuel conditions. The main quantity of interest is the propagating speed of the front commonly known as the "rate of spread" (ROS). This regionalscale viewpoint is adopted in current operational wildfire spread simulators, e.g. FARSITE [25]. These simulators require a parametric ROS formulation and a 2-D numerical scheme in order to account for wind and slope effects on fire growth. There is a wide variety of 2-D schemes, which can be sorted into two main types [9,27,47]: (1) Lagrangian type (i.e. tracking marker trajectories over time) using for instance asynchronous advection method based on discrete-event simulation [24] or Huygens-based ellipse expansion [49]; and (2) Eulerian type (i.e. tracking the burning area object over time) using for instance level-set methods [35,41,58,59]. In the present work, we focus on Eulerian front-tracking simulators. While beyond the scope of this work, our data assimilation strategy could be easily extended to Lagrangian fire growth models.

Fire growth model
Several ROS submodels are reported in the literature [61]. The most widely known is the model due to Rothermel [3,56] that evaluates the ROS in a pointwise manner given local orography, biomass and meteorological conditions: To represent the time-evolving burning active areas over the computational domain Ω ⊂ R 2 , we introduce a regular progress variable c ≡ c(x, t) as the fire front marker. The fire front is identified as the contour line c fr = 0.5. Therefore c < c fr corresponds to the unburnt vegetation, c > c fr corresponds to the burnt vegetation. We thus denote the active burning areas B c = {x = (x 1 , x 2 ) ∈ Ω | c(x, t) > c fr } and, when regular enough, the front line The progress variable c is used as the prognostic variable and is calculated as a solution of the eikonal equation with c 0 (x) the initial condition at time t 0 . The quantity ROS 2D is an adequate extension [1,12,35] on the whole domain Ω of the Rothermel-based ROS 1D (Eq. 1) calculated on the front Γ c . ROS 1D corresponds to a 1-D parametric function that is applied in 2-D cases by accounting for the local normal vector n fr ≡ n fr (x). Note that for solving Eq. (2), we follow the choices made by [47] using a second-order Runge-Kutta scheme for time-integration and a second-order total variation diminishing scheme with a Superbee slope limiter for spatial discretization [51].

Sources of model uncertainties
Despite the development of many wildfire-spread models, their use has been relatively limited operationally to respond to actively burning fires as they contain numerous uncertainties. Typically, the ROS submodel introduces epistemic uncertainties due to its limited domain of validity resulting from a calibration procedure based on laboratory-scale experiments [61]. The input parameters, especially the biomass moisture content M v and the near-surface wind velocity U w (see Eq. 1), are usually available with limited spatio-temporal resolution; they are subject to significant uncertainties that translate into uncertainties in the ROS predictions and thereby in the simulated front positions [17,18,31]. There is in general not much information on the fire ignition c 0 (x), so the initial position and shape of the fire are also part of the uncertainties in the wildfire problem.
Since this paper is methodology-oriented, we focus our attention on a test case with idealized conditions. We only consider the initial front condition c 0 and the near-surface wind velocity vector U w as sources of uncertainties; they are known to have the most significant impact on the fire front shape and position. Note that in the present work, the wind velocity vector is specified using two scalar parameters, the wind magnitude u w (u w ≡ U w ) and the wind direction angle d w such that U w = (u w sin d w , u w cos d w ) T . In the following, the front position uncertainties due to c 0 are addressed through state estimation. In practice, the uncertainties in the initial condition will be formulated in a parametric form with respect to the "center of mass" of the initial burning area, whose position is denoted by x ign = (x 1,ign , x 2,ign ). The front position uncertainties due to (u w , d w ) are addressed through parameter estimation. Control parameters are assumed uniform here. Our method extends easily to regular spatially distributed parameters -see for instance [65] -as soon as the spatial discretization of the parameters can be formulated with a reasonable number of degrees of freedom [11,37]. The parameters x ign and (u w , d w ) are considered as independent random variables, each random variable being characterized by a marginal probability density function (PDF) that is either Gaussian (Kalman-based data assimilation, Sec. 2.2.3) or uniform (sensitivity analysis, Sec. 3). A wide inventory of wildfire observations is presented in [28]; the most valuable data for data-driven wildfire spread modeling are the time-evolving fire front locations that can be derived from infrared remote sensing images [13,43,50]. Several data structures can be envisioned. The first data structure is a map z obs ≡ z obs (x, t) of the burnt and unburnt regions (see Fig. 1). The second data structure is a close-contour delimiting the burnt regions. In this case, a binary map z obs ≡ z obs (x, t) corresponding to the interior points of this 2-D front could be reconstructed. The third structure consists of a time-map t ign ≡ t ign (x) of the fire propagation. Then, a binary map can be defined by z obs ≡ z obs (x, t) = 1 t>tign (x).

Observations and Observing System Simulation Experiment (OSSE)
Moreover, we point out that our data z obs (·, t) could also be defined as a continuous map C([0, 1]) in order to represent the uncertain location of the front by a fuzzy set [8].
In the present observing system simulation experiment (OSSE) framework, observations are synthetically generated by integrating the fire growth model in Eq. (2) using reference (or "true") values of the uncertain parameters; the error in the observations is assumed to be small and the observation is then considered as the "target". The prior (or "background") values of the control variables are obtained by perturbing the true values. OSSE thus provides a verification framework to check the ability of the data assimilation algorithm to retrieve the target for varying uncertainty in the control variables and in the observations. In the present work, data assimilation aims at tracking the observed fire front when starting from a poor prior estimate by inferring wind parameters (u w , d w ) that are more consistent with the reference values and/or by directly correcting the simulated field c(x, t) to reduce uncertainties due to a wrong initial front position x ign .

2.
Front shape similarity measure adapted to front data assimilation

Principles of sequential data assimilation
In essence, sequential data assimilation approaches aim at correcting a model dynamics by the use of the current available data. Let us consider that our forward model noted M is given by Eq. (2). The target trajectory cannot be simulated precisely since the initial condition c 0 and some parameters θ entering in ROS 2D are uncertain. We aim at estimating a trajectory characterized by the dynamicsẏ = M(y, t), y(0) = y 0 , with y : t → (c(·, t), θ(t)) the "extended state" at a function of the time t. The observations are cast in the observation vector z obs ∈ Z, with Z the observation space equipped with its norm · Z . The observation operator H is defined as the map between the control space and the observation space such that z = H(y) denotes the simulated counterpart of the observations corresponding to the model state y, itself depending on the initial condition c 0 and on the parameters θ.
To begin with, let us consider a classical data assimilation approach. In this context, we typically assume that a similarity measure between the given observations z obs and the simulated counterparts z can be computed by a "least square" type data fitting functional of the form with a discrepancy operator D of the form D(z obs , y) = z obs − z = z obs − H(y). Then a sequential estimator y uses modified dynamics of the form where K is the sequential gain, in order to converge asymptotically to the target trajectory when assimilating noiseless data, namely y t→+∞ −−−−→ y; robustness property to noisy data is then obtained in a second step. Note that, as y(t) = ( c(·, t), θ(t)) introduces dynamics for the estimated parameters θ(t), the property θ t→+∞ −−−−→ θ allows to identify asymptotically the time invariant target parameters θ. The gain is in general of the form K = P dH * where dH denotes the differential of H and P is a positive operator [11], hence the estimator correction of the dynamics is of the form K(z obs − H(y)) = P dH * (z obs − H(y)) = −P ∇J data (y).
This formulation is natural as the term −∇J data (y) gives a direction of maximum decrease in the dissimilarity between the given observations and the simulation.

The Chan-Vese contour fitting functional
The classical approach presented in Sec. 2.1 suffers from a strong limitation for front-tracking problems. Indeed, it is particularly difficult to define a data fitting term based on a simple Euclidean distance. A first approach was performed in [52] using a Lagrangian front description based on marker position comparison. This Euclidean computation was efficient but limited to a given front topology and suffers from potential mismatches during marker registration. To be more general, we investigate a more complex data fitting functional derived from image active contour segmentation approaches. Following the work of [15], an adequate similarity between the target front and the simulated one can be computed from the Chan-Vese contour fitting functional [10,40,41].
The Chan-Vese contour fitting functional can be written as where φ is a level set associated with c, for instance φ = c − c fr , and where C min and C max are scalar coefficients defined by with The constant C 1 then corresponds to the mean of z obs on the region where c > c fr (i.e. φ > 0), namely the simulated burnt region. Respectively C 0 corresponds to the mean of z obs on the region where c < c fr (i.e. φ < 0), namely the simulated unburnt region.
In the following, we provide some elements of understanding about this criteria through some illustrated examples displayed in Figure 2. For instance, we consider the case where the burnt region -corresponding to a progress variable c > 0 -is included in the observed burnt region z obs = 1, see Figure 2-C. The constant C 1 is the mean value of z obs in the burnt region. Considering that z obs ≡ 1 in this region, we thus have C 1 = 1. By contrast, we have 0 ≤ C 0 ≤ 1 which is minimum when the contours Γ = {c = c fr } delimit exactly the burnt and unburnt regions observed through z obs , see Figure 2-A. In this case, C min = C 0 , C max = C 1 and J is null when the observed and simulated contours coincide. A second example -see Figure 2-E -comes when we consider a progress variable c with a burnt region completely disjoint from the observed burnt region. Then C 1 = 0 and 0 ≤ C 0 ≤ 1 with a maximum value of 1 when the observed and simulated contours coincide. In this last case, φ and z obs are in complete phase shift, hence the interest of introducing C min = C 1 and C max = C 0 in order for J to be maximum. We emphasize that introducing C min and C max is original here with respect to what is classically done with the Chan-Vese functional in image processing. Indeed, in its classical image segmentation use, there is no need to penalize a phase shift.
Having defined the discrepancy functional J , our objective is now to propose a sequential estimator based on a correction of the form (6) for the model dynamics and compatible with the data fitting functional of Eqs. (7)-(8)-(9).

A state observer
Let us first assume that all the parameters in θ are perfectly known. Only the initial condition c 0 is uncertain -due to uncertainty in x ign -and we thus focus on state estimation only. In [15], the bidomain eikonal equation was corrected based on the Chan-Vese contour fitting functional in Eq. (7). To do so, as envisioned in Eq. (6), [15] computed the shape gradient of the similarity measure, namely with δ the Dirac distribution playing the role of a localization operator, meaning that the gradient is concentrated on the front. This means that the variations of the criterion (7) can be diminished by acting on the front contour only. From the gradient computation in Eq. (10), [15] then proposed a correction of the form −λ sh ∇ sh J data with a spatially distributed scalar gain λ sh ≡ λ sh (x) (the subscript "sh" stands for "shape"). Following the same strategy, we propose as Luenberger state observer starting from a given inexact initial condition c 0 . The gain λ sh is a spatially distributed scalar parameter accounting for the local confidence in the observation z obs ; typically λ sh is proportional to the inverse of the local observation error covariance. Figure 2 shows -through some illustrated examples -how the simulated front contour will evolve -see the blue arrows -to minimize the discrepancy. The state observer given by Eq. (10) can deal with unobserved regions. Let us consider that z obs is only defined on Ω obs that is a restriction of Ω. We can replace Eq. (9) by and λ sh (x) = λ sh (x) 1 Ω obs (x) is then restricted to the observation region Ω obs .
The state observer is devoted to pursue the target front from the discrepancy sensitivity ∇ sh J data . However, this observer can only capture a number of fronts that is bounded by the number of initial fronts. However, we know that new ignitions can occur in wildfire problems, for instance by fire spotting [20], hence our need to capture new fronts during the estimation procedure. To address this difficulty, we propose to follow [16] and also compute the topological gradient of Eq. (7). A topological gradient, here given by typically computes the sensitivity of the discrepancy measure to new appearing fronts [29] (the subscript top stands for "topology"). From the sensitivity (13), we deduce -as in [16] -a data correction adapting the eikonal equation to new appearing fronts, see Figure 2-E for example. The resulting complete observer then reads where λ sh ≡ λ sh (x) is the gain associated with the shape gradient and where λ top ≡ λ top (x) is the gain associated with the topological gradient. The impact of this additional topological feedback is typically illustrated in Figure 2-E. Indeed in this case, the simulated front tends to collapse due to the shape-based feedback term (see the blue arrows). However, the topological observer can introduce a new front in the domain {z obs = 1}. Note that the theoretical study -even for unnoisy data -of the asymptotic convergence y t→+∞ −−−−→ y is beyond the scope of this methodology-oriented article. However, based on the results of [15] in a more complex context, we believe it is possible to demonstrate a weaker property, namely that the error y − y decreases for small initial errors.

Kalman-based filters and parameter estimation
An alternative to propose a sequential estimator correction is to rely on a Kalman-based update. For simplicity purpose, we shall describe the Kalman correction after space and time discretization of the initial model and observations. We denote by Y h and Z h the corresponding finite dimensional state space and observation space. In the following, we use bold letters to refer to the vectors of degrees of freedom associated with the discretized solutions; n refers to the time index and [n, n + 1] refers to a given assimilation time window. Over the window [n, n + 1], the control augmented state vector is noted y n+1 = (c n+1 , θ n+1 ). When considering a prior estimate of the state vector y b n+1 (the superscript b stands for "background") and when the similarity measure has a least square structure as in Eq. (4), the posterior estimate y a n+1 (the superscript a stands for "analysis") is obtained using the Kalman filter update equation K al n+1 is the Kalman-type gain matrix such that K al n+1 = P b yy dH T dH P b yy dH T + P o zz −1 , where P b yy is the augmented state background error covariance matrix containing covariances between the errors in the state variable and the cross-covariances between errors in the parameters and in the state variable, where P o zz is the observation error covariance matrix, and where dH is the derivative of the observation operator H. Note that Eq. (15) is consistent with Eq. (6) as the time-discrete Kalman gain integrates the time step when accounting for discrete white noise [11].
In order to be compatible with a Kalman filter update computation of the form of (15), we thus need a discrepancy J = 1 2 D 2 Z h of least square type associated with our front shape similarity measure. The Chan-Vese functional in Eq. (7) involves Heaviside distribution cannot be multiplied univocally [26], hence our functional is not a least square criterion. However, a pseudo least square strategy can be envisioned by decomposing the Chan-Vese functional into two functionals with (17) Indeed these two functionals can be approximated -see for instance [2] -by the least square functionals where is here a small parameter defined with respect to the contour sharpness [19]. Integrating these two functionals into a Kalman-type filter algorithm is straightforward as it only consists in aggregating two sources of information, hence D z obs n+1 , H(y b n+1 ) = (D ,+ , D ,− ) T with D ,+ and D ,− discretizing the quantities    D ,+ = 1 + 2 π arctan φ z obs − C max (z obs , φ) , As the state dynamics is already controlled by the LO (Eq. 14), we can restrict the definition of the Kalman filtering approach to the parameter space [37,54]. This is usually referred to as reduced-order Kalman filtering -ROKF -methods. Hence, our estimation strategy combines a Luenberger-type update that directly integrates the gradient of the Chan-Vese functional and a Kalman-type update based on a least square reformulation of the approximated Chan-Vese functional. Note that the PDF associated with the control parameters θ are assumed to be Gaussian in Kalman filtering. Note also that we consider an ensemble-based Kalman filter to account for some of the nonlinearities in the forward model. The key idea is to define a set of particles based on the prior PDF to estimate the Kalman gain K and more precisely, the error covariance statistics. Each particle represents one forward model integration and thus one realization of the state vector y. In the present study, we use the reduced-order UKF algorithm noted ROUKF, meaning that the prior PDF is approximated by a deterministic sampling of particles (or "sigma-points") [37].

Joint state-parameter estimation algorithm
Eventually at each model time step, our joint state Luenberger Observer and parameter Reduced Order Unscented Kalman Filter (LO-ROUKF) data assimilation algorithm reads: (1) sample sigma-points around the mean parameter based on the estimated parameter covariance and define corresponding state particles using the estimated parameter-state sensitivity; (2) propagate the Luenberger state observer (LO) defined by the dynamics of Eq. (14); (3) estimate the resulting model sensitivity with respect to the control parameters; (4) compute independently the two data fitting terms D ,+ and D ,− in Eq. (19); (5) estimate parameter mean value and covariance based on sensitivity and data fitting terms; (6) correct the state prediction by a ROUKF analysis step based on sensitivity and data fitting terms using Eq. (15).

Polynomial chaos (PC) framework
To make parameter estimation efficient, the choice of the control parameters θ is of prime importance. They have to be actual sources of uncertainties, and the observed quantities have to be sensitive to them. We thus need to design a cost-effective approach that can measure this sensitivity, while being adapted to the Kalmantype front shape similarity measure presented in Eq. (19). For this purpose, we introduce a numerical strategy based on a PC expansion [54,63]. The key idea is to replace the front shape similarity measure D ,± obtained with the fire growth model by a surrogate (or surface response) D ± pc of the form θ = (θ 1 , θ 2 ) is defined in the input physical space and its counterpart in the standard probabilistic space is noted ζ = (ζ 1 , ζ 2 ) such that ζ i = θi−µi σi with µ i the mean value and σ i the standard deviation (STD) associated with the ith uncertain parameter θ i in θ. θ is thus rescaled in the standard probabilistic space to which the PC framework applies. It can be projected onto a stochastic space spanned by orthonormal polynomial functions {Ψ α (ζ)} α∈A P . Ψ α is the αth multivariate basis function chosen in adequacy with the prior PDF ρ θ associated with the parameters θ = (u w , d w ) ∈ R 2 . α = (α 1 , α 2 ) ∈ {0, 1, · · · , P } 2 is a multi-index of absolute value |α| = α 1 + α 2 , which identifies the components of the multivariate polynomial Ψ α that shall be of total degree less or equal to P , i.e. |α| ≤ P ; the set of selected multi-indices is noted A P . γ α is the corresponding coefficient to determine to build the surrogate.
Since the fire growth model is of finite variance, the front shape similarity measure defined in Eq. (19) can be considered as a spatially distributed random vector for which there exists a convergence polynomial expansion of the form (20) representing how D ,± varies according to changes in wind speed (θ 1 = u w ) and direction (θ 2 = d w ). Details of the PC surrogate construction are given in the following for a 2-D random vector θ but could be easily generalized.

Choice and truncation of polynomial basis
The basis functions are orthonormal with respect to the joint PDF ρ θ (ζ), namely with δ αβ the Kronecker delta-function and Z ⊆ R 2 the space in which ζ evolves. In practice, the orthonormal basis is built using the tensor product of 1-D polynomial functions such that the αth multivariate basis function can be expressed as Ψ α = φ α1 φ α2 with φ αi the 1-D polynomial function and its index α i ∈ {0, 1, · · · , P }. The choice for the basis functions depends on the probability measure of the random variables. For the present sensitivity analysis, we consider uniform PDFs: {Ψ α } α∈A P correspond to the Legendre polynomials [63].
There are several ways of choosing the number of terms r pc in the expansion of form (20). Using a standard truncation strategy, r pc is constrained by the number of random variables (the size of θ is 2) and by the total polynomial degree P as r pc = (2 + P )!/(2! P !), meaning that all polynomials involving the two random variables of total degree less or equal to P are retained in the PC expansion. Thus, the set of selected multi-indices for the multi-variate polynomials A P is defined as

Computation of expansion coefficients
We focus on non-intrusive approaches to numerically compute the coefficients {γ α } α∈A P using a training set made of N e simulations. Since we consider a 2-D random vector θ, Galerkin spectral projection is a suitable approach [54]. It relies on the orthonormality property of the polynomial basis. The αth coefficient γ α is computed using a tensor-based Gauss-quadrature as where z (k) = H(y (k) ) with y (k) = (c (k) , θ (k) ) is the kth integration of the forward model corresponding to the kth quadrature root θ (k) (in the physical space) with its weight w (k) , k = 1, ..., N e , and with (P + 1) the number of quadrature roots required per uncertain input random variable. In the present study, building the surrogate model D ± pc will require N e = (P + 1) 2 fire growth simulations for a given total polynomial degree P , which is a lower bound that insures the orthonormality of the polynomial approximation basis. The N e realizations form the training set.

Polynomial Chaos (PC) algorithm
Our algorithm to build the PC surrogate proceeds as follows: (1) choose the polynomial basis functions {Ψ α (ζ)} α∈A P assuming uniform marginal PDFs (ρ θ1 , ρ θ2 ) on the 2-D random vector θ = (θ 1 , θ 2 ) = (u w , d w ) with ρ θ = ρ θ1 ρ θ2 ; (2) choose the total polynomial degree P according to the complexity of the physical processes (e.g. nonlinearity in the forward model and in the front similarity measure D ,± ); (3) truncate the expansion to r pc terms using standard truncation strategy; (4) apply Gauss-Legendre quadrature in Eq. (23) to compute the coefficients {γ α } α∈A P using N e = (P +1) 2 realizations of θ and their corresponding N e fire growth model snapshots (the front shape similarity D ,± in Eq. (19) is computed at a given time for each snapshot); (5) form the surrogate D ± pc in Eq. (20) that can be evaluated for any new set of parameters θ * = (u * w , d * w ).

Estimation of statistical moments and Sobol' sensitivity indices
Once the surrogate D ± pc is built, we can directly derive from the coefficients {γ α } α∈A P the main statistical moments of the front shape similarity measure such as the mean µ pc and the STD σ pc . The coefficients also contain information on how the discrepancies between the observation and the simulated fire front position change according to variability in each uncertain input parameter. We use Sobol' indices [60] to measure this sensitivity between uncertain inputs and the front shape similarity measure. For the ith variable in the random vector θ, the first-order Sobol' index S pc,i specifically evaluates the ratio of the front dissimilarity variance that is explained by the variance of the standalone parameter θ i . The mean, variance and Sobol' indices are obtained with E[.] the expectation operator, V[.] the variance operator and A P i the set of multi-indices selected in A P such that the computation of S pc,i only includes terms that depends on the input variable θ i , namely

Numerical configurations
We present results from an OSSE test case in which synthetic observations are generated using specified (or "true") values of ignition location x t ign and near-surface wind parameters (u t w , d t w ) and in which there is the possibility to have a "spot" fire.   We consider here two test cases: the case with the standalone main fire (OMAIN) and the case with both main and spot fires (OSPOT). The two corresponding series of fire front positions are used as observations; confidence in the observations is high with the scalar shape weight equal to λ sh = 0.04 to evaluate the capability of the data assimilation algorithm to track observations in the "best-case" scenario (i.e. when observations are perfect). In practice, observations are not a time-continuous sequence of images, but a sequence of images acquired at distant times. In order to handle the data time-sampling, there are two possible strategies. We can either use the data only when they are available, or we can rely on time-interpolation. In the latter, as interpolating Heaviside functions is not recommended, we linearly interpolate shape observer corrections (first correction term in Eq. 14) and not the observations. Concerning the topological observer (second correction term in Eq. 14), as it is a Heaviside-type correction, we only apply it only when observations become available. In the present study, for both OMAIN and OSPOT test cases, the target fire front is observed every 10 s. Therefore, the data time-sampling is ∆T = 100 ∆t when compared to the model time-step ∆t.
In the following, the main fire ignition location x ign , the presence of a spot fire x * ign , the wind speed and direction (u w , d w ) are considered as sources of uncertainties. We present different experiments with changing control variables to highlight the features of standalone LO-based state estimation (with or without activating the topological observer, step 2 of the algorithm presented in Sec. 2.2.4); standalone ROUKF-based parameter estimation (go directly from step 1 to step 4 in the algorithm presented in Sec. 2.2.4); and joint state-parameter estimation (combining LO and ROUKF through steps 1 to 5 in the aforementioned algorithm LO-ROUKF). The objective is to show that the new front shape similarity measure is applicable to any type of control variable and can retrieve the observed fire fronts considered here as the "targets".

Standalone state estimation
First, we consider uncertain fire ignition in terms of the main fire ignition location and the presence of a spot fire. The prior estimate of the front position (or "background") is obtained considering a single fire ignited at x b ign = (94 m, 94 m) (4 % error). We obtain a new estimate of the front position using the LO state estimator given in Eq. (11), with uniform gains λ sh and λ top . Figure 4 shows the result at times 10, 40, 60 and 100 s, when the topological gradient is not used (λ top = 0 -left panel) and when it is used (λ top = 10 -right panel). The main fire is accurately retrieved for both cases. However, the spot fire is only detected and tracked in the posterior estimate (or "analysis") obtained through the topological gradient. When λ top = 0, the observer is limited in the sense that it can only correct the front shape that is already present in the simulation. On the contrary, when λ top = 10, we show that the analysis accurately retrieves the spot fire rapidly after ignition at time 40 s.
If the prior wind (u b w , d b w ) = (4 m s −1 , 225 • ) is subject to uncertainties but not corrected through parameter estimation, the LO state estimator is not able to track the observations since a wrong wind induces significant changes in the front shape as highlighted in Fig. 8 (left panel). This highlights the need for parameter estimation due to the parametric formulation of the fire growth model (Eq. 2) and its strong dependency on the quality of the inputs of the Rothermel-based ROS submodel.

Sensitivity analysis and standalone parameter estimation
Second, we consider only the wind speed and direction as sources of uncertainties; the ignition position is assumed to be known. The prior estimate of the front position is obtained considering a single fire (OMAIN test case) ignited at the true position x t ign = (90 m, 90 m) but with prior wind values (u w , d w ) b = (4 m s −1 , 225 • ) that are significantly different from the true wind (25 % and 20 % errors, respectively).
Prior to parameter estimation, we carry out a sensitivity analysis of the discrepancy operator (D ,+ , D ,− ) with respect to the uncertain wind magnitude and direction, θ = (u w , d w ). To limit the computational cost of this process, we build a PC-surrogate considering a total polynomial order P = 4 and thus using a training set of N e = (P + 1)   . It is found that the term D ,+ highlights the "hits" area (orange area where the observation and the simulated progress variable are equal) and penalizes the "false alarms" area (blue area where the simulated ensemble propagates and that is outside of the observed burnt area); the STD field is consistent with the patterns of the mean progress variable field. Moreover, the term D ,− highlights the "misses" area (yellow area inside the observed burnt area but missed by the simulated ensemble members). The complete distinction between hits, false alarms and misses is not possible with the standard measure used to compute the observation-simulation differences (Eq. 4). This is one of the motivations to develop this new front shape similarity measure.
Let us focus the discussion on the terms D ,+ . They feature a high sensitivity to the wind direction d w (S pc,2 > 0.8) on the flanks of the mean fire front. They also feature a significant sensitivity to the wind speed u w (S pc,1 > 0.5) at the head of the fire, in the main direction of front propagation. These results are consistent with the fire growth model and the underlying ROS submodel (Eq. 2) since a change in the wind direction significantly modifies the main direction of front propagation, while a change in the wind speed mainly impacts the speed of the front propagation in the upwind direction. They highlight that only 25 fire growth simulations are sufficient to evaluate the contribution of each parameter to the behavior of the front shape similarity measure (via the PC surrogate) and therefore quantify the input-discrepancy sensitivity. The effects of the wind direction and wind speed on the front shape similarity measure are found to significantly differ, implying that no equifinality issue shall be encountered when performing parameter estimation.
Based on the sensitivity analysis results, we then carry out parameter estimation using the ROUKF. direction and the correct ROS at the head of the fire. Still, there is a remaining gap between the analysis front estimates and the observations, especially on the flanks. There is no state estimation here and thereby not enough constraints from the wind parameters to perfectly track the observed fire fronts. We could extend the methodology to spatially-distributed parameter estimation to give the data assimilation algorithm more degrees of freedom to match the observations [65]. Note that in reality correcting parameters spatially does not ensure the retrieval of accurate parameter values. They shall then be viewed as effective values that incorporate multiple sources of uncertainties and that tend to be modified to reduce simulation-observation discrepancies, while they may only be partially responsible for them. This issue motivates the use of joint state-parameter estimation to obtain more realistic values of the parameters and thereby improve forecast performance.

Joint state-parameter estimation
We finally test the front shape similarity measure on the joint state-parameter estimation algorithm LO-ROUKF including a topological gradient for the state estimator [16] (Fig. 4 shows the influence of including a topological gradient in the LO formulation). We consider the ignition locations (x ign , x * ign ) as well as the wind parameters (u w , d w ) as sources of uncertainties. The prior values of these parameters are . Due to these combined uncertainties, the prior front estimate significantly deviates from the truth as shown in Fig. 10. The joint state-parameter estimator is found to accurately track the observed fronts (including the secondary fire) with a good balance between state estimation (λ sh = 0.04, λ top = 10) and parameter estimation: the true wind magnitude and direction are accurately identified as shown in Fig. 11. Note that the convergence to the true wind magnitude is slightly longer than in the standalone parameter estimation approach, while the convergence to the true wind direction takes less time (see Fig. 9). Moreover, we emphasize that here we focus on one iteration of the estimation procedure to illustrate the already very good performance of our method. Further studies could include multiple iterations of the procedure to help the parameter identification convergence in such a non-linear parameter dependency. Such strategies are known as iterative extended or unscented Kalman approaches [5,30].

Conclusions and Perspectives
We presented a front shape similarity measure adapted to an eikonal equation solved using an Eulerian level-set front-tracking model. This measure derived from the Chan-Vese contour fitting functional in image segmentation theory is well suited to handle position errors that go beyond the problem of amplitude errors addressed by classical data assimilation framework. The present work shows that this formalism is valuable for wildfire applications (1) where uncertainties are both due to limited knowledge on the active fire location (location, spotting fires) and on the physical parameters required as inputs to the fire problem (e.g. near-surface wind speed and direction), (2) where observations typically correspond to mid-infrared images from which the position, shape and topology of the active burning areas shall be extracted to be compared to the fronts simulated by fire growth models. We showed on synthetic cases, the benefits of using a joint state-parameter estimation approach to address shape and topological errors: the Luenberger-type state estimation was useful to track the (potentially multiple) fire front shapes; and the Kalman-type reduced-order parameter estimation was useful to reduce model bias. We also provided a polynomial chaos framework to balance the weight of each approach through input-measure sensitivity analysis. Comparing Sobol' indices derived from the polynomial chaos surrogate is a robust method to classify by order of importance the parameters to estimate.
Future work includes the extensive evaluation of the front shape similarity measure in the context of realworld wildfire hazards. To address these problems, the framework could include more sources of uncertainties in  the control parameter vector (e.g. biomass fuel moisture content) and could be able to operate on Lagrangiantype fire growth models (e.g. FARSITE). We also need to evaluate, in practice, a spatially distributed gain in both state and parameter estimation, which may be hampered by partially observed fire fronts -as incomplete  wildfire images may be the only source of information available due to the opacity of the fire-induced thermal plume and/or due to a limited monitoring. We may also require to define error covariance statistics that are well suited for the discrepancy operator, and to add a texture criterion -and not only shape and topological criteria -to provide more information on the active burning areas in the data assimilation process.  Figure 11. Same caption as in Fig. 9 but for OSPOT case and joint LO-ROUKF stateparameter estimation.

Acknowledgment
The