Cortical-inspired image reconstruction via sub-Riemannian geometry and hypoelliptic diffusion

In this paper we review several algorithms for image inpainting based on the hypoelliptic diffusion naturally associated with a mathematical model of the primary visual cortex. In particular, we present one algorithm that does not exploit the information of where the image is corrupted, and others that do it. While the first algorithm is able to reconstruct only images that our visual system is still capable of recognize, we show that those of the second type completely transcend such limitation providing reconstructions at the state-of-the-art in image inpainting. This can be interpreted as a validation of the fact that our visual cortex actually encodes the first type of algorithm.


Introduction
Since the founding works of Petitot, Citti and Sarti [14,32], where the foundations of the mathematical model (known as Citti-Petitot-Sarti model, CPS model in short) of the human primary visual cortex V1 have been posed, many authors worked on develop and apply these ideas to image processing and computer vision [9,18,19,22,34]. Indeed, the main concept at the basis of the CPS model, i.e., that images in our brains are processed via a redundant representation taking into account local features as local orientation, presents a somewhat new framework in which many tasks can be strongly simplified.
In view of the many well-studied examples of contour completion operated by the human brain (see Figure 1), one of the main problems that was attacked from this point of view has been that of image reconstruction, also known as image inpainting [3,12,15,28]. In this survey, we present the results we obtained in a series of works [4][5][6][7]33] in this context. The paper is divided in three parts, corresponding to different methods and purposes of image reconstruction.
In the first part, Section 2, we introduce the geometry of the problem by describing the Citti-Petitot-Sarti model for reconstruction of curves, with some new ingredients introduced in [7]. Although this simplified model is not very effective to reconstruct images partially corrupted (reconstructing an image level curve by level curve is a hopeless problem), it permits to understand easily the sub-Riemannian structure of V1, which will allow to introduce the hypoelliptic diffusion.  Indeed, in Section 3, we present an image inpainting algorithm obtained by building on the curve reconstruction methods of the first part. In particular, we show that, via stochastic considerations, the latter can be extended to image reconstruction by considering the hypoelliptic diffusion associated with the a sub-Riemannian structure of V1. Although not very efficient, this method allows to reconstruct images with small corruptions. The main interest of this technique is to simulate the reconstruction of images by the visual cortex and, as we will show in the next section, to produce very effective image inpainting algorithms when coupled with other techniques. Moreover, thanks to the exploitation of non-commutative harmonic analysis techniques, the final algorithm that we present has the twofold advantage of, on one side, being very fast and easy to parallelize, and on the other, of not needing nor exploiting any information on the location and shape of the corruption.
Finally, in the third and last part, Section 4, we collect some results detailing how to ameliorate the above hypoelliptic diffusion method by taking into account the location of the corruption. In particular, we show how these methods allow to obtain image reconstructions of highly corrupted images (e.g. with up to 97% of pixel missing), placing themselves at the same level as state-of-the-art inpainting methods.
We conclude this brief introduction by observing that, experimentally, the threshold of the maximal amount of corruption that the pure hypoelliptic diffusion method can manage looks very close to the threshold of corruption above which our brain can recognize the underlying image. This fact could be seen as a validation of the CPS model with pure hypoelliptic diffusion. On the other hand, the methods presented in the last part completely transcend such problem, thus suggesting that, although based on neurophysiological ideas, they do not reflect any real mechanism of V1.

The sub-Riemannian model for curve reconstructions
In this section, we recall a model describing how the human visual cortex V1 reconstructs curves which are partially hidden or corrupted. The model we present here was initially due to Petitot [31,32]. It is based on previous work by Hubel-Wiesel [24] and Hoffman [23], then it was refined by Citti et al. [13,14], Duits et al. [17][18][19][20]. and by the authors of the present paper in [4][5][6][7]. It was also studied by Hladky and Pauls in [22]. In a simplified model 1 (see [31, p. 79]), neurons of V1 are grouped into orientation columns, each of them being sensitive to visual stimuli at a given point of the retina and for a given direction 2 on it. The retina is modeled by the real plane, i.e. each point is represented by (x, y) ∈ R 2 , while the directions at a given point are modeled by the projective line, i.e. θ ∈ P 1 . Hence, the primary visual cortex V1 is modeled by the so called projective tangent bundle P T R 2 := R 2 × P 1 . From a neurological point of view, orientation columns are in turn grouped into hypercolumns, each of them being sensitive to stimuli at a given point (x, y) with any direction. In the same hypercolumn, relative to a point (x, y) of the plane, we also find neurons that are sensitive to other stimuli properties, like colors, displacement directions, etc... In this paper, we focus only on directions and therefore each hypercolumn is represented by a fiber P 1 of the bundle P T R 2 .
Orientation columns are connected between them in two different ways. The first kind of connections are the vertical (inhibitory) ones, which connect orientation columns belonging to the same hypercolumn and sensible to similar directions. The second kind of connections are the horizontal (excitatory) connections, which connect neurons belonging to different (but not too far) hypercolumns and sensible to the same directions. (See Figure 2.) These two types of connections are represented by the following vector fields on P T R 2 : The parameter β > 0 is introduced here to fix the relative weight of the horizontal and vertical connections, which have different physical dimensions. In other words, when V1 detects a (regular enough) planar curve γ : [0, T ] → R 2 , γ(t) = (x(t), y(t)), it computes a "lifted curve" Γ : [0, T ] → P T R 2 , Γ(t) = (x(t), y(t), θ(t)), by including a new variable θ(·) : [0, T ] → P 1 which satisfies: The new variable θ(.) plays the role of the direction in P 1 of the tangent vector to the curve. Here it is natural to require that u, v ∈ L 1 ([0, T ]), i.e., that Γ ∈ W 1,1 ([0, T ]).

The curve reconstruction problem
Consider an "interrupted" liftable curve, that is, assume that γ : [0, a]∪[b, T ] → R 2 where 0 < a < b < T . Let us call γ in = γ(a), (x in , y in ) := (x(a), y(a)), γ fin = γ(b), and (x fin , y fin ) := (x(b), y(b)). Notice that, according to (3) the limits θ in := lim τ ↑a θ(τ ) and θ fin := lim τ ↓b θ(τ ) are well defined. In the following, we describe a method to reconstruct the missing part, based on the model presented above. That is, we are looking for a curveγ : [a, b] → R 2 whose support is a "reasonable" completion of the support of γ.
It has been proposed by Petitot [30] that the visual cortex reconstructs the curve by minimizing the energy necessary to activate orientation columns which are not activated by the curve itself. This is modeled by the minimization of the following cost functional, Here, a, b are fixed, and the minimum is taken on the set of liftable curvesγ : [a, b] → R 2 which are solution of (2) for some u, v ∈ L 1 ([a, b]) and satisfying boundary conditions Hereθ(·) is the lift associated withγ(·) via (3). Observe that in the cost (4), u(τ ) 2 (resp. v(τ ) 2 ) represents the (infinitesimal) energy necessary to activate horizontal (resp. vertical) connections. As pointed out in [7], a solution to the problem (2), (4), (5) always exists. An example of curve reconstruction obtained via this technique is given in Figure 3.

Remark 2. Minimizers of the cost (4) are minimizers of the reparametrization-invariant cost
where κ(·) is the curvature of the planar curve γ and β is the dimensional parameter appearing in the vector field Θ. For more details about the relations between minimizers of costs (4) and (6), see [29].
Remark 3. Observe that here θ ∈ P 1 , i.e., angles are considered modulo π. Notice that the vector field X is not continuous on P T R 2 . Indeed, a correct definition of problem (2), (4), (5) needs two charts, as explained in detail in [7,Remark 12]. In this paper, the use of two charts is implicit, since it plays no crucial role.
Remark 4. From the theoretical point of view, the weight parameter β is irrelevant: for any β > 0 there exists a homothety of the (x, y)-plane that maps minimizers of problem (2), (4), (5) to those of the same problem with β = 1. However its role will be important in our inpainting algorithms (see Section 7).
The minimization problem (2), (4), (5) is a particular case of a minimization problem called a sub-Riemannian problem. For more details on this interpretation see the original papers [14,30,32], the book [31], or [2,6,7] for a language more consistent with the one of this paper.
One the main interests of the sub-Riemannian problem (2), (4), (5) is the possibility of associating to it a hypoelliptic diffusion equation which can be used to reconstruct images (and not just curves), and for contour enhancement. This point of view was developed in [7,14,[18][19][20], and is the subject of the next section.

Pure hypoelliptic diffusion for image inpainting
The sub-Riemannian problem (2), (4), (5) described above was used to reconstruct images whose level sets are smooth curves by Ardentov, Mashtakov and Sachkov [27]. The technique they developed consists of reconstructing as the missing parts of the level sets of the image by applying the above curve-reconstruction algorithm. Obviously, beside the fact that level sets in general need not be smooth nor curves, when applying this method to reconstruct images with large corrupted parts, one is faced with the problem of how to put in correspondence the non-corrupted parts of the same level set.
To avoid this problem, it is natural to model the cortical activation induced by an image f : R 2 → [0, 1] as a function Lf : P T R 2 → R, where Lf (x, y, θ) represents the strength of activation of the neuron (x, y, θ) ∈ P T R 2 . It is also known that this cortical activation will evolve with time, due to the presence of the horizontal and vertical cortical connections already introduced in Section 2. Then, a natural way to model this evolution is to assume that the activation of a single neuron propagates as a stochastic process Z t on P T R 2 , which solves the stochastic differential equation associated with system (2). That is, letting u t and v t be two one-dimensional independent Wiener processes, we have Under this assumption, the evolution of a cortical activation Lf is obtained via the diffusion naturally associated with (7), that is, Observe that the above diffusion is highly anisotropic, since the operator ∆ only takes into account two of the three possible directions in P T R 2 . Indeed, ∆ is not elliptic. However, due to the fact that the family of vector fields {X, Θ} satisfy Hörmander condition, the operator ∆ is hypoelliptic. The use of equation (8) to model the spontaneous evolution of cortical activations was first proposed by Citti and Sarti in [14] (although they posed the problem in SE (2), a double covering of P T R 2 ).
To be be able to use equation (8) to reconstruct a corrupted image, one has to specify two things: i) the lift operator L, which maps images on R 2 to cortical activations on P T R 2 ii) how to project the result of the diffusion on R 2 to get the reconstructed image. (See Figure 4.) We remark that in this pipeline we do not exploit any knowledge on the location and/or shape of the corruption.
In the remainder of this section, we will discuss these two points and present an idea of the efficient numerical scheme we exploit to solve (8).

Lifting procedure
Concerning how the visual cortex lifts an image to P T R 2 , it seems likely that this is done through multiple convolutions with orientation sensitive filters (like Gabor filters), see e.g. [16]. This idea is strictly connected with wavelet transforms and orientation scores, which have been implemented, e.g., in [19,20]. See also [33]. The main mathematical advantage of this kind of lifts is that, for any (x, y) ∈ R 2 , the vertical components Lf (x, y, ·) are essentially 1D gaussians centered around the orientation θ(x, y) of the gradient ∇f (x, y).
In this paper, we will consider only a primitive version of this lift. (See Figure 5.) Indeed, we will consider the lift Lf : P T R 2 → R of a function f : R 2 → R to be the distribution Lf = f δ Sf , where δ Sf is the Dirac delta measure concentrated on the surface Sf = (x, y, θ) | (x, y) ∈ R 2 , θ = θ(x, y) ⊂ P T R 2 , θ(x, y) = arctan ∂yf (x,y) ∂xf (x,y) , if ∇f (x, y) = 0, In order to insure that Sf be a well-defined hypersurface of P T R 2 , we actually compute it on a smoothed version of f , see [7]. This is in accordance with neurophysiological observations that the retina itself operates a Gaussian smoothing. It is interesting to notice that this smoothing renders Sf particularly regular. More precisely, in [7] it has been shown that the convolution of an L 2 (R 2 ) function with a Gaussian is generically a Morse function and that the following holds.

Proposition 6 ( [7]
). If f : R 2 → R is a Morse function, then Sf is an embedded 2D submanifold of P T R 2 .
We remark that this distributional version of the lift can be interpreted as a limit of the wavelet transform lifts, where the variance of the gaussians obtained by the latter for Lf (x, y, ·) goes to zero. Variations on the above lifting procedure have been proposed in [9,14,34].

Projection
For our choice of lift, as well as for most of the wavelet transform lifts, solutions to the evolution equation (8) for any time t > 0 do not belong to the range of L. Thus, it is not possible to exploit the fact that the lifting procedure is injective in order to define a projection. The most natural way to project functions ψ : P T R 2 → R 2 to πψ : R 2 → R is then to simply choose πψ(x, y) = P 1 ψ(x, y, θ) dθ.

Numerical integration
Observe that, since the double covering of P T R 2 is the group of roto-translations SE(2) = R 2 S 1 , and (8) is covariant w.r.t. the canonical projection SE(2) → P T R 2 , it is more practical to solve (8) on the latter. Indeed, this allows to exploit the group structure of SE(2). As already remarked, the numerical integration of (8) is subtle, since multiscale sub-Riemannian effects are hidden inside 3 and has been approached in different way. For example, in [14,34] the authors use a finite difference discretisation of all derivatives while in [18][19][20] an almost explicit expression for the heat kernel is exploited.
In [4,5,7] we presented a sophisticated and highly parallelizable numerical scheme, based on the noncommutative Fourier transform on a suitable semidiscretization of the group SE(2), i.e., the semidiscrete group of roto-translations SE(2, N ) for N ∈ N. This is the semi-direct product SE(2, N ) = R 2 Z N , where Z N is the cyclic group of order N and the action of n ∈ Z N on R 2 is given by the rotation R n of angle 2πn/N . As pointed out in [5], considering a discrete number of orientations seems to be in accordance with experimental evidence. Although still an open problem, this is probably connected with topological restrictions given by the the fact that neurons in V1 encode a 3D space while V1 itself, physically, is essentially 2D.
A way of interpreting the algorithm presented in [5], see also [8,33], is the following: For a given finite subset K of R 2 , let SE(2, N, K) be the set of C N -valued trigonometric polynomials Q on SE(2, N ) that read where Q n (x, y) = (λ k ,µ l )∈K c n k,l e i(λ k x+µ l y) , r = 0, . . . , N − 1.
Here, c r k,l ∈ C. Then, if we assume that (standard) Fourier transforms of images are compactly supported on K, their lifts will belong to SE(2, N, K). Equation (8) can be naturally (semi)discretized on SE(2, N ), essentially replacing the operator Θ 2 in ∆ with its discretized version Λ N ∈ R N × R N , and then restricted to SE(2, N, K). This yields the completely uncoupled systems of linear ordinary differential equations Here, c k,l (t) = (c 0 k,l (t), . . . , c N −1 k,l (t)) T . These systems are equipped with initial conditions c k,l (0) = c k,l , where the latter are the coefficients in (12) corresponding to Lf .
These discretized equations can then be solved through any numerical scheme. We chose the Crank-Nicolson method, for its good convergence and stability properties. Let us remark that the operators appearing on the r.h.s. of (13) are periodic tridiagonal matrices, i.e. tridiagonal matrices with non-zero (1, N ) and (N, 1) elements. Thus, the linear system appearing at each step of the Crank-Nicolson method can be solved through the Thomas algorithm for periodic tridiagonal matrices, of computational cost O(N ).
Remark 7. By [5, Theorem 2], solving (13) for some couple (λ k , µ l ) is equivalent to solve it for any rotated couple R n (λ k , µ l ), associated with n ∈ Z N . Thus, if the set K is invariant with respect to rotations R n , n ∈ Z N , it is indeed sufficient to solve (13) for a slice of K whose orbit under the rotations covers the whole K.

Numerical experiments
When implementing the pure hypoelliptic diffusion, there are essentially 3 parameters to be tuned: the number of angles N for the semi-discretization of the equation on SE(2, N ), the weight parameter β appearing in the operator ∆, and the total time of diffusion T . Clearly, for performance reasons, one would like to have N as small as possible. Indeed, numerical experiments suggests that it suffices to choose N = 30, as increasing N beyond this threshold does not affect the resulting image in any visible way. On the other hand, the parameters β and T have to be tuned by hand and the optimal choice seems to be deeply sensitive to the image to be treated (and of course to the desired result).
In Figure 6, we present a sequence of images showing the effect of the pure hypoelliptic diffusion with two different choices for β at different times. In particular, this shows how for large values of β, the effect of the pure hypoelliptic diffusion on P T R 2 becomes, after the projection, close to an isotropic diffusion on R 2 . In Figure 7, we present the results of the pure hypoelliptic diffusion when the parameter β is chosen to be 0. Notice that, due to the absence of diffusion in the angular part, we observe the formation of straight lines tangent to the level lines.
In order to show the strong anisotropicity of the hypoelliptic diffusion, we present also Figures 8, 7 9. In the first one, we show the effects of the evolution when the surface Sf on which the image is usually lifted is replaced with S θ0 = {(x, y, θ 0 ) | (x, y) ∈ R 2 }, for different choices of θ 0 ∈ P 1 . In accordance with the mathematical formulation, this yields a diffusion restricted to the direction θ 0 . Finally, in Figure 9, we show the hypoelliptic evolution when the lift is replaced withL f g = δ Sf g for two different images f, g. Namely, we use the values of the first image only to compute Sf and on this surface we consider the values given by the second image. Here,   In particular observe that only the white stripe perpendicular to θ 0 is filled, while the other is preserved.
although the information on the values of the first image is lost, the information on the directions of the level lines alone allows to recover some of the contours.

Heuristic complements: Exploiting information on the location of the corruption
The method explained in the previous section does not use any information of where the image is corrupted. In this section we show how the suitable use of this information permits to obtain better reconstructions. In particular we will present two extensions to the previous algorithm, the Dynamic Restoration (DR) procedure [5] and the varying coefficients hypoelliptic diffusion, and briefly introduce a sort of synthesis of the two, the Averaging and Hypoelliptic Evolution (AHE) algorithm [4]. The results obtained by these methods are comparable with the current state-of-the-art for PDE based image inpainting algorithms [11,21]. Actually, in our opinion, in order to go beyond this state-of-the-art it is necessary to introduce additional external information on the corrupted part as it is done, for instance, in exemplar-based methods [11].

Dynamic Restoration
In this section we present a technique to exploit the information on the location of the corruption in the inpainting algorithm. Assume that a partition of the set of pixels of the image is given I = G ∪ B , where points in G are "good", i.e., non-corrupted, while those in B are "bad", i.e., corrupted. The idea is now to periodically "mix" the solution ψ t of the diffusion on SE(2, N ) with the initial function Lf on G, while keeping tabs on the "evolution" of the set of good points.
Namely, fix n ∈ N and split the segment [0, T ] into n intervals t r = rτ , r = 0, . . . , n, τ = T /n. Let G(0) = G, B(0) = B and iteratively solve the hypoelliptic diffusion equation on each [t r , t r+1 ] with initial condition Here, the function ψ − is the solution of the diffusion on the previous interval (or the starting lifted function if r = 0), and the coefficient σ is given by  Some reconstruction results via the Dynamic Restoration procedure are presented in Figures 10 and 11. Notice that restorations obtained via this method are much better than those obtained via pure hypoelliptic diffusion. In particular, the DR procedure gives reasonable results even on highly corrupted images, as shown in Figure 11.

Varying coefficients hypoelliptic diffusion
A different approach w.r.t. the Dynamic Restoration procedure is to directly modify the evolution equation (8), in order to enforce a stronger diffusion on the parts of the image that are known to be corrupted. Namely, we can replace the operator ∆ in (8) by ∆ H := bX 2 + a β 2 Θ 2 for some non-negative continuous coefficients a, b : R 2 → R, i.e., The above equation with varying coefficients a, b tries to implement the natural idea of reducing the eect of diusion at non-corrupted points. Indeed, when using (8), this can be done only by decreasing of the total time of diffusion T , but this acts in an homogeneous way on all points, including the corrupted ones. On the other hand, when using equation (16) one can tune the coefficients a, b depending on the point (x, y), and consequently, weaken the diffusion effect only where it is necessary. Unfortunately, from the point of view of numerical integration, the essential decoupling effect that allowed to reduce (8) to (13) does not take place anymore. In order to overcome this problem, in [4], we approximate the varying coefficient version of (8) by a similar equation with constant coefficients. Although the constant coefficients operator used in this approximation presents a drift term, this allows to recover the decoupling effect and the Crank-Nicolson method is still pertinent applied to each of the decoupled ODE's. When choosing the varying coefficients a and b, the idea is to make them larger at bad points and their neighbors (especially the coefficient b, which has the most influence to the velocity of the diffusion). Thus, they will be chosen to be a smooth approximation of the indicator function of the the set B of bad (corrupted) points and c 0 , c 1 are positive constants. In particular, our choice is: where (x, y) = exp − f 2 (x, y) σ .
Here, a 0 , b 0 ∈ R, a 1 , b 1 , σ > 0, and * ∈ (0, 1) are constant parameters chosen experimentally. Some numerical experiments are presented in Figure 12 We observe that for low levels of corruption, reconstruction of images with this method yields results which are comparable to, if not better than, the ones obtained through the DR procedure. However, when the corrupted part becomes larger this method fails. This suggests that, in order to obtain a good inpainting algorithm for highly corrupted images, one has still to use the DR procedure, combining it with the varying coefficients. This is an essential component of the algorithm presented in the next section.

AHE algorithm
In order to improve on the results for high corruption rates, in [4] we proposed the Averaging and Hypoelliptic Evolution algorithm. The main idea behind the AHE algorithm is to try to provide the anisotropic diffusion with better initial conditions. More precisely, it is divided in the following 4 steps (see Figure 13): 1. Preprocessing phase (Simple averaging): We apply a simple iterative procedure that fills in the corrupted area by assigning to each corrupted pixel the average value of the non-corrupted neighboring pixels.  2. Main diffusion (Strong smoothing): By using the result of the previous procedure as an input, we apply the varying coefficients hypoelliptic diffusion discussed in Section 4.2. 3. Advanced averaging: In order to remove the blur introduced by the hypoelliptic diffusion of the previous step, we mix the results of step 1 and step 2. 4. Weak smoothing: We perform a last hypoelliptic evolution, in order to smooth some of the edges we obtained in step 3.
In Figure 14 we present the results obtained via the AHE algorithm on highly corrupted images. In particular, a comparison with Figure 11 shows that the synthesis between the DR procedure and the varying coefficients, coupled with the averaging steps, yields much better results w.r.t. the simple application of the DR procedure.

Conclusions
In this paper we presented several image inpainting algorithms based on hypoelliptic diffusion. Although pure hypoelliptic diffusion allows to obtain reasonable inpaints of images with small corrupted regions, we showed that coupling such diffusion with heuristic methods exploiting the full knowledge of the location of the corruption allows to obtain very efficient reconstructions. It is interesting to notice that when the image is so corrupted that our visual system is not able to recognize it, as it happens, for instance, for the corrupted image presented in Figure 15, the use of pure hypoelliptic diffusion does not help it. This fact can be interpreted as a validation of the CPS model with pure hypoelliptic diffusion. On the contrary, as already pointed out, both the DR procedure and the AHE algorithm produce reconstructions that go beyond the capabilities of our visual system. Figure 15. A comparison of the different algorithms presented in this paper. From left to right: the original image, the corrupted image used as input, the reconstructions obtained via pure hypoelliptic diffusion (β 2 = 1 4 , T = 1), the DR procedure (n = 120), and the AHE algorithm.