SENSITIVITY OF THE “INTERMEDIATE POINT” IN THE MEAN VALUE THEOREM: AN APPROACH VIA THE LEGENDRE-FENCHEL TRANSFORMATION

. We study the sensitivity, essentially the diﬀerentiability, of the so-called “intermediate point” c in the classical mean value theorem f ( b ) − f ( a ) b − a = f (cid:48) ( c ) ; we provide the expression of its gradient ∇ c ( d, d ) , thus giving the asymptotic behavior of c ( a, b ) when both a and b tend to the same point d . Under appropriate mild conditions on f , this result is “universal” in the sense that it does not depend on the point d or the function f . The key tool to get at this result turns out to be the Legendre-Fenchel transformation for convex functions.


Introduction
The basic mean value theorem (MVT for short), also called Lagrange's MVT, is certainly one of the best known results in Calculus. Let us recall it. Given a differentiable function f : I → R on some open interval I, for any a < b in I, there exists c ∈ (a, b) such that We call (1) the MVT relation. The MVT theorem is mainly an existence result, it does not say more on this "intermediate point c"..., especially if it is unique or not. In this note, we precisely consider the situation where such a c is unique for any a < b, and study the sensitivity of c as a function of both variables a and b (its continuity, and differentiability).
Since one does not know much about such c, c sometimes has the qualifier "mysterious". Here are some preliminary comments about it: • The intermediate point(s) c could be anywhere between a an b. We can say more on their location if we know more about the generating function f ; a typical example is when f is a polynomial function of degree n at most (see [10,Section 6.5] for that).
• When f is strictly monotone on (0, +∞), c(a, b) = (f ) is a way of defining "means" of a > 0 and b > 0, but not all means are defined in such a way (cf. [8]). For example, with the help of the generating function • One could expect for c some homogeneity properties like: for the same function f , the intermediate point c(ta, tb) on the dilated interval (ta, tb), with t > 0, is the dilated version of the intermediate point on (a, b), that is tc(a, b). This is not true, it is even exceptional. For example, c(a, b) just above does not satisfy this property. According to a deep and nice result by S. Golab and S. Lojasiewicz [3], this happens only for three classes of functions (we restrict ourselves to strictly convex or strictly concave functions f on (0, +∞)): With α = 0, This is the occasion to propose various positively homogeneous means of the type c(a, b): (arithmetic mean of a and b); if f (x) = 1/x, then c(a, b) = √ ab (geometric mean of a and b). But there is no function f for which the corresponding c(a, b) would be the harmonic mean 2ab a+b of a and b. Let us give a hint for that (from [8, pages 682 − 683]). According to what has been seen above, the (positively homogeneous) harmonic mean could only come from a function of the type f : x > 0 → f (x) = x p (where p = 0 and p = 1). But this is impossible. Indeed, if that was the case for some power p, one would have: where H(a, b) stands for the harmonic mean of a and b. The first equation, that is 2 p −1 For what we have to explore in this note, there is no loss of generality in assuming that f is defined on the whole of R, that is to say I = R.
Let us list here some qualitative properties related to this intermediate point c (see [7] for proofs): • Such a point c is unique for any a < b, if and only if either f or −f is strictly convex on R. That militates for our choice thereafter: f : R → R is a differentiable strictly convex function. • Such a point c can sometimes be expressed as a function of a and b, in a closed form; it depends on how easily the strictly increasing function f can be inverted (see some examples above). There is however a simple basic situation in that respect: If f (x) = 1 2 αx 2 + βx + γ, with α = 0, then c = a+b 2 for all a < b. It is interesting to note that this property of c characterizes quadratic functions [7]. In the same vein, given p ∈ (0, 1/2), it is impossible to have c = pa+(1−p)b 2 for all a < b [7].
From now on, we define the "intermediate point function" c : R × R → R as follows: The questions we address in this paper are: • (The easiest one). Is c a continuous function of a and b?
• (A more elaborate one). Is c a differentiable function of a and b? If so, what is the gradient vector of c at any point (d, d) lying on the critical diagonal line of R × R? We answer both questions, delineating the appropriate (minimal) general assumptions on f for that.

The extended difference quotient
Given a differentiable strictly convex function f : R → R, we define the extended difference quotient function q : R × R → R as follows: For This is a standard result in differential calculus. Let us still provide a short proof for the convenience of the reader.
The only nontrivial point to check is that q is continuous at We can even go further on the properties of q, assuming more about the function f . Property: Assume that f is C 2 . Then q is a continuously differentiable function on R × R, with This is again an exercise in differential calculus. Let us prove it quickly. At a point (a, b) with a = b, we clearly have: Following second-order Taylor-Lagrange expansions of f at a and at b, there exist ξ 1 and ξ 2 in (a, b) such that ∂q ∂a (a, b) = 1 2 f (ξ 1 ) and ∂q ∂b (a, b) = 1 2 f (ξ 2 ). By the assumed continuity of f , we therefore get: Thus, q is differentiable at (d, d) and (3) holds.

The Legendre-Fenchel transformation
Since the given function f is (strictly) convex, it is natural to evoke the Legendre-Fenchel transform for any property of f . Indeed, for our purposes its use turns out to be incredibly efficient.
The Legendre-Fenchel transform (or conjugate) f * of f is defined as follows: We use s to denote the variable because, in the geometrical interpretation of definition (4), s is a slope. We denote by I * the interval on which the supremum in (4) is finite. When f is invertible, which is our case (the strict convexity of f implies that f is strictly increasing), another way of looking at f * is: This is the usual form of the so-called Legendre transform of f , whether f is convex or not. In terms of derivatives, the relation between f and f * is simple: the derivative function of f * is the inverse of the derivative function of f : This property could even help obtain an antiderivative (or primitive function) of f , see an example below (Example 2.4). To get familiar with the transformation f f * , it is best to consider examples, which we do below.
Here I * = R and f * is of the same kind (i.e., quadratic) as f.
Since f = sinh and its inverse (f ) −1 is the derivative of f * , the above calculation is an alternate way of determining an antiderivative (or primitive) function of sinh −1 : it is the function given in (7).
All these examples illustrate formulas (5) and (6); we will even go further, in due time, with a relationship between second derivatives of f and f * .
The properties of the Legendre-Fenchel transformation are well-known and used in convex analysis (see for example [11, section 26] or [5, chapter E]). We list below those we need for our purposes. Recall that we have assumed that f : R → R is a strictly convex differentiable function. We then have: (1) I * is an interval with a nonempty interior, and f * is differentiable on int(I * ).
Note that (2)  ( Then so is f * on the interior of I * , with: , where x is the unique solution of s = f (x).
Let us quickly illustrate the property above with the functions displayed in Example 2.3. We have .
is a way of parameterizing the interior of I * (= (−1, 1)), and, for all s =

The main results
We begin with the following key observation on the function c defined in (2): This is indeed true for a = b, since by definition but also at diagonal points (d, d) since It now remains to scroll the results, everything has been prepared for that. Proof. We have seen in Section 2.1 that the extended difference quotient q is a continuous function from R × R into int(I * ). Moreover, (f * ) : int(I * ) → R, the derivative function of a differentiable convex function, is continuous. Hence, c which is nothing but (f * ) • q (see (9)), is a continuous function.
The result of Theorem 3.1 is not completely new, it appears in the following alternative form in the literature (in [3,4,9] for example): Assuming that f is strictly monotone, θ(a, h) is defined as the unique parameter in (0, 1) for which (f (a + h) − f (a)) /h = f (a + θ(a, h)h); then θ is a continuous function of a and h. c(a, b) be the intermediate point in the MVT, as defined in (2). Then c is a continuously differentiable function on R × R, with Proof. Again, c = (f * ) • q, resulting from the composition of two continuously differentiable functions, is itself continuously differentiable. As for ∇c(d, d), it suffices to apply the chain rule, with (3) and (8) as ingredients: .

Remarks.
• By applying Theorem 3.2 partially, we recover the following result on the behavior of the intermediate point in the MVT (see [1] for example, but it was known long before): assume that f is twice continuously differentiable in a neighborhood of a fixed a, with f (a) = 0, then the intermediate point c in the MVT property has the following asymptotic behavior Indeed, one can choose f (a) > 0 without loss of generality, and extend f to the whole of R so as to satisfy the assumptions in Theorem 3.2. Then, use the fact that ∂c ∂b (a, a) = 1 2 . • The "perfect" or asymptotic case is when f is quadratic, f (x) = 1 2 αx 2 + βx + γ, with α > 0 (see Example 2.1). In that case, c(a, b) = a+b 2 for all (a, b) in R × R. What Theorem 3.2 entails is that, under appropriate assumptions on f , say f (d) > 0, c(a, b) ∼ a+b 2 when a and b are close to a same point d.

Final comments.
• As any differentiability result, Theorem 3.2 measures the "sensitivity" of c(a, b) with respect to changes to the point (d, d). The fact that ∂c ∂a (d, d) = ∂c ∂b (d, d) just expresses the symmetry of this sensitivity of c with respect to the two variables a and b; this is indeed a reasonable and expected result.
• Note that the value of ∇c(d, d) does not change with d, and is independent of the function f (satisfying the assumptions in Theorem 3.2)! It is a kind of "universal result", as it sometimes happens in mathematics. As a by-product of this result, we note that all the means of the form c(a, b) (see the beginning of the paper) have the same asymptotic behavior, c(a, b) ∼ a+b 2 , when a and b are close to a same point d.
We would like to thank the two referees for their careful reading of the submitted manuscript and the subsequent remarks.