Probabilistic and Piecewise Deterministic models in Biology

We present recent results on Piecewise Deterministic Markov Processes (PDMPs), involved in biological modeling. PDMPs, first introduced in the probabilistic literature by Davis (1984), are a very general class of Markov processes and are being increasingly popular in biological applications. They also give new interesting challenges from the theoretical point of view. We give here different examples on the long time behavior of switching Markov models applied to population dynamics, on uniform sampling in general branching models applied to structured population dynamic, on time scale separation in integrate-and-fire models used in neuroscience, and, finally, on moment calculus in stochastic models of gene expression.


Introduction
The piecewise deterministic Markov processes (denoted PDMPs) were first introduced in the literature by Davis [30,31]. The key point of his work was to endow the PDMP with rather general tools, similar as such that already exist for diffusion processes. Indeed, PDMPs form a family of non-diffusive càdlàg Markov processes, involving a deterministic motion punctuated by random jumps. The motion of the PDMP {X(t)} t≥0 depends on three local characteristics, namely the jump rate λ, the deterministic flow ϕ and the transition measure Q according to which the location of the process at the jump time is chosen. The process starts from x and follows the flow ϕ(x, t) until the first jump time T 1 which occurs either spontaneously in a Poisson-like fashion with rate λ(ϕ(x, t)) or when the flow ϕ(x, t) hits the boundary of the state-space. In both cases, the location of the process at the jump time T 1 , denoted by Z 1 = X(T 1 ), is selected by the transition measure Q(ϕ(x, T 1 ), ·) and the motion restarts from this new point as before. This fully describes a piecewise continuous trajectory for {X(t)} t≥0 with jump times {T k } and post jump locations {Z k }, and which evolves according to the flow ϕ between two jumps.
Since the seminal work of Davis, PDMPs have been heavily studied from the theoretical perspective, we may refer for instance the readers to [27,2,1,57,7,9] among many others.
From the applied point of view, and more precisely in biological applications, these processes are sometimes referred as hybrid, and are especially appealing for their ability to capture both continuous (deterministic) dynamics and discrete (probabilistic or deterministic) events. First applications date back at least to the studies of the cell cycle model [53,52], and more recent works, to name just a few, in neurobiology [63,35,35], cell population and branching models [4,22,60,36,51,20], gene expression models [72,56,33], food contaminants [14,12] or multiscale chemical reaction network models [28,3,50,46]. See also [47,16,66] for selected reviews on hybrid models in biology.
The paper is organized as follows. In Section 1, we present long time behavior results for switched flows in population dynamics. In Section 2, we present many-to-one formulas and description of the trait of an individual uniformly sampled in a general branching model applied to structured population dynamic. In Section 3, we present limit theorems and time scale separation in integrate-and-fire models used in neuroscience. Finally, in Section 4, we derive asymptotic moments formulas in a stochastic model of gene expression.

Long time behavior of some PDMP for population dynamics
In this section, we present recent results on long time behavior and exponential convergence of some PDMP used in population dynamics, in particular, for modeling population growth in a varying environment.
We consider processes (X t , I t ) t≥0 , where the first component X t has continuous paths in a continuous space E, namely a subset of R d , for some integer d, and the second component I t is a pure jump process on a finite state space F or in N. The continuous variable represents the number of individuals (or more precisely the population density or the relative abundance) of a population and the discrete variable models the population environment. In a fixed environment I t = i ∈ F , we assume that (X t ) t≥0 evolves as the solution of an ordinary differential equation, that is where F (i) is some smooth function. For instance, the oldest and maybe the most famous model to represent the evolution of a population is the choice E = R + and When there is no variability in the environment, solutions are given by exponential functions and there is either explosion or extinction according to the sign of a i . We develop in Subsection 1.1 this toy model when the environment varies. Almost all properties (almost-sure convergence, moments...) can be derived and it gives a first hint to understand more complicated models such as multi-dimensional linear models or non-linear models. These generalizations are described in Subsection 1.2, and 1.3, respectively. For non-linear population models, maybe one of the most famous models is given by the Lotka-Volterra equations (also known as the predator-prey equations), with state-space E = R 2 + and To end the introduction of this section, let us mention that all of these models are particular instance of a class of switching Markov model, see for instance [9,71] for general references. Some other examples of application are developed in [57,38,61].
1.1. Malthus model. Let us consider, in this subsection, that (I t ) t≥0 is an irreducible continuous time Markov chain on a finite state space F . We denote by ν its unique invariant probability measure. The process (X t ) t≥0 will be the solution of This means that F (i) is given by (1.2). Thus, we have a Is ds .
The almost-sure behavior of this process is easy to understand. Indeed, by the ergodic theorem, we have that Then, we have the following dichotomy: either ν(a) > 0 and (X t ) t≥0 tends almost-surely to infinity at an exponential rate, or ν(a) < 0 and (X t ) t≥0 tends to 0. The second statement can be understood as the extinction of the population, even if, in contrast with classical birth and death processes, (X t ) t≥0 never hits 0; that is the extinction time (or the hitting time of 0) is almost-surely infinite. The convergence of moments of (X t ) t≥0 is more tricky than the almost-sure convergence. Indeed, one cannot use a dominated convergence theorem or a Jensen inequality (which is in the wrong way). Nevertheless, we have If ν(a) < 0 then there exists p > 0 such that The proof which follows comes from [6]; see also [32].
Proof. If we denote by A the generator of the process (I t ) t≥0 and µ 0 the vector corresponding to its initial law, then the Feynman-Kac formula states that, for every p > 0, where it is the classical exponential of matrices, D is the diagonal matrix such that D i,i = a i and 1 is the vector made of ones. Now, classical results on matrices (including Perron-Frobenius Theorem) give the existence of two constants c p , C p > 0 such that where λ p is the (unique) eigenvalue of largest real part of the matrix (A + pD). It then remains to understand the sign of λ p to understand the moment behaviour. By definition of λ p , there exists v p such that Moreover, for p = 0, we have λ 0 = 0 and v 0 = ν. Again classical results on matrices ensure that p → λ p and p → v p are smooth and then differentiating in p, we obtain In particular, for p = 0 and multiplying by 1 by the right, we have ν(a) = (∂ p λ p ) |p=0 . As a consequence, if ν(a) > 0, then p → λ p is increasing in a neighborhood of 0 and for all p > 0 small enough we have λ p > 0.
Thus, E[X p t ] tends to infinity for p small enough, and then for all p > 0 by the Jensen inequality. In contrary, if ν(a) < 0 then there exists p > 0 such that lim t→+∞ E[X p t ] = 0 and also for all q < p, again by Jensen inequality.
Understanding the long time behavior of moments is primordial when we are interested in the convergence to some invariant law more general than δ 0 . Indeed, moments provide Lyapunov-type functionals ensuring that the process returns frequently in a compact-set.
1.2. Others linear models. From a modeling point of view, the results of the previous section are easily understandable. Indeed, it is enough that the environment has in mean a positive effect to ensure an exponential growth, while if the environment is, in mean, not favorable, then the population goes to extinct.
However, note that this interpretation only holds because the growth parameter is well-chosen. Indeed, let us first consider a discrete time analogous of the Malthus model: the sequence (Y n ) n≥0 satisfying where (Θ n ) n is a sequence of i.i.d. non-negative random variables. Here Θ n represents the mean number of children per individuals at generation n (all individuals have the same number of children which depends on the environment). If (Θ n ) n≥0 is a constant sequence then Y n = Θ n 1 Y 0 and a dichotomy occurs with respect to a critical value Θ 1 = 1. When (Θ n ) n≥0 is a sequence of i.i.d random variable, then there is also a dichotomy but the critical value (to be smaller or larger than 1) is no longer E[Θ 1 ] but exp (E[ln(Θ 1 )]). Indeed, we have This is exactly the same phenomenon for Galton-Watson chain in random environment, see for instance [5] and [42, Section 2.9.2].
There is a similar (but more complex, maybe) problem in upper dimension. Let us consider that E = R 2 , F = {0, 1}, and (I t ) t≥0 jumps from 0 to 1 and from 1 to 0 with the same rate λ > 0, and Then the solutions of the equation ∂ t x y = F (1) x y are given by It is easy to see that there exists some constants C > 0 such that for all t ≥ 0, This property is also satisfied by the solutions of ∂ t x y = F (0) x y . However, we have the following theorem. For this type of results for more general vector fields, see for instance [8,57,54]. Let us briefly explain this phenomenon. One can see that, whatever the initial conditions of the solutions (x, y) of (1.4), the map t → (x(t), y(t)) is not decreasing; see for instance Figure (2).
Then, heuristically, we see that if I jumps sufficiently fast (before the decay of the distance) then the distance may grow. However, if I is too slow, then (X t ) t≥0 essentially follows one of the two flows and goes rapidly to 0.

A general theorem for convergence.
In this subsection, we expose a condition similar to Subsection 1.1 for ensuring convergence to an equilibrium (not necessary δ 0 ). As pointed out in the previous section, we have to measure precisely the growth of each underlying flow. So we assume that for all i ∈ F , there exists (1.5) The parameter ρ(i) measures the contraction of a solutionu = F (i) (u) with respect to the norm · . Indeed, for two solutions, u 1 and u 2 , we have Note that the next result does not rely on a specific norm, but it has to be the same for all flows. A sufficient condition to prove (1.5) is given by the next lemma.
in the sense of quadratic form.
Proof. It is classic to derive the previous bounds using the mean value theorem for the following map We can now state the main result of this subsection.
then (X t , I t ) converges in law to a unique invariant law.
The proof of this theorem is given in [24,7]. The proof of [24] is based on a generalization (developed in [44]) of the classical Harris Theorem [43], which does not hold in general for PDMP (see for instance the introduction of [7]). One of the main point of the proof is the construction of a Lyapunov function V , in the sense that there exist, C, γ, K > 0 such that (1.7) The construction of V and the control of its moments are based on Lemma 1.1.
Note that another proof of the previous theorem is given in [7]. Nevertheless, the proof of [24] takes the advantage to be generalizable in many contexts (dependent environment, non-deterministic underlying dynamics, asymptotic assumptions...). See [24] for details.
Closely related articles [10,58,59] deal with a fluctuating Lotka-Volterra model. They show that random switching between two environments both favorable to the same specie can lead to the extinction of this specie or coexistence of both species.
As Theorem 1.4, their proofs are based on the construction of a Lyapunov function. Nevertheless instead of controlling the exit from a compact set of R to infinity, this Lyapunov function is used to control the exit from a compact set included in (0, ∞) 2 to the axes {(0, y)|y ≥ 0} or {(x, 0, )|x ≥ 0}. A complete description of this model is given in [58]. Note however that their counter-intuitive result is similar to the one of Theorem 1.2. It is based on the fact that, when I jumps sufficiently fast (namely λ being large in Theorem 1.2), then X is close to follow the following deterministic dynamics: Finally, we end this section mentioning that such hybrid framework can also be applied to models where the population is represented by a discrete variable, and/or the environment fluctuates continuously. One can cite for instance [26], where the author studies a model of interaction between a population of insects and a population of trees. Another example is the chemostat model (a type of bioreactor), introduced in [29] and studied in [25,19,18,23]. Lyapunov functions are again a key theoretical tool to study long time behavior for such models. However, we point out that extinction events may occur in discrete population models, and that the interplay between the discrete and continuous components makes the problem more difficult than standard absorbing time studies in purely discrete population models. Nevertheless, it was proved in [23, Theorem 3.4] that extinction event is almost-surely finite and has furthermore exponential tails. This generalizes part of a result of [25].

Uniform sampling in a branching structured population
In this section, which is a short version of [60], we give a characterization of the trait of a uniformly sampled individual among the population at time t in a structured branching population.
2.1. Introduction. We consider a branching Markov process where each individual u is characterized by a trait (X u t , t ≥ 0) which dynamic follows an X -valued Markov process, where X := X × R + and X ⊂ (R + ) d for some d ≥ 1. For any x ∈ X , the last coordinate corresponds to a time coordinate and we denote by x ∈ X the vector of the first d coordinates. We assume that this trait influences the lifecycle of each individual (in terms of lifetime, number of descendants and inheritance of the trait). Then, an interesting problem consists in the characterization of the trait of a "typical" individual in the population. This is the question we address here. Let us describe the process in more details. We consider a strongly continuous contraction semi-group with associated infinitesimal generator G :

13.82
denotes the space of continuous bounded functions from X to R. Then, each individual u has a trait (X u t , t ≥ 0) which evolves as a Markov process defined as the unique X -valued càdlàg solution of the martingale problem associated with (G, D(G)). An individual with trait x dies at an instantaneous rate B(x), where B is a continuous function from X to R + .

It is replaced by
For convenience, we assume that p 1 (x) ≡ 0 for all x ∈ X . The trait of the descendants at birth depends on the trait of the mother at death. For all k ∈ N, let P (k) (x, ·) be the probability measure on X k corresponding to the trait distribution at birth of the k descendants of an individual with trait x. We denote by P Finally, we denote by M P (X ) the set of point measures on X . Following Fournier and Méléard [39], we work in D (R + , M P (X )), the state of càdlàg measure-valued Markov processes. For any Z ∈ D (R + , M P (X )), we write: the measure-valued process describing the dynamic of the population where V t denotes the set of all individuals in the population at time t. We will denote by N t the cardinality of V t and by: An example of such a process is the size-structured population where each individual grows exponentially fast. For more details we refer the reader to [36] or [13]. Then, at rate B(X u t ), the individual u splits at time t in two daughter cells of size X u t /2. Figure 3 is a realization of such a process. The diameter of each circle corresponds to the size of the individual at division and the length of the branch represents the lifetime of each individual. In particular, we notice the link between the lifetime and the size of each individual: the bigger the cell is, the shorter its lifetime is.
In Section 2.2, we give the chosen assumptions on the model to ensure that our process is well-defined. Then, in Section 2.3, we describe the process corresponding to the trait of a "typical" individual in the population. Finally, in Section 2.4, we explain why this process corresponds to the trait of a uniformly sampled individual in a large population approximation.

Definition and existence of the structured branching process.
To define rigorously the structured branching process on R + , we need to ensure that almost surely the number of individuals in the population does not blow up in finite time.
As explained in the previous description of the process, the lifetime of each individual depends on the dynamic of the trait through the division rate B. One way of dealing with this dependency is to assume that the division rate is upper bounded by some constant B. Then, in the case of a binary division, the expected number of individuals in the population at time t is upper bounded by the expected value of a Yule process with birth rate B at time t which is equal to exp(Bt) and which is bounded on compact sets. In the case of an unbounded division rate, the same reasoning applies if we can control the excursions of the dynamic of the trait. Thus, we consider two sets of hypotheses: the first one controls what happens regarding divisions (in term of rate of division and of mass creation) and the second one controls the dynamic of the trait between divisions. Assumption 1. We consider the following assumptions: (1) There exist b 1 , b 2 ≥ 0 and γ ≥ 0 such that for all x ∈ X , (2) There exists x ∈ X such that for all x ∈ X , k ∈ N: The first point controls the division rate. The second point means that we consider a fragmentation process with a possibility of mass creation at division when the mass is small enough. In particular, clones are allowed in the case of bounded traits and bounded number of descendants and any finite type branching structured process can be considered.
As explained before, we make a second assumption to control the behavior of traits between divisions.

Assumption 2.
There exist c 1 , c 2 ≥ 0 such that for all x ∈ X : where γ is defined in Assumption 1 and for This assumption ensures in particular that the trait does not blow up in finite time in the case of a deterministic dynamic for the trait. Assumptions 1(1) and 2 are linked via the parameter γ which controls the balance between the growth of the population and the dynamic of the trait. In particular, if Assumption 1 is satisfied for γ = 0, the division rate is bounded and Assumption 2 is always satisfied.
Under Assumptions 1 and 2, the previously described measure-valued process is well-defined as the strongly unique solution of a stochastic differential equation.
The proof of existence relies on a recursive construction of a solution via the sequence of jumps time of the population. Then, we prove that this sequence is unique conditionally to the Poisson point measure determining the jumps in the population and to the stochastic flows corresponding to the dynamic of the trait. Finally, using Assumptions 1 and 2, we prove that the number of individuals in the population does not blow up in finite time. This is detailed in [60] (Theorem 2.3.).
2.3. The trait of sampled individuals at a fixed time : Many-to-One formula. In order to characterize the trait of a uniformly sampled individual, the spinal approach ( [21], [55]) consists in following a "typical" individual in the population whose behavior summarizes the behavior of the entire population. In this section, we give the dynamic of the trait of a typical individual: it is a Markov process called the auxiliary process.
From now on, we assume that for all x ∈ X , t ≥ 0 and s ≤ t, m(x, s, t) = 0. With a slight abuse of notation, we denote by X u s the trait of the unique ancestor living at time s of u. Let f be a non-negative measurable function and x ∈ X . The idea is to notice that the following operator: is a conservative (non-homogeneous) semi-group. Then, the auxiliary process is defined as the time-inhomogeneous Markov process with associated family of semi-groups P (t) r,s , r ≤ s ≤ t . It describes the dynamic of the trait of a "typical" individual in the population in the sense that it satisfies a so-called "Many-to-One" formula. This formula is true under Assumptions 1, 2 and under the following technical assumptions: Assumption 3. There exists a function C such that for all j ≤ k, j, k ∈ N and 0 ≤ s ≤ t, we have: This assumption tells us that we control uniformly in x the benefit or the penalty of a division.

Assumption 4.
For all t ≥ 0 and x ∈ X we have: We can now state the Many-to-One formula.

s ≤ t is an inhomogeneous-Markov process with infinitesimal generator A (t)
s , D(A) given for f ∈ D(A) and x ∈ X by: where: Comments on the proof. To prove the Many-to-One formula (2.2), we first show that the family of semi-groups r,s , r ≤ s ≤ t is uniquely defined as the unique solution of an integro-differential equation (see Lemma 3.2 in [60]). In particular, we need Assumption 3 to prove the uniqueness. Then, the infinitesimal generator r,s , r ≤ s ≤ t using Assumption 4.
Unlike previous works on this subject ( [41], [45], [22]), the existence of our auxiliary process does not rely on the existence of spectral elements for the mean operator of the branching process. In particular, we can apply this result to models with a varying environment. The dynamic of this auxiliary process heavily depends on the comparison between m(x, s, t) and m(y, s, t), for x, y ∈ X . It emphasizes several bias due to growth of the population. First, the auxiliary process jumps more than the original process, if jumping is beneficial in terms of number of descendants. This phenomenon of time-acceleration also appears for examples in [21], [55] or [45]. Moreover, the reproduction law favors the creation of a large number of descendant as in [4] and the non-neutrality favors individuals with an "efficient" trait at birth in terms of number of descendants. Finally, a new bias appears on the dynamic of the trait because of the combination of the random evolution of the trait and non-neutrality. Indeed, if the dynamic of the trait is deterministic, we have G

Ancestral lineage of a uniform sampling at a fixed time in a large population.
The Many-to-One formula (2.2) gives the law of the trait of a typical individual in the population. But so far, we did not prove that it corresponds to the trait of a uniformly sampled individual. In particular, we have to take into account that the number of individuals in the population is stochastic and depends on the dynamic of the trait. Using the law of large numbers, we can approximate the number of individuals in a population starting for n individuals, divided by n, by the mean number of individual in the population. That is why we now look at the ancestral lineage of a uniform sampling in a large population.
It only makes sense to speak of a uniformly sampled individual at time t if the population does not become extinct before time t. For all t ≥ 0, let Ω t = {N t > 0} denote the event of survival of the population. Let ν ∈ M P (X ) be such that: We set where X i are i.i.d. random variables with distribution ν. For t ≥ 0, we denote by U (t) the random variable with uniform distribution on V t conditionally on Ω t and by X U (t) s , s ≤ t the process describing the trait of a sampling along its ancestral lineage. If X is a stochastic process, we denote by X ν the process with initial distribution ν ∈ M P (X ).

Theorem 2.3. Under Assumptions 1,2, 3 and 4, for any t ≥ 0, the following convergence in law in
. Remark 2.4. If we start with n individuals with the same trait x, we obtain: Therefore, the auxiliary process describes exactly the dynamic of the trait of a uniformly sampled individual in the large population limit, if all the starting individuals have the same trait. If the initial individuals have different traits at the beginning, the large population approximation of a uniformly sampled individual is a linear combination of the auxiliary process.

Averaging for some integrate-and-fire models
3.1. The model. We examine the following slow-fast hybrid version of integrate-and-fire models [17], used in mathematical neuroscience. In such a setting, X represents the membrane potential of a neural cell which is increasing until it reaches some threshold c, corresponding to the time where a nerve impulse is triggered, and then the potential is reset to some lower value. More precisely, the process (X(t), t ∈ [0, T ]), with T some time horizon, obeys the following dynamic: (1) Initial state: At time T * 0 = 0, the process starts at X(T * 0 ) = ξ 0 , a random variable with support included in (m, c) where {c} is considered as a boundary and m < c is some real.
where α is a positive measurable function such that α(Y) is bounded from above and F is a positive continuous function.
(4) Jumping measure: Then, at time T * − 1 , the process is constrained to stay inside (m, c) by jumping according to the Y -dependent measure µ Y T * − 1 whose support is included in (m, c): ∀A Borel subset of (m, c), (5) And so on: Go back to step 1 in replacing T * 0 by T * 1 and ξ 0 by ξ 1 = X(T * 1 ). This is not hard to see that the process (X, Y ) is a piecewise deterministic Markov process. Moreover, due to the presence of a boundary c, this is also a constrained Markov process. Our aim is, in this quite simple situation, to describe the behavior of the process X at the boundary c, when the dynamic of the underlying process Y is infinitely accelerated.

Acceleration and averaged process.
From now on, we assume that the underlying process of celerities, that is the continuous time Markov chain Y , has a fast dynamic, by introducing a (small) timescale parameter ε such that ∀t ≥ 0, Y ε (t) = Y (t/ε) . In the same time, to insure a limiting behavior, we assume that Y is positive recurrent with intensity matrix Q = (q zy ) z,y∈Y and invariant probability measure π, considered as a row vector. For convenience, let us also define by α(V ) the diagonal matrix such that diag(α(V )) = {α(y) ; y ∈ Y}. As ε goes to zero, the process Y ε converges towards the stationary state associated to Y in the sense that, by the ergodic theorem, ∀t ≥ 0, ∀y ∈ Y, lim ε→0 P(Y ε (t) = y) = π(y).
Therefore, as ε goes to zero, the process X ε , defined as X by replacing Y by Y ε , should have its dynamic averaged with respect to the measure π. The behavior of the limiting process away from the boundary is indeed not hard to describe, see for example [49] and references therein. Here and later on, D ([0, τ ] , R) is the space of càdlàg functions from [0, τ ] to R, with τ a time horizon. For technical reasons, we assume that Figure 4. A trajectory of X, in a piecewise linear case, dX(t)/dt = Y (t), with Y switching between 1/2 and 1.
is the counting measure at the boundary for the process X ε . Proposition 3.1. Assume that ξ 0 is deterministic. Then, for any η > 0, the process X ε converges in law towards a processX on D 0, c−ξ0 max α(Y) − η , R defined as: F (X(s))ds.
We can read in (3.1) the behavior of the averaged processX. Away from the boundary, the dynamic ofX is the one of the averaged flow, as indeed described in Proposition 3.1. At the boundary, the averaged jump measure is not given by the averaged version of the jumping measure, that is but the values of the celerity at the boundary are taken into account through the presence of α. There is a balance between the celerity intensities and the invariant probabilities. For example, consider a piecewise linear motion whose speed switches at rate 1/ε between 1 and 2. When, ε goes to zero, the process will spend as much time increasing with celerities 1 and 2, but when increasing with celerity 2, the process obviously increases twice more than with celerity one in a same time window, and then approaches the boundary twice more rapidly, and therefore will have in fact twice more chance to hit the boundary with this celerity. Note that because of the separation of variables in the form of the flow, the value of the processX at the boundary does not appear in the expression of π * , as it could be the case in more general situations. Theorem such as 3.2 can be derived using the Prohorov program : prove the tightness of the family {X ε , ε ∈ (0, 1)} and then identify the limit. Tightness for constrained Markov processes, as the process described in the present section, has an interesting and rich literature, see [48] and references therein. We refer to [40] for more details about the proof of Theorem 3.2.

Stochastic models of protein production with cell division and gene replication
The production of proteins is the main mechanism of the prokaryotic cells (like bacteria). These functional molecules represent about 3.6 × 10 6 molecules of 2000 different types and compose about half of the dry mass of the cell [15] and this pool needs to be duplicated in the cell cycle time. In total, it is estimated that 67% of the energy of the cell is dedicated to the protein production [67]. The protein production is a two steps process by which the genetic information is first transcribed into an intermediate molecule, the messenger-RNA (mRNA), and then translated into a protein. Experimental studies have shown that this production is subject to a large variability [37,62,68]. This heterogeneity can either be explained by the stochastic nature of the production mechanism (both transcription and translation are due to random encounter between molecules) or the effect of cell scale phenomenon like the DNA replication, cell division, or fluctuation in the resources.
In particular, the extensive experimental study of Taniguchi et al. (2010) [68] measured the variability of protein concentration for more than 1000 different proteins in E. coli. For each of them, the authors have measured the average protein concentration, the average mRNA concentration and their lifetime. They interpret different sources of proteins, either from the protein production mechanism itself (i.e. the transcription of mRNAs and the translation of proteins) or from external factors, like the gene replication, division or the sharing of common resources in the production.
One can interpret the experimental results in the light of classical stochastic models of protein production [11,65,64] that predict the variability of protein production. But these classical models do not consider important phenomenon that may have an impact on the production: they usually do not consider the replication of the gene at some point in the cell cycle and also do not take into account the random assignment of each protein in either of the two daughter cells at division. Moreover, they only consider the number of mRNAs and proteins without any explicit notion of volume, and therefore the notion of concentration is unclear in this case. Since the experimental study only considers concentrations, it would be difficult to have quantitative comparisons between the variances predicted by these models and the ones experimentally measured.
We therefore propose here a more realistic model of protein production that takes into account all the basic features that can be expected for the production of a type of protein inside a cell cycle: the transcription, the translation, the gene replication and the division. We also explicitly represent the number of mRNAs and proteins in the growing volume of the bacteria, so that we can consider the concentration of each of these entities in the cell. We will then estimate the theoretical contribution of these features to the protein variance through mathematical analysis. We will also make a biological interpretation of these results with respect to experimental measures.

Model and Results.
Our model presented here is adapted from a simple gene constitutive model that does not consider any gene regulation (it is a particular case of the model presented in [64]). Like this classical model, our model is gene centered (it represents the production of only one protein), and has two stages (transcription and translation) where both the number of mRNAs, M , and proteins, P , are explicitly represented. Each event (creation or degradation of an mRNA, creation of a protein, etc.) is supposed to occur at random times. The time intervals are considered as exponentially distributed. In addition to this classical framework, we introduce a notion of cell cycle represented by cell growth, division and gene replication. Periodically two events occurs: the replication at a deterministic time τ R in the cell cycle that doubles the mRNA production rate and the division after a time τ D starting a new cell cycle. We can then consider the evolution of one type of mRNA and protein through a cell lineage (see Figure 5a). Overall, the model represents five aspects that intervene in protein production (see Figure 5b): mRNAs: Messenger-RNAs are transcribed at rate λ 1 before the replication and at rate 2λ 1 after gene replication. When transcription happens, the number M of mRNAs increases by 1. As in the previous model, each mRNA has a lifetime of rate σ 1 (so the global mRNA degradation rate is σ 1 M ). Proteins: Each mRNA can be translated into a protein at rate λ 2 (so the global protein production rate is λ 2 M ). The number of proteins P is increased by 1. As the protein lifetime is usually of the order of several cell cycles, we consider that the proteins do not degrade. Gene replication: After a deterministic time τ R after each division (with τ R < τ D ), the gene replication occurs. The gene responsible for the mRNA transcription is replicated, hence doubling the mRNA transcription rate until next division. Division: Divisions occur periodically after deterministic time interval τ D . After each division, we only follow one of the two daughter cells (see Figure 5a) so that each mRNA and each protein have a probability 1/2 to be in this next considered cell. The result is a binomial sampling: for instance, with M mRNAs just before division, the number of mRNAs after division follows a binomial distribution B (n, p) with parameters n = M and p = 1/2. Moreover, as there is only one copy of the gene in the newborn cell, the mRNA transcription rate is anew set to λ 1 until the next gene replication. Volume growth: The volume V (s) considers the entire volume of the cell at any moment s of the cell cycle and it increases as the cell grows. One considers the volume growth of the cell as deterministic.
In real life experiments, a bacteria volume globally grows exponentially (see [70]) and approximately doubles its volume at the time of division τ D . As a consequence, in our model, for a time s in the cell cycle, then the volume grows as V (s) = V 0 2 s/τ D with V 0 the typical size of a cell at birth.
From this, if M s and P s denote respectively the number of mRNAs and proteins at time s, the concentrations are M s V (s) and P s V (s) .
Our main results are the analytical expressions for the distribution of the number of mRNAs and the first moments of the number of proteins at any time s of the cell cycle, when the system is at equilibrium (ie after many cell divisions).
The proof of this proposition uses the framework of Marked Point Poisson Processes to determine the mRNA distribution at any time of the cell cycle knowing the random variable M 0 , the number of mRNA at birth. Then, the distribution of M 0 can be determined as the equilibrium stipulates that the distribution of M is the same at time 0 and just after the next division. The proof will be given in [34] (in preparation).

Proposition 4.2.
At any time s ∈ [0, τ R [ before replication, the mean and the variance of P s are given as a function of the moments of (P 0 , M 0 ) by: with x 0 defined in Proposition 4.1.
Moreover, we can show a similar result for s ∈ [τ R , τ D [, a time after gene replication. Then, at equilibrium, we have that the distribution of the couple (M, P ) after division is the same as at the beginning of the cell cycle. This allows us to determine the explicit values for E [P 0 ], Var [P 0 ] and Cov [P 0 , M 0 ]. These complementary results and the proof of Proposition 4.2 will be available in [34] (in preparation).

4.2.
Discussion. In order to have realistic values for the parameters λ 1 , σ 1 , λ 2 and τ R , we fix them to correspond to the average production of each protein measured in Taniguchi et al. (2010) [68]. It gives groups of parameters that represent a large spectrum of different genes. Then, using Propositions 4.1 and 4.2, as well as the additional results, we can compute the concentration for every gene at any time of the cell cycle.
An example of the evolution of one protein concentration is shown in Figure 6. It appears that the mean concentration at a given time s of the cell cycle E [P s /V (s)] is not constant during the cell cycle. The curve of E [P s /V (s)] fluctuates around 2% of the global average protein production (denoted by µ p in the graph). Experimental measures of average protein expression during the cell cycle show similar results: for instance, the expression of genes at different positions on the chromosome have been measured in [69]; it shows a similar profile during the cell cycle and depicts a fluctuation also around 2% of the global average (see Figure 1.d and Figure S6.b of the article).
Overall, for all the genes considered, the effect of the cell cycle (the fluctuation of E [P s /V (s)] around µ p ) is small compared to the standard deviation at each moment of the cell cycle Var [P s /V (s)] (the coloured areas in Figure 6). We have determined the range of parameters where, on the contrary, the fluctuations due to the cell cycle would be higher than the inherent variability of the system. It appears that such regime is possible   the variance and µ p the mean) experimentally measured in [68]. It clearly appears two regimes: the first one (for µ p < 10) shows a CV inversely proportional to the mean; the second regime (for µ p > 10) shows a CV independent of the average production. (B) Same graph obtained with the proteins of our model. Even if the CV is roughly inversely proportional to the average production, there is no lower bound limit for highly expressed proteins.
only for highly active gene (high λ 1 ) and for mRNAs not very active (low λ 2 ) and which exist for short periods of time (high σ 1 ). Biologically, such regime seems unrealistic because the cost of mRNA production would be very high. In Figure 7, we compare the experimental results obtained in the article of Taniguchi et al. (2010) [68] and our analytical expressions obtained in our model, for the global coefficient of variation (CV) of protein concentration as a function of the average concentration. In the experimental measures, it clearly appears two regimes for the CV of protein concentration: for low expressed proteins (mean protein concentration < 10), the CV roughly scales inversely with the average concentration; for genes with higher protein production (mean protein concentration > 10), the CV becomes independent of the average protein production level, the plateau is around 10 −1 . In our model, the global tendency of the CV approximately inversely scales the average protein concentration, and there is no lower bound limit, as in the experiments.
We have therefore shown through this model that both DNA replication and division can not explain the second regime observed in Figure 7a. Yet the authors of the experimental study propose another possible origin for the noise observed in highly produced proteins; they propose that it is due to fluctuations in the availability of RNA-polymerases and ribosomes. These two macro-molecules are shared among the productions of all the different proteins because they respectively participate to the transcription of mRNAs and the translation of proteins. Since they are present in limited quantities, fluctuations in their availability could indeed have repercussions on the protein noise. We will present in [34] an extension of the model presented here that takes into account this sharing of ribosomes and RNA-polymerases in the production of all proteins of the bacteria, and the impact it has on the protein noise.