Simulating non-Markovian stochastic processes

We present a simple and general framework to simulate statistically correct realizations of a system of non-Markovian discrete stochastic processes. We give the exact analytical solution and a practical an efficient algorithm alike the Gillespie algorithm for Markovian processes, with the difference that now the occurrence rates of the events depend on the time elapsed since the event last took place. We use our non-Markovian generalized Gillespie stochastic simulation methodology to investigate the effects of non-exponential inter-event time distributions in the susceptible-infected-susceptible model of epidemic spreading. Strikingly, our results unveil the drastic effects that very subtle differences in the modeling of non-Markovian processes have on the global behavior of complex systems, with important implications for their understanding and prediction. We also assess our generalized Gillespie algorithm on a system of biochemical reactions with time delays. As compared to other existing methods, we find that the generalized Gillespie algorithm is the most general as it can be implemented very easily in cases, like for delays coupled to the evolution of the system, where other algorithms do not work or need adapted versions, less efficient in computational terms.


I. INTRODUCTION
Discrete stochastic processes are widespread in nature and human-made systems. Chemical reactions and biochemical processes in living cells, epidemic propagation in populations, and diffusion of information in societies and technological networks are all examples of systems whose states change at discrete random intervals, defining a sequence of events that conform a mixture of temporal point processes. In general, these processes are assumed to be memoryless, with future occurrences predictable based solely on the present state of the system, and with exponentially distributed inter-event times so that the dynamics can be described only in terms of the rates of occurrence of each of the processes involved. In this case, there exists stochastic simulation algorithms able to generate statistically exact realizations of the stochastic process, including the seminal method developed by Gillespie for Markovian dynamics modeled by Poisson point processes and its variations [1][2][3].
In this paper, we develop a simple and general framework to simulate statistically correct realizations of discrete stochastic processes, each with an arbitrary inter-event time distribution, that may stochastically create or annihilate other processes and that can depend on the current state of the system. We provide the exact solution to the problem, along with an approximation in the limit of large systems leading to an efficient and simple stochastic simulation algorithm in the same spirit of the Gillespie algorithm for practical applicability. We apply our generalized Gillespie algorithm to two case studies: the susceptible-infected-susceptible epidemic spreading model in contact networks, and a system of coupled biochemical reactions with time delays where we compare with already existing methods. Our results highlight the important effects that subtle differences in the non-Markovian dynamical rules underlying the stochastic processes have on the global behavior of complex systems.

II. MARKOVIAN STOCHASTIC SIMULATION: THE GILLESPIE ALGORITHM
The Gillespie algorithm [1,2] was originally designed to simulate systems of coupled (bio)chemical reactions within a thermal bath in a well mixed environment but, more generally, it can be applied to any system of discrete Markovian stochastic processes. The algorithm takes advantage of the theory of superposition of a (fixed) number of renewal processes [28]. Suppose we have a collection of N statistically independent discrete stochastic processes, each occurring at rate λ i , i = 1, · · · , N . The Gillespie algorithm generates a sequence of events by specifying, at each step of the simulation, the time until the next event τ (generated from the distribution ϕ(τ )) and the next event i, generated from the probability Π(i). It can be proved that for Poisson point processes (constant rates) these are given by arXiv:1310.0926v3 [cond-mat.dis-nn] 30 Jul 2014 2 whereλ = N −1 N k=1 λ k is the population mean rate of the set of processes [28]. Notice that, in general, the occurrence of a particular event can modify, besides the state of the system, both the rates and/or the number of "active" processes (those that can occur given the current state of the system) for the next iteration. For instance, a given reaction taking place modifies the number of molecules of all species involved in that particular reaction which, in turn, modifies the rates of occurrences of all reactions in which these species participate. This makes the algorithm extremely powerful and versatile, as it can simulate reaction-like processes for which the number of processes is stochastically generated by the realization, including non-equilibrium dynamics with absorbing states.

III. NON-MARKOVIAN STOCHASTIC SIMULATION
Next, we generalize the Gillespie algorithm to account for non-Markovian inter-event times. As before, we consider a set of N statistically independent discrete stochastic processes, each with an inter-event time distribution ψ i (τ ); i = 1, · · · , N [29]. Suppose now that, for a given process i and a given point in time, we know the time elapsed since the last event, t i , and ask the probability that next event will occur a time between τ and τ + dτ from that moment (hereafter, we use latin symbol t i to denote elapsed times and greek symbol τ to denote the time until a future event). This probability density can be expressed as [30] where Ψ i (τ ) is the survival probability of process i, that is, the probability that the time until the next event is longer than τ , Ψ i (τ ) = ∞ τ ψ i (s)ds. Analogously, the conditional survival probability of process i is given by In a single realization of the dynamics, all processes happen in the same timeline in a random order. Therefore, to generate a statistically correct sequence of events in a simulation, we have to evaluate the joint probability ϕ(τ, i|{t k }) that, given the times {t k } elapsed since the last occurrence of each process up to a given point in time t, next event taking place corresponds to process i and will occur at time t + τ . Since the probability that process k = i does not occur is Ψ k (τ |t k ), we have is the survival probability of τ , i.e. the probability that no reaction occurs before t + τ . Note that the joint probability Eq. (3) is well normalized.
Given the ocurrence time τ , the probability that next occurring event belongs to process i is where we have introduced the instantaneous (hazard) rate of process k as Equations (4) and (5) provide us with an algorithm that generates statistically correct sequences of events. Specifically: 1. Initialize elapsed times for all processes.

Draw a random time from the cumulative distribution
Eq. (4), by solving Φ(τ |{t k }) = u, being u a uniform random number in the interval (0, 1) and update current time as t → t + τ .
3. Choose a process i from the discrete distribution Eq. (5).

Update the list of elapsed times as
5. update the state of the system and, if needed, the set of active processes. If a new process, say process k, is activated, set its elapsed time. Go to step 2.
The initialization of elapsed times can be implemented in different ways depending on the particular application. One simple possibility would be to set initial elapsed times to zero. Another approach is to assume that the system is already in the steady state and, thus, set elapsed times according to the probability density Ψ i (t)/ τ i , where τ i is the average interevent time of process i [31].

A. Generalized Gillespie algorithm
The most frequent applications typically involve a fairly large number of processes N . It is possible to work out a simple approximation that becomes exact in the limit N → ∞ and drastically simplifies the numerical computation of the time τ needed in point 2 of the algorithm. We start by rewrit- The sum within the exponential function is a sum of N monotonously increasing functions of τ . Therefore, when N 1 the survival probability Φ(τ |{t k }) is close to zero everywhere but when τ ∼ 0. Hence we only need to consider Φ(τ |{t k }) around τ = 0, where an expansion in small τ can be performed: Plugging this expression into Eq. (7), we can write where the average rate isλ({t k }) = N −1 N k=1 λ k (t k ). The previous expansion assumes that Ψ k (τ + t k ) is analytical at t k = 0, a hypothesis which sometimes is not true. To overcome this singular case, we remove the last event, the one for which t last = 0 from the sum inλ({t k }). This implies that the probability to choose the same event two times in a row with our algorithm is zero. While this restriction is in general not present in the real dynamics, the probability of such event is negligible for large N and, thus, our assumption does not have any noticeable effect while avoiding a potential divergence of the algorithm in cases where lim τ →0 + ψ i (τ ) = ∞ [32].
Within this approximation, the probability that the next event taking place belongs to process i can be obtained from Eq. (5) setting τ = 0: whereas the distribution of the time until the next event is In the Markovian case, λ i (t i ) = λ i and we recover the classical Gillespie algorithm given in Eq. (1). In fact, quite remarkably, the new algorithm works as the original Gillespie algorithm with the difference that now the individual rates depend on the elapsed times of the processes and, therefore, are stochastic processes themselves. We name this algorithm generalized "non-Markovian Gillespie algorithm" (nMGA). We test the nMGA with a set of N = 10 3 independent renewal processes. The inter-event time survival probability is taken to be the versatile Weibull distribution for all processes. However, the scale and shape parameter of each process, µ −1 i and α i , are chosen uniformly at random in the intervals µ i ∈ (0.1, 1) and α i ∈ (0.5, 1.5), so that processes with temporal scales that differ in many orders of magnitude are mixed in the simulation. The individual instantaneous rates to be used in Eqs. (9) and (10) are given by Note that the rates diverge at t i = 0 whenever α i < 1. We generate a single long sequence of mixed events according to the nMGA. Then, we measure the inter-event time survival probability for each process. In Fig. 1 a, we show the comparison between the survival probability for three such processes and the theoretical distribution Eq. (11) with the corresponding parameters. As it can be seen, the agreement is extremely good even when processes with very different time scales are combined. In Fig. 1 b, we check the effect of having a limited number of processes. Even though the nMGA is only approximate, our numerical simulations indicate that even for a small number of processes (20 in our simulations), the algorithm is able to reproduce the inter-event times very accurately, with a small deviation for processes with α i > 1.
Next, we present two relevant examples. In the first, we shall see the effects of a non-Markovian dynamics and, as opposed to the Markovian case, the importance of the specific details of the laws governing the dynamics. In the second, we compare the computational efficiency of the nMGA with other existing methods.

IV. EPIDEMIC SPREADING. THE SIS MODEL AS A CASE STUDY
The Susceptible-Infected-Susceptible (SIS) model is one of the simplest and most paradigmatic models of epidemic spreading [33]. In this model, individuals within a contact network can be in two states, either susceptible or infected. Infected individuals remain in this state during a random time and then become susceptible again. Susceptible individuals can become infected if they are in contact with infected neighbors. Except for few exceptions [27,34,35], epidemic pro-cesses are always considered as Markovian so that, in the SIS case, infected individuals recover spontaneously at rate β and susceptible ones become infected at rate λ times the number of infected neighbors. This dynamics undergoes a phase transition between an absorbing (healthy) phase -where any infectious outbreak disappears exponentially fast-and an endemic phase with a sustained epidemic activity. This transition takes place at a critical value of the effective infection rate λ eff = λ/β that depends on the topology of the contact network [36][37][38][39][40][41][42]. Here we consider the SIS dynamics on top of the less structured network, the classical Erdos-Renyi random graph [43]. In this simple model, pairs of nodes out of a set of N nodes are connected with probability p = k /N , where k is the average degree of the network. In the limit N 1, this procedure generates a maximally random graph with a Poisson degree distribution. The critical value for the effective infection rate in these model networks is approximately λ c eff = 1/ k [44]. In the subsequent sections we investigate the role of non-Markovian effects.

A. Independent infections
In the non-Markovian case, the time that individuals remain infected follows the distribution ψ rec (τ ), in general nonexponential. This means that to apply the nMGA, we have to keep track for each infected individual of the time elapsed since he became infected. The infection process is more involved. In this subsection, we consider that each active link (connecting a susceptible-infected pair) defines a statistically independent infection process following the distribution ψ inf (τ ). That is, a susceptible individual connected to a single infected individual will become infected after a random time distributed by ψ inf (τ ) from the moment the link became active. If the susceptible individual is connected to more than one infected neighbor, each active link is considered as statistically independent so that the infection event will take place at the time of the first firing event of any of the current active links. Because the dynamics is non-Markovian, the infection of a susceptible individual depends not only on the number of active links (infected neighbors) but on the elapsed time of each active link which is, in general, different for each infected neighbor.
Therefore, the complexity of the infection course is related to the specific process that leads to the generation of an active link. Indeed, an active link connecting infected individual A and susceptible individual B can reach this configuration from two different scenarios, as illustrated in the top panel of Fig. 2. In the first one, both A and B are originally susceptible and individual A becomes infected by one of his neighbors other than B, generating a new active link. In the second scenario, both A and B are infected and individual B recovers so that an active link is equally created. In the first scenario, it is clear that the active link is new and, therefore, its elapsed time is set to zero, t AB = 0, we call this "rule 1". In the second scenario, we can use again rule 1 and set t AB = 0. However, we could also argue that infected individual A is the one that makes the action of infection and, thus, we could also consider that  [6] This is the distribution of elapsed times in t information, see [5] for a formal proof. the elapsed time of the active link is, in this case, the elapsed time of infected individual A since he became infected, that is, t AB = t A . We call this "rule 2" and it is the point of view taken in [27]. We first consider the case of Poisson statistics for recovery events and a Weibull distribution with parameter α for the infection process. To compare with the Markovian case, we use as a control parameter a generalization of the effective infection rate, defined as the ratio between the average recovery time and the average infection time, λ eff = t rec / t inf . This definition reduces to the effective infection rate used in the Markovian case. Figure 2 shows the epidemic prevalence (fraction of infected individuals) at the steady state in a network of size N = 10 4 and average degree k = 5 as a function of λ eff . As it can be seen in the figure (and also reported in [27]), non-Markovian statistics modifies the position of the critical point significantly. However, there are also important differences between rule 1 and rule 2 for the same values of α. For α > 1, prevalence for rule 2 is always above the one for rule 1 and vice-versa for α < 1. This difference can be understood by the analysis of the average infection time of an active link, conditioned to a given elapsed time, that is, the first moment of the probability density Eq. (2). In the case of a Weibull distribution, the average time until the next event is an increasing function of the elapsed time when α < 1, whereas it is decreasing when α > 1. When a new active link is generated, its elapsed time is always above zero with rule 2 whereas it is exactly zero with rule 1. Therefore, the average infection time with rule 2 is longer or shorter than in the case of rule 1 whenever α < 1 or α > 1, respectively.
The effect of a non-Poisson recovery time distribution is much less determinant as compared to the non-Markovian infection dynamics. Indeed, when using rule 1, we do not find any noticeable difference with respect to the Markovian case whereas there are minor differences when using rule 2 but only for very heterogeneous recovery time distributions.

B. Cooperative infections
One of the consequences of the "independent infections" assumption made in the previous subsection is that, for a given individual, the total infection rate at a given time is the sum of the instantaneous rates of all her active links at that time. This is a reasonable assumption when rule 2 is used because, in this case, the infected node is the one associated with the random event of infecting the neighbor whereas the susceptible node is only a passive actor of the process. However, this is not the case when rule 1 is in use because, in such situation, the susceptible node is the one actively associated with the random infectious event. A naive explanation of the difference between these two cases is as follows. For rule 2, we could imagine the infected node firing imprecisely infective agents to her neighbor such that the susceptible node would only become infected after one of these agents hits her. The random infection time is then given by the random time the infected node takes to hit her neighbor, a process attributed solely to the infected node. For rule 1, we could imagine the infected node firing with perfect precision to her neighbor, who is endowed with a protective shield that is destroyed after some exposure to the infective agent and regenerated once the individual becomes susceptible again. However, in this case there is no reason, a priori, to assume the hypothesis of independence between different active links that could act cooperatively and non-linearly to infect the susceptible individual.
To explore this possibility, we consider a simple example where the total infection rate of a given susceptible node i at time t is where a ij is the adjacency matrix, n j (t) = 1 if node j is infected at time t and zero otherwise, and t ij is the time the link i−j has been active. As before, the instantaneous rate is given by Eq. (12). For σ = 1 we recover the case of independent infections. This case can be readily implemented with the nMGA. Results are shown in Fig. 3. Non-linear infections have an important effect on the prevalence of the infection, increasing it when σ > 1 and decreasing it if σ < 1. In this case, however, the position of the critical point is not affected. The reason is that in the low prevalence regime close to the critical point, the number of infected neighbors is very small and, thus, we are effectively in the same regime than in the case of independent infections. It is also possible to implement more complex non-linear schemes, such as threshold models for which the instantaneous infection rate is zero below a given value. In all these cases the nMGA can be applied.

V. ASSESSMENT OF THE GENERALIZED GILLESPIE ALGORITHM ON A SYSTEM OF BIOCHEMICAL REACTIONS WITH DELAYS
To show the generality and assess the performance of the nMGA as compared to other existing methods, we apply our approach to a stochastic system of reactions with time delays. Time delays account for the non-Markovian nature of many random processes that play a key role in a wealth of prob-lems in molecular biology involving biochemical reactions or transport. For instance, time delays can model slow processes compound of sequential multistage reactions that can induce stochastic oscillations in gene expression [17,23,25]. In neurotransmission, time delays can be related to the trap of particles in dendritic spines explaining their anomalous diffusion [45].
This relevance prompted several attempts to adapt Gillespie's algorithm to implement biochemical reactions with time delays for the analysis of gene regulation [17][18][19][20][21]. When time delays are independent of the evolution of the system the proposed nMGA does not necessarily outperform those previously proposed methods based on annotated lists of future events [17,18,20] or Anderson's modified next reaction algorithm for systems with delays (algorithm 7 in [21]). However, when there is a coupling between the distribution of the time delays and the state of the system, none of the previously developed methods can be straightforwardly applied. The reason is that those methods assume that delay times must be chosen at the moment of the initiation of each reaction, which is clearly not possible if delays depend on the changing state of the system. In Appendix C, we show how to modify Anderson's algorithm to deal with this more general case although, as we shall see below, it is slower than the nMGA.
As an example, we consider the following stochastic reaction system, which can serve as a model for gene regulation with delayed auto-inhibition Here M represents some messenger RN A (mRN A) molecule and P is the corresponding translated protein. The generation of mRN A is initiated at a Poissonian process of rate g(n P ) depending on the instantaneous number of proteins n P present at generation time, but it is completed only after a delay time τ , drawn from a given probability distribution. Translation of mRN A molecules to proteins and spontaneous degradation of mRN A and proteins are modeled by Poissonian processes with rates β P , γ M , and γ P , respectively. A version of this system was analyzed, for example, in [16,18,[22][23][24][25][26]. In particular, reference [22] considers uniformly distributed delays in the range (10.7, 26.7), β P = 1, γ M = γ P = 0.03, and a Hill function g(n P ) = N α M /(1+(n P /(N P 0 )) h ) -with α M = 1, P 0 = 10, h = 4.1, and variable size N -to reflect that the presence of protein molecules has a negative feedback on mRN A generation.
As mentioned before, the nMGA is particularly suitable when time delays are coupled to the macroscopic evolution of the system. This is the case, for instance, if we consider a slightly modified gene regulation model where the time required for the transcription of mRN A is affected by the total amount of mRN A being transcribed, n M * , or already present at the time, n M , -a likely biological assumption based on the fact that resources needed for mRN A transcription, like nucleotides or ATP energy, are finite.
More specifically, we consider a system-coupled delay model that only differs from the previous one in the distribution of time delays τ . Instead of the uniform distribution, we choose a Weibull distribution with scale parameter µ −1 = µ −1 0 [1 + v(n M * + n M )/N ] and µ 0 = 0.125, v = 0.5. This means that the instantaneous rate of an ongoing process is modified every time the system changes, which implies that the time a reaction takes to complete (the delay) is not defined when the reaction starts. We use this non-Markovian version of the model to compare the performance of the nMGA with that of the modified Anderson's algorithm in Appendix C. As shown in Fig. 4 Top, the computational time required by both algorithms scales as N 2 . However, nMGA is a factor ≈ 3.7 times faster for α = 0.5 and ≈ 9.5 times faster for α = 5. We note, furthermore, that the adapted Anderson's algorithm works relatively well in this case because the cumulative distribution of a Weibull distribution can be expressed in terms of elementary functions. When this is not possible, Anderson's algorithm becomes much slower while the efficiency of nMGA remains the same. In Fig. 4 Bottom, we also show the autocorrelation function for the temporal evolution of the number of proteins in the stationary state. As α increases taking values between 0.5 and 100, the distribution of time delays becomes markedly more peaked around the mean and, as a consequence, the sequence of oscillations, with decaying amplitude and wave cycle marked by the average delay, becomes more pronounced in the autocorrelation function. Interestingly, as the distribution of time delays drifts away from the exponential becoming more bursty, these oscillations damp out to eventually disappear.
To finish this section, let us comment on a recent and innovative approach proposed in [22], which replaces the master equation by an effective non-linear integro-differential Langevin equation amenable to numerical treatment. The running time of the numerical integration is independent of the system size so, in principle, this method should be preferred over other stochastic methods when the system size is very large (also in this limit the error of the method tends to zero). This will not be the case, however, if one is interested in the highest frequencies of the system, which scale linearly with the system size. Probing such high frequency domain would make it necessary to increase the resolution of the discretization of the integro-differential equation accordingly, with the corresponding increase in computing time. In addition, this approach was developed to study well-mixed systems and, thus, it cannot be applied directly to networked populations.

VI. CONCLUSIONS
Models of dynamical processes in complex systems often assume that characteristic random events occur continuously and independently at constant rates. However, this assumption fails for many real systems, which cannot be correctly described unless memory effects like time delays, aging, or bursty dynamics are accounted for using non-Markovian transitions. We have introduced an exact and general framework able to generate statistically correct realizations for systems of non-Markovian discrete stochastic processes. In the limit of a large number of processes, it is approximated to a simple and general simulation algorithm which, quite surprisingly, works exactly as the original Gillespie method with the difference that instantaneous rates of events depend on the time elapsed since the event last took place. Compared with other methods existing in the literature, our algorithm is not always the fastest. However, it is the most general as it can be implemented very easily in cases (like delays coupled to the evolution of the system) where other algorithms do not work or need adapted versions.
Beyond the proven validity and efficiency of the algorithm, our results unveil the drastic effects that very subtle differences in the modeling of non-Markovian processes have on the global behavior of complex systems, with important implications for their understanding and prediction. This turns out to be a central question in many different fields, since evidence shows that non-Markovian dynamics are the rule rather than the exception. For instance, in the Internet information is injected in bursts and packet flow arrival times can experience propagation delays due to congestion and other effects. In human dynamics, bursty behavior affects the way information is generated and spread. In gene regulatory networks, intrinsic stochastic fluctuations may lead to the occurrence of oscillations and other phenomena not observed in Markovian or deterministic analogs. Beyond these examples, the potential range of applications is countless. For all of them, the cor-rect modeling of non-Markovian events is crucial and minimal variations can have drastic effects on their global behavior.