Global Stability Properties of the Climate: Melancholia States, Invariant Measures, and Phase Transitions

For a wide range of values of the incoming solar radiation, the Earth features at least two attracting states, which correspond to competing climates. The warm climate is analogous to the present one; the snowball climate features global glaciation and conditions that can hardly support life forms. Paleoclimatic evidences suggest that in past our planet flipped between these two states. The main physical mechanism responsible for such instability is the ice-albedo feedback. In a previous work, we defined the Melancholia states that sit between the two climates. Such states are embedded in the boundaries between the two basins of attraction and feature extensive glaciation down to relatively low latitudes. Here, we explore the global stability properties of the system by introducing random perturbations as modulations to the intensity of the incoming solar radiation. We observe noise-induced transitions between the competing basins of attractions. In the weak noise limit, large deviation laws define the invariant measure and the statistics of escape times. By empirically constructing the instantons, we show that the Melancholia states are the gateways for the noise-induced transitions. In the region of multistability, in the zero-noise limit, the measure is supported only on one of the competing attractors. For low (high) values of the solar irradiance, the limit measure is the snowball (warm) climate. The changeover between the two regimes corresponds to a first order phase transition in the system. The framework we propose seems of general relevance for the study of complex multistable systems. At this regard, we relate our results to the debate around the prominence of contigency vs. convergence in biological evolution. Finally, we propose a new method for constructing Melancholia states from direct numerical simulations, thus bypassing the need to use the edge-tracking algorithm.


I. INTRODUCTION
zone can be in the regime of bistability: if the planet is in the snowball state, it will have very hard time supporting life 1 . Additionally, astronomical parameters such as the obliquity of the planet [12,13] or the length of the year [14] can have a dramatic effect on the properties of multistability of a planet in the habitable zone, up to the point of erasing it altogether. In particular, one expects that tidally locked planets with an active carbon cycle can never be found in a snowball state [15].
The dynamical landscape of the Earth's climate seems to be indeed richer than what is described above. Further investigations have proposed the possibility of the existence of alternative cold states with respect to the snowball one. Such states are characterised by the existence of a thin strip of ice-free region near the equator. Clearly, this possibility has key implications in terms of habitability and the evolution of life on Earth. Different physical mechanisms have been proposed for explaining the existence of such a state, based either on the role of a dynamical ocean [16] or the specific properties of the albedo of sea ice [17]. In fact, in a previous work [18] we have found that, indeed, a third co-existing stable state, intermediate between the snowball and the warm climate, can be found in a climate model featuring a very simplified representation of the oceanic heat transport, so that one might expect that a rich landscape multistability could be a more intrinsic property of the climate system. Additionally, the physics and the chemistry of the climate system features further complexities when one considers warmer conditions. For sufficiently large values of S * , the warm state becomes unstable and the planet performs a transition to either the so-called moist greenhouse state or the so-called runaway greenhouse state (associated to massive evaporation of the oceans), which define the warm boundary of the habitable zone [19]; see discussions in [20,21].
In what follows, we consider the simpler -yet extremely relevant -scenario where the only relevant co-existing climates are the snowball state and the warm state. As discussed in [9], the physics of the system is especially interesting when the critical transitions are approached. As the solar irradiance S * nears the critical value S * W →SB with the system being in the warm state, the climatic engine becomes more efficient, because larger temperature gradients are realised inside the domain. Such an increased efficiency leads to a stronger atmospheric circulation, which is fuelled by temperature gradients and tends to reduce them by transporting heat from warm to cold regions, acting as a non-trivial diffusion process. Such a non-linear equilibration mechanism acts as a negative feedback and, broadly speaking, is a macroscopic manifestation of the second law of thermodynamics. One of the results of the heat transport performed by the atmospheric circulation is the stabilization of the ice-line. When S * = S * W →SB , the ice-albedo feedbacks becomes as strong as the positive feedbacks of the system, and the system flips to snowball state with the ice-line reaching the equator. Similar mechanisms are in place when the system is in the snowball state and S * nears, instead, S * SB→W . When S * = S * SB→W , the ice begins to melt near the equator, leading to a rapid poleward retreat of the ice line.
One can rephrase the previous description by saying that at the critical transitions the climate system is not anymore able to dampen fluctuations due to (infinitesimal) external forcings. Using methods borrowed from the transfer operator theory [22], the investigation of the behaviour of the same model used in [9] has indeed shown that when S * nears S * W →SB , the spectral gap -defined as the distance between the subdominant Ruelle-Pollicott resonance [23,24] and the unit circle -of the transfer operator constructed in a suitably defined reduced phase space vanishes. As a result, exponential decay of correlation is lost and the system experiences what is often referred to as critical slowing down [25]. We remark that, as discussed in [26], it is expected that the (near) closure of the spectral gap is associated to a very enhanced sensitivity of the system's statistics to perturbations. Specifically, a vanishing spectral gap leads to having a vanishing radius of expansion for the response theory describing how the measure of the system changes differentiably with respect to small changes in the dynamics of the system [27].
A looming question is what lies in-between the stable co-existing climates within the region of bistability. Here, in simple models, it is often possible to identify unstable solutions sitting in-between the two stable climates. Such unstable solutions are, roughly speaking, ice-covered up to the mid-latitudes, and small perturbations applied to trajectories initialised on the edge states lead with probability one to the system falling eventually into either of the competing attractors [3,28].
When studying more comprehensive climate models featuring chaotic dynamics, things are, as described in the next section, considerably more complex from a mathematical point of view, and the individuation of the unstable solutions is much harder. In a previous investigation [18], by taking advantage of the edge tracking algorithm presented in [29,30] and there used to study the co-existence of a laminar attractor and long-lived turbulence in fluid mechanical set-ups, we have been able to find, for each value of S * in the region of bistability, the Melancholia (M) states separating the two co-existing realisable warm and snowball climates.
As we shall see below, the M states are the natural extension of the saddle-type fixed points average properties of the third attractor. The warm-to-snowball tipping point is located at µ ∼ 0.965; the snowball-to-warm tipping point is located at µ ∼ 1.055;. See [18]. found in dynamical systems living in energy landscapes to the case where more general dynamics is considered [31][32][33][34]. The M states are embedded in the boundary between the two basins of attraction. The analysis was performed using an intermediate complexity climate model with O(10 4 ) degrees of freedom. Figures 1a)-b) summarize the main properties of the system, by showing the long term averages of the globally averaged surface temperature T S and of the temperature difference between low and high latitudes ∆T S as a function of the relative solar irradiance µ = S * /S * 0 , where S * 0 is the present day value. By focusing on the M states we have been able to show the existence of much richer than previously thought dynamical landscape.
The properties of the M states are non-trivial. Up to µ ∼ 1.01, the M state is chaotic and is characterised by longitudinal symmetry in its statistical properties, just as the boundary conditions of the system, are, indeed, longitudinally symmetric. The chaotic dynamics manifests itself as weather variability in form not too dissimilar from the usual one observed in stable climates.
Nonetheless, on long time scales, orbits initialised near the M states drift to either the warm or the cold snowball state, as a result of the positive ice-albedo feedback. For µ ∼ 1.01, we observed that the symmetric M state becomes transient, evolving very slowly (on a time scale scale much longer than the other ones typical of the system) into a symmetry-broken state, where very cold and very warm conditions co-exist, separated by two regions of very strong longitudinal temperature gradient. The two regions feature rather different dynamical behaviour and the boundary between them rotates very slowly in time. The nontrivial bifurcation associated to such a symmetry break leads to a dynamical regimes that resembles chimera states in extensive systems [35,36]. The third climate state mentioned before exists in a small parametric window near µ ∼ 1.045 [18].
We remark that the dynamical systems viewpoint clarifies that the critical transitions for the warm (snowball) state occurring when S * approaches the critical value S * W →SB (S * SB→W ) are associated to the collision between the edge state and one of the warm (snowball) climates, according to the dynamical scenario of boundary crisis [33]. The system's reduced ability to dampen fluctuations near the tipping points and the associated shrinking of the spectral gap described above can be seen, dynamically, as the result of the fact that the attractor attracts less effectively orbits in its immediate neighbourhood because of the presence of a nearby edge state state. See a discussion of this specific point in [25].

A. This Paper: Goals and Main Results
Studying multistable systems in general, and the climate system in particular, using deterministic autonomous dynamical systems faces two important limitations, both resulting from the fact that the phase space is partitioned into disjoint invariant sets -the various basins of attractions and their respective boundaries. First, it is not possible to account for transitions between the coexisting basins of attraction. Instead, transitions between distinct regimes of motion are observed in many systems of interest. Additionally, one cannot establish an ergodic, physically relevant invariant measure, as the co-existing attractors are disjoint and one cannot assign a weight to each of them. Hence, one cannot answer the question of how likely it is for the system of interest to be found at a given time in a specific regimes of motion. Instead, all the statistical properties of an orbit are conditional on which invariant set its initial conditions belong to. Indeed, building upon the preliminary results reported in a short communication [37] and giving them much broader scope, we want to address these relevant aspects that cannot be treated taking the viewpoint of autonomous deterministic dynamics.
For the benefit of the reader, we report below the main goals of our investigation and the main findings presented in this paper.
We will introduce a stochastic forcing forcing to the model studied in [18] as a Gaussian perturbation with variance σ 2 modulating the intensity of the incoming solar radiation. This impacts the radiative budget of the planet in a very nontrivial way and, thanks to the ice-albedo feedback, acts as a multiplicative noise. Additionally, we conjecture, supported by physical arguments, that the combination noise and deterministic dynamics lead to a hypoelliptic diffusion process, i.e. the noise propagates to all degrees of freedom of the system [38]. The presence of a random forcing allows the system to perform transitions between the neighbourhoods of the deterministic attractors, crossing the basin boundaries that, in the unperturbed case, are impenetrable; see some classical results in [39][40][41]. The consideration of a random forcing will allow us to construct the ergodic invariant measure of the system by performing long integrations, and investigate the properties of noise-induced transitions.
Following [34,[42][43][44], we will propose that, in the weak noise limit, the invariant measure can be written as a large deviation law with the following notable properties. The relevant exponent is proportional to minus a pseudo-potential function divided by σ 2 /2. In the region of bistability, the pseudo-potential has local extrema associated to the attractors (local minima) and the M states (lo-cal maxima) of the deterministic system. In the region where only one stable state is realised in the deterministic case, the pseudo-potential has just one local (and global) minimum, corresponding to the unique attractor.
As a result, in the case of multistability, the logarithm of the average permanence time in a basin of attraction is approximately given, apart from an additive constant, by the product of 2/σ 2 times the difference between the value of the pseudo-potential at the M state through which the transitions takes place and its value at the corresponding attractor. We also show that the stochastically averaged exit trajectories connect the attractor with the M state, which is indeed the most likely exit point from the deterministic basin of attraction. In the weak noise limit, such paths correspond to the instantons. These can be defined as minimizers of a the Freidlin-Wentzell action [40,41,45,46]. In our simulations, we show that such an identification becomes more accurate as the intensity of the noise is decreased. Exploiting this property, we also propose a new general method for constructing the M states via direct numerical integration of stochastic differential equations, thus bypassing the need to resort to the edge tracking algorithm mentioned above.
Evaluating the average permanence times in some selected experiments allows us to reconstruct the properties of the described pseudo-potential in a very high-dimensional system, and to predict how the populations of the regions nearby the two deterministic attractors change as a function of the intensity of the noise. In particular, we discover that, since generally the local minima of the pseudo-potential corresponding to the two attractors have different values, in the zero-noise limit only one attractor -the one corresponding to the global minimum of the pseudo-potential -is populated. Nonetheless, an individual trajectory may in fact persist near the competing metastable state for a very long time, as the permanence time in the corresponding basin of attraction also diverges (but at a slower rate than for the asymptotic state).
As indicated by intuition, for low values of S * , the noise selects as limit measure the snowball state, while for high values of S * , the limit measure is the warm attractor. The changeover takes place for a critical value of S * = µ crit S * 0 , where S * 0 is the present solar irradiance, and µ crit ≈ 1.005. For this value of S * , the equivalent of a first order phase transition for equilibrium systems takes place in the system. The order parameter that more obviously captures the transition is the globally averaged surface temperature.
As discussed later in the conclusions, the specific findings reported here suggest general ideas about how to study multistable complex systems and how to define their global stability proper-ties. Additionally, we show how the framework presented in the paper is potentially suitable for clarifying the role played in complex historical processes by contingency, as discussed by Gould [47] in the context of evolutionary biology.
The paper is structured as follows. In Section II we summarise the main mathematical concepts and ideas we have used to frame our investigation, to perform the data analysis, and to interpret our results. In Section III we report the modelling suite we have used, describe the data we have produced, and give information on how they have been processed. In Section IV we present and discuss our results. In Section V we summarize and interpret our findings, describe the limitations of our work, and propose ideas for future investigations. In Appendix A we propose and test numerically a new method for constructing the M states by looking at the intersections of instantons constructed by taking into account three different noise laws superimposed to the same deterministic dynamics.

A. Geometry of the Phase Space: Attractors and Edge States
The investigation of systems possessing multiple steady states is an extremely active area of interdisciplinary research, encompassing mathematics, natural sciences, and social sciences. The recent review by Feudel et al. [48], introducing a special issue on the topic, gives a rather complete overview of the state-of-the-art of the ongoing activities on this topic and provides several interesting examples. In the context of Earth sciences, it is more common to refer to critical transitions in multistable systems using the expression tipping points, which has received a great popularity following a paper by Lenton et al. [49].
A possible way to introduce the mathematical background for multistable systems is the following. We consider a smooth autonomous continuous-time deterministic dynamical system defined on a smooth finite-dimensional compact manifold M (in what follows, a subset of R N ). The orbit evolves from an initial condition x 0 ∈ M at time t = 0. We define x(t, x 0 )=S t (x 0 ) as the orbit at a generic time t, where S t indicates the evolution operator. The corresponding set of ordinary differential equations are:ẋ where F (x(t)) = d/dτ S τ (x(t))| τ =0 is a smooth N -dimensional vector field. We define such a dynamical system as multistable if it features more than one asymptotic states. Specifically, we mean that one can define two or more attractors Ω j , j = 1, . . . , J with finite Lebesgue measure basin of attraction. Each of the attractors is an invariant set and is the support of an invariant measure of the system. The asymptotic state where the trajectory falls into is determined by its initial conditions, and the phase space is partitioned between the basins of attraction B l of the various attractors Ω j and the boundaries ∂B l , l = 1, . . . , L of such basins. We assume, for simplicity, that within each basin of attraction an ergodic measure can be defined as the limit of the empirical measure constructed by averaging over infinitely long forward trajectories almost everywhere with respect to Lebesgue in the initial conditions. In other terms, all the invariant measures of the system can be written as a convex sum of its j = 1, . . . , J ergodic components, each supported on the corresponding attractor Ω j , j = 1, . . . , J.
Let us for the moment assume that the dynamics is determined by an energy landscape defined by the smooth potential U (x), so that F (x) = −∇U (x). Therefore, one has thatU (x) = −|∇U (x)| 2 (U being a Lyapunov function), which leads to identifying the attractors with the local minima x j , j = 1, . . . , J of the potential U (x), which are invariant sets [50]. Using the classic geographical metaphor responsible for the use of the word basin, the basin of attraction of the lake x j corresponds to its watershed, or precipitation catchment area, where U (x) is the orography. Additionally, the basin boundaries are the mountain crests, which are manifolds of co-dimension one, while the mountain passes are the saddles Π l , l = 1, . . . , L situated between the attractors contained in the basin boundaries. The saddles are invariant sets and are, globally speaking, repellers, but can be thought of as relative attractors because they attract the orbits whose initial conditions are exactly on the basin boundaries (which have zero Lebesgue measure).
The geometry of the phase space becomes much more interesting when the dynamics is more complex than in the case of an energy landscape [31][32][33]51]. One can be in the situation where the asymptotic dynamics on one or more of the competing attractors can be chaotic (meaning here that at least one Lyapunov exponent is positive and unstable periodic orbits are dense). Additionally, in some systems, chaotic dynamics can be realised on one or more of the saddles embedded in the basin boundaries. Therefore, as opposed to the energy landscape case, saddles can be non-trivial geometrical objects being the support of a possibly nontrivial measure. While typically one is interested in the asymptotic states of a complex multistable system, the properties of the basin boundaries and of the possibly chaotic saddles is extremely relevant because it is key for understanding the global stability properties of the system and for interpreting its critical transitions. A very relevant example can be found in the well-known case of boundary crisis, where the critical transition takes place when, for a given value of a parameter, one attractor and one saddle collide, thus leading to the disappearance of one asymptotic state [33].
A fascinating aspect of multistable systems is the following. If the first Lyapunov exponent of a chaotic saddle is larger than the inverse of the life time of the saddle state itself (which measures the rate at which orbits initialised near the saddle state are repelled towards the two nearby asymptotic states), then the basin boundary separating the basin of attraction of the two asymptotic sets has co-dimension strictly smaller than one. In two-dimensional maps, it has been proved that in this case the basin boundary is not a manifold, but rather a rough fractal [31,34,51]. In [18], despite high-dimensionality of the system, one could detect that the co-dimension of the basin boundary is strictly smaller than one. In fact, such co-dimension was found to be very close to zero, as a result of time scale difference between the most relevant instabilities: the climatic instability due to the ice-albedo feedback acting near the M state is much slower than the fast weather-like instability associated to baroclinic processes occurring on the M state. This implies that near the basin boundary there is basically no predictability of the second kind in the sense of Lorenz [52].
The basin boundary is, de facto, a grey zone, and we can hardly assess where orbits initialised near the boundary will asymptotically end up.

B. Impact of Stochastic Perturbations: Invariant Measure and Noise-induced Transitions
The goal of our investigation is to analyse the impact of imposing a random forcing on the deterministic dynamics discussed above. Specifically, for a given choice of noise law and strength, we want to construct the invariant measure of the system, which allows us to understand whether, for a given value of the solar irradiance, the warm climate or the snowball climate is more likely to be realised. We also want to be able to study the statistics of the noise-induced transitions, and to understand the role played by the edge states in channeling the transitions. Processes that can be described as a noise-induced escape from an attractor have long been studied in the natural sciences; see [39][40][41].
Let's consider the following stochastic differential equation in the Itô form: is a vector flow -drift term -with multiple steady states as in Eq. 1, dW is an M -dimensional Brownian motion, and the matrix s(x) ∈ R N ×M -volatility -is such that s(x) T s(x) is the covariance matrix of the noise. Finally, σ ≥ 0 is a bookkeeping parameter controlling the intensity of the noise. In what follows, we draw from the classical Freidlin-Wentzell [42] theory and from the more recent extensions able to address the case of chaotic dynamics in the deterministic case of σ = 0 [34,43,44].
We first need to assume that we are considering a hypoelliptic diffusion process: while the covariance matrix of noise can be singular, a smooth invariant density with respect to Lebesgue is realised because the noise is propagated to all the coordinates through the drift term. Technically, this is realised if the drift term F and the columns of the matrix s satisfy the so-called Hörmander condition, i.e. the Lie algebra generated by them has dimension N everywhere; see for instance [38] 2 . We then assume that either the noise law is additive (s(x) is a constant) or multiplicative in a very mild sense, meaning (see also [53,54]) that s(x) undergoes just a weak modulation in the phase space, so that one observes no new stable states that have no deterministic counterpart as a result of adding noise [55]. Then, in the weak noise limit given by σ → 0, the invariant measure W σ (x) of the system can be written as a large deviation law: where Z(x) is a prefactor and Φ(x) is the pseudo-potential.
For a given deterministic vector flow F (x(t)), different choices of the noise law will lead to different invariant densities: in Eq. 3, both the prefactor Z(x), and more decisively, the pseudo- and the deterministic vector field can be written as F (x) = −∇U (x).
One can establish general properties for the pseudo-potential Φ(x) that apply to all choices of noise laws, and, once a noise law is chosen, derive how the properties of the system change as a function of the parameter σ. In general, one has that, regardless of the noise law, Φ(x) has local minima supported on the attractors Ω j , j = 1, . . . , J of the deterministic dynamics. Additionally, Φ(x) has a constant value on the support of each M state and of each attractor, respectively. We remark that such sets can be strange and, as discussed before, support chaotic dynamics [43,44].
Unless degeneracies are present, the probability that a noisy trajectory with initial condition in the basin of attraction of Ω j does not escape it over a time span of length p decays exponentially: where we have included the possibility of a pre-factor. The expression for the expected escape where is the potential barrier height, constructed as proportional to the difference between the lowest among the values of the pseudo-potential evaluated at the M states surrounding Ω j (let's refer to this M state as Π l ) and the value of the pseudo-potential on the attractor Ω j ) [34]. Note that Eq. 5 is a generalisation of the classic Kramers' formula [56].
In the weak-noise limit, the transition paths are exponentially more likely to follow specific paths in the phase space. Such paths are called instantons and can be constructed as minimizers of the related Freidlin-Wentzell action [40,41,45,46]. In general, different choices of the noise law will lead to different instantons, as the specific form of s(x) enters in the definition of the action.
Nonetheless, under the hypotheses stated above on the noise laws, instantons corresponding to different choices of s(x) share some essential qualitative properties: they connect a point on the attractor Ω j to a point in the neighbouring M state Π l where the pseudo-potential has the lowest value. We remark that if the attractors and the M states are more complex sets than isolated points, the instantons connecting them are not unique, because in principle any point in the attractor can be linked by an instanton to any point in the M state), as a result of the fact that the pseudo-potential is constant on Ω j and Π l . We conclude that the M states are the gateways for the noise-induced escapes from the deterministic basins of attraction, and generalise the saddle points discussed above in the case of dynamics taking place in an energy landscape (which is the case described by the Kramers' formula [56]).
See Appendix A for a discussion of how to construct the M states of a system by studying the intersection of instantons constructed according to different noise laws acting on the same deterministic dynamics, and considering the weak noise limit. Such a method seems novel and is alternative to the edge tracking algorithm mentioned above.
Let's now take the simpler case of bistable systems, where we have two attractors Ω 1 , Ω 2 , and one edge state Π 1 . We can then write transitions times as follows: where we have that which implies that, in the weak noise limit, both escape times diverge, but the escape time out of the attractor corresponding to the lower value of the pseudo-potential diverges faster. Note that one can expect the proportionality constant in Eq. 8 to be O(1). Taking a maximally coarse-grained view on the problem, where we consider the state as represented by the populations P 1,σ , P 2,σ of the neighbourhood of the two attractors, we can write the following master equation: At steady state, we obtain that We remark that one could obtain the same results regarding the ration of the populations by integrating the invariant measure given in Eq. 3 in the neighborouhood of the attractors and and taking a saddle point approximation. The previous equation implies that in the weak-noise limit only one of the two deterministic attractors will be populated, and specifically the one where the pseudopotential has lower value. We remark that two different noise laws, which define two different pseudo-potentials, will feature different properties for the transition rates between the two basins of attraction and might determine a different selection of the dominating population in the weak noise limit. One can easily extend the master equation defined above to the case where multiple states and multiple paths of transitions are present.

III. NUMERICAL MODELLING
The climate model considered here is constructed by coupling the primitive equations atmospheric model PUMA [57] with the Ghil-Sellers energy balance model [3], the latter describing succinctly the meridional oceanic heat transport. It has been already presented in [18] with the name of PUMA-GS, but we report here again its formulation in order to elucidate the role of stochastic forcing, which was absent in the previous version. The stochastic forcing is added as a fluctuating term modulating the value of the incoming radiation determining the energy input into the system.

A. The Atmospheric Component
The atmospheric component of the PUMA-GS model is provided by PUMA [57], which con- divergence of the velocity field D, the (atmospheric) temperature T a =T a + T a (separated into a time-independent arbitrary reference partT a and anomalies T a ), and the logarithmic pressure (normalized by the surface pressure p s ) σ = ln p/p s , read as follows: where , v being respectively the horizontal and vertical wind velocity components, and Φ is the geopotential height. Equations 12,13, and 15 express the conservation of momentum, Eq. 14 expresses the conservation of energy, and Eq. 16 is the equation of state.
A number of simple parametrizations are adopted in order to improve the realism and the stability of the model. Firstly, the hyperdiffusion operator K∇ 8 is added to the equations of vorticity, divergence and temperature, to represent subgrid-scale eddies. Secondly, large-scale dissipation of vorticity and divergence is facilitated by Rayleigh friction of time scale τ f . Thirdly, the physics of diabatic heating due to radiative heat transport is parametrized by Newtonian cooling: the temperature field is relaxed (with a time scale τ c ) towards a reference or restoration temperature field T R , which can be considered a radiative-convective equilibrium solution. We adopt the following simple expression for the restoration temperature [57]: where (T R ) tp and z tp are the temperature and height of the tropopause, respectively, L (L) is the (average) lapse rate, T S is the globally averaged surface temperature, and z(σ) is determined by an iterative procedure [57]. The above expressions indicate that the restoration temperature profile is anchored to the surface temperature T S . However, as Eq. 18 indicates, T R at any one point on the sphere is determined by not only the local (dynamical) surface temperature, but also the global average T S . We note that T a (σ = 1) is obtained by linear extrapolation, according to T a (σ = 1) ≈ T a (σ = 0.9) + η(T S − T R (σ = 0.9)), 0 < η < 1. The surface temperature T S is taken to be governed by the a 2D version of the GS EBM [3,28].
This model includes a simple yet effective representation of the ice-albedo feedback, and basically defines the slow manifold of the coupled atmosphere-ocean system. The partial differential equa-tion describing the evolution of the ocean surface temperature field T S = T S (t, φ, λ) is written as: where S * 0 is the present solar irradiance (the factor 4 emerges looking at the geometry of the Earth-Sun system (see [58]), µ = S * /S * 0 as introduced in Sect. I, while the heat capacity C(φ) and the geometrical factor I(φ) are explicitly dependent on φ only, thus enforcing zonally-symmetric boundary conditions. The albedo α depends on φ and, critically, on T S , with a rapid transition from strong albedo for low values of T S (α max = 0.6) to weak albedo for T S 260 K (α min = 0.2), which fuels the positive ice-albedo feedback. Additionally, O is the outgoing radiation per unit area, expressed as a monotonically increasing function of T S (this is responsible for the negative Boltzmann feedback, taking into account also the greenhouse effect), D is a diffusion (taking place only along the meridional direction) operator parametrizing the meridional heat transport, and χ describes the heat exchange with the atmosphere. See [18,28] for further details.
Finally, the last term represents the stochastic forcing, which is introduced as a random modulation of incoming solar radiation, so that we have: where σ controls the intensity of the noise, s defines the noise law, and dW is the increment of a one-dimensional Wiener process. Since s depends explicitly on T S via the term α(φ, T S ), we are dealing with a multiplicative noise law. We consider the Itô convention for noise. Adding a Gaussian random variable of variance σ at each time step ∆t (1 hour ) of the model amounts to considering that, on the time scale τ = N × ∆t, the relative fluctuation of the solar irradiance is

C. Remarks
We briefly discuss how reasonable it is to expect that the properties of the system under consideration can be explained using the theoretical framework presented in Section II.
Assuming the validity of the hypoellipticity condition, as done in Sect. II B, amounts to saying, in this specific case, that the stochastic forcing acting on the oceanic surface temperature -see Eqs. 20-21 -propagates to all degrees of freedom of the coupled system thanks to the interplay between stochastic and deterministic components of the dynamics. On physical grounds, such an assumption seems convincing if one considers that the ocean surface temperature drives the restoration profile of the atmospheric temperature -see Eq. 17 -and that fluctuations of such a profile modulate the atmospheric instabilities, whose energy cascades down to the smallest scales resolved by the model; see [59,60] for a detailed treatment of the energetics of the climate system.
A second critical point is that we are considering a multiplicative noise law. Nonetheless, our dispersion matrix, encoded by the factor 1 − α(φ, T S ) -see Eq. 21 -has a smooth dependence on the variable T S . Additionally, the factor 1 − α(φ, T S ) is bounded in the phase space between 0.4 (α = α max = 0.6, ice-cover) and 0.8 (α = α min = 0.2, very warm conditions with the absence of ice cover). Therefore, we are confident that our noise law obeys the mathematical conditions indicated in [34,43,44,53,54] for the results contained in Section II to hold.
We can also provide a more physical interpretation of why the multiplicative noise we are considering should allow for the emergence of the large deviation laws described in Section II.
In the phase space region near the cold attractor, we have that 1 − α(φ, T S ) ∼ 0.4 since the temperature T S is extremely low and the planet is fully glaciated (or almost entirely so), so that α(φ, T S ) is virtually constant, with α(φ, T S ) ∼ α min . Near the warm attractor, the properties of the field α(φ, T S ) are slightly more complex, because part of the planet is glaciated and part of it is ice-free. Our working hypothesis is that, to a first approximation, the stochastic forcing we have introduced can be treated as corresponding to perturbing the system with additive noise of different strengths near the cold and warm attractors, plus a transition region between the two attractors, where the effective intensity of the noise decreases with the globally averaged surface temperature T S and the multiplicative nature of the noise is more evident. The ratio of the variance of the noise in the snowball attractor vs. warm attractor is of the order Loosely speaking, the competing warm and snowball climate states have different statistical mechanical, microscopic -as well as thermodynamical, macroscopic -temperatures.

IV. RESULTS
We first treat in detail three cases inside the region of bistability depicted in Fig. 1 µ in the region of bistability. We remark that the results shown in Figs. 2, 3, and 5a) have already been reported in the short communication [37], but we deem extremely useful to present them here as well because they are now part of a much more complex, coherent, and detailed narration.

A. Escapes from Basins of Attraction and Instantons
In the case of µ = 0.98, we first perform a set of simulations with noise of different intensity ranging from σ τ = 0.5% to σ τ = 1.4%, with τ = 100 years (y). to define the difference between the value of the pseudo-potential Φ at the M state and at the warm attractor as the slope of the straight line. For reference, we have that for σ 100y = 0.5% the average escape time is about 5.2 × 10 3 y. We can predict that the escape rate increases to about 1.5 × 10 7 y when σ 100y ∼ 0.3%. We have that the slope of the straight line in Fig. 2 gives ). Therefore, we show that it is indeed possible to estimate quantitatively the properties of the pseudo-potential also in a very high-dimensional dynamical system like the one considered here. The operation can be repeated for all the other values of µ in the range of multistability and for the processes of escape from the snowball attractor, but we do not pursue here a systematic study of this.
We then wish to look at the paths corresponding to the transitions. Following the discussion in [18,28], we choose to consider the reduced phase space spanned by the globally averaged surface ocean temperature T S and by the meridional temperature difference ∆T S , defined as the difference between the spatially averaged ocean temperature field between the Equator and 30 occurring in the climate model: • The average surface temperature T S is directly associated to the positive ice-albedo feedback and the negative Bolzmann radiative feedback; • The meridional temperature difference ∆T controls the meridional heat transport performed by the ocean, as a result of the diffusive law we insert into its evolution equation, • The meridional temperature gradient ∆T also controls the meridional heat transport performed by the atmosphere, as a result of the mechanism of baroclinic instability [62]. To a good approximation, the instanton connects the warm attractor to the M state, and follows a path of decreasing density. We do not find evidence of different paths for the trajectories leading to an escape and the relaxation trajectories, which is, instead, a typical signature of non-equilibrium [63]. This can be explained by considering [28], where it is shown that the ocean model evolve to a good approximation in an energy landscape.
See also Appendix A.

B. Construction of the Invariant Measure
The information contained in Fig. 3 is limited because we are studying only W → SB escape processes, and we do not allow for the establishment of an invariant measure. The problem lies in the fact that the pseudo-potential minimum associated to the SB attractor is much deeper than the one associated to the W attractor, so that the average escape time associated to SB → W transitions is prohibitively long for the range of (rather weak) noise intensities used for constructing Fig. 3. In fact, in order to be able to construct the invariant measure of the system, we need to observe a sufficient number of W → SB and SB → W transitions, in order to be sure that we have collected a satisfactory statistics; see also the master equations for the populations given in Eqs. 9-10. We next increase the noise intensity by setting σ 100y = 1.8%, so that, in an integration lasting about 1.0 × 10 4 y, we observe 10 SB → W and W → SB transitions, The number of transitions is low because it is extremely hard to escape from the SB state.
Our results are shown in Fig. 4. We portray the logarithm of the projection of the invariant measure on the (∆T S , T S ) plane; we refer to this also as the two-dimensional probability distribution function (pdf). We have that the peaks of the pdf are in good agreement with the position of the attractors is relatively small. We will explore this matter in Subsect. IV D, where we will try to deduce where the pseudo-potential reaches its absolute minimum for each value of µ in the bistable region.
It is worth looking more in detail at how the estimate of the instantons is impacted by the intensity of the noise used in the simulation. As instantons are defined in the weak-noise limit, we would expect that one achieves higher precision when weaker noise is used. This is confirmed by the results shown in Fig. 6: we have that the estimates of the instantons obtained using lower noise intensity come closer to the attractors and to the M state. Nonetheless, also the instantons obtained for very strong noise are still relatively accurate.

C. Instantons and Transitions across the Symmetry-broken Melancholia State
We next examine the noise-induced transitions for µ = 1.02. This case is quite interesting because, as discussed in [18] and reported in Fig. 1 Alternatively, this feature might be a subtle signature of dynamical, noise-induced stabilization [55]; this would provide a slight contradiction of our working hypothesis discussed after Eq. 3.
D. Selection of the Limit Measure in the Weak-noise Limit and First-order Phase Transition Equation 11 indicates that, in the weak-noise limit, all of the measure will be concentrated on the attractor featuring the lowest value for the pseudo-potential Φ. Since for µ < S * W →SB /S * 0 only the SB state is realised, physical intuition suggests that for low values of µ within the range of bistability, the snowball attractor should contain all the mass in the limit of weak noise, as one can also anticipate by looking at Fig. 3. Conversely, one expects that for high values of µ within the range of bistability, the W should be dominant in the weak noise limit. It is reasonable (yet far from obvious or rigorous) to expect that there should be a critical value of µ = µ crit separating the two regimes. Following [18], we consider 18 equally spaced values of µ (∆µ = 0.005) within the multistable regime. For each of these values of µ (excluding the case of µ = 1.045, where three stable states are realised) the fraction of the population residing within the basin of attraction of the deterministic warm attractor P W,σ (µ) and its complement, residing in the basin of attraction of the snowball attractor P SB,σ (µ). Related results are shown in Fig. 8. We show how the probability distribution of the variable T S depends on µ for three different noise levels: σ 100y = 1.5% (Panel a), σ 100y = 1.8% (Panel b), and σ 100y = 2.5% (Panel c). In these panels we superimpose the bifurcation diagram reported in Fig. 1a). We observe that as the noise is reduced, for all values of µ ≥ µ crit ≈ 1.005. The absolute minimum of the pseudo-potential Φ is realised in the warm attractor for µ ≥ µ crit and in the snowball attractor for µ ≤ µ crit . The changeover is, curiously, quite close to the reference case µ = 1, for which the weak-noise limit of the measure is given by the SB state.
We add a note on the uncertainty associated to the figures reported in Fig. 8d). We remark that for all values of µ and σ we have used simulations lasting at least 10 4 y. For very low (≤ 0.98) and very large (≥ 1.02) values of µ in the considered range, the simulation length does not allow for observing more than few transitions for the two lowest considered noise level. Therefore, according to the fact that the transitions occur following a Poisson law, one expects in this range an uncertainty on the figures reported in Fig. 8 of the order of the values of the smaller between P SB,σ (µ) and P W,σ (µ). This corresponds, in fact, to a low uncertainty, because most of the mass is concentrated near one of the two deterministic attractors. The uncertainty is quite small for 0.98 ≤ µ ≤ 1.03, because, in all cases, we observe relatively many transitions. The uncertainty in this range can be safely estimated to be below 5%. Summarising, while the values reported in what is shown in Fig. 8, we have that τ SB→W τ σ grows more rapidly than τ W →SBτ σ for µ ≤ µ crit (and viceversa for µ ≥ µ crit ). As a result, in the weak-noise limit an individual trajectory might be trapped for a very long time in the metastable, non-asymptotic state. We remark that we have not performed here a systematic evaluation of the exponential relationship between the escape times and σ, as instead done for µ = 0.98 and shown in Fig. 2, also because, as for the reasons explained before, this would require computational resources that are beyond what has been allocated for this study. We remark that this would allow for evaluating for each value of µ the difference between the value of the pseudo-potential realised at the attractors and at the M state. The algorithm proposed in [61] can be very useful to reduce the computational burden.
We can say that for µ = µ crit our system exhibits a behaviour that is reminiscent of a first-order phase transition for near-equilibrium statistical mechanical systems, like a liquid-gaseous transition. In our case, µ is the control parameter (corresponding to the temperature in the equilibrium case), the pseudo-potential Φ is the equivalent of a thermodynamic potential, the scaling factor for the noise intensity σ is the equivalent of (square root of) the temperature, and the globally averaged surface temperature T S is the natural order parameter, e.g. density. The discontinuous We have studied in detail the noise-induced transitions between the deterministic basins of attraction in the range of multistability, extending the results presented in a short communication [37]. We have shown that by studying how the average escape time depends on the intensity of the noise it is possible to estimate the difference between the value of the pseudo-potential at the M state and at the attractor the trajectories are escaping from. The estimates of the instantons are shown to become more precise as we consider simulations where weaker noise is considered.
The instanton, in a case of special interest where the M state was shown to undergo a symmetrybreak process, selects as optimal point of passage between the snowball and the warm climate the Finally, by studying how the populations of the warm and snowball climate change as a function of the intensity of the noise, and using the large deviation law for the measure predicted by the theory, we find an estimate of a critical value of µ = µ crit ≈ 1.005 such that for µ ≥ µ crit the zero-noise limit of the invariant measure is supported on the warm deterministic attractor, while for µ ≤ µ crit the weak noise limit of the invariant measure is supported on the snowball attractor. The asymptotic state corresponds to the attractor featuring the lowest value of the pseudo-potential.
These results obtained here indicate that, as soon as some form of noise is added into the system, multistability is factually lost in the weak-noise limit, as the noise law is responsible for selecting, for each value of the control parameter (here µ), a specific asymptotic state. Changing the value of the control parameter, one will find one or more abrupt transitions in the statistical properties of the system (here realised at µ crit ), i.e., in other terms discontinuities in the response of the system to changes in the control parameter. What happens in our model at µ = µ crit is mathematically analogous of a first-order phase transition occurring in a near-equilibrium statistical mechanical system. We remark that, since the escape time away from either attractor grows exponentially with the inverse of the parameter controlling the variance of the noise, an individual trajectory might be trapped for very long time in a metastable state.
In collaboration with C. Kilic (Bern) and F. Lunkeit (Hamburg) the authors have started some preliminary simulations where stochastic forcing is added to PLASIM [64], a much more complex climate model than the one used in this study (yet missing some essential ocean dynamical processes). As shown in Fig. 9, the first results we have obtained are encouraging in indicating that the findings of this paper might be relevant for more realistic model configurations. For the future, we aim at obtaining detailed information on the large scale properties of the flow configurations leading to the noise-induced transitions, taking the thermodynamic lens we originally explored in [9]. The overall goal is to advance our knowledge of the global properties of the climate system by linking its statistical mechanical and its thermodynamical descriptions. An important question of specific relevance for paleoclimatic and planetary science studies is to understand whether the third climatic state found in [18] for µ = 1.045 is recovered also for more complex model configurations.
The approach presented here, based upon combining the knowledge of the dynamical landscape of a multistable deterministic dynamical systems with the analysis of the impacts of stochastic perturbatios, seems of more general interested than the specific problem we have studied. We believe that it can be key for addressing the challenge of understanding tipping points in the Earth system [49] as well as providing insights to a large class of multistable systems [48]. We will dedicate future efforts exactly to exploring these research lines, in particularly looking at the Atlantic meridional overturning circulation, an element of the global ocean circulation that is well know to have more than one competing modes of operation. The occurrence of one or of the other state has important implications on the global climate, and rather dramatic ones for the regional climate of the North Atlantic sector; see [65] for a review of this topic. Multistability has been recently reported in the von Karman turbulent flow in [66]: the methods proposed in this paper could elucidate paths and mechanisms underlying the transitions between the asymptotic states.
A. An Interdisciplinary Outlook: Evolutionary Biology We wish to conclude this paper by reaching out to other scientific disciplines and to a more conceptual level. In the influential book discussing the Cambrian fossils from the Burgess shale and their importance for explaining the mechanisms of evolution, Gould [47] presents some ideas on the scientific methodology inherent to historical sciences, and, specifically, to evolutionary biology. He clarifies that historical scientific explanations take the form of a narrative, whereby subsequent phenomena follow in a specific order 4 : the time-ordering and causal links between such phenomena can be discovered and convincingly explained when multiple, independent sources provide indication for the same historical pattern of change. Gould argues that this method is fundamentally different from the classic -which he calls stereotypical -scientific methodá la Galileo whereby one specific experiment can be repeated in idealised conditions, and the final outcomes of the experiment can be predicted using the fundamental laws of nature. The latter is the -indeed too simplified -version he gives of how hard sciences work. An obvious difference between the two methods comes from the fact that certain experiments -like the evolution of the climate and of the biosphere of a specific planet -cannot be repeated. Another difference comes from the fact -Gould argues -that historical processes are dominated by contingency: while we observe a specific path of historical development, many others of such paths are indeed compatible with the laws of nature, and could have been realised had the system been forced in a slightly different way in the past. We are able to understand the mechanisms of historical development, but not to predict accurately the specific path. Gould argues that the evolution in our planet could have led to fundamentally different forms of life, had the contingencies been different in the distant past.
The existence of our own species is the -a priori very unlikely -result of such contingencies. In general, in systems dominated by contingency, if we could run again the movie the outcome would be vastly different. Gould's views on evolution have been criticised by authors, as Conway Morris, proposing that convergence, rather than contingency, is the main mechanism of evolution, so that evolution is seen as a -mostly -deterministic path of change, of which nowadays we see the un-avoidable outcome [68]; see the debate in [69]. Some authors have proposed that both mechanisms are indeed in action, yet dominant at different scales of diversification of the organisms [70].
We argue that Gould's view can be put in the context of the mathematical framework discussed in this paper. Stochastically forced complex systems evolve in a phase space where an individual trajectory corresponds to a historical realisation of the system. The historical realisations, even starting from the same initial condition, can be vastly -even qualitatively -different if the deterministic dynamics supports the existence of multiple attractors, because different realisation can be trapped for very long time in very distant regions of the phase space. As discussed here, stochastic forcing allows for the system to jump across the various basins of attraction, and the mechanism defining the evolution of the trajectory are defined by the differential equations. The predictability of the system, both of the first and second kind in the sense of Lorenz, is finite but non-zero. Finally, we can interpret the M states as the true agents of the contingency discussed by Gould. It is not the presence of a forcing, however strong, that changes radically the future history of the system, but rather the existence of special regions of the phase space -near the M states -where even small perturbations can force two nearby trajectories towards qualitatively different future histories. The scenario proposed by Conway Morris is associated to a scenario where M states are either absent -so that the system is not multistable -or the noise is so weak that the probability of getting close to the an M state is exceedingly low, despite the presence of stochastic perturbations, the system will be (almost) always quite close to a specific deterministic attractor, even if we could run again the movie, i.e., run another simulation.
While contingency and qualitative changes dominate the behaviour of the system near the M states, quantitative properties, stated in probabilistic terms as large deviation laws defining probability measures and escape rate, can still be defined and tested. This is the main positive outcome of the mathematical and physical framework outlined in this paper.

ACKNOWLEDGMENTS
VL and TB acknowledge the support received by the EU Horizon2020 projects Blue-Action As discussed in the paper, the construction of M states for multistable systems is far from being a trivial task, and requires the of the edge tracking algorithm introduced in [29,30] and used also by the authors in [18,28]. We wish to propose here a proof of concept of an alternative procedure for constructing the M state without resorting to such an algorithm, but rather using only direct numerical simulations. The procedure discussed below might be useful in the case the edge tracking algorithm might be hard to implement because of the specifics of the system under investigation, or the presence of similar time scales associated to the instability along and across the basin boundary might make the construction of the M state harder to attain. Alternatively, it can be seen as a way to test the results obtained from the study of the deterministic dynamics.
The idea is to exploit the fact that, as discussed in Section II, under rather general conditions on the noise law, the M state, in the weak noise limit, acts as the gate for noise-induced transitions between the competing attractors. We then propose to proceed as follows. Let's consider the following stochastic differential equations: dx(t) = F (x(t))dt + σs k (x)dW , k = 1, . . . , K. (A1) We consider the possibility of perturbing the deterministic flow with K different noise laws, defined by the K functions s k (x). We assume, for simplicity, that the deterministic system defined byẋ(t) = F (x(t)) is bistable. We choose such K noise laws so that they obey the hypotheses discussed in Sect. II B 5 .
In the weak noise limit defined by σ → 0, the stochastic dynamics of the corresponding SDEs obey Eqs. 3-5. Clearly, for each noise law the prefactor Z k (x) and the pseudo-potential Φ k (x) will be different. Nonetheless, as discussed above, in all cases Φ k (x) has a local minimum (and In order to show that this approach does indeed work, we investigate the properties of K = 3 variants of the Ghil-Sellers diffusive model we studied in [28], differing for the law of the stochastic perturbation impacting the energy balance of the climate system. The Ghil-Sellers diffusive model can be written by removing the atmosphere-ocean coupling term in Eq. 20. In order to conform to Eq. 2 (note that we treat below the numerical discretization of a stochastically perturbed partial different equation), we express the three stochastically perturbed models as follows: where dW is the increment of a one-dimensional Wiener process, and we have: Specifically, we have that s 1 (φ, T S ) = s 1 (φ) and s 2 (φ, T S ) = s 2 (φ) correspond to additive noise laws, which feature different diffusion matrices. Instead, s 3 (φ, T S ) is a multiplicative noise law as a result of the temperature-dependence of the albedo and is closely related to what was studied in the rest of the paper; see Eq. 20. We construct for these three SDEs the invariant measure and, by stochastically averaging, we estimate the instantons connecting the attractors and the M states.
We make sure that these are estimated using very weak noise amplitudes. Results are presented in Fig Considering additional noise laws can be helpful for resolving possible geometrical degeneracies due to the use of a projection. Projecting in dimension higher than 2 could also serve a similar scope.