Preferential attachment in growing spatial networks

We obtain the degree distribution for a class of growing network models on flat and curved spaces. These models evolve by preferential attachment weighted by a function of the distance between nodes. The degree distribution of these models is similar to the one of the fitness model of Bianconi and Barabasi, with a fitness distribution dependent on the metric and the density of nodes. We show that curvature singularities in these spaces can give rise to asymptotic Bose-Einstein condensation, but transient condensation can be observed also in smooth hyperbolic spaces with strong curvature. We provide numerical results for spaces of constant curvature (sphere, flat and hyperbolic space) and we discuss the conditions for the breakdown of this approach and the critical points of the transition to distance-dominated attachment. Finally we discuss the distribution of link lengths.


I. INTRODUCTION
Scale-free networks have attracted a wide interest as models for many systems. Their degree distribution can be explained by the mechanism of preferential attachment in growing networks [1]. Preferential attachment has been proposed as a realistic effective mechanism of network growth for many real-world examples, from Internet and the World Wide Web to Wikipedia and citation networks. However, preferential attachment alone is not enough to explain the structure of scale-free networks, since many of these networks build also on other variables like fitness [2,3] and spatial structure [4][5][6][7][8][9][10][11].
The fitness model of Bianconi and Barabasi [2] is the basic example of a network model where the attachment depends not only on the number of links but also on another variable, namely the quality of the nodes. This model predicts multiple power-law scaling in the degree distribution and a phase with Bose-Einstein condensation on nodes with high fitness [3] and it has been applied to the WWW [12,13], where Google could be an interesting example of an emerging condensate.
Space is another feature that plays an important role in many real-world networks. Examples include the world airport network, which is scale-free [14] and where distances are relevant [15], but physical and abstract spaces appear in many other transportation networks as well as in communication and social networks. Another biological example is given by brain connectivity in the hippocampus, which depends on the distance in physical space and shows a scale-free degree distribution [16] and a short axon length compared to the size of the region.
It is possible to obtain scale-free networks from models based on on hidden variables even without any preferential attachment [17,18]. Interestingly, there are several models with power-law degree distributions induced by the space itself (see [19] for an extensive review). In some models, it is the interplay/competition between spatial and network rules that generates to a scale-free distribution [20][21][22][23][24]. Other models rely only on the spatial structure; as an example, recent models based on static random networks on hidden hyperbolic spaces [25][26][27] have been shown to give rise to scale-free degree distributions and could explain the assortativity and clustering observed for Internet.
An alternative, simpler way to obtain spatial scalefree networks is to embed the nodes in a metric space and modify the preferential attachment rule to includes spatial information. The classic preferential attachment mechanism is based on a connection probability Π j→i that is linear in the degree of the node Π j→i ∝ k i . If each node i is associated to a position x i in a metric space, an immediate modification of preferential attachment is given by a multiplicative weight depending on the distance d ij between nodes, i.e.
The positions x i can be randomly chosen from a distribution ρ(x). If all distance are equal (for example, if the space reduces to a single point) then the spatial structure disappears and the model reduces to the Barabasi-Albert model, otherwise there could be a non-trivial interplay between the spatial and network structure. Growing network models in this class have been proposed several times in the literature and have been studied numerically in Euclidean spaces [4][5][6]28]. These studies are reviewed in [19], section IV.D.2. Furthermore, analytical results have been obtained for some models on the sphere [9][10][11] and in generic spaces under strong approximations [7,8]. However, despite the simplicity of these models, there are no analytical or numerical results for models on generic metric spaces and with generic weights σ(d). Our aim is to provide a general analysis of this class of models.
In this paper we focus on networks on flat and curved spaces, the growth of which follows a preferential attachment rule weighted by a function of the distance between nodes as in equation (1). Our main result is the derivation of a general expression for the asymptotic degree distribution. The form of the degree distribution is essentially equivalent to the fitness model of Bianconi and Barabasi [2], with a fitness distribution that depends on the space and on the connection function σ(d) through a selfconsistency equation. We present some examples where the distribution can be solved explicitly. These examples include homogeneous space and spaces with curvature singularities; in the latter, Bose-Einstein condensation can occur on nodes near a singularity. We also study numerically the distribution for disks in the flat plane, on the sphere and in the hyperbolic plane, showing that in the hyperbolic model there is a long but transient regime of Bose-Einstein condensation. Finally we give a general expression for the distribution of link lengths.

II. DEGREE DISTRIBUTION OF GROWING NETWORKS ON HIDDEN SPACES
We present a general class of models for networks with preferential attachment on a metric space. In these models, each node has a position x randomly chosen from a space M with probability ρ(x). We call x i and k i the position and degree of the ith node. At each time step, a node with m links is added to the network and the m links are randomly attached to nodes in the network with a modified preferential attachment rule.
We are mostly interested in the case where each node is assigned a positions on a Riemannian manifold of dimension D (possibly with boundaries and/or singularities) with finite volume, that is, S = (M, ds 2 ) where M is the manifold and ds 2 is its metric, that is, the squared distance between the points x and x + dx, which is a bilinear form ds 2 = i,j g ij (x)dx i dx j in the infinitesimal displacement dx on the manifold. Specific models in this class were studied in [4-6, 9-11, 28] and a more general model was defined in [7,8]. However the analysis of [8] is valid only for almost-homogeneous manifolds, as discussed in section III A.
In this work we employ the natural requirement that all the information about hidden variables should come from the metric structure of the manifold. More specifically, we require that nodes are assigned random position on the manifold such that the average number of nodes per unit volume is constant. Therefore the resulting density is simply ρ(x) = 1/Vol(M) if expressed in terms of (locally) Euclidean coordinates, or more generally ρ(x) = J −1 (x)/Vol(M) where J (x) is the Jacobian of the change of coordinates to (locally) Euclidean coordinates. However, all the results of this section extend naturally to the case of a generic node density ρ(x) arbitrarily chosen.
Moreover, we require that the preferential attachment rule be modified such that the connection probability from a new node with coordinate y to a node with coordinate x depends on their distance d(x, y), that is, the attachment rule is given by a real positive function σ(d) such that the probability to attach a link to the ith node is where x corresponds to the position of the new node. These models correspond to the heterogenous network models of [7,8], but while these papers focus on the almost-homogeneous case, we allow for all spaces and all levels of heterogeneity. An important quantity related to σ(d) is its typical length scale λ σ ∼ dσ(d) / σ(d) , which depends explicitly on the manifold M (and on its boundary, if it exists). Denoting by R the typical length scale of the manifold, the connection function is "long-range" if λ σ ∼ R and "short-range" if λ σ ≪ R. The networks associated with the two cases can have different properties.
A network model on a hidden space is therefore defined by a manifold M and a connection probability σ(d). We derive the degree distribution through the rate equation approach introduced in [29,30]. The equation for the node density n k (x), which is the average density of nodes with hidden position x and degree k, is plus a term +δ k,m ρ(x) that accounts for the birth of new nodes. We assume a linear scaling with time for the quantity then can rewrite it as where q(x) plays the same role as the (average) fitness of the node [2,3] and is defined as The "fitness" q(x) is determined by solving asymptotically the above equation and substituting in the definition of C(y) and in (6) to obtain a single functional equation for q(x): This consistency equation determines q(x) as a function of the global structure of the manifold S = (M, ds 2 ), the weight function σ(d) and the node density ρ(x). From the point of view of the degree distribution, this class of models is equivalent to the Bianconi-Barabasi fitness model [2,3], but in this case the fitness distribution is dynamically determined by ρ(x) and σ(d) through equation (8). The resulting degree distribution is so the distribution is a sum of power laws similarly to the fitness model, and for a regular distribution of x and q(x) it typically reduces to a power law with logarithmic corrections [2]. Note that if we allow for an arbitrary node density ρ(x), the only relevant information about the manifold is the distance function d(x, y), therefore these results can be applied to any metric space. However from now on we focus on Riemannian manifolds with uniform node density.
We are particularly interested in warped manifolds. A warped manifold is a space M that is locally a product R×H where H is an homogeneous space with metric ds 2 H , and whose metric has the form ds 2 M = dr 2 + f (r)ds 2 H . We are interested in spaces with O(D) symmetry, which means that H is the sphere S D−1 with the natural metric dΩ 2 . In this case, if there is a radius (say, r = 0) such that f (r) → 0 for r → 0, then f (r) ∼ r 2 for r ∼ 0 to avoid singularities in the metric. Flat spaces, hyperbolic spaces and hyperspheres can be seen as particular examples of these spaces with f (r) = r 2 , f (r) = sinh 2 (ζr)/ζ 2 and f (r) = sin 2 (ζr)/ζ 2 respectively. In these spaces the O(D) symmetry implies that q and C depend only on r and that q(r) is determined through the equation where h(r, s) is In the following sections we discuss three analytically solvable cases (homogeneous spaces, balls in flat or hyperbolic spaces with long-range connections, warped throaths) and then the interesting cases of disks on the flat plane, the sphere and the hyperbolic plane, which are the simplest spaces corresponding to constant zero, positive and negative curvature respectively.

A. Homogeneous spaces
We consider first the cases where M is an homogeneous space. This means that there is a symmetry group on M that leaves the metric (and the measure) invariant and moreover that the orbit of each point in M under the action of the group is M itself. The first consequence is that the volumes d D xρ(x) and the connection probability σ(d(x, y)) are invariant under the action of the symmetry group, as are all quantities build from integrals of functions of the position over the whole manifold M. Then, if the assumption (4) is correct and if all the integrals involved in the definition of C(x) and q(x) are convergent, neither C(x) nor q(x) depend on the position x, and therefore integrating both sides of the equation (8) by d D xρ(x) we obtain q = 1/2. This means that the degree distribution of the model on homogeneous spaces remains essentially the same as the one of the Barabasi-Albert model For homogeneous spaces, this result does not depend at all on the form of σ(d), as long as its integral over the volume is convergent.
The result (12) was obtained numerically by Barthelemy [6] for the case of the flat disk in D = 2 with σ(d) = e −d/rc and r c not too small. In the same paper it was shown that for small r c the transient regime dominates the dynamics for long times, inducing a cutoff k max ∼ n 0.1 where n is the number of nodes inside a volume of order r 2 c . In this section we have shown that the result (12) is actually general for homogeneous spaces.
Recently, Jordan [11] proved a slightly more general result: the degree distribution (12) is the same for all spaces where the volume of a ball depends only on its radius and not on its position, provided that σ(d) has a positive minimum and that the integrals of σ(d) and σ(d) 2 over the whole manifold are finite. This result applies to all homogeneous spaces, confirming the above argument. Note that Jordan's result can be directly obtained from equation (8) by substituting q(x) = 1/2 and interpreting all integrals as Lebesgue integrals, with the only requirement M d D x σ(d(x, y)) < ∞.
In practice, equation (12) is also true for open balls in homogeneous spaces if λ σ ≪ R, where R is the radius of the ball. Consider as an example the hyperbolic disk: if the typical radius λ σ of the function σ(d) is much smaller than the radius of the disk, then the disk appears to each node as the (infinite homogeneous) hyperbolic plane. Deviations from q(r) = 1/2 decay exponentially with the distance from the boundary of the ball. More generally, a rough approximation for almost-homogeneous spaces is to consider C independent on the position on the manifold and therefore obtain the approximate equation [8] q(x) = 1 2

B. Balls with long-range connections
We consider balls in flat spaces of high dimension D ≫ 1. In this limit, most of the nodes lie within a distance R/D from the boundary of the ball. This means that for D → ∞ almost all the nodes lie approximately on the boundary, that is, on the (D − 1)-dimensional sphere S D−1 , which is an homogeneous space with the induced metric. Moreover, if the connection probability grows with the distance, most of the nodes will connect to nodes on the boundary. Therefore, these nodes have q = 1/2 and it is possible to calculate q(r) for the few nodes in the inside from equation (10) by considering an effective distribution ρ(r) ∝ δ(r − R) of nodes. If we choose for example σ(d) = d 2 the result is Similar (but more complicated) results can be obtained in the same limit D ≫ 1 with other functions σ(d) or with balls in other spaces (hyperspheres, hyperbolic spaces).

C. Warped throats, condensation and the fitness model
A natural question about these models is: what happens when equation (8) does not admit solutions? The absence of solutions is a sign of the breakdown of the mean-field approximation in equations (3),(4). This breakdown could appear as condensation of the links on a single node as in the fitness model of Bianconi and Barabasi [3]. In order to show that this possibility is realized at least in some cases, we naturally embed the condensate phase of the fitness model in a network on a warped geometry.
The basic mechanism for condensation can be understood by considering the throat-like geometry illustrated in figure 1. Most of the nodes would lie on the upper border of the throat, while only a few nodes would lie near the tip. If the connection function σ(d) increases strongly with the distance, many nodes would connect to the node that is closest to the tip, because it would have maximum distance from the border and almost no competitors. In turn, the degree of this node would grow fast and attract links because of preferential attachment. Then, links would condensate on this node.
For a concrete realization of the mapping between our model and the fitness model, we consider networks on warped throats. These geometries are well-known in the context of string theory and particularly of gauge/gravity correspondence [31]. They can be realized as warped spaces R + × S D−1 with a metric ds 2 = dr 2 + f (r)dΩ 2 and a function f (r) that remains very small for a large interval of r, as illustrated in Figure 1. In particular, we consider r ∈ [0, 1], D ≫ 1 and we choose f (r) = εr k and σ(d) = d. Then ρ(r) = r k(D−1)/2 (kD − k + 2)/2. In the limit ε ≪ 1, kD ≫ 1 almost all the nodes lie at r ≃ 1. Moreover, the size of the transverse spherical sections is given by √ ε, which is practically negligible, therefore the connection probability can be approximated as σ(d 1,2 ) = d 1,2 ≃ |r 1 − r 2 |. From the point of view of the degree distribution, it can be further approximated as σ(d 1,2 ) ≃ 1 − r 1 since r 2 would almost surely lie around 1. If we define η = 1 − r, we see that σ(d 1,2 ) ≃ η 1 and therefore the model corresponds precisely to the Bianconi-Barabasi model with fitness distribution ρ(η) ∝ (1 − η) k(D−1)/2 , that is, in the deep condensate phase [3]. Then the mean-field approximation breaks down and there is a condensation of links (that is, a finite fraction of the links) on a node near the tip of the throat.
The correspondence between the two models captures well the competition between nodes with higher "fitness", that is, the nodes between r = 0 and r ′ ≪ 1. This means that the condensate is not a transient effect but a long-term feature of this geometry. This is related to the presence of a metric singularity on the tip of the throat: in fact a non-singular tip would look locally like R D , therefore it would contain a finite density of nodes increasing linearly in time and competing asymptotically with the condensate. The competition of nodes with the same fitness would make the degree of the most connected node grow sublinearly (q < 1) and then the condensate would disappear.
In the above model with k = 2 it is possible to see explicitly the condensation phase transition by varying the parameter ε . This interpolation is possible because in this case ε has also an easy geometric interpretation: in fact the geometry reduces to a D-dimensional hypercone with a conical singularity at the tip r = 0 with cone angle α and the parameter ε is a function ε = α 2 /4π 2 of this angle. For ε = 1 we have α = 2π, therefore the space is a flat ball without singularities and the model reduces to the one of the previous section III B with a degree distribution p(k) ∼ k −3 . On the other extreme, for ε → 0 the cone is extremely shallow (α → 0) and we recover the model above, which is in the deep condensate phase. From geometrical reasoning, the critical point for this phase transition in the limit D → ∞ lies around ε c ≃ 16 arcsin 2 (1/4)/π 2 ∼ 0.1.
It is not clear if there can be condensation without a singularity in the metric. For smooth manifolds, the manifold looks locally like the hyperspace R D in the inside and has an uniform density of nodes everywhere, therefore the argument above forbids the appearance of a condensate at any position except the boundary. A similar argument should apply to points on a smooth boundary. For this reason, we conjecture that the equation (8) for q(x) always admit a solution for a smooth manifold of finite diameter with smooth boundaries and a smooth function σ(d). The argument can be easily extended to the case of a generic (non-uniform) density function ρ(x), as long as ρ(x) is stricty positive and smooth in all points of the manifold.
Note that condensation with σ(d) an increasing (decreasing) function of the distance d occurs typically in manifolds with singularities of positive (negative) curvature. An example of the latter is the above hypercone model with ε ≫ 1 (a strongly hyperbolic space, the opposite limit with respect to a warped throat) and a decreasing function like σ(d) = e −d .

IV. NUMERICAL RESULTS
Since it is difficult to solve analytically the equation (10), we simulated the above model on some spaces and solved numerically the equations for q(x) to compare it with the result of the simulations. Simulations were performed for two-dimensional warped manifolds with constant curvature: the flat disk (curvature ζ = 0), the elliptic disk (disk on the sphere, ζ = +1) and the hyperbolic disk (disk in the hyperbolic plane, ζ = −1). These choices of ζ are completely general since a rescaling of the curvature is equivalent to a rescaling of the radius r max of the disk and of the typical length λ σ of the connection probability.
We built the initial network with n 0 = 2m nodes with m links each, randomly connected. To every node, both initial and new ones, were assigned random coordinates in the manifold with a uniform distribution with respect to the volume of the manifold. The rest of the network was built as in the model presented above. Data reported are for simulation with total number on nodes N = 2 · 10 5 and are averaged over 50 simulations with differents starting seeds.
Numerical solution of (10) is based on the numerical solution of the system of equations ∂k(r, t) ∂t which is equivalent to a numerical integration of the network using macronodes. These equations are integrated up to a time t when the solution satisfied the conditioñ with a given precision. If the above condition is met, the resultq(r) is a numerical solution of (10). For the simulations, q(r) can be estimated in two ways: where . . . I denotes the average over nodes in the set I. The difference betweenq(r) andq 2 (r) (orq 1 (r, t) and q 2 (r, t)) suggests how far the network is from equilibrium.
Comparison between predicted and simulated q(r) for flat disk of radius 1 and some choices of σ(d).

A. The flat and elliptic disk
The flat space is an homogeneous space, so the flat disk looks almost homogeneous for a short-range connection function σ(d) = e −16d/R and q(x) ≃ 0.5 except for the region near the border, therefore the degree distribution of the network is similar to the one generated by the Barabasi-Albert (BA) model. Functions like e −4d/R and especially θ(R−d) have higher λ σ , therefore produce networks with a q(x) and consequently a p(k) that deviate from the BA model, as it can be seen in Figure 2. There is a very good agreement between the numerical solution of equation (10) and q(r) extracted from the simulations, as shown in Figure 3.
Simulation results for the disk on the sphere are similar to the flat case, but the deviations from the BA case are slightly less pronounced. We compare the distributions of q(r) and p(k) for networks with σ(d) = θ(r max − d) on disks in flat space, sphere and hyperbolic plane in Figure  4.
The case of networks in one-and two-dimensional flat space was actually studied in a series of works [4,5,28,32] with the function σ(d) = d α for negative values of α. This case is interesting because as emphasized in [4,5], the degree distribution resembles a power-law for α above a critical value α c , while below this value it develops a stretched exponential tail. This transition to exponential behaviour is confirmed by our simulations for the flat disk in D = 1 and 2, as shown in Figure 5. The critical point appears to be α c = −1 for D = 1 and α c = −2 for D = 2, different from the values found in [4] for the twodimensional torus but in agreement with later results in [33] and the theoretical suggestions in [11].
Our approach offers a simple explanation for this behaviour. For α ≤ −D, the connection function σ(d) = d α has a divergent integral on a manifold with finite volume in dimension D. The rate equation approach leading to equation (8) breaks down if the integrals involving the connection function σ(d) diverge. In particular, if the divergence is localized at d = 0, new nodes connect mostly to the nearest nodes and the distribution develops a exponential tail. The distribution resembles a stretched exponential, as shown numerically for D = 2 in [33] and also analytically in a similar model where each new node i connects to the existing node with maximum k j d α ij [34]. Therefore the predicted critical point is α c = −D, in agreement with simulations in Figure 5. As an interesting observation, note that the behaviour of p(k) in Figure

B. The hyperbolic disk
For disks in the hyperbolic plane of curvature -1, the results depend strongly on the radius of the disk. The deviation from the homogeneous case (BA distribution) is more pronounced as λ σ and r max (i.e., curvature) increase. For a disk of radius 1.5, the simulated q(r) and p(k) show a pattern slightly more pronounced than the flat case, while results for a disk of radius 4 are shown in Figure 6.
The hyperbolic space is quite interesting because it resembles the scenario opposite to the warped throath discussed above, given the exponential expansion of the volumes as r increase, therefore it is a good place to observe condensation in models with decreasing σ(d). Actually, the absence of singularies forbids the existence of an asymptotic condensate, but transient condensates with q ≈ 1 could last for exponentially long times due to the exponentially small number of competing nodes near the center of the disk. This effect can be clearly seen in Figure 6 with σ(d) = θ(r max − d). The fact that q(r) goes above 1 (that is, superlinear scaling) depends only on the long out-of-equilibrium transient regime, as it can be seen by comparing all estimators of q(r) in Figure 7. For large times, q(0) is slightly below 1.

V. DISTRIBUTION OF LINK LENGTHS
The asymptotic distribution of link lengths follows the equation which depends on the fitness q(x) defined by equation (8). For (almost) homogeneous spaces, the above equation reduces to the simple distribution where ρ d (l) is the density of nodes at distance l from a random node. This result is consistent with previous studies with σ(d) = d α in [4].
Two examples of link length distributions in flat space are shown in Figure 8. The model with short-range interactions σ(d) = e −16d is almost homogeneous and fits well the prediction p(l) = 256le −16l from equation (21), while the one with long-range interactions σ(d) = θ(1−d) deviates from the distribution p(l) = 2lθ(1 − l) expected for an homogeneous space.

VI. CONCLUSIONS
The results of this paper show that the power-law behaviour of the degree distribution of growing networks with preferential attachment is quite robust with respect to the spatial structure of the network. The effect of the interplay of preferential attachment and the spatial nature of the network results in an heterogeneity between nodes in different positions. In spaces with high curvature or singularities, it is possible to observe Bose-Einstein condensation driven by the intrinsic heterogeneity of the space or by the finite size of the network, as it happens in hyperbolic spaces at strong curvature.
These results have been obtained through a rate equation approach, which breaks down if the integrals involved in the selfconsistency equation for q(x) diverge. This is the case for connection functions σ(d) with divergent integral. In the simple case σ(d) = d α , there is a critical point α c = −D for the transition from the scale-free behaviour to an exponential behaviour due to distance-dominated attachment, that is, preferential connection to the nearest neighbours.
In this paper we only considered a spatial embedding of the basic Barabasi-Albert model, which gives rise to a power-law degree distribution with exponent γ = 3. However, most scale-free networks found in natural and technological systems have a degree distribution with an exponent γ < 3. There are several variations on the Barabasi-Albert model (see [35] for a review) that are able to account for a different exponent and that can be included naturally in the spatial models studied in this paper without changing the main results. For example, adding m ′ extra links per unit time would leave equation (9) invariant but change the l.h.s. of equation (8) to q(x)/(1 + 2m ′ /m), which corresponds to an exponent γ = 2 + 1/(1 + 2m ′ /m) in homogeneous spaces, varying between 2 < γ ≤ 3.
Finally, spatial networks have other interesting properties, like clustering and assortativity, that have not been studied in this paper and would deserve further work. In the models considered here, the volume Vol(M) of the manifold is constant and the density of nodes grows linearly in time, going to infinity in the thermodynamic limit. This divergence in the density makes the clustering decrease in the thermodynamic limit. An interesting variation on these models would be a growing network with constant node density on an expanding space.