On the validity of the on–site spin–orbit coupling approximation.

work we thoroughly test its general validity by confronting SOC related properties such as spin splittings, spin textures or magnetic anisotropies calculated under the on–site approximation versus the more general approach where all the contributions to the SOC, including three–center integrals, are explicitly included. After considering a variety of systems with diﬀerent dimensionalities, all presenting a strong SOC, we conclude that although the on–site approximation often provides accurate results, it breaks down in some systems where 5 d electrons are close to the Fermi level due to their strong SOC and moderately large spatial extension. Furthermore, there are a few examples where subtle inaccuracies lead to qualitatively wrong conclusions, the most clear case being the doping of the topological surface state in Bi 2 Se 3 (0001). Finally, magnetic anisotropy energies calculated under this approximation tend to be underestimated.

Spin-orbit coupling (SOC) is generally understood as a highly localized interaction within each atom, whereby core electrons holding large J splittings transfer the SOC to the valence electrons of the same atom, while their direct impact on neighbor valence orbitals is usually small. Seivane and Ferrer [Phys. Rev. Lett. 99, 183401 (2007)] proposed an approach within a tight-binding type ab initio framework assuming that the transfer of SOC from core to valence orbitals only takes place when both are on the same atom, leading to the so-called on-site approximation, which then has been successfully applied to a variety of systems. In this work we thoroughly test its general validity by confronting SOC related properties such as spin splittings, spin textures or magnetic anisotropies calculated under the on-site approximation versus the more general approach where all the contributions to the SOC, including three-center integrals, are explicitly included. After considering a variety of systems with different dimensionalities, all presenting a strong SOC, we conclude that although the on-site approximation often provides accurate results, it breaks down in some systems where 5d electrons are close to the Fermi level due to their strong SOC and moderately large spatial extension. Furthermore, there are a few examples where subtle inaccuracies lead to qualitatively wrong conclusions, the most clear case being the doping of the topological surface state in Bi2Se3(0001). Finally, magnetic anisotropy energies calculated under this approximation tend to be underestimated.

I. INTRODUCTION
The spin-orbit coupling (SOC) is a relativistic effect that arises from the interaction between the intrinsic magnetic moment of the electron and the magnetic field seen in its orbital motion around the nucleus 1,2 . The SOC is of paramount importance in numerous active research areas such as spin textures 3 , topological insulators 4 , spin Hall effects 5,6 , magnetic anisotropy energies (MAEs) [7][8][9][10][11] , Dzyaloshinskii-Moriya interactions [12][13][14] and spin-orbit transfer torques 15,16 , among others. In fact, a new field known as spinorbitronics has emerged in the last years aiming to achieve efficient mechanisms for spin injection accumulation and manipulation 17 with potential use in ultra low power memories and computing and signal processing devices.
Along with the advent of this plethora of SOC-related phenomena, most codes based on Density Functional Theory (DFT) had accomplished the implementation of spin-orbit interactions under different levels of approximation and accuracy. The main drawback in fully self-consistent calculations including SOC is the large increase of the computational cost caused by the spin mixing, whereby an (N × N ) system transforms into a (2N × 2N ) one, where N stands for the total number of basis orbitals. Therefore, accurate all-electron DFT calculations using a fully-relativistic (FR) Hamiltonian in the Schrödinger equation are currently restricted to small systems involving at most a few tens of atoms 18 . The replacement of core electrons by pseudopotentials (PPs) 19,20 has become a standard approximation [21][22][23][24] in order to significantly reduce N , making calculations for large systems more tractable. Although traditionally most PP-based formalisms exploited only the scalarrelativistic (SR) part of the PPs generated from allelectron FR atomic calculations, Hemstreet et al showed almost thirty years ago how the SOC part could be incorporated into the electronic Hamiltonian in an efficient way 25 -in previous works we have referred to it as the fully-relativistic pseudo-potential formalism (FR-PP) 26 . In essence, within the FR-PP, the angular part of the PP of a given atom k, V ps k , is expressed in the |j = l ± 1 2 , m j basis and the SOC felt by the valence states solely arises from their interaction with the SO part of these PPs which induces spin mixing via nonvanishing µ, σ|V so k |ν, σ matrix elements, where µ and ν index the elements of the basis set (plane waves, atomic orbitals, wavelets, ...) and σ = ↑, ↓. In the context of calculations using atomic orbitals as basis sets, where the orbitals are centered on the atoms and are the product of a radial function times a spherical harmonic that describes the angular part (so-called Linear Combination of Atomic Orbitals or LCAO approaches), the orbitals µ and ν may belong to different atoms than k, the only requirement being that their overlaps with V so k are not null. In order to avoid the calculation of these threecenter integrals, the PPs are typically expressed in their fully non-local form 27 . The FR-PP formalism has been successfully implemented and is currently actively used under both plane wave 24 and LCAO 3,21,29,30 DFT codes.
Within the LCAO framework, a further simplification named as the on-site approximation was introduced by Seivane and Ferrer 31 whereby only the µ|V so k |ν matrix elements with |µ and |ν belonging to the same atom k as V so k are considered, while the rest of inter-atomic hopping terms are discarded. The justification relies on the short-ranged character of the radial part of the spinorbit pseudo-potential. The approximation leads to a one-center radial integral while the angular integrals can be analytically solved for the non-vanishing elements. The on-site approximation has been applied to a number of systems, ranging from the SOC induced valence band splittings of semi-conductors to the MAEs of metallic nanoparticles 31,32 .
However, no thorough study of the accuracy of the onsite approximation has yet been performed. Although deviations from the full FR-PP approach are expected to be small, hopping terms can add up to produce effects comparable to those of the on-site terms 36 , and therefore it is important to assess the range of validity of the onsite approximation. The implications are also relevant when considering tight binding (TB) models including SOC 33,34 . For instance, under the on-site approximation, one would expect that the SOC only affects on-site energies with corrections of the form SO,σσ l,m,m , whereas the off-site contributions would additionally lead to SOC dependent transfer integrals, t SO,σσ lm,l m . In this work we present a detailed analysis of the accuracy of the on-site approximation considering several systems with different dimensionalities (from 0-D to 3-D), all of them presenting a strong SOC. We address SOC related properties such as MAEs, Rashba splittings, spin textures and (topological) surface states. All calculations have been performed with the SIESTA 21 LCAO code, which in its recent versions 37 features both the full FR-PP and the on-site approximation. We employ the same calculation parameters for each system in order to ensure that any differences can be solely ascribed to the neglect of off-site terms in the on-site approximation.
The paper is structured as follows. In section II we describe briefly the theoretical formula behind the FR-PP formalism and the on-site approximation, as well as some general remarks concerning the actual implementation in SIESTA. In sections III and IV we present the different SOC systems considered, showing the excellent accuracy of the on-site approximation in most of the cases, but also emphasizing the few failures that we have found. Finally, section V summarizes the main conclusions of this work.

II. THEORY
Within a PP-DFT formalism, the Kohn-Sham Hamiltonian may be expressed as: beingT the kinetic energy operator,V ps the PP contribution andV H andV XC the Hartree and exchangecorrelation potentials, respectively.V XC and the SO part ofV ps (V SO ) are the only spin-dependent operators that couple both spin components. Kleinman 20 showed how the norm-conserving PP formalism could be extended to include all relativistic corrections up to order α 2 , where α is the fine structure constant, by constructing J-dependent PPs from the allelectron solutions of the major component of the Dirac equation for isolated atoms: with J = l ± 1/2. If an LCAO basis set is employed, {|µ }, then the matrix elements of theV ps operator take the form: where we have introduced the subindex k indicating the atom to which eachV ps belongs. The main problem with expression (2) is that it has a semi-local character, in the sense that it is local in the radial part and non-local in the angular part 25 and, hence, it requires the computationally expensive evaluation of three-center integrals since the AOs |µ and |ν need not be located at the k-site.
After taking the following J-weighted sum and difference: (where J ± = l ± 1/2), equation (2) may be rewritten in terms of a scalar-relativistic (SR) and a spin-orbit contribution as: It is common practice in scalar-relativistic DFT calculations to transform the SR part into a local plus a sum of fully non-local operators following Kleinman-Bylander 27 : so that only two-center integrals, µ|v l ; lm , are now involved.
A. The Fully-Relativistic Pseudo-Potential formalism Hemstreet et al 25 deduced fully non-local forms for the SR and SO pseudo-potential operators. Inspired by their work, alternative albeit equivalent expressions were deduced by Cuadrado et al 26 ; here, the fullV ps operator is transformed into a Kleinman-Bylander form in the {|lJm j } basis: thus only requiring two-center integrals between projectors and basis orbitals, µ|v lJ ; lJm J . Although the matrix elements of V ps are sufficient to solve the problem, and the decomposition in SR and SO terms is not necessary, these terms can nevertheless be obtained if needed for further analysis. By computing at the same time thê V SR µν matrix elements via eq. (7) with the same choice of V local (r) as in (8), it is straightforward to extract the SO contribution from the difference: V SO µν = V ps µν − V SR µν . This FR-PP method takes into account all interactions µ|V ps k |ν between the PP at site k and all neighboring AOs and provides a rigorous account of SOC within the context of the underlying pseudo-potential and LCAO approximations.

B. The on-site approximation
In the on-site approximation developed by Seivane and Ferrer 31 , the standard expression (7) for the SR part is retained, whileV SO is approximated as a fully local operator by only considering intra-atomic matrix elements; that is, µ|V ps k |ν terms where both |µ and |ν belong to atom k, and neglecting all others. The justification lies on the fact that the V SO l (r) potentials are short-ranged.

C. General remarks about the implementation of SOC in SIESTA
Both the full FR-PP formalism and the on-site approximation have been implemented in SIESTA 37 as separate modules. Nevertheless, we have tested both implementations by removing in the full FR-PP routine any SOC interactions involving orbitals not belonging to the same site as the PP, yielding results indistinguishable from those obtained with the on-site specific routine.
We also note that SIESTA, as well as most common LCAO codes, employs the spherical harmonics Y lm (r) in their real form, obtained as a linear combination of complex spherical harmonics. Therefore, the Clebsch-Gordan coefficients involved in the angular lm|lJm J integrals require a further unitary transformation 26,31 .
Once the Hamiltonian including the SOC part has been solved self-consistently, the spin-orbit contribution to the energy is given by: Although the imaginary part of the diagonal spin boxes of the density matrix, Im{ρ σσ µν }, do not contribute to the magnetic moment m(r), they cannot be neglected since they do contribute to the SOC energy, E SO , and therefore to the total energy.
Due to the small contribution of E SO to the total energy, the level of precision required to perform an accurate fully-relativistic self-consistent calculation is quite demanding. This is specially true for the calculation of MAEs, where energy differences between two spinquantization axis are typically in the meV (and sub-meV) range. In such calculations, the tolerance in the self-consistent criteria (either related to the Hamiltonian, density matrix or both), the k-point sampling or the size of the real space grid (Mesh Cutoff) must be carefully converged for each specific system to ensure accurate results. In addition, and as shown in Ref. [26], inclusion of non-linear core corrections 38 in the PPs with rather small matching radius is in general quite relevant to achieve accurate MAE values. This, in turn, requires finer real space grids.
Last, we mention that in SIESTA it is possible to either construct the fully-relativistic Kleinman-Bylander projectors from PPs in semilocal form, or directly read them from appropriately generated PSML files, as provided by the Pseudo-Dojo project 39,40 .

D. Details of the calculation parameters
In all SIESTA calculations to be shown in the next sections we employed the GGA 41 for the XC functional, and fully relativistic PPs including nonlinear core corrections.
The convergence tolerance in the self-consistent loop for the density matrix was set to 10 −6 or to 1 meV if the Hamiltonians were mixed instead. The geometric optimisations were performed using the conjugate gradient method until forces on atoms were less than 0.02 eV/Å.  The optimisations have been performed at the SR level -that is, without SOC.
Selected results have been compared with calculations done with the VASP code, which is commonly used for SOC calculations 43 . We have used the same XC functional 41 and geometries as in SIESTA. Plane-wave energy cutoffs were taken between 250 − 400 eV, with additional tests up to 1000 eV.
Furthermore, in order to achieve a precise description of the surface states in several sytems, we considered true semi-infinite surfaces via Green's function matching technique as implemented in the GREEN code 44,45 and its interface to SIESTA. In these cases, instead of the usual band structure, we calculated k-resolved surface projected density of states maps, PDOS(k,E), which allow to resolve the bulk gap regions unambiguously. High resolution maps were computed employing energy-and k-grids of 2 meV and 0.003Å −1 , respectively, while the imaginary part entering the Green's function, which determines the broadening of the surface states, was set to 2 meV.

III. ELECTRONIC BAND STRUCTURES
A. Isolated helical Te chain The first case we consider is a helical isolated chain of tellurium atoms which was recently studied by Han et al 46 . This one-dimensional system was shown to be dynamically stable and to present a giant Rashba splitting 46 . Figure 1 displays the 1D band structure for our optimized geometry which coincides with that of Ref. 46. In the figure we simultaneously present the bands calculated with the full FR-PP method and the on-site approximation. The agreement between the two (as well as with those reported in Ref. [46] is excellent, with deviations of just a few tens of meV. Larger differences are only found for an empty band just below +2 eV. Notice in particular how the giant Rashba splitting of the highest occupied band is perfectly reproduced, both in terms of the spin splitting in energy as well as the k-shift.
B. Zig-zag surface at a Bi bilayer Our next system is a semi-infinite 2D material consisting of a truncated Bi(111) bilayer exposing a zig-zag edge. The structure is constructed by first building a 22atoms thick ribbon with the atoms initially at the positions of the ideal infinite 2D bilayer (a=4.60Å). We then optimize the positions of the six atoms closest to the edge of the ribbon, leaving the rest fixed. The most prominent feature in the relaxed structure is a large ∼ 0.6Å inward shift of the atoms at the edge towards the inner ones in order to strengthen their bonds, which is also accompanied by an increase in the buckling between them from 1.68Å to 2.0Å. Next, and as shown in Figure 2, we have modeled the bilayer edge with a semi-infinite geometry after matching via Green's function techniques the ribbon containing the relaxed edge to an semi-infinite bulk-like bilayer (see section II D for further details).
In Figure 2(a) we present the electronic structure around the Fermi level projected on the surface atoms for a calculation without SOC. A spin degenerate edge state runs across the band gap of the entire BZ crossing the Fermi level four times, in good accordance with the similar calculation of Ref. 47. In panel (b) we present the analogous calculation including SOC at the full FR-PP level. The topologically trivial edge state now appears spin-split while the gap is removed. Indeed, at the equilibrium lattice constant, the bulk 2D Bi(111) bilayer is at the turning point towards a topological state, as it becomes metallic as the gap closes when SOC is included (not shown). This is at contrast with the 1D nano-ribbon case; for instance, Li et al 47 found a sizeable gap for a 73Å wide zig-zag nano-ribbon which we ascribe to the interaction between the two edges, still present even for such a wide nano-ribbon. Upon comparison with the on-site calculation, shown in panel (c), we again find a perfect agreement with only very subtle differences; for instance, the upper edge state inside the conduction band cone (resonance) is clearly more intense.
Surface projected spin textures are presented in panels (d,f) and (e,g), for the full FR-PP and on-site cases, respectively -the S x component is omitted since, due to mirror-symmetry, it vanishes. Remarkably, we find that the on-site approximation accurately reproduces this rich spin texture, presenting inversions of the S y component at the correct k-locations for the edge states.

C. WS2 monolayer: range of the basis orbitals
We illustrate the influence of the localization of the basis set on the performance of FR-PP and on-site approximation by calculating the band structure of a monolayer of a transition metal dichalcogenide, WS 2 (figure 3). As for surfaces, it is known that the extension of the electronic wavefunctions in monolayers towards the vacuum requires longer atomic orbitals to give accurate results. 35 Longer orbitals means larger interactions with neighbour orbitals, and larger off-site terms in the SO operator. We use two alternative basis: one set is automatically generated by SIESTA using different values of the confinement energy (PAO.EnergyShift between 230 and 14 meV, which results in maximum cutoff radii between 3.52 and 4.76Å, respectively); the second basis was carefully tuned for the on-site approximation by Roldán et al 28 . While for relatively short basis both the on-site (top panels) and the FR-PP formalism (red dashed lines in lower panels) give essentially indistinguishable bandstructures, substantial differences are evident for the longer orbital basis. In particular, a band with p-type character crosses from high energies down below the Fermi level as the orbital cutoff radii increases, erroneously driving the system metallic with the on-site description. Mulliken population analysis shows that there is non-negligible overlap between W neighbouring atoms. As the basis radial function increases, this approximation forces an internal redistribution of charges between 6s and 6p orbitals, which also involves 3s and 3p orbitals of near S atoms. Note that this does not affect the S 3p -W 5d hybridization that is responsible for the SO-splitting of the valence band at K. This illustrates that for the on-site approximation to be valid, the range of localization of the support orbitals must comply with the conditions of negligible, or small, off-site contributions.

D. Bi2Se3(0001) surface
In this sub-section we consider the Bi 2 Se 3 dichalcogenide as it stands as a paradigmatic topological 3D insulator, given its large gap. In order to retrieve the topological surface states (TSS) we have employed a (0001) oriented slab containing up to six quintuple layers (QLs), since for this thickness the interaction between the TSSs at each side of the slab is known to be negligible 29 . We do not model the surface as a semi-infinite system since we will compare the SIESTA-derived electronic structure with that obtained with the VASP plane-wave code 22 for the same slab geometry. The comparison between the band structures calculated with VASP (green lines) and SIESTA using the full FR-PP formulation (black) and the on-site approximation (blue) is displayed in Figure 4(a). If we focus on the TSSs crossing the band gap between -0.1 and 0.35 eV, one immediately observes a very good match between VASP and our FR-PP results, with the Dirac point where the upper and lower cones meet located precisely at the Fermi level due to charge neutrality requirements. The on-site approximation also reproduces the TSSs, but with the Dirac point located clearly below the valence band maximum (at around −80 meV) and, hence, not pinned any more at the Fermi level. Although in general a deviation of several tens of meV does not seem too relevant -for instance, similar differences can be seen in the valence band at Γ between the plane-wave and FR-PP cases -in this particular case it has fundamental consequences as the on-site approximation would erroneously predict an n-type doped TSS.

E. (111) surfaces of 5d metals
In this subsection we examine the capability of the on-site approximation to correctly describe the surface states (SSs) hosted by several (111) surfaces of heavy 5d f cc-metals, namely: Au, Ir and Pt. For all of them we have again modeled the surfaces as semi-infinite systems first determining the relaxed surface interlayer spacings via geometry optimizations of 10-11 layers thick (1×1) slabs. Figure 5 shows the k-resolved DOS projected on the first layers of the Au(111) surface around the Γ point, with panels (a) and (b) corresponding to the full FR-PP formalism and the on-site approximation, respectively. The well known Shockley sp surface state 48 is clearly visible in both maps as two Rashba split parabolas crossing the gap region. At first sight, the main difference is the onset of the SS, which appears 0.1 eV towards larger binding energies in the on-site case. Otherwise, the splitting and dispersion of the SSs are very similar between the two formalisms. However, a closer look into the spectra via energy dispersion curves (EDCs) brings in further discrepancies. In the top panel of Fig. 5(c) we present EDCs extracted for the k-point marked by an arrow in panels (a) and (b), as well as that obtained for a calculation without SOC (green line). The effect of the SOC is to shift the SS to lower energies and induce a spinsplitting which for this k-point is around 50 meV (for reference, the splitting at the dashed vertical line in the figure is around 100 meV, which is in agreement with previous theoretical and experimental estimations 49,50 ). Under the on-site approximation the shift is ∼109 meV larger, while the splitting is only marginally larger. On the other hand and as expected from symmetry arguments, the helical (Rashba) character of the SS spin texture is not disrupted by the on-site approximation. This is seen at the bottom of Fig. 5(c), where the in-plane tangential component of the spin (S x ) takes opposite values at each branch while the radial in-plane component vanishes and the out-of-plane is negligible (the latter two not shown).  (d), also reproduce this gap, but clearly shifted towards higher energies with respect to the FR-PP counterparts. In the case of Ir(111) this shift is as large as 0.6 eV, so that the upper SS lies above the Fermi level (becomes empty) while the lower one falls into the continuum of bulk bands (becomes a resonance), thus yielding a highly inaccurate picture of the electronic structure. For Pt(111) the shift is reduced to 0.25 eV but, still, we regard the quality of the on-site bands as rather poor.

F. Bulk GeTe
We end this section considering a bulk 3D system with a broken space inversion symmetry, so that spin degen-  eracy can be removed due to the SOC interaction 51 .
We have chosen as model system the monochalcogenide GeTe insulator as it is known to exhibit a large Rashba effect 52 . GeTe stabilizes in a ferroelectric rhombohedrally distorted rocksalt structure with space group R3m. Figure 7 shows the energy dispersion of the valence and conduction bands around the Fermi level. This time the onsite bands (blue lines) yield an almost perfect agreement with the full FR-PP case (dark), accurately reproducing the strong Rashba splitting for both bands at the Zpoint, while small deviations appear only in the valence band as one moves towards the A-point.

IV. MAGNETIC ANISOTROPY ENERGIES
MAE is defined as the difference in total energy between the easy and hard magnetisation axes of a system. In this section we address the capability of the on-site approximation to obtain MAEs close to those derived from the full FR-PP approach. We note that reproducing energy differences at the meV (or even sub-meV) level is, in general, a more stringent test than the comparison between band structures.

A. Pt dimer
We first analyze the MAE for the Pt dimer. We have optimized the bond distance obtaining a value of 2.27 A. The dimer was located along the X-axis and two SC calculation along X and Z spin quantization axis were performed for the on-site approximation and the full FR-PP formulation. The energy differences, E x −E z , were of 206 meV and 200 meV, respectively. Both calculations give similar values of the MAEs and predict the easy axis along the bond axis as previously reported by Seivane and Ferrer 31 .

B. FePt-L10 bulk alloy
The binary FePt-L1 0 alloy is formed by alternating planes of Fe and Pt with square lattice geometry (see figure 8-left), leading to a structure with sligtly different in-plane, a, and out-of-plane, c, lattice constants. As a result of the lattice parameters optimization we obtained a=3.92Å and c/a=0.96.
The upper part of Table I shows the energy difference between the solutions with magnetization along the X and Z directions, ∆E x−z = E x −E z using SIESTA , VASP, a full-potential (FP) version of the linear-muffin-tin orbital (LMTO) method 53  MM F e =3.2µ B and MM P t =0.2µ B . In agreement, VASP provides 3.03µ B and 0.31 µ B for Fe and Pt, respectively. The OMs are depicted in the table I.

C. FePt-L10 cuboctahedral nanoparticle
Next, we have considered a non-periodic system consisting of a cuboctahedral FePt nanoparticle (NP) composed of 55 atoms following a L1 0 stacking. This kind of NP belongs to the so-called magic cluster sizes where the total number of atoms follows the relation N tot =(10n 3 +15n 2 +11n+3)/3, where n represents the number of geometrical closed shells which, in our case, is n=2 (see right panel in Figure 8). The total number of atoms for each species is then given by N M =(5n 3 +6n 2 +4n)/3 for the magnetic (M), and N N M =(5n 3 +9n 2 +7n+3)/3 for the nonmagnetic (NM) species, i.e., N M =24 and N N M =31. The initial structure of the NP was built using the FePt-L1 0 bulk experimental values 53 (3.86Å and 0.98, respectively), and the geometry was subsequently fully relaxed. The square box where the NP was simulated had 25Å of side avoiding the neighboring interaction with replicas in adjacent cells.
Due to the cuboctahedral shape of the NP we have calculated the total energy along three different spin quantization axes defined by the spherical angles (θ, φ), namely: (0 • , 0 • ), (90 • , 0 • ) and (90 • , 45 • ). In the following, we will label these directions as z, x and xy, respectively, and the energy difference as: ∆E x−z = E x − E z and ∆E xy−z = E xy − E z . The calculated values are shown in the table I for the on-site approximation and the full FR-PP, as well as for VASP. All predict correctly the easy axis of the NP that lies out-of-plane (0 • , 0 • ). Comparing the largest energy difference we observe that, whilst the full FR-PP MAE is ∆E xy−z =37.2 meV, on-site predicts a smaller value by around 30% (25.8 meV) and VASP a 25% larger value (50.0 meV). As in the case of the bulk material, we conclude that on-site approach underestimates the MAE values of this kind of cuboctahedral NP.
We have also obtained the magnetic moments (MMs) of each of the Fe and Pt atoms of the NP for the case in which the magnetisation is along z. We summarise their behaviour taking into account whether the atoms are in the core or at the surface. Whereas the MMs of the core atoms of both Fe and Pt species present similar values and alignment, Fe atoms at the surface present small tilts along x or y directions within the range of 0.1-0.4µ B in the full FP-PP calculation and between 0.04 and 0.27 µ B for the on-site approximation. The induced MMs of surface Pt atoms present similar dispersion for both formalisms between 0.2 and 0.4 µ B along Z direction without tilt. In both SOC formalisms, the average values are MM F e =3.4µ B /at and MM P t =0.38µ B /at. In VASP the average values of the MMs are 3.3µ B /at and 0.5µ B /at for Fe and Pt, respectively. The tilts along x and y direction ranges between 0.0-0.45µ B .

V. CONCLUSIONS
We have performed an in-depth study on the accuracy of the so called on-site approximation for the inclusion of SOC in electronic structure calculations within the DFT-PP formalism. Within a TB spirit, this approximation assumes that all the SOC transferred to the valence electrons occurs within each ion -i.e. equivalent to the renormalization of the on-site energies together with the inclusion of intra-atomic SOC matrix elements-whereas in a more general framework SOC matrix elements between two orbitals centered at different atoms pick up contributions from neighboring atoms via three-center integrals -the full FR-PP formulation.
We have considered a variety of systems with different dimensionalities, all of them presenting strong SOCrelated effects. In most cases the on-site approximation yielded good agreement with the more general full FR-PP formalism, but there were (a few) exceptions. One of them is an erroneous location of the Dirac point of the TSS at the Bi 2 Se 3 (0001) surface, as it ends up below the top of the valence band and, hence, becomes n-doped. Although the magnitude of this energy deviation falls within the error bars associated to DFT itself, we emphasize that it originates solely from the neglect of inter-atomic SOC interactions as the band structures for both approaches have been computed under the same calculation parameters.
A larger and systematic error was however found for 5d transition metals, for which the on-site band structures showed giant energy shifts -specially in the case of Ir(111)-leading to an imprecise description of the projected gaps or the off-.sets of surface states. By noting that most of the systems where the on-site approximation worked correctly involved states of p character around the Fermi level, we may conclude that it breaks down for systems involving 5d-5d interaction. In fact, this is not a surprising result, as these states present a large SOC but are also spatially quite extended, so that their contribution to three-center integrals of the µ|V so k |ν type, with the AOs residing at different sites than k, is not negligible. This could also be the reason why the onsite approximation shows more sensitivity to the basis set, giving worse results when the radial extension of the basis orbitals is increased, as dramatically illustrated for the WS 2 monolayer.
We have also calculated the MAEs for a Pt dimer, a FePt-L1 0 cuboctahedral NP composed of 55 atoms and for FePt-L1 0 bulk comparing the SIESTA MAEs versus those derived with VASP for the last two systems. Although the on-site approximation predicts correctly the easy magnetization axis, the MAE values are for both systems underestimated compared with the full FR-PP formalism and VASP. MAE values for the Pt dimer are similar for both formulations.
Finally, we note that, in DFT-PP calculations, the time consumed in the construction of the SOC Hamiltonian, including all the integrals for the off-site matrix TABLE I. Energy difference (in meV) and orbital magnetic moments (in µB) between different magnetization orientations for the FePt-L10 bulk alloy and the Fe24Pt31 NP obtained using the on-site approximation and the FR-PP formulation as implemented in SIESTA, compared to VASP, FP-LMTO, SPRKKR and WIEN2K(PBE-GGA).

System
Method ∆Ex−z ∆Exy−z µ elements, represents a very small fraction of the total in a self-consistent calculation. Hence, the computational gain in using the on-site approximation is negligible and does not seem to justify its use as it is susceptible to some inaccuracies. We conclude that it is advisable to employ the full FR-PP approach with no approximations.