Quantum transport simulation scheme including strong correlations and its application to organic radicals adsorbed on gold

We present a computational method to quantitatively describe the linear-response conductance of nanoscale devices in the Kondo regime. This method relies on a projection scheme to extract an Anderson impurity model from the results of density functional theory and non-equilibrium Green's functions calculations. The Anderson impurity model is then solved by continuous time quantum Monte Carlo. The developed formalism allows us to separate the different contributions to the transport, including coherent or non-coherent transport channels, and also the quantum interference between impurity and background transmission. We apply the method to a scanning tunneling microscope setup for the 1,3,5-triphenyl-6-oxoverdazyl (TOV) stable radical molecule adsorbed on gold. The TOV molecule has one unpaired electron, which when brought in contact with metal electrodes behaves like a prototypical single Anderson impurity. We evaluate the Kondo temperature, the finite temperature spectral function and transport properties, finding good agreement with published experimental results.


I. INTRODUCTION
In recent years, much research effort has been dedicated to study the electronic transport through magnetic molecules and single atoms in order to combine molecular electronics with spintronics 1-3 . Experiments and theoretical works have demonstrated that molecular and atomic spin states can be inferred and sometimes switched through an electrical current 4-18 . Among the many interesting phenomena arising in devices comprising magnetic molecules, there is the Kondo effect [19][20][21][22] . Below the so called Kondo temperature, θ K , characteristic of each system, the coupling between the electrons from the electrodes and the spin of the molecule promotes the formation of a many-body state with a fully or partially quenched magnetic moment. This results in a new resonant transport channel at the electrodes Fermi level. To date the Kondo effect has been studied in a number of molecular devices 8,9,14,17, , and exotic manifestations, such as the orbital Kondo effect, have also been reported 45 .
Ab-initio computational studies play a prominent role in molecular electronics and spintronics. In particular density functional theory (DFT) combined with the non-equilibrium Green's functions (NEGF) formalism 46 , known as DFT+NEGF for short, has become the dominant method to address electronic transport [47][48][49][50][51][52][53][54] . As initial step in a typical DFT+NEGF study of a molecular device, one optimizes the atomic configuration of the device active region, usually called "scattering region" or "extended molecule". Then the scattering region is joint to two semi-infinite left-and right-hand side leads (electrodes), whose effect on the states in the scattering region is taken into account by the so-called leads' self-energies. This approach allows to treat the extended molecule as an open system, with the leads' self-energies that provide quantitative estimates of the molecule-electrode hy-bridization and of its dynamic character 55 . Finally, the device transmission function and conductance are calculated within the Landauer framework 46 , by considering the Kohn-Sham (KS) eigenvalues as the single-particle excitations. Although this assumption is formally not correct, since even in exact DFT only the highest occupied molecular orbital (HOMO) can be rigorously associated to the negative of the ionization potential [56][57][58][59][60] , it practically works well for metallic point-contacts 61 , nanowires and nanotubes 62,63 , quasi two-dimensional systems 64 , and tunnel junctions 65,66 . In contrast, the calculation of transport properties via the KS eigenvalues encounters some drastic limitations in case of molecules. The fundamental gap between the HOMO and the lowest unoccupied molecular orbital (LUMO) is often severely underestimated by the KS gap computed with standard (semi-)local exchange-correlation density functionals. Furthermore, non-local correlation effects, such as the dynamical response of the electronic system to the addition of an electron or hole [67][68][69][70][71] , are not captured. These shortcomings hinder the ability of DFT+NEGF to predict the correct energy level alignment between a molecule and the electrodes, which often results in overestimated values for the conductance. Consequently, different improvements have been proposed, such as corrections for self-interaction error 72,73 , scissor operator schemes [74][75][76][77][78][79][80] and constrained-DFT [80][81][82][83][84] . Furthermore, in the recent years, there have been several attempts to move beyond the DFT+NEGF method by using the GW approximation of the many-body perturbation theory [85][86][87][88] .
The description of the Kondo effect, even at the qualitative level, still represents a challenge. On the one hand, electron correlations leading to the Kondo effect are beyond the GW perturbative scheme. On the other hand, the DFT KS spectrum fails to display any Kondo-related feature, and so does the conductance computed via the Landauer approach. We note that in principle DFT is arXiv:1701.08405v1 [cond-mat.mes-hall] 29 Jan 2017 able to capture the Kondo effect in one-orbital lattice models [89][90][91] if the exchange-correlation potential has the correct derivative discontinuity at integer number of electrons, and if the conductance is computed from the density through the Friedel sum rule 92,93 and not from the KS states.
In order to describe the Kondo effect in real molecular systems and to overcome the limitations of DFT+NEGF and GW , recent studies have proposed to combine DFT with model calculations, thus extending to molecular electronics theoretical schemes originally proposed for the study of strongly correlated solid state materials. These schemes include the dynamical mean-field theory (DMFT) 94 . The combination of DFT and models is typically achieved by partitioning the system of interest in two coupled subsystems, a weakly correlated one, whose electronic structure is well accounted for by DFT, and a strongly correlated one. Mathematically, this means that part of the system of interest is projected onto the correlated sub-system, and that the rest is integrated out as an effective bath. In case of molecular devices, this approach ultimately leads to the reformulation of the electronic structure and transport problem in terms of an effective Anderson impurity model (AIM), which then has to be solved either exactly or within some approximations. The potential of this approach has been firstly demonstrated by comparing to photoemission experiments 95 the computed spectral properties of single magnetic atoms [96][97][98][99] and molecules 100,101 on metallic surfaces. Then, the linear-response (i.e. zero-bias) transport properties have been addressed for example by Smogunov, Tosatti and co-workers [102][103][104][105] , and Jacob and co-workers [106][107][108][109][110][111] . We have recently contributed to further develop the DFT+NEGF scheme including DMFT to study spin transport in solid state devices such as multilayered heterostructures 112 .
In this article, we present our scheme to project out from DFT+NEGF an AIM, which is then solved numerically using continuous time quantum Monte Carlo (CTQMC) 113 . The developed method allows us to evaluate the temperature-dependent linear-response transport properties of magnetic molecules on metal surfaces. An important class of such molecules is formed by the stable organic radicals, which are are paramagnetic compounds presenting an unpaired electron in a singly occupied molecular orbital (SOMO) [114][115][116][117][118][119][120][121][122] . In particular, here we consider the 1,3,5-triphenyl-6-oxoverdazyl (TOV, Fig. 1). This is the first organic radical for which the Kondo effect was experimentally observed in a scanning tunneling microscope (STM) setup with a gold substrate, and the Kondo temperature was reported to be about 37 K 118 . Since the system is well characterized by STM, the comparison of the calculated density of states and Kondo temperature with the experiment serves as a stringent test for the theory. Additionally, we demonstrate how the different contributions to the conductance that originate from elastic, non-coherent and quantum interference effects can be disentangled. An open issue concerns the calculation of the electron-electron interaction energy for the Anderson impurity. Here we suggest that a partially screened value should be used in order to reproduce the experimental Kondo temperature.
The article is separated in two main parts. In the first part (Sec. II) we describe the theoretical methods, while in the second (Secs. III and IV) we present the results for the TOV molecule on Au. In the first part we initially summarize the projection scheme and outline how the different contributions to the transmission are computed (Secs. II A to II E), and then we describe the employed CTQMC algorithm (Sec. II F) and the analytic continuation of the CTQMC data (Sec. II G). In the second part, we start by listing the computational details of the DFT and NEGF calculations (Sec. III). We present the DFT results for the gas phase TOV, for the TOV on gold (Sec. IV A) and the DFT+NEGF results for the transport properties (Sec. II E). We then introduce the AIM and discuss its solution within the mean-field approximation (Sec. IV C) and by CTQMC (Sec. IV D). Finally, we present the transport properties computed via DFT+NEGF+CTQMC (Sec. IV E) and we conclude (Sec. V).

II. METHOD AND IMPLEMENTATION
The first step of the method is the separation of the Anderson impurity sub-system from the full system calculated in the DFT+NEGF setup. This is done by an appropriate projection scheme, outlined in subsections II A, II B, and II C, and schematically summarized in Appendix A 6. Then, the AIM with an effective interaction term is introduced and solved. This means that the many-body Green's function and self-energy are computed (subsection II D) so that the transport properties can be obtained (subsection II E). The method can, in principle, address both zero-and finite-bias transport problems provided the availability of a computationally efficient out-of-equilibrium solver for the AIM. Nevertheless, in this work we only consider the zero-bias case, and therefore the solution of the AIM is achieved by using CTQMC for quantum systems in equilibrium at finite temperatures, as outlined in subsection II F. The use of CTQMC requires a scheme to carry out the analytic con- tinuation of the many-body Green's function from the discrete imaginary Matsubara frequencies onto real energies This is presented in subsection II G. A schematic summary of the whole method is shown in Fig. 2.

A. NEGF transport setup
The method has been implemented in the NEGF code Smeagol 52 , which uses a linear combination of atomic orbitals (LCAO) basis set {φ µ }, and obtains the KS Hamiltonian from the DFT package Siesta 124 . Note however that the method is general and can be readily used for any code based on the LCAO approach. Each basis orbital |φ µ in Smeagol is characterized by its integer index µ, which is a collective index that includes the atom, I, the orbital n, and the angular momentum (l, m) indices. The orbital index n can run over different radial functions corresponding to the same angular momentum, according to the multiple-zetas scheme 123 . Any operatorÔ can be expressed in this basis by using its matrix form O, with the matrix elements given by (1) As a matter of notation we remark that in general upper case symbols represent matrices, and to distinguish operators from matrices we explicitly add a hat on top of the symbol for operators.
The setup used in the calculations is schematically illustrated in Fig. 3, where the basis set described above is used to expand the charge density 52,124 . Like in standard electron transport simulations, the system is first split into a semi-infinite left lead, a so-called scattering region (or extended molecule (EM)), and the semi-infinite right lead 52 . We denote the number of basis orbitals within the EM by N . We can then introduce the Hamiltonian matrix of the EM as (H) µν = φ µ |Ĥ|φ ν , where the basis orbitals span all orbitals within the EM, and whereĤ is the Hamiltonian operator. Since in general the basis set is non-orthogonal, we also need to introduce the overlap matrix of the EM, S, given by (2) We note here that we shift all energies of leads and EM in such a way to set the Fermi energy, E F , equal to 0. Such a global shift of the spectrum does not affect the properties of the system. The EM is then further subdivided in a total of 3 subsystems. A set of pre-determined wave-functions ψ i (i ∈ [1, N AI ]), defines the Anderson impurity (AI), with N AI being to the number of interacting states. The interacting region (IR) includes all basis orbitals inside the EM that contribute to the AI, which corresponds to the collection of those basis orbitals of the EM where any of the {ψ i } is non-zero. The number of basis orbitals in the IR, N IR , is usually much larger than N AI . We require that the overlap and Hamiltonian matrix elements between orbitals within the IR and the orbitals of the leads outside the EM is zero, so that the EM has to be chosen large enough to ensure this. As last subspace we introduce the extended interacting region (ER), which includes all basis orbitals inside the EM that have a finite overlap or Hamiltonian matrix element with the basis orbitals within the IR. We denote the number of orbitals within the ER as N ER , with N ER ≥ N IR . Ordered by decreasing size, the 4 subsystems are then: EM, ER, IR, and AI. The AI is a subsystem of the IR, the IR is a subsystem of the ER, and the ER is a subsystem of the EM (see Fig. 3).
The first step is therefore to define the set of molecular orbitals, {ψ i }, inside the EM that constitute the AI. The method outlined here is applicable for any arbitrary set of wave-functions, although in general the choice of the {ψ i } is based on physical intuition. In the simplest case one can simply take a combination of d or f orbitals for correlated magnetic atoms in the system, or else the wave-functions of the HOMO or LUMO of a molecule attached to metal electrodes. A practical way to construct such a set of interacting molecular orbitals is to extract from the full EM Hamiltonian, H, and overlap, S, matrices sub-blocks with basis orbitals on the molecule, which we denote as H M and S M , respectively. We can then calculate the eigenvalues i and eigenvectors ψ i of these sub-systems by solving H M ψ i = i S M ψ i . Note that each of the ψ i is a vector of dimension N IR .
Once the set of wave-functions that defines the AI as  . 3. (Color online) Schematic representation of the twoterminal device comprising the left lead, the extended molecule (EM), and the right lead. We denote the Hamiltonian matrix of the semi-infinite left (right) lead by HL (HR), the one of the EM as H, and the coupling Hamiltonian matrix as HLM (HRM). The EM is further subdivided in the interacting region (IR) and the extended interacting region (ER). The IR includes a set of interacting molecular orbitals {ψi}, with i ∈ [1, NAI] (NAI is the number of interacting molecular orbitals; in the schematic figure NAI = 4), forming the Anderson impurity (AI). The ER consists of all orbitals in the EM that have finite overlap with the IR orbitals, so that the IR orbitals are always also part of the ER. In total the system is therefore subdivided into 4 subspaces of decreasing size: EM, ER, IR, and AI.
{ψ 1,IR , ψ 2,IR , . . . , ψ NAI,IR } is chosen (here we have added the subscript "IR" to indicate explicitly that the wavefunctions extend over the IR), the projection matrix onto the AI inside the IR, U IR , is then defined as which is of dimension N IR × N AI . We then construct the ER by adding to the IR all basis orbitals within the EM that have finite overlap with the IR basis orbitals, and construct the set of wave-functions {ψ i,ER }. Each ψ i,ER is a vector of length N ER , and is equal to ψ IR i for its elements within the IR, and 0 for the others. These vectors therefore define the AI in the ER, and lead to the ER projection matrix onto the AI, U ER , given by which is of dimension N ER × N AI . Note that the simulation setup in Smeagol requires that one leads' unit cell is included at both the left and right ends of the EM. For the projection onto the AI we then further require that the orbitals of those cells cannot be part of the IR, while they are allowed to be part of the ER. If the basis orbitals indices are approximately ordered from left to right in the EM, then the overlap matrix of the EM 52,124 can be written in the following general block-matrix structure where S ER is a N ER × N ER matrix. S αα (S ββ ) is a square matrix that includes all the N α (N β ) orbitals within the EM on the left (right) of the ER, so that N = N α + N β + N ER . The dimensions of the off-diagonal matrix blocks are determined from the ones of the diagonal blocks, and are therefore not explicitly given. We subdivide the Hamiltonian matrix in an analogous way. Note that by construction the following important relations are satisfied: S α,ER U ER = 0, S β,ER U ER = 0, H α,ER U ER = 0, and H β,ER U ER = 0. The Green's function (GF) matrix of the EM is then given by the standard form 52 which has the block-matrix structure analogous to the one of the H and S matrices in Eq. (5); z is an arbitrary complex number, and Σ L (z) and Σ R (z) are the left and right leads' self-energies that describe the coupling of the EM to the leads. These are computed according the algorithm in Ref. 53. Importantly, G(z) can be either the retarded GF if z = E +iδ, with E the real energy and δ a vanishingly small positive real number, or the Matsubara GF if z = iω n , with ω n = (2n + 1)π/β, for n ∈ Z and β = 1/kθ the inverse temperature (k is the Boltzmann constant, θ is the temperature). Spin indices are omitted in above equations and in the following subsections to simplify the notation, but they will be explicitly reintroduced when required to emphasize spin-dependent relations.

B. Projection to the Anderson impurity subsystem
Here we outline how to separate explicitly the AI subsystem from the rest of the system, which we refer to as the "bath", and that includes the orthogonal subspace to the AI within the EM as well as the semi-infinite electrodes. To this aim we introduce a basis transformation matrix, W , which transforms the overlap, Hamiltonian, and self-energy matrices as (8) where we denote the transformed matrices with a bar on top of the symbol. This transformation is required to bring any general matrix extending over the orbitals of the EM, M , into a transformed form,M , in which the top left corner describes the AI, the bottom right corner describes the part of the bath included in the EM (B), and the off-diagonal blocks describe the connection terms. We note that the full bath includes orbitals from both the EM and the semi-infinite electrodes, while the N B × N B matrixM B contains only the N B = N − N AI bath orbitals within the EM. In mathematical terms we therefore requireM to have the following block-matrix structure:M We also require the projection to lead to a zero overlap between the AI orbitals and the bath orbitals. The form of W to achieve this for the transport setup is derived in Appendix A 1, and is given by Here and in the following we denote an identity matrix of dimensions m × m as 1 m . We have introduced which is equal to the projection matrix U ER [Eq. (4)], multiplied by the N AI × N AI matrix W 2,AI , which in general is constructed in such a way to orthogonalize both the overlap and Hamiltonian matrices of the AI itself.
In principle this second transformation is arbitrary and can also be omitted, depending on how the AI problem is solved. The N ER × N NI matrix W NI spans the orthogonal space to the AI inside the ER, so that W † AI W NI = 0, and therefore projects onto the non-interacting part of the ER. We denote the number of non-interacting states within the ER as N NI , so that N NI = N ER − N AI . There is some freedom in the construction of the matrices W 2,AI and W NI , since they are not uniquely defined. We construct them in such a way to leave the non-interacting part close to the original system, and the detailed relations to construct W 2,AI and W NI are given in Appendix A 1, Eqs. (A14) and (A18).
The explicit form of the final transformed overlap matrix is then evaluated tō which as required has zero overlap between AI and bath orbitals, and whereS The final general form of the projected Hamiltonian matrix isH where The general structure of the resulting matrices has the required shape given in Eq. (9), with Note that the transformed self-energy matrices extend only over the block of the bath inside the EM, and are zero for the other matrix blocks. This is ensured by the requirement that the orbitals of the left and right leads' unit cells included at the boundaries of the EM are not part of the IR. The transformation for the GF is given bȳ and can also be evaluated directly in the transformed system asḠ It is possible to evaluate the required inverse of W by blocks (Appendix A 2), and the result is where with The form of the N NI × N ER block-matrix W iNI is given in Appendix A 2, Eq. (A26).

C. Hybridization function
By removing the AI orbitals from the system we can introduce the GF of only the bath orbitals within the EM,ḡ(z), as which can be written in a block-matrix structure analogous to the one ofS B [Eq. (22)] as By using also Eqs. (27), (12) and (16) the GF on the AI is then obtained as Here we have introduced the so-called hybridization function,∆ AI (z), which is given bȳ We remark that in general∆ AI (z) is a dense N AI × N AI matrix. Therefore, optionally, as alternative possibility one can modify the transformation matrix W 2,AI to an energy-dependent form that diagonalizes∆ AI (z) rather thanH AI . In Appendix A 3 we derive an equivalent, commonly used expression for the hybridization function. Furthermore, in Appendix A 4 we show that∆ AI (z) exhibits the correct physical decay for large z.
OnceḠ AI (z) andḡ(z) are known, then the remaining block matrices of the GF within the EM [see Eq. (9) for the general matrix structure] can be evaluated tō

D. Effective Coulomb interaction and many-body self-energy
The AI is fully characterized at the KS-level through the "on-site" upper-block AI,D of the Hamiltonian matrix in Eq. (16), which is coupled to the bath via the off-diagonal blockH AI,NI . The AIM Hamiltonian operator can be rewritten in its standard form as an operator in second quantizationĤ AIM =Ĥ AI,D +Ĥ TB +Ĥ AI,NI , withĤ iσ is the annihilation (creation) operator for an electron of spin σ in the orbital i of the AI, whileĉ ( †) pσ is the annihilation (creation) operator for an electron of spin σ in an orbital p of the total bath, whose Hamiltonian matrix is formally written asH TB . We note that H TB includes the elements ofH B obtained from the projection of the EM Hamiltonian with the Eq. (7), as well as the elements of the semi-infinite electrodes. The effect of the leads is accounted for in the bath Green's function [Eq. (31)], and therefore in the hybridization function [Eq. 34] through the projected self-energies Σ L B (z) and Σ R B (z), so that the fullH TB never needs to be explicitly computed.
In order to account for many-body correlation effects, we supplement the AI with an effective Coulomb interaction, expressed by the operator Then the interacting AIM Hamiltonian operator is defined asĤ H C is typically chosen to have the form of a generalized Hubbard-like interaction, whileĤ dc is the doublecounting correction 94 . This double-counting correction is required in order to subtract the correlation effects in the AI that are already included at the KS level. The exact form ofĤ dc is unfortunately not known, and several approximations have been introduced, the most common one being the so-called "fully localized limit" 94 . This is the one that we also employ.
For a single impurity one-orbital Anderson model (SIAM) one hasĤ with the Hubbard U being a real number, andn ↑ (n ↓ ) the occupation operator for up-spin (down-spin) electrons on the AI. The double-counting correction in the fully localized limit has the simple expression where n is the DFT total occupation of the impurity.
The solution of the interacting AIM leads to the manybody GF on the AI,Ḡ MB AI (z). One can then define the many-body self-energy on the AI,Σ MB AI (z), by means of the Dyson equation Since we are considering the case of a single interacting region, the full many-body self-energy matrix of the EM is zero for all elements, except for those of the AĪ Here we use the notation 0 m to denote a m × m matrix with all zeroes; the sizes of the off-diagonal blocks are determined by the ones of the diagonal blocks. IfΣ MB (z) is known, it can be used to directly calculate the manybody GF as In cases when the many-body self-energy is required in the original basis, this can be obtained by applying the inverse transformation as By using Eqs. (28) and (46) the explicit structure of Σ MB (z) becomes The non-zero block of Σ MB extends over the ER, and is given by with W iAI given in Eq. (29). After calculating Σ MB ER (z) one can then obtain the full many-body GF in the original basis as

E. Current and transmission
The current flowing out of the EM into the right electrode, I R , can be written as sum of a component transmitted into the right electrode from the left electrode, I R,L , and a component flowing from the AI into the right electrode, I R,AI , 125,126 A similar expression holds for the current flowing from the left electrode into the EM, I L , as outlined in Appendix A 5. At steady state the current conservation condition implies that I L = I R .
The value of I R,L can be evaluated using the left to right energy dependent many-body transmission coefficient, T R,L (E), which we will simply denote as transmission, T (E), from now on in order to simplify the notation. I R,L is then given by where e is the electron charge, h is the Planck constant, f L (f R ) is the Fermi Dirac distribution at the chemical potential of the left (right) electrode, and T includes all interference effects between the possible transport channels across the system, including the effects of interactions on the AI. The so called coupling matrices Γ L,R (E) are defined as Note that if the Matsubara GF is computed by using Eqs. (6) or (47) with z = iω n , one has to perform the analytic continuation from the complex Matsubara energies to the real axis in order to obtain the retarded GF prior to the calculation of the transmission. The procedure that we have chosen to carry out such operation is described in Sec.II G. Note also that we can equally express the transmission and current in terms of the quantities in the original basis and in the transformed one. In what follows we work in the transformed basis.
The transport properties of a molecular device in DFT+NEGF are usually expressed in terms of the transmission coefficient for the KS system 52 , denoted as T 0 (E), which is given by 52 It has the analogous structure to T (E), but with the many-body GF replaced by the KS one. Therefore T (E) accounts for the renormalization of the coherent transport properties via many-body effects not included at the KS DFT level, and in the following, we refer to I R,L as the coherent, or elastic, component of the current. In contrast, I R,AI represents the incoherent component of the current. The incoherent component of the current flowing from the AI to the right electrode is given by (see Appendix A 5) where we have introduced the occupation matrix of the AI,F MB (E), 126 and which is defined in a similar way to the coupling matrices in Eqs. (55) and (56), but with the many-body self-energy replacing the the leads' self-energies. The general structure of the occupation matrix is analogous to the one of Σ MB in Eq. (46), and can be written as The quantity inside the trace of Eq. (58) can be interpreted as a transmission matrix, given by times the difference in the distribution matrices, given by . It therefore has the analogous structure to I R,L in Eq. (53), with the only difference that sinceF MB is a matrix it cannot be moved outside the trace. The analogous equations for the current from the left lead into the EM, I L , are given in Appendix A 5.
If we assume that the matrices H αβ and S αβ are zero, or more generally that their values are small enough that they can be neglected, and that Γ L (E) (Γ R (E)) extends only over the region α (β), then we can obtain the transmission function from the GF block matrix of the NI as withK = ES −H having the analogous block-matrix structure toH in Eq. (16), and for real energiesK NI,β = K † β,NI andK NI,α =K † α,NI . Note thatK αα = K αα , K ββ = K ββ , and with the assumptions made in this paragraph we have Σ L αα = (Σ L ) αα and Σ R ββ = (Σ R ) ββ . With Eq. (35) the GF block matrix of the NI can be obtained from the one of the AI as with The transmission can then be decomposed in three components where we have introduced the background or bath transmission the transmission through the AI and the interference term of the transmission Often the background and AI transmission do not interfere significantly, so that it is useful to give estimates for their separate values. In that case typically the transmission is composed of a background value, onto which the peaks due to transport through the AI are added. The transmission through the AI can be rewritten as which depends only on the GF of the AI, and on the hybridization matrices of the AI, given bȳ We can perform the analogous transformations also for I R,AI from Eq. (58), and obtain where we have defined In general therefore the calculation of the current requires the knowledge of the impurity occupation matrix F MB (E), which in turn requires the solution of the nonequilibrium problem. In Refs. 127,128 a specific shape is implicitly assumed for this matrix 126 , which allows simplified estimates of the currents. The approach proposed in those references requires the inversion of the coupling matrices Γ L and Γ R , which is however not defined in the general case 53 . We now consider the special case whereγ L,AI (E) = λγ R,AI (E), with λ a constant 125 . This rather strict condition can usually only be fulfilled for a SIAM, so that in the remainder of this section we consider only the SIAM. In this case it is possible to avoid the calculation ofF MB (E) by using the current conservation condition I L = I R 125 , and one obtains with (76) where we have used the fact that for the SIAM all quantities in the equation are just numbers instead of matrices.
Using Eqs. (53) and (75) for the special case of a SIAM, andγ L,AI (E) = λγ R,AI (E), we can write the total current as with the total effective transmission, which includes elastic and incoherent terms, given by We can collect the terms that describe the effective total transmission across the AI, T t,AI (E), as 125 Eqs. (77)(78)(79) extend the results of Ref. 125 to the more general case including background transmission and interference terms, as typically found for STM experiments of molecules on surfaces.
In this article we only consider applications to linearresponse transport, so that the conductance G = dI/dV is given by This equation generalizes the Landauer formula used in DFT+NEGF to the case of an interacting EM, with the only difference that T 0 (E) (used in the Landauer formula) is replaced by T t,AI (E). Therefore, following the standard practice used in DFT+NEGF that consists in analyzing the zero-bias transport properties by means of the transmission at equilibrium T 0 (E), here we will present the results for zero-bias transport in presence of many-body effects by plotting T t,AI (E).

F. CTQMC impurity solver
In the present work the AIM is solved by using CTQMC for quantum systems in thermodynamic equilibrium 113 , since we only address linear-response transport. In this case the method is well-established. We note however that recently there have been a number of developments towards the extension of CTQMC to out-of-equilibrium problems, both in steady-state and time-dependent frameworks [129][130][131][132][133][134] . In this work, two different algorithms have been considered: the weak-coupling approach, called continuoustime auxiliary field (CT-AUX) 135 , and the hybridization expansion, strong coupling approach (CT-HYB) 136 . CT-AUX scales as the product of the interaction U and of the inverse temperature, so that its application turns out too computationally demanding for Kondo systems at temperatures of the order of only a few Kelvin. In contrast, we found CT-HYB able to provide quite accurate results at a reasonable computational cost for the specific Au/TOV system considered in the following. Since this system is described as a SIAM, here we present the method only for this case. However, we remark that our implementation can treat multi-orbital systems as well, although only for density-density interaction terms.
CTQMC is restricted to finite-temperatures, where the partition function is given by Z = Tr e −βĤIAIM . The starting step, which is common to all algorithms, is to separate the interacting AIM Hamiltonian [Eq. (42)] in two parts, a reference HamiltonianĤ 1 and a perturbation HamiltonianĤ 2 , so thatĤ IAIM =Ĥ 1 +Ĥ 2 . Each CTQMC algorithm differs in the exact definition ofĤ 1 andĤ 2 113 . While in CT-AUX and other weak-coupling approachesĤ 2 is set equal to the effective Coulomb in-teraction termĤ C [Eq. (43)], in CT-HYB one imposes 136 withĤ AI,D ,Ĥ TB ,Ĥ I andĤ AI,NI defined in subsection II D. Therefore the perturbation Hamiltonian in CT-HYB is represented by the bath-AI coupling, while the reference Hamiltonian is that of the decoupled atomiclike correlated AI and of the isolated bath.
Having divided the Hamiltonian in two parts, one is able to introduce the interaction picture, where an op-eratorÔ depends on the imaginary time τ asÔ(τ ) = e τĤ1Ô e −τĤ1 , with 0 < τ < β, and the partition function is written as the standard time-ordered exponential with T τ the time-ordering operator, and The partition function has the form of an integral over a configuration space. In such space, any particular configuration is specified by the expansion order n, the times {τ 1 , ..., τ n } and a set of discrete variables, for instance the spin, and it is characterized by the probability distribution p n = w n n k=1 dτ k . It is this integral, which is ultimately evaluated by Monte Carlo techniques. By using the definitions ofĤ 1 andĤ 2 in Eqs (81) and (82), w n in Eq. (84) becomes 113,136 Here Z B is the bath partition function,Ĥ loc =Ĥ AI,D + H I ,d which are the Fourier transforms of the hybridization function∆ evaluated for the imaginary-time interval τ σ i − τ σ j , which separate pairs of operatorsd σ (τ σ i ) andd † σ (τ σ j ) in Eq. (85). In the function w n the trace accounts for the impurity that fluctuates between different quantum states as electrons jump in and out, while the determinants resum the bath evolutions which are compatible with the sequence of quantum fluctuations in the impurity. Note that here we have explicitly re-introduced the spin-index that was dropped from the equations in the previous subsections. However, if the bath is non-magnetic like in most cases, the hybridization function is the same for spin-up and down, and can therefore be obtained from non spin-polarized calculations.
In the specific case of a one-orbital Hubbard interaction, CT-HYB can be efficiently implemented by using the so-called "segment representation" 136 . This means that each configuration is depicted by segments, which represent time intervals τ σ − τ σ during which an electron of a given spin resides on the impurity. Notably, with such representation the trace in w n can be evaluated in polynomial time. New configurations are then obtained by either adding or removing segments. This is enough to ensure ergodicity, although other operations (such as shifting the segments' end-points) and global updates must be implemented to ensure an efficient sampling. Each update is accepted or rejected according to the Metropolis algorithm. The acceptance probability is efficiently computed with standard fast matrix update methods 113 .
As seen in Eq. (86), CT-HYB requires the Fourier transform of the hybridization function, which in principle is calculated through a summation over an infinite number of frequencies. However, in practice only a finite number of frequencies smaller than a certain cutoff N ω can be inevitably summed up, although the high-frequency limit of∆ AI (iω n ) determines∆ f AI (τ ) close to τ = 0. Therefore, in order to accurately calculatẽ ∆ f AI (τ ) for any arbitrary τ , we use a standard approach that consists in adding and removing from Eq. (87) the high frequency limit of∆ AI (iω n ). This limit is given by −iM 1 /ω n , with M 1 given in Eq. (A36) of the Appendix A 4. Hence we evaluate∆ f AI (τ ) as where the summation is restricted to positive frequencies, since ∆ AI (iω n ) = ∆ AI (−iω n ) and ∆ AI (iω n ) = − ∆ AI (−iω n ) . During Monte Carlo sampling the many-body Matsubara GF can be directly estimated. However, following Boehnke et al. 137 , we use an expansion of theḠ MB AI,σ (iω n ) in terms of Legendre polynomials, P l [(2τ /β) − 1], and we estimate the expansion coefficientsḠ MB AI,σ (l). The advantage of this choice is twofold. First, the transfor-mation (89) can be written as a unitary transformation. Second, only few Legendre coefficients are needed to express the Matsubara GF. This is because the Legendre coefficients for a Matsubara GF decay faster than any power of the Legendre expansion index l 137 . A valuable effect of this is that the statistical noise is filtered out due to the cutoff at a certain expansion order. After extensive tests for the specific system studied in this work, we found that an appropriate choice for the cutoff of the Legendre polynomials is around 100. However, such cutoff must generally be determined for each specific case.
The many-body self-energy is obtained either from the Dyson Eq. (45), or by using the expression 138 whereF MB AI,σ (iω n ) is the Matsubara representation of the correlation functionF MB AI,σ (τ − τ ) = − T τd−σ (τ )d † −σ (τ )n σ (τ ) , which can be easily computed in the Legendre polynomial basis at no extra cost 139 . This last approach usually provides much more accurate results than the calculation through the Dyson equation. The inversion of the GF in Eq. (45) amplifies the statistical noise, in particular at high Matsubara frequencies, when the difference between interacting and non-interacting GF is very small. In contrast, such problem is not present when using Eq. (90). Finally, we note that, although in this subsection we have made explicit the dependence of the GF on the spin-index for notation completeness, the spin up and down GFs are equal as long as the substrate is nonmagnetic, and there is no Zeeman-like term in the AIM Hamiltonian. Since this is the case for the application presented in this work, we will once again drop this index in the following.

G. Analytic continuation
In order to compute the transport properties, as outlined in subsection II E, we need the retarded many-body GF on the AI,Ḡ MB AI (E), whose imaginary part defines the spectral function normalized as and which is related to the real part via the Kramers-Kronig relation However, CTQMC returns the Matsubara GFḠ MB AI (iω n ) and notḠ MB AI (E). The relation between the spectral function, Eq. (91), and the Matsubara GF is determined by the Hilbert transformation which must therefore be inverted to find the unknown A MB AI (E) fromḠ MB AI (iω n ). Unfortunately, the inversion of the Hilbert transformation belongs to the class of illposed problems, for which there is no unique solution in a mathematical sense. Despite that, several numerical methods have been proposed during the last few decades to deal with this problem in the context of quantum Monte Carlo methods [140][141][142][143][144] . Here we employ the stochastic optimization (SO), and our implementation is based on the original proposal by Mishchenko et al. 145 , with some modifications. We have verified that there are only negligible quantitative differences between the results obtained with our code and the original version by Mishchenko 146 . Other methods, such as the Pade approximation 140 and the maximum entropy 141 methods, which are routinely employed by the DMFT practitioners, have been tested as well. However, the results are found to depend critically on the precision of the Monte Carlo data, so that they result in overall unsatisfactory performances.
The SO relies on the parametrization of the spectral function A MB AI,t (E) as a sum of K rectangles determined by the height h t , the length l t and the center c t , and which is normalized as in Eq. (92). This means that with Accordingly, the GF obtained by using this A MB AI,t (E) in Eq. 94 reads The optimization algorithm proceeds through a series of global stochastic updates that change the number, the size and the location of the rectangles in order to minimize a given deviation function between G MB AI,t (iω n ) and the original set of GFs obtained by CTQMC. Importantly, the fact thatḠ MB AI,t (iω n ) has an analytic expression [Eq. (97)] largely improves the performances of the method.
Once the many-body retarded GF is computed for real energies, the many-body self-energy on the real energy axis can be obtained by using the Dyson Eq. (45). Alternatively, the analytic continuation of the self-energȳ Σ MB AI,t (iω n ) can be directly carried out 147,148 .

A. DFT
The DFT calculations for the TOV molecule in gas phase are performed by using the Siesta code 124 . Normconserving Troullier-Martin pseudopotentials are used together with a double-ζ plus polarization quality basis set. The Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA) 149,150 for the exchangecorrelation density functional is considered. Additional all-electron calculations are carried out with the FHI-AIMS package [153][154][155] . In particular, FHI-AIMS is used to compare the results of PBE with those obtained with the PBE0 hybrid functional 151 . Furthermore, the G 0 W 0 method 152 of the many-body perturbation theory is also employed. G 0 W 0 calculations are carried out by using the (generalized)-KS orbitals and eigenvalues obtained by either PBE or PBE0 (the notation G 0 W 0 @PBE and G 0 W 0 @PBE0 is used in order to distinguish between the two cases). The computational details are the same as those described in Ref. 156 . FHI-AIMS is also employed to optimize the geometry of the TOV molecule on Au within a supercell approach. The Au(111) surface is experimentally found to present FCC and HCP domains separated by ridge regions, and with the molecules that are adsorbed on both domains 118 . Here we consider the FCC surface, which is modeled as a 4-layer slab with a (5x5) square unit cell, and each slab is separated by 60Å of vacuum from its periodic image. Only the molecule and the first two Au layers are allowed to relax until forces are smaller than 0.01 eV/Å. The PBE+vdW surf method 157-159 is employed in order to include the effect of the van der Waals (vdW) interactions. PBE+vdW surf has been extensively tested for molecules both physisorbed and chemisorbed on metallic substrates, and the results are generally more accurate than those obtained with other common vdWcorrected GGA functionals 160,161 . The values of the screened C 6 coefficient, the vdW radius and the polarizability for Au are provided by Ruiz et al. in the original article about PBE+vdW surf 158 . The standard numerical atom-centered orbitals basis set "tier 1" and "tier 2" are considered for Au and H, C, N, O, respectively. The used k-points mesh is equal to 4 × 4 × 1. All DFT calculations for the gas phase molecule and for the molecule on Au are spin-polarized in order to account for the magnetism.

B. DFT+NEGF
Smeagol electron transport calculations are performed with the relaxed geometries described in the previous subsection, where further Au layers are added below the molecule, and the top electrode with the Au STM tip is included [ Fig. 6(a)]. The real space mesh is set by an equivalent energy cutoff of 300 Ry, and we use the local density approximation (LDA) for the exchange correlation potential. We verified that the transport properties are essentially the same when using the LDA or GGA functionals. We employ a double-ζ plus polarization basis set for the all atoms of the molecule, a double-ζ basis for the Au atoms, and a double-ζ plus polarization basis with extended orbital cutoffs for the Au atoms of the tip. This ensures that there is an appropriate electronic coupling between tip and molecule also at extended tipmolecule separations. We verified that due to the large size of the supercell in the plane we can evaluate the electronic structure at the Γ-point in the Brillouin zone perpendicular to the transport direction. All calculations of the DFT+NEGF transport properties without the inclusion of the many-body self-energy are spin-polarized. In contrast, non-spin-polarized calculations are used to extract the hybridization func-tion∆ AI (z) and on-site Hamiltonian AI,D of the AIM, since the magnetism of the TOV is accounted for within the AIM. Spin-polarized calculations of the hybridization function are only required if the substrate is magnetic.

A. TOV molecule and adsorption geometry
The TOV molecule has spin 1/2 due to an unpaired electron, delocalized mainly over the nitrogen π orbitals    of the planar verdazyl heterocycle (Fig. 4). The atoms N2 and N4 carry slightly more spin than N1 and N5, and the difference of the Mulliken populations for spin up and down is equal to 0.31 for N2 and N4, and to 0.16 for N1 and N5. The spin delocalization over the verdazyl heterocycle reflects the character of the SOMO, and of the corresponding singly unoccupied molecular orbital (SUMO) (Fig. 4, left panel). In contrast, the highest doubly-occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) are mainly located on the phenyl rings. The results are overall consistent with the established picture for the TOV electronic structure 162,163 .
Although the four N atoms in the verdazyl heterocycle are in the same plane, the molecule does not have a flat conformation. In fact, we find that the the two phenyl groups attached to N1 and N5 are twisted out of the verdazyl heterocycle plane, either in the same direction by about 40 degrees (geometry 1 in right panel of Fig. 4), or in opposite directions by about 40 and 140 degrees (geometry 2 in the right panel of Fig. 4). Only the latter conformation (geometry 2) has been previously reported in the literature 162 , but we find that that the energy dif-ference with the geometry 1 is below 1 meV, which is the numerical accuracy of our calculations set by the used basis set 153 . In contrast, the energy difference with the geometry obtained by constraining all three phenyl groups in the same plane as the verdazyl heterocycle is about 0.15 eV. The molecule's spin can be fully compensated either by charging or by attaching a hydrogen to the oxygen atom, thus forming the H-TOV molecule. In both cases, we find a considerable geometrical rearrangement involving also the N atoms, so that the verdazyl core appears largely distorted, and the phenyl groups are further moved outof-plane in an asymmetric way. The optimized geometries for the negatively-charged TOV and H-TOV are displayed in Fig. 4 [166][167][168][169] . For this reason, and due to the lack of experimental data, in the following we consider the G 0 W 0 @PBE0 estimate of 4.85 eV as reference value. We note that that no difference is found within our numerical accuracy for the computed SOMO-SUMO gap in case of the two different TOV conformations (geometries 1 and 2 in in the right panel of Fig. 4), so that the results in Tab. I apply to both. The geometry optimization of the TOV/Au system reveals that the molecule can be adsorbed in two different configurations, labelled CFG 1 and CFG 2 . In configuration CFG 1 , TOV is physisorbed and assumes an almost flat conformation on the surface. The distances of N2/N4 and N1/N3 from the top Au layer are 2.94Å and 3.13 A, respectively. The average adsorption height of the phenyl rings attached to N1 and N3 is 3.09Å, while that of the phenyl ring attached to C3 is 2.95Å. The adsorption energy is equal to −2.564 eV, a value that is comparable to that of typical molecules of similar size (such as perylenetetracarboxylic dianhydride, PTCDA) physisorbed on Au 158 . By analyzing the Mulliken populations, we note that a small charge transfer of about 0.2 electrons from the Au to the molecule occurs. This leads to an overall decrease of the SOMO-SUMO exchange splitting and to a partial compensation of the molecule magnetic moment, which reduces to 0.3 µ B . Configuration CFG 2 corresponds to a qualitatively different physisorbed geometry, which has an adsorption energy of −2.706 eV. For CFG 2 one full electron is transferred from the surface to the TOV, so that the spin on the molecule completely disappears. Although the adsorption energy for CFG 1 and CFG 2 differs by only about 150 meV, the energy barrier separating the two states is quite large (≈ 1.4-1.5 eV). Indeed we observe that the increased charge transfer is accompanied by a substantial distortion of the TOV verdazyl heterocycle, with one of the N atoms (N4) being displaced out of the molecular plane towards the Au surface (see Fig. 5). The distance between N4 and the surface decreases to 2.3Å, while the distance between N2 and the surface increases to 3.23Å. This distortion of the verdazyl core is similar to that observed in the gas phase for the negatively-charged molecule (Fig. 4), taking into account that now the phenyl groups are kept parallel to the surface because of the vdW interaction. We note that our calculations for physisorbed H-TOV reveal a similar deformation of the molecule. We point out that the charge transfer and the consequent spin reduction or compensation are likely overestimated by the calculations, because PBE underestimates the SOMO-SUMO gap. This problem is commonly found in DFT-based electronic transport simulations, and it has been cured in practice by using scissor operator schemes 74,76-80 (see the following section). The overestimated charge transfer may have important implications for the geometry optimization, since it may artificially stabilize the configuration CFG 2 over the configuration CFG 1 . The calculation of the correct charge transfer for structural relaxations can currently not be addressed on a fully quantitative level even with state-of-art methods. DFT energy functionals that increase the SOMO-SUMO gap, such as hybrid functionals, do not provide a good description of the metal electronic structure. For configuration CFG 1 the charge transfer is overall rather small, and in cases of small or negligible charge-transfer PBE+vdW surf has been shown to perform remarkably well when compared to experimental data 170 . We therefore expect that the final geometry for CFG 1 , and in particular the moleculesurface average distance, is accurately predicted. In experiments two types of molecules are observed on the Au surface, denoted as type-A and type-B 118 . While type-B molecules display a Kondo resonance in the low bias conductance, such resonance is absent in type-A molecules. Type-A molecules have been identified as H-TOV, while type-B molecules correspond to configuration CFG 1 118 . We note that based on our results the type-A molecules could also correspond to molecules adsorbed with geometry CFG 2 . However, as discussed in the previous paragraph, it is likely that the calculations overestimate the stability of CFG 2 , and in experiments most type-A molecules may indeed be H-TOV. In the remaining part of the manuscript we will address only type-B molecules, where the Kondo state has been measured experimentally.

B. Transport properties from DFT+NEGF
The spin-polarized projected density of states (PDOS) on the molecule in the transport setup for configuration CFG 1 is shown in Fig. 6(b). After adsorption, the former gas-phase SOMO and SUMO appear as Lorentzian-like peaks that extend around E F . The combined effects of the hybridization of the SOMO and SUMO states with the Au substrate, and of the small surface-to-molecule charge transfer, induce a reduction of the exchange splitting compared to the gas phase, and consequently of the magnetic moment. The full width at half maximum of the SOMO and SUMO Lorentzian peaks gives their electronic coupling to the substrate, Γ ≈ −2 ∆ AI (E = E F ), which we calculate to be Γ ≈ 290 meV. This system therefore presents a rather strong electronic coupling to the substrate. In contrast, the electronic coupling to the tip is usually much smaller, and it can be neglected compared to Γ.
The analysis of the transport properties is carried out by looking at the transmission coefficient, T (Sec. II E). This decays exponentially with the tip height and depends sensitively on the in-plane position of the STM tip. The transmission through the SOMO and SUMO is largest when the tip is on top of the N atoms of the molecule, so that we place it above atom N2 in the subsequent evaluation of the transport properties, with a tip height of about 5.7Å above the plane of the molecule.
The spin-polarized transmission as function of energy is shown in Fig. 7(a), and it can be seen that, like the PDOS, also T is only slightly spin-split. We investigate the change of T for reduced coupling by rigidly shifting the molecule off from the surface by a distance, ∆h, of 0.5Å, and 1.0Å (Fig. 7(b-c)). The results demonstrate that if the coupling is reduced by increasing the moleculesurface separation, then the peak of the radical state is less broad, and the spin splitting increases. For ∆h of 0.5Å the radical is fully spin-split, while it is only partly spin-split for ∆h = 0.0Å. This shows that in quantitative predictions of hybridizations of molecules to surfaces the correct evaluation of the contact geometry is of central importance. Experimentally such a change of coupling might be achievable by systematically varying the groups attached to the TOV molecule.
As discussed in the previous section, the LDA gas phase KS SOMO-SUMO gap is underestimated when compared to experiment. On a surface however the gap is reduced due to the image charge effect, which is not captured by the LDA KS eigenvalues 80 . A fortuitous error cancellation between the gap underestimation and the neglect of image-charge effects may sometimes happen, but this is not generally the case. Here we evaluate the effect that a larger gap has on the transport properties by using a scissor operator (SCO) approach 74,76,77,79,80 , where we increase the SOMO-SUMO gap by about 1 eV to illustrate the general trend (Fig. 8). We apply the SCO correction non-self-consistently as a postprocessing step on top of the LDA converged charge density as well as selfconsistently. The opening of the SOMO-SUMO gap leads to a fully spin-split state, where one can clearly identify SOMO and SUMO peaks despite their large broadening. Self-consistency does not qualitatively change the results, demonstrating that applying the correction as a post-processing step to the converged LDA solution is a good approximation. The exact value of the SOMO-SUMO gap, which corresponds to the interaction energy U [Eq. (43)], for molecules on surfaces can only be estimated. We will address this issue for the TOV molecule deposited on Au in Sec. IV D.
Finally, we note that the DFT+NEGF PDOS and transmission are spin-split and a magnetic moment appears due to the unequal occupation of spin up and down states. This is a characteristic broken-symmetry picture, which is only valid for applied magnetic fields 171 . In absence of magnetic field the symmetry is not broken, and the DFT symmetry breaking is an artifact caused by the effective exchange-correlation magnetic field.

C. Mean-field Anderson impurity model
The TOV adsorbed on Au is represented as a SIAM through the projection described in Sec. II. We remind that here non-spin-polarized DFT calculations are used, since magnetism is accounted for in the SIAM calcula-  tion. The hybridization function, calculated with Eq. (34), is shown in Fig. 9, and mainly reflects the Au DOS. Both the real and imaginary parts are rather featureless over a wide energy range, where only the 4s states contribute to the Au DOS, while very sharp peaks are present at energies below about −2 eV, where the fully filled 3d bands are located. Before solving the SIAM by using CTQMC, we address it within the mean-field approximation, where the HamiltonianĤ C in Eq. (43) is replaced byĤ MF C = U σn σ n −σ , and the double-counting correction is the same as in CTQMC [Eq. (44)]. The mean-field solution can be directly compared to the broken-symmetry DFT+NEGF results, whereĤ MF C plays the same role as the KS potential including the SCO, and where U determines the SOMO-SUMO gap.
At U = 0, the total AI occupation n = n ↑ + n ↓ is equal to the LDA occupation of the AI, obtained with the projection described in Sec. II. This is slightly larger than  1, reflecting the small metal-to-molecule charge transfer predicted by DFT, and discussed in Sec. IV A. The magnetic moment m z = n ↑ − n ↓ is equal to zero, because the AIM is mapped from a non-spin-polarized calculation andĤ MF C = 0. For U = 0 eV, we see in Fig.10 (top panel) that the total occupation n initially increases as function of U , until the system undergoes a transition to the magnetic state. At this point (between U = 0.5 and U = 0.75 eV) the formation of the magnetic moment is accompanied by the reduction of n , which then keeps decreasing monotonically for increasing U towards the half-filling limit. This behavior is qualitatively the same that we find when we apply the SCO in the DFT+NEGF (Sec. IV B). The spectral function of the AI is a Lorentzian at U = 0 eV (Fig. 11), and gradually splits into two (almost) Lorentzian functions as U is increased, one for spin up and one for spin down. Importantly, due to the small charge transfer from the Au substrate, these functions are not symmetric around E F . Moreover, the spectral function for spin down is slightly sharper than that for spin up, which is due to the decrease of absolute value of the imaginary part of the hybridization function with increasing energy (Fig. 9). This is overall similar to the DFT+NEGF transmission peaks displayed in Fig. 8.
To conclude this section, we take advantage of the known exact mean-field spectral function for real energies in order to assess the accuracy of the analytic continuation (Sec. II G). This is shown in Fig. 12, which compares the exact spectral function evaluated directly for real energies, to the one "analytically continued" from complex Matsubara frequencies to real energies, and which should in principle be identical to one another. The accuracy is very good for energies around E F , where the two spectral functions are almost indistinguishable, while the agreement becomes progressively worse for much larger (in absolute values) energies. In particular, the analytically continued spectral function appears broader than the exact one the further one moves away from E F . This additional broadening is an artifact of the analytic continuation, and it is caused by the limited accuracy of the SO (Sec. II G).

D. CTQMC simulations and Kondo effect
The CTQMC impurity occupation n and squared local magnetic moment m 2 z for several U -values at θ = 20 K are presented in Fig. 10. We remark that m 2 z and not m z is the appropriate quantity to look at in the analysis  145 . In principle, it can be greatly reduced by averaging the results of many independent optimizations for the same data set, in practice however it cannot be completely removed. The presented results are obtained by averaging over 250 independent runs. of the magnetization for the CTQMC results, since the occupation for spin up and down is equal in absence of any Zeeman-like term in the Hamiltonian, even in the local moment regime. This shows that the exact CTQMC solution does not have the unphysical symmetry breaking found for the DFT and mean field solutions. The CTQMC m 2 z gradually increases as function of U , and there is no evidence of an abrupt magnetic transition at U ≈ 0.5 eV. Nevertheless, when mean-field predicts a stable local magnetic moment for U > 0.5 eV, the CTQMC m 2 z is smaller than the corresponding mean-field value, while n is slightly larger. In this region of parameters the effect of electron correlation is therefore to favor the AI double occupancy compared to the mean-field picture, so that the magnetic moment is partly quenched. This is a typical behavior expected for a Kondo system. For large U , after the transition from the Kondo screened to the local magnetic moment regime, the mean-field and the CTQMC m 2 z converge to the same value, and n approaches the half-filling limit. Important insights into the dominant physics at different energy scales are provided by the AI spectral function A(E). In Fig. 13 we observe that the Lorentzian peak found for U = 0 eV gradually shrinks into a narrow and sharp peak for increasing values of U , thus forming the Kondo resonance centered around E F . At the same time, two broad satellite peaks, separated approximately by U , develop, and they correspond to the SOMO and SUMO states found in the DFT+NEGF and in the mean-field DOS. Note that, unlike in the mean-field case of Fig. 11, here the SOMO and SUMO are not formed by breaking the spin-symmetry through an unequal spin-occupation. Furthermore, they are much broader. Even though the broadening is partly an artifact caused by the analytic continuation (see Sec. IV C), we can still appreciate the real intrinsic broadening due to electron-electron interaction by comparing Fig. 13 to Fig. 12. In Fig. 14 the spectral function is plotted at fixed U = 2 eV, but at different temperatures, ranging from 20 K to 300 K. The SOMO and SUMO-related peaks do not change significantly as function of θ within the stochastic and numerical precision of the results, since U is constant. The Kondo peak on the other hand progressively shrinks with increasing θ, in analogy to what has been reported in experiments. We note that the energy scale for the Kondo effect is set by the ratio between U and the electronic coupling of the molecule to the substrate Γ (see below). Therefore, for a fixed Γ, increasing θ at fixed U has the analogous effect on the Kondo feature as fixing θ and increasing U . The quality of the computed spectral function at E F can be checked by means of the generalized Friedel sum rule 20 We remind that in this work we set E F to 0 as explained in subsection II A. Although in principle the generalized Friedel sum rule is valid only at θ = 0 K, we find that it is satisfied within 2% whenever the temperature is low enough that the system can be considered deep in the Kondo regime. This turns out to be the case for U 1.5 eV at θ = 20 K. Deviations for larger U are mostly due to the finite temperature, since the system gets closer to its Kondo temperature (see below). We have also assessed the large energy behaviour of the spectral functions by computing their moments. The normalization condition defined in Eq. (92) fixes the value of the zero moment, and is enforced in the analytic continuation (Sec. II G). The sum rule involving the first moment 172 is fulfilled within 5% to 10% for every considered value of U and θ [ dc = U (n − 1/2) is the contribution to the onsite energy deriving from the double-counting correction in Eq. (44)]. Much larger errors are found for higher moments. Overall these results and the verification of the generalized Friedel sum rule confirm the observations in the previous sub-section about the expected accuracy of the computed spectral functions at different energies. While the accuracy is good for energies around E F , there is a loss of accuracy at high energies, reflected by the error in the moments beyond the first. In order to quantitatively relate our simulations to the experiment 118 we compute the Kondo temperature θ K . This is defined as 173,174 where Z is the quasiparticle weight, also known as the quasiparticle mass-renormalization factor, Since CTQMC simulations yield the self-energy at the Matsubara frequencies, Z is approximated in the discrete form 175 with ω 0 = πkθ. We note that the definition of θ K in Eq. 100 was originally derived for the symmetric AIM in the Kondo limit, where the occupation is n = 1, and where the charge susceptibility is zero and the magnetic susceptibility becomes equal to that of the sd model, also called Kondo model, with S = 1/2 for θ → 0 20 . Despite that, in recent works 96,99 it has been employed to study general AIMs with the parameters obtained from DFT, and it was shown able to provide reliable estimates of θ K , so that its use has become justified in practice. Moreover, in our case the symmetric AIM is a good approximation to the studied AIM, since the investigated system is close to half-filling and its hybridization function appears rather featureless over a wide energy range around the Fermi energy (Fig. 9). In experiments 118 θ K is extracted from the measurement of the full width at half maximum of the Kondo peak, Γ K (θ), as a function of temperature by means of the approximate relation 176 Therefore, additionally to Eq. 100, we have also considered this equation, with θ equal to our simulation temperature, and Γ K as the Kondo-peak width of the spectral functions in Fig. 13. The estimated θ K as a function of U for simulations at θ = 20 K is plotted in Fig. 15 on a logarithmic scale. The results obtained by using the two alternative definitions in Eqs. (100) and (103) are in good agreement, showing that both methods are largely equivalent for the evaluation of θ K . Note that the numerical and stochastic precision of the computed Z, and consequently of θ K through Eq. 100, degrade for U larger than 2 eV as Z becomes small, although its error bar stays of the order of 0.1. In this case the system is approaching the transition point between the Kondo and the local moment regime, where Z is not a well-defined quantity anymore. A precise evaluation of the Kondo temperature via Eq. 100 for U > 2 would require much lower simulation temperatures than θ = 20 K, where the system would be deep in the Kondo regime, but these simulations turned out too computationally demanding.
The results can then be compared to the Kondo temperature for the asymmetric SIAM, given by (see page 168 in Ref. [20]) which is valid for U/πΓ > 1. Here Γ = −2 ∆ AI (E F = 0), and is the on-site energy of the impurity. The asymmetric SIAM reduces to the symmetric SIAM for = −U/2. Eq. 104 shows that for large U the value of θ K decreases exponentially with U . In our calculations is used a parameter that is obtained by fitting with Eq. 104 the θ K , which is calculated via Eq. (100) in Fig. 15, for small U (πΓ < U 1.5 eV). We then obtain = −1.42 U/2. This value deviates from the symmetric value = −U/2 because, as we pointed out above, there is a small substrate-to-molecule charge-transfer. The fitted approximately agrees with AI,D − dc ≈ − dc ( AI,D dc for πΓ < U 1.5), and accordingly also to the position of the SOMO peak with respect to E F in the spectral functions in Figs. 13 and 14. This agreement confirms the applicability of the model, and shows that for U 1.5 eV our DFT+NEGF+CTQMC calculations provide reliable estimates of the Kondo temperature. Then, by using Eq. (104) with the fitted , we can extrapolate the calculated values of θ K to larger values of U , where the θ K computed directly from the CTQMC data becomes inaccurate. We note that for the special case of the symmetric SIAM with = −U/2 in Eq. (104), we obtain a steeper decay of the predicted Kondo temperature as function U than for the fitted asymmetric case (see Fig. 15).
The TOV/Au system has an experimental θ K of 37 K 118 , and the estimated value of U that gives such θ K in our calculation for the fitted asymmetric SIAM equation is U ≈ 2.2 eV (Fig. 15). Consistently, for this U the value of Z is calculated to be approximately zero, and the squared local magnetic moment m 2 z approaches the mean-field value (see Fig 10). The estimated U that reproduces the experimental Kondo temperature can then be compared to the SOMO-SUMO gap obtained for the gas phase molecule (Tab. I), which is computed to be 4.85 with G 0 W 0 @PBE0. When the molecule is put on the Au surface at a height of about 3.1Å (Sec. IV A), metal-induced image-charge corrections reduce the static gap 70,71 . In order to estimate this static gap reduction, a classical image-charge model can be used, with an image plane placed at 1Å above the Au surface 75,80,82,83 . With this model the reduction of the gap results to be about 3.4 eV, which then leads to an approximate static U for the molecule on the surface that is 1.5 eV. This static U corresponds to the lower limit for U , since it implies full screening of any charge fluctuation on the molecule by the Au surface. The value of U ≈ 2.2 eV that best matches experiment lies somewhat above this lower limit, but well below the upper limit of the gas phase molecule values. We explain this result by observing that, since the fluctuations responsible for the Kondo physics happen on a rather fast time-scale, they will generally not be fully screened. While much effort has been dedicated to compute fully-screened U values in both molecular and solid state systems with several alternative methods, such as constrained Random Phase Approximation (cRPA) [181][182][183] , constrained DFT (cDFT) 80,81,84,[177][178][179][180] , screened hybrid functionals 184,185 and (self-consistent) linear response 186,187 , the intermediate situation that we suggest here has not been analyzed yet. This represents therefore an important aspect for future theoretical studies about Kondo physics.
We point out that an accurate estimate of the Kondo temperature is a great challenge. As seen in Eq. (104), the Kondo temperature is an exponential function of the ratio between U and Γ. This latter quantity is determined by the molecule adsorption distance and position, which means that the quantitative modeling of Kondo

E. Transport properties from DFT+NEGF+CTQMC
The total elastic transmission in presence of manybody correlations, T , of the system is calculated with Eq. (54). Importantly, we have verified that the same result is obtained by using the original or the projected matrices, so that the evaluation of the inverse projection of the many-body self-energy [Eq. (50)] is in principle not required for the calculation of the transmission. The results for the setup outlined in Sec. III B, and for three different values of U , are given in Fig. 16(a). The nonspin-polarized transmission without correlations, T 0 , corresponds to T for U = 0, and has a broad Lorentzian peak around E F . When increasing U this peak gets narrower and becomes the Kondo peak, with two broad satellite peaks separated by about U , following the evolution of the DOS in Fig. 13. The peak below E F is less pronounced than the one above E F , which indicates an energy dependent coupling to the electrodes, and is in qualitative agreement with the results obtained using a scissor operator (Fig. 8). The breakdown of T into its components, as derived in Sec. II E, is shown in Figs. 16(bd). By construction the background transmission in Fig.  16(b) [Eq. (67)] is independent of U , and reflects the DOS and coupling of the tip to the substrate mediated by non-interacting states of the molecule. For the considered values of U the transmission of the AI itself, defined in Eq. (70) and shown in Fig. 16(d), decays to zero for energies 2 eV below and above E F , where the AI DOS vanishes (Fig. 13). The interference terms are shown in Fig. 16(c) [Eq. (69)], and are rather small, so that in this case T largely corresponds to the sum of T B and T AI . In general however T I can be large and lead to a change of shape of the Kondo peak from a symmetric Lorentzian-type to an asymmetric Fano-type, and our method allows to evaluate the contributions individually. For the contribution of the AI we evaluate also the total transmission including incoherent components, T t,AI , and the results are shown in Fig. 16(d). Qualitatively T t,AI and T AI are similar, but T AI becomes significantly smaller than T t,AI as U increases, since it neglects incoherent components of the electron transport. Note that at zero temperature the many-body self-energy at E F vanishes in the Kondo regime, so that one should have T (E F ) = T 0 (E F ) and T t,AI (E F ) = T AI (E F ). However, at finite temperature the many-body self-energy is small, but not exactly 0, which results in the Kondo transmission peak shrinking when compared to the non-interacting case and to T t,AI (E F ) = T AI (E F ) for θ > 0.
We can now evaluate how the transmission depends on the tip shape. Since the tip-molecule coupling is rather weak at our considered tip height, we can assume that the tip does not significantly influence the electron correlation effects in the molecule, so that we use the same many-body self-energy for the different tip shapes. To evaluate the transmission for a blunt tip, we remove the single-tip Au atom, so that we have a tip with 3 Au atoms in the bottom-most plane closest to the molecule. We then shift the height of these atoms from the molecule to be equal to the one of the single-atom tip considered in Fig. 16. The comparison of the transmission for the two tip shapes, and for U = 2.5 eV, is shown in Fig. 17. The main difference is that the 3-atom tip has a stronger coupling to the molecule, which increases all contributions to the transmission. The background transmission becomes less featured, since the increased DOS around E F for the 1-atom tip is generally broadened for a 3atom tip 75 . With the used value of U = 2.5 eV, which is close to the value obtained by comparing the Kondo temperature to experiment, our transport results are in good agreement with experimental conductance measurements in Ref. 118 , which show a Kondo resonance at small bias on top of a background conductance, which is approximately constant in the considered low bias range. In general the relative height of background transmission to Kondo peak depends on tip shape, size and position, and can therefore be one of the tools to infer a nanoscale junction geometry.
FIG. 17. (Color online) Transmission including correlations for U = 2.5 eV for a tip with a single atom at the top (1-atom tip), and a tip with three atoms at the top (3-atom tip). The different sub-figures correspond to the ones described in Fig. 16

V. CONCLUSIONS
We have presented a rigorous way to extract an effective AIM from DFT+NEGF calculations for molecules adsorbed on a substrate. This effective AIM is then solved at finite-temperature with our implementation of CTQMC. With the developed method we can describe the linear-response conductance of systems in the Kondo regime, and evaluate the separate contributions to the overall conductance originating from the transmission through the molecule itself, from the background transmission, and from interference terms. Furthermore, for the SIAM we can compare the elastic contributions to the total conductance including incoherent effects. The results for the specific case of the TOV radical adsorbed on Au demonstrate that the method can potentially return accurate values for the Kondo temperature, provided that DFT gives reliable adsorption geometries, and that one can estimate the value of U for the effective electron-electron interaction. We propose that a partially screened U has to be considered to describe the Kondo physics for such molecules on metallic surfaces. Our results show that the developed method is a promising tool to evaluate the linear-response conductive properties of strongly correlated nanoscale systems. We note that the extension beyond linear response to finite-bias problems is in principle straightforward, and the current can be calculated as outlined in Sec. II E. The main developments required for such finite bias calculations are accurate and computationally efficient solvers for the outof-equilibrium AIM in the steady-state. Such developments are still an open research direction, and important progresses have been recently made 129,[189][190][191][192] .

VI. ACKNOWLEDGMENTS
As first step in the projection scheme we introduce a new basis set {φ µ }, which is identical to the original set of {φ µ }, with the exception that the N AI basis orbitals, which have largest overlap with the set {ψ 1,ER , ψ 2,ER , ψ 3,ER , . . . , ψ NAI,ER } are replaced with the corresponding ψ i,ER of the AI. Note that by construction the replaced basis orbitals are always part of the IR, since ψ i,ER has non-zero elements only within IR basis orbitals. This transformation is achieved with the projection matrix The matrix U NI has the dimension N ER × N NI , and is an identity matrix with the N AI columns removed, which correspond to the ones of the replaced basis functions. We collect all the unity vectors removed when forming U NI into a N ER ×N AI matrix U ⊥ , so that U NI † U NI = 1 NNI and U ⊥ † U NI = 0. The overlap and Hamiltonian matrices in the new basis,S andH, are then obtained as The general structure of the resultingS is whereS AI is a N AI × N AI matrix,S NI is an N NI × N NI matrix, and the dimensions of the off-diagonal blocks are determined from the ones of the diagonal blocks. Importantly, the sub-blocks for the overlap between the AI and the α and β subsystem are 0, since the ER is chosen to be large enough to guarantee this by construction.
The transformedH has an analogous structure. The basis functions of the AI are generally not orthogonal to the other ER orbitals, so thatS AI,NI = 0. If the Anderson impurity solver can deal with such nonorthogonal systems, then one can pass this matrix directly to such a solver. Most AI solvers however require orthogonality between the AI and all other orbitals in the system. We therefore perform a second transformation, which orthogonalizes all the orbitals to the ones of the AI. Importantly, we require such transformation to keep the orbitals of the AI, and of the α and β regions, unchanged. One can achieve this with the following transformation matrix: We therefore now have the AI orthogonal to all bath basis orbitals.
As a further optional step we can diagonalizeS AI and H AI . To this aim we first orthogonalize the orbitals in the AI via a Löwdin transformation by calculating the square root of the inverse of the AI overlap matrix, S Here U AIψ is a N AI × N AI matrix, with each column equal to an eigenvector of the eigenvalues system, and AI,D is a N AI ×N AI diagonal matrix, where the diagonal elements are the corresponding eigenvalues. The resulting transformation matrix to the new basis acts only on the orbitals of the AI, and is with The transformed matrices areH = W † 2H W 2 , andS = W † 2S W 2 , and keep the non-zero structure ofH andS, except that the top-left N AI × N AI elements become diagonal matrices.
The total transformation matrix, W , then is which when multiplied out gives the result in Eq. (10), repeated in the following for convenience Here we have introduced W AI = U ER W 2,AI , (A17) W NI = (1 NNI − W ER,ψ S ER ) U NI , with

Inverse projection
In this section we evaluate the inverse of W [Eq. (10)]. By using Eq. (A15) we obtain its general form as The inverse of W 0 is where A is a N AI × N ER matrix, and B is a N NI × N ER matrix. They are obtained by evaluating If we recollect that all the unity vectors removed when forming U NI are combined into a matrix U ⊥ (U † ⊥ U NI = 0), then we can give analytic expressions for A and B as Note that since for large systems the matrices U NI and U ⊥ consist mainly of zeroes, and the matrix U † ⊥ U ER is only of dimensions N AI × N AI , the calculation of the non-zero elements of W −1 0 by using Eq. (A23) is a fast Order(N ) operation.
The inverse of W 1 is and the inverse of W 2 is Multiplying all components gives the result in Eq. (28), with

Alternative expression for the hybridization function
We write the dense N ×N matrix G in the block matrix form analogous to the one of the overlap matrix [Eq. (5)] as By using Eqs. (26) and (28) we obtain the following expression for the Green's function projected on the AI: which is an equivalent expression to the one given in Eq. (33). If the task is to calculate∆ AI (z), one can therefore also first use Eq. (A28) to obtainḠ AI (z), and then Eq. (33) to evaluate∆ AI (z) as ∆ AI (z) = z − AI,D −Ḡ −1 AI (z).
However, especially for large z, this indirect way of ob-taining∆ AI (z) is significantly less accurate than the direct way through Eq. (34). The reason is that for large z the quantities z andḠ −1 AI (z) in Eq. (A29) are both very large, while the resulting difference is small, with a leading term proportional to 1/z (see Sec. A 4). In a practical calculation we therefore always use Eq. (34) to calculate∆ AI (z).

Asymptotic limits of the hybridization function for large energies
By means of Eq. (34) we can evaluate the limit of ∆ AI (z) as z goes to infinity. In Eq. (34)∆ AI (z) is given by a multiplication between energy independent matrices,H AI,NE andH † AI,NE , with the energy dependent matrixḡ NI (z). We therefore need to evaluate the asymptotic limit ofḡ NI (z), defined in Eq. (31), for large z.
To this aim we first evaluate the limit for large z of Σ L (z) and Σ R (z). The left self-energy satisfies the following recursive relation 53