APP下载

Neural-Network Quantum State of Transverse-Field Ising Model∗

2019-11-07HanQingShi石汉青XiaoYueSun孙小岳andDingFangZeng曾定方

Communications in Theoretical Physics 2019年11期

Han-Qing Shi (石汉青), Xiao-Yue Sun (孙小岳), and Ding-Fang Zeng (曾定方)

Theoretical Physics Division, College of Applied Sciences, Beijing University of Technology, Beijing 100124, China

Abstract Along the way initiated by Carleo and Troyer [G.Carleo and M.Troyer, Science 355 (2017) 602], we construct the neural-network quantum state of transverse-field Ising model(TFIM)by an unsupervised machine learning method.Such a wave function is a map from the spin-configuration space to the complex number field determined by an array of network parameters.To get the ground state of the system, values of the network parameters are calculated by a Stochastic Reconfiguration (SR) method.We provide for this SR method an understanding from action principle and information geometry aspects.With this quantum state, we calculate key observables of the system, the energy,correlation function,correlation length,magnetic moment,and susceptibility.As innovations,we provide a high efficiency method and use it to calculate entanglement entropy (EE) of the system and get results consistent with previous work very well.

Key words:neural network quantum state, Stochastic reconfiguration method, transverse field Ising model,quantum phase transition

1 Introduction

In a general quantum many-body system, the dimension of Hilbert space increases exponentially with the system size.Kohn called this“an Exponential Wall problem”in his Nobel Prize talks.[1]This lofty wall prevents physicists from extracting features and information from the system.To bypass this lofty wall, physicists make many efforts.The most productive or influential ones are density matrix renormalization group (DMRG)[2]and quantum monte carlo (QMC).[3]But till this day, no satisfactory methods are discovered for this problem universally.Each method has its advantage and disadvantages.For example, DMRG is highly efficient for 1-dimensional system, but it works not so well in higher dimensions.QMC suffers from the notorious sign problem.[4]

However, people note that machine learning is a rather strong method for rule-drawing and informationextracting from big data sources.In this method, machine can“learn”from data sources and“get intelligence”,and analyze newly input data then do decisions intelligently.Very naturally,we expect machine learning may be also used to solve problems appearing in quantum manybody systems.It has been used in condensed matter physics, statistical physics, Quantum Chromodynamics,AdS/CFT, black hole physics and so on.[5−10]

By our current computer power,the“Exponential Wall problem” can not be solved through direct diagonalization of the Hamiltonian.In Ref.[11], by writing down a wave function containing enough parameters to adjust,Laughlin provides successful explanations for the fractional quantum hall effects.His doing bypass the question of exact diagonalization of the Hamiltonian and implement a paradigm shift in the research of many-body system.After his work,people realize that the direct construction of wave function is of great value for the manybody systems’ resolving.That is, we formally write down the wave function of the system that depends on enough parameters, then adjust the parameters to get the target wave function.This way, the core of the many-body problem becomes dimension reduction and feature extraction.Among the many algorithms for machine learning,artificial neural-network is a splendid one for this goal.[12]

In Ref.[13],Carleo and Troyer introduced a variational representation of quantum states for typical spin models in one and two dimensions, which can be considered as a combination of Laughlin’s idea and neural-networks.This neural-network quantum state (NQS) is actually a map from the spin configuration space to wave function or complex number domain.In this framework, adjusting the neural-network parameters so that for each input spin configuration, the output number is proportional to probability amplitude.In the current work, we will try out this NQS representation and machine learning method to reconstruct the ground state of the TFIM, both in one and two dimensions, and calculate its key observables, especially the EE.For the SR method,[14−16]we will provide an understanding basing on least action principle and information geometry.

The layout of our paper is as follows, this section is about history and motivation; the next section is a brief introduction to the neural-network quantum state and TFIM.Section 3 is our discussion on the SR method and its programing implementation.Section 4 is our calculation results of key observables of the ground state TFIM,using machine learned NQS in Sec.3.Section 5 is our method for the calculation of EE of the ground state TFIM.The last section is our summary and prospect for future work.

2 Neural-Network Quantum State and Transverse-Field Ising Model

The neural-network that Carleo and Troyer proposed to describe spin-1/2 quantum system has only two layers,a visible layers=(s1,s2,...,sN) corresponding to the real system, and a hidden layerh=(h1,h2,...,hM) corresponding to an auxiliary structure.The connecting lines between the visible nodes and the hidden nodes represent interactions between them.But there are no connecting lines inside the visible layer and hidden layer.This type of neural-network is termed as Restrict Boltzmann Machine(RBM).Its schematic diagram is shown in Fig.1.In the following, we do not distinguish between neural-network and RBM.

Fig.1 (Color online) Schematic diagram of Restricted Boltzmann Machine.This is a two layer structure.The left is visible layer, the right is the auxiliary hidden layer.The dashed line between nodes in both left and right layers does not imply interactions, they are plotted here only for visual impression for “layer”.The lines between the visible nodes and hidden nodes represent interactions.

The many-body wave function could be understood as a map from the lattice spin configuration space to complex number field.Explicitly, this can be written as

wheres={si} denotes the spin configuration andW={a,b,w}is the weight parameters of the neural-network.AdjustingWis equivalent to adjusting rules of the map.Andhi= {1,−1} is the hidden variables.Since there is no interactions inside the visible layer and hidden layer themselves, the summation over hidden layers spin configuration can be traced out.So the wave function can be more simply written as

Mathematically, this NQS representation can be traced back to the work of Kolmogorov and Arnold.[17−18]It is the now named Kolmogorov-Arnold representation theorem that makes the complicated higherdimensional function’s expressing as superpositions of lower-dimensional functions possible.[19]

This work focuses on the TFIM, whose Hamiltonian has the form

3 Stochastic Reconfiguration Method for the Ground State

SR method[14−15]was firstly proposed by Sorella and his collaborators in studies of addressing the sign problem firstly.Then it was used as an optimization method for finding goal functions from some general trial-function set.It can be looked as a variation of the steepest descent(SD)method.Considering its key value for numerical calculation of neural-network quantum state, we provide here a new understand for it basing on the least action principle and information geometry.Information geometry can be dated back to Rao’s work in 1945.[23]In that work, Rao takes the Fisher information metric as Riemannian metric of statistical manifold, and regards geodesic distances as the differences between different distributions.This discipline drives to maturity after the work by Shun’ichi Amari and others.[24]In recent years, it also gets attention as a tool to understand gravitation emergence and AdS/CFT correspondence.[25−26]

The quantum state of our system is functions of the neural-network parameter set{Wk} ≡{ai,bj,wij}.We will start from a trial function ΨT, which is controlled by the initial parameters{W0k}.Consider a small variation of the parametersWk=W0k+δWk, under the first order approximation, the new wave function becomes

Introduce a local operatorOk, so that

and set the identity operatorO0=1, then Ψ′Tcan be rewritten as a more compact form

Our goal is to find the ground state wave function,so that the expectation value of energyis minimized.Obviously,Edepends on parameters involved in the neural-network.The procedure of looking for the ground state is equivalent to the network parameters’ adjusting.The key question is the strategy of updating parameters from{W0} to{W}.This process is something like a process that a moving from an initial point to the target point (the ground energy state in our question) in parameter space.The parameter path connecting the initial point to the target point is determined by the “least action principle” in parameter space as we will show as below.

In SR method, the parameters are updated by strategies

wheresikis the metric of the parameter space, which will be clear from the following discussion.Our task here is to show that this strategy is the requirement of least action principles.For this purpose, we firstly introduce generalized forcesf

Then variations of the energyEdue to changes ofWcan be written as

i.e.

Now if we definesik∆Wi∆Wk ≡∆sas the line element in the parameter space, then

In integration form, this is nothing but,

whereSis the“action”of the iterative process when seeking the ground state of the system andLis its corresponding“Lagrangian”.The path forms in the parameter space when the parameters are updated is determined by the corresponding least action principle.This is the physical meaning of the SR method.The SD method is a special case of SR one, whose parameter space metric is a simple Cartesian one

However, in general cases we have no reason to take the parameter space as such a simple one.So we have to introduce a metric so that

This is the reason whysikappears in Eq.(7).

Obviously,sik’s determination is the key to the question.On this point, SR method tells us that

From information geometry’s perspective,this is very natural.Consider a general data distributionp(x;θ), the Fisher information matrix or Riemannian metric on the statistic manifold is defined as.

In our neural network quantum state, the probability reads (our wave function is limited to real fields)

Substituting this into Eq.(16), we know

This is exactly the results we want to show.The rationality behind this derivation is that,mathematically a distribution function determined by its parameter set has little difference from a quantum state wave function determined by the corresponding neural-network parameters.

Now comes our concrete implementation of the ground state finding numeric programs.The key idea is iterative execution of Eq.(7), starting from some arbitrary point of theW ≡{a,b,w}-parameter space.When the ground state is arrived on, the generalized forcef=−∂E/∂Wtend to zero and the parameters are stable.Due to the exponential size of the Hilbert space, for arbitrarily chosen parametersW,we cannot determine which state is the ground one by complete listing of all spin configurations.We use Metropolis-Hastings algorithm to sample the important configurations for approximation.The detailed step is as follows.

•Step 1,starting from an arbitrarya,b,wwe construct ΨT(s,W) and generateNs=103−104spin state sample{s}through a Markov chain ofs →s′ →···→s(f).The transition probability between two configurationssands′is

•Step 2, for givena,b,w, calculate the corresponding,

•Step 3, withOk, we calculatesikaccording to (14)wheremeans averaging over theNssamples.Get its inverses−1ikand update parametersa,b,wthrough Eq.(7).

•Step 4, repeat the above steps enough times, until the generalized forcefktends to zero and the parameters become iteration stable,we will get the desired parameter for ground state.

Two points are noteworthy here

i) In practical calculationsfktakes the form of

Eloc=is the local energy in Variational Monte Carlo (VMC)[27]for each spin configuration.

ii) Using symmetries of the model to reduce the number of parameters, which was discussed in supplementary materials of Carleo and Troyer’s paper.[28]In our models,we impose periodic boundary conditions for the lattice,so translation symmetries are used in our calculation.Due to this symmetry, the number of free components inaiis 0,inbjisM/N,inwijisα×N=M/N×N=M,instead ofM ×N, whereαis the ratio of hidden nodes number and visible nodes number.

The following is our numeric results for TFIM in both one- and two-dimensional square lattices.Our numerical work can be divided into three parts:

i) The ground state wave function training.

ii) Key observables’ measurement excluding EE.

iii) The EE’s measurement.

In the one-dimensional model,we do the ML and measurements in three different network parametersα=1, 2,and 4.Almost no superiority is observed for largerα.For the non-entanglement-entropy observables,our results are compared with exact solutions of Ref.[22].They coincide very well.While for the EE,we compared our results with Ref.[29],probably due to the finite size effects,our results are only qualitatively agreeing with the literature.

4 Key Observables of the TFIM Ground State

Our first set of observables is the per-site ground state energyE/Nof TFIM for one- and two-dimensional models, whose dependence on the transverse-field strength is illustrated in Fig.2.

Fig.2 (Color online) The ground state energy E/N of TFIM in one-dimensional 32-site spin-chain (a) and twodimensional (b) 10×10-site lattice as functions of the external field-strength.The red, green and blue points in the left figure are for network parameters α=1,2,4.They are displaced from each other artificially otherwise coincide almost exactly.The dashed line is the analytic result of Ref.[22].While the two-dimensional result is compared with the real space renormalization group analysis of Ref.[31].

In one-dimensional case,our numerical results fit Pfeuty’s exact result[22]very good.Many numerical studies about 1D TFIM have been done, for example the Mote Carlo method was used in Ref.[30].While in two dimension models, our results coincide with those from real space renormalization group analysis.[31−32]In Ref.[31], the critical transverse field is 3.28.In Ref.[32], the critical transverse field ish=3.4351.From the figure, we easily see that the energy decreases as the field strength increases.This is because part of the energy arises from interactions between the spin sites and external magnetic fields.Very importantly, in one-dimensional case we note that enlarging the network parameterα ≡M/N, i.e.the ration of hidden to manifest nodes number almost has no affects on the value and precision of the per site energy.However, the computation time grows at least linearly withα’s growing.Due to this reason, we do not make 2-d ML and measurements forαgreater than 1.

Our second set of observables is the per-site magnet moment and the corresponding susceptibility of the ground state TFIM

Focusing on the component along external transverse field,the results are displayed in Fig.3 explicitly.For onedimensional case, our results coincide with existing literatures very well.From the figure, we see that the susceptibility contains a singularity ath=1 (1D case) andh ≈3(2D case),which corresponds to the quantum phase transition points as the external field strength varies.The enlarged detail figure in this figure seems to indicate that more largerαML givesline more well coincides with the analytic result.

Our third set of observables is the spin-zcorrelation function 〈σzi σzj〉 and the corresponding correlation lengthξzz, with the latter defined asξzz=in numeric implementations.Our result is displayed in Fig.4.From the figure,we easily see that the system manifests long-range spinz-zcorrelation in small transverse-field strength, while in largehregion, the correlating function decreases quickly.The correlation lengthξzz’s behavior tells us this point more directly.In 1D case, the correlation length’s jumping occurs onh ≈1.While in the 2D case, such jumping occurs onh ≈3.Due to the finite size effect of our lattice model,ξzzhas saturation values in the smallhregion.This saturation phenomena will disappear in thermodynamic limits and the correlation length will diverge on the critical point.

Physically, the spin-spin interactionJand the external fieldh’s influence in TFIM are two competitive factors.The former has the trends of preserving orders in the lattice,while the latter tries to break such orders.The quantum phase transition occurs on a critical value ofh/J.In our numerics,we setJ=1.The more larger critical valueh=3 in 2D than theh=1 value in 1D is due to the more number—twice as much—of spin-spin interaction bonds for each site in 2D than in 1D.

Fig.3 (Color online)The per-site magnet moment expectations〈Mx〉and susceptibilities χx in the one-dimensional(a)(c) and two-dimensional (b) (d) TFIM as functions of the external field-strength.The 1D model’s calculation is done for three different α’s while the 2D calculation has fixed α=1.The dash line in the upper-left figure is the analytic results〈Mx〉 from Ref.[22].The sub figure in it shows details of conformity with the analytic result of the three α’s numeric.

Fig.4 (Color online) The ground state spin-z correlation function 〈σzi σzi+x〉 (a) (b) and the corresponding correlation length (c) (d) of TFIM.The left is for 1D chain with 32 sites while the right is for 2D lattice of 10×10 sites.

5 EE of the Ground State

Entanglement, a fascinating and spooky quantum feature without classical correspondences, is getting more and more attentions in many physics areas.[33]People believe that many important information about a quantum system can obtained from its bipartite EE and spectrum.[34]More importantly,the area law feature of EE sheds new light on our understanding of holographic dualities, quantum gravity and spacetime emergence.[35−36]Nevertheless, we have very few ways to calculate the this quantity efficiently.Our purpose in this section is to provide a new method for its calculation in both one- and two-dimensional TFIM.

Let the total system be described by a density matrixρand divide the system into two parts,AandB.The EE between the two is defined as follows

whereρAis the reduced density matrix ofA,which follows from the B-part degrees of freedom’s truncating.The behavior of EE is regarded as a criteria for quantum phase transition.Using conformal field theory methods, Calabrese and Cardy[29]calculate the EE of 1D infinitely long Ising spin chain and show that it tends to the value of log 2 asymptotically forh →0 and tends to 0 ash →∞.In the quantum critical pointh=1, it diverges.In Ref.[37]Vidal,et al.showed that for spin chain in the noncritical regime the EE grows monotonically as a function of the subsystem sizeLand will get saturated at certain subsystem sizeL0.At the critical value ofh, it diverges logarithmically with singular valueL.

For a general stateof a system consisting of two partsAandB,

The matrix coefficientcijis the probability amplitude of a configuration whose part-A is ini-th spin configuration while part-B is inj-th spin configuration.With the help of Singular Value Decomposition (SVD)

whereU,V, and Σ aredA ×dA,dB ×dB, anddA ×dBmatrices respectively,we can diagonalizecijinto Σkk′and rewriteinto the form

However, the size ofcijincreases exponentially with the number of lattices.This exponential devil makes the SVD hard to do.We hope to get a reduced coefficient matrix to approximate the originalcijbut reserve key features of system.We want to and only can include the important elements ofcij.Here an approximation method to bypass the exponential wall problem is needed.The exposition and schematic diagram of our idea is as follows.

Firstly,we write the general state of the lattice system withNsites as the superposition of spin configurations in descending order of|cℓ|,

Only the firstqconfigurations with the maximal|cℓ| will be produced by a Monte Carlo sampling algorithm and saved for successive computations.For the 1D TFIM with 32 lattices,q=104is enough.|cℓ|=ψ(sℓ)here is just the value of NQS wave functions coming from MLs.When we write the subscript ofcℓas the combination of part-A’s configuration-iand part-B’s configuration-j, we will get a very sparse matrixcij ≡cℓ.If we substitute thiscijinto Eqs.(26)–(28), what we get will be a very poor approximation of EE.However,if we fill the blank position of thiscijmatrix with NQS wave function valuesψ[sij≡ℓ].Our results will become much better than the those following from the original sparse matrixcij.

Firstly,we show in Fig.5 theh-dependence of EE when the system (both 1D chain and 2D lattice) is equally bipartite.For the 1D chain model, 3 different network parametersα=1, 2, and 4 are studied and all of them yield equally good results forS, but the largerαcomputation costs time at least linearly increasing withα.For this reason, we do not consider this parameters’ effect on numerics for the 2D lattice.Our 1D numeric EE is compared with analytical results of Refs.[29,37].They have equal small-hlimit, approximatelySh→0−→log 2 and similar decaying trends in the large-hregion.They also have the same quantum phase transition pointh ≈1.For the 2D lattice, our EE indicates that the system may experience quantum phase transitions ath=3∼5.Combining with magnetic susceptibility and correlation length calculation in the previous section,we know that his transition occurs ath ≈3.

Fig.5 (Color online) The equal-size bipartie EE of TFIM as functions of the external field strength.The left 1D spin-chain has 32 sites and the right 2D lattice has 10×10 ones.In the 1D chain, three different network parameters α=1,2,4 in red, green and blue are tried but the results exhibit little differences.The dashed gray line in it is the analytic result of Ref.[29].

Then in Fig.6 we studied the A-part size dependence of EE when the spin chain is arbitrarily bipartite.From the figure we see that this dependence is symmetric on the size of A and B.And, in the critical value case of the transverse-field strength, the EE increases monotonically before A-part size increases to half the chain length.While for the much less than or more larger than the critical value of transverse-field strengths, EE rises quickly to some saturation values before the the size of A-part increases to the half size of the total system.These results agree with those of Vidalet al.[37]

For all known quantum many body system, their EE’s calculation is a challenging work.References [38]and [39]are two well known works in this area.The former uses methods of QMC with an improved ratio trick sampling,whose illustrating calculation is done for 1D spin-chain of 32 sites and only the second Renyi entropy is calculated with long running times.While the latter uses wave function obtained from RBM + ML and a replica trick,also only the second Renyi entropy for a 32-sites 1D spinchain is calculated as illustrations.As a comparison, our method can be used to calculate the von Neumann entropy for both 1D and 2D TFIM directly.Our method adopts new approximation method in the SVD approach.We preserve the most important configurations of the system, which corresponds to the important elements of the matrix coefficient,to represent the full wave function.The key to our method is the reduction of the matrix coefficientcijand the filling of its blank positions by values of wave functions getting from RBM.In the 1D case, we get results highly agree with the known analytic results of CFT.[29,37]While in the 2D case, our EE’s calculation yields quantum phase transition signals consistent with those yielding by other observables.

Fig.6 (Color online) The EE of TFIM as functions of the size of the A-Part in some typical transverse-field strengths.The upper is for 1D chain with 32 sites, the downer is for 2D lattice with 10×10 sites.In the 2D lattice, bi-partition is along the 45◦line of the lattice square.

6 Conclusion and Discussion

Follow the idea of artificial neural-networks of Carleo and Troyer, we reconstruct the quantum wave function of one- and two-dimensional TFIM at ground state through an unsupervised ML method.Basing on the resulting wave function, we firstly calculate most of the key observables, including the ground state energy, correlation function, correlation length, magnetic moment and susceptibility of the system and get results consistent with previous works.The stochastic reconfiguration method plays key roles in the ML of neural-network quantum state representation.We provide in Sec.2 of this work an intuitive understanding for this method based on least action principle and information geometry.As a key innovation,we provide a numeric algorithm for the calculation of EE in this framework of neural-network and ML methods.By this algorithm,we calculate entanglement entropies of the system in both one and two dimensions.

For almost all quantum many-body system, calculations of their EE are all a challenging work to do.Both DMRG and QMC do not solve this question satisfactorily.The former works well main in 1D models, while the latter has difficulties to treat large lattice size.Our method introduced here can be used to calculate the EE directly and applies to both 1- and 2D models.On our Macbook of two core 2.9 GHz CPU and 8 G RAM, finishing all illustrating calculations presented in this work costs time less than five days.

As prospects, we point here that, further exploration and revision of our numerical algorithm so that in 2D lattices it can give more clear and definite EE signals of quantum phase transition, or use our methods to study the behavior of time-dependent processes in the spin-lattice model[40]are all valuable working directions.On the other hand,to explore the NQS representation and their ML algorithm for other physic models,such as the more general spin-lattice and Hubbard model, is obviously interesting direction to consider.For these models,more complicated neural-network such as the deep and convolution ones may be more powerful.In Ref.[41], deep neural-networks has the potential to represent quantum many-body system more effectively.While Ref.[42]shows that the combination of convolution neural-networks with QMC works even for systems exhibiting severe sign problems.