Soliton,breather,and rogue wave solutions for solving the nonlinear Schr¨odinger equation using a deep learning method with physical constraints∗

2021-06-26JunCaiPu蒲俊才JunLi李军andYongChen陈勇

Chinese Physics B 2021年6期

Jun-Cai Pu(蒲俊才) Jun Li(李军) and Yong Chen(陈勇)

1School of Mathematical Sciences,Shanghai Key Laboratory of Pure Mathematics and Mathematical Practice,and Shanghai Key Laboratory of Trustworthy Computing,East China Normal University,Shanghai 200241,China

2Shanghai Key Laboratory of Trustworthy Computing,East China Normal University,Shanghai 200062,China

3College of Mathematics and Systems Science,Shandong University of Science and Technology,Qingdao 266590,China

4Department of Physics,Zhejiang Normal University,Jinhua 321004,China

Keywords: deep learning method,neural network,soliton solutions,breather solution,rogue wave solutions

1. Introduction

In recent decades,more and more attention has been paid to the nonlinear problems in fluid mechanics,condensed matter physics,optical fiber communication,plasma physics,and even biology.[1–4]After establishing nonlinear partial differential equations to describe these nonlinear phenomena and then analyzing the analytical and numerical solutions of these nonlinear models, the essence of these nonlinear phenomena can be understood.[5]Therefore, the research of these nonlinear problems is essentially transformed into the study of nonlinear partial differential equations which describe these physical phenomena. Due to many basic properties of linear differential equations are not applicable to nonlinear differential equations,these nonlinear differential equations which the famous nonlinear Schr¨odinger equation belongs to are more difficult to solve compared with the linear differential equations.It is well known that the Schr¨odinger equation can be used to describe the quantum behavior of microscopic particles in quantum mechanics.[6]Furthermore, various solutions of this equation can describe the nonlinear phenomena in other physical fields, such as optical fiber, plasma, Bose–Einstein condensates,fluid mechanics,and Heisenberg ferromagnet.[7–16]

With the explosive growth of available data and computing resources, deep neural networks,i.e., deep learning,[17]are applied in many areas including image recognition,video surveillance, natural language processing, medical diagnostics, bioinformatics, financial data analysis, and so on.[18–23]In scientific computing, especially, the neural network method[24–26]provides an ideal representation for the solution of differential equations[27]due to its universal approximation properties.[28]Recently,a physically constrained deep learning method called physics-informed neural network(PINN)[29]and its improvement[30]has been proposed which is particularly suitable for solving differential equations and corresponding inverse problems. It is found that the PINN architecture can obtain remarkably accurate solution with extraordinarily less data. Meanwhile,this method also provides a better physical explanation for predicted solutions because of the underlying physical constraints which is usually described explicitly by the differential equations. In this paper,the computationally efficient physics-informed data-driven algorithm for inferring solutions to more general nonlinear partial differential equations,such as the integrable nonlinear Schr¨odinger equation,is studied.

As is known to all, the study of exact solutions for integrable equations which are used to describe complex physical phenomena in the real world have been paid more and more attention in plasma physics, optical fiber, fluid dynamics,and others fields.[13,31–35]The Hirota bilinear method,the symmetry reduction method,the Darboux transformation,the B¨acklund transformations,the inverse scattering method,and the function expansion method are powerful means to solve nonlinear integrable equations, and many other methods are based on them.[5,36–42]Although the computational cost of some direct numerical solutions of integrable equations is very high,with the revival of neural networks,the development of more effective deep learning algorithms to obtain data-driven solutions of nonlinear integrable equations has aroused great interest.[43–47]Li and Chen constructed abundant numerical solutions of second-order and third-order nonlinear integrable equations with different initial and boundary conditions by deep learning method based on the PINN model.[43,46,47]Previous works mainly focused on some simple solutions(e.g.,Nsoliton solutions,kink solutions)of given system or integrable equation. Relatively,the research results of machine learning for constructing rogue waves are rare. In Ref. [48], the bias function including two backward shock waves and soliton generation and the generation of rogue waves are studied by using a single wave-layer feed forward neural network. As far as we know,the soliton solutions,breather solution,and rogue wave solutions[8,9]of the integrable nonlinear Schr¨odinger equation have not been given out by the deep learning method based on PINN. Therefore, we introduce the deep learning method with underlying physical constraints to construct the soliton solutions,breathing solution,and rogue wave solutions of integrable nonlinear Schr¨odinger equation in this work.

This paper is organized as follows. In Section 2, we introduce the physically constrained deep learning method and briefly present some problem setups. In Section 3, the one-soliton solution and two-soliton solution of the nonlinear Schr¨odinger equation are obtained by this approach, and the breather solution is derived in comparison with the two-soliton solution. Section 4 provides rogue wave solutions which contain one-order rogue wave and two-order rogue wave for the nonlinear Schr¨odinger equation, and the relative L2errors of simulating the one-order rogue wave of nonlinear Schr¨odinger equation with different numbers of initial points sampled,collocation points sampled,network layers,and neurons per hidden layer are also given out in detail. Conclusion is given in the last section.

2. Method

In this paper, we consider (1+1)-dimensional nonlinear Schr¨odinger equation as follows:

Specifically, the complex value solutionq(x,t)is formulated asq=u+iv, whereu(x,t) andv(x,t) are real-valued functions ofx,t, and real part and imaginary part ofq(x,t),respectively. Then,equation(1)can be converted into

Accordingly, we define the physics-informed neural networksfu(x,t)andfv(x,t)respectively

and the solutionq(x,t) is trained to satisfy the networks (4)and(5)which are embedded into the mean-squared objective function(also called loss function)

In this paper, we optimize all loss functions simply using the L-BFGS algorithm which is a full-batch gradient-based optimization algorithm based on a quasi-Newton method.[52]In addition, we use relatively simple multilayer perceptrons(MLPs)with the Xavier initialization and the hyperbolic tangent(tanh)activation function.[43]All codes in this article are based on Python 3.7 and Tensorflow 1.15, and all numerical examples reported here are run on a DELL Precision 7920 Tower computer with 2.10 GHz 8-core Xeon Silver 4110 processor and 64-GB memory.

3. Soliton solutions and breather solution of the nonlinear Schr¨odinger equation

The (1+1)-dimensional focusing nonlinear Schr¨odinger equation is a classical integrable field equation for describing quantum mechanical systems, nonlinear wave propagation in optical fibers or waveguides,Bose–Einstein condensates,and plasma waves.In optics,the nonlinear term is generated by the intensity dependent index of a given material. Similarly, the nonlinear term for Bose–Einstein condensates is the result of the mean-field interactions about the interactingN-body system.We consider the focusing nonlinear Schr¨odinger equation along with Dirichlet boundary conditions given by

whereq0(x)is an arbitrary complex-valued function of space variablex,x0,andx1represent the lower and upper boundaries ofxrespectively,t0andt1represent the initial and terminal time instants oftrespectively. In addition,this equation corresponds to Eq.(1)withα=1 andβ=2.Equation(11)is often used to describe the evolution of weakly nonlinear dispersive wave modulation. In view of the characteristic of its solution, it is called “self focusing” nonlinear Schr¨odinger equation. For water wave modulation,there is usually coupling between modulation and wave induced current,so in some cases,water wave modulation can also be described by the nonlinear Schr¨odinger equation.[2]TheN-soliton solutions and breather solution of the above equation have been obtained by many different methods.[36,38,53]Here, we simulate the soliton solutions and breather solution using the physically constrained deep learning method,and compare them with the known exact solutions, so as to prove the effectiveness of solving the numerical solutionsq(x,t) by neural networks. Specifically,theN-soliton solution of nonlinear Schr¨odinger equation have been derived by the Riemann–Hilbert method,[53]and theNsoliton solution is formed as

andRis a matrix of(N+1)×(N+1),

withθk=−iζkx −2iζ2k t(k= 1,...,N),ζkandci(i=1,...,N) are complex value constants. After taking the positive integerN, one can obtain the corresponding N-soliton solutions and breather solution of the nonlinear Schr¨odinger Eq.(11).

3.1. One-soliton solution

In this subsection, we numerically construct one-soliton solution of Eq.(11)based on the neural network structure with 9 hidden layers and 40 neurons per hidden layer. WhenN=1,we have

whereξ,ηare the real and imaginary parts ofζ1respectively,andx0,σ0are real parameters. Then the above one-soliton solution(16)can be reduced to

One can obtain the exact one-soliton solution of the nonlinear Schr¨odinger Eq. (11) after takingη= 1,ξ= 1,x0=0,σ0=1 into Eq. (17)as follows:

Then we take[x0,x1]and[t0,t1]in Eq.(11)as[−5.0,5.0]and [−0.5,0.5], respectively. The corresponding initial condition is obtained by substituting a specific initial value into Eq.(18)

We employ the traditional finite difference shcemes on even grids in MATLAB to simulate Eq. (11) with the initial data (19) to acquire the training data. Specifically, dividing space[−5.0,5.0]into 513 points and time[−0.5,0.5]into 401 points, one-soliton solutionqis discretized into 401 snapshots accordingly. We sub-sample a smaller training dataset that contain initial-boundary subsets by randomly extractingNq=100 from original initial-boundary data andNf=10000 collocation points which are generated by LHS.[50]After giving a dataset of initial and boundary points, the latent onesoliton solutionq(x,t) is successfully learned by tuning all learnable parameters of the neural network and regulating the loss function (6). The model achieves a relative L2error of 2.566069×10−2in about 726 seconds,and the number of iterations is 8324.

In Fig. 1, the density diagrams, the figures at different instants of the latent one-soliton solutionq(x,t), the error diagram about the difference between exact one-soliton solution and hidden one-soliton solution, and the loss curve figure are plotted respectively. The panel (a) of Fig. 1 clearly compares the exact solution with the predicted spatiotemporal solution. Obviously,combining with the panel(b),we can see that the error between the numerical solution and the exact solution is very small. We particularly present a comparison between the exact solution and the predicted solution at different time instantst=−0.25,0,0.25 in the bottom panel of panel(a). It is obvious that as time t increases,the one-soliton solution propagates along the negative direction of thexaxis.The three-dimensional motion of the predicted solution and the loss curve at different iterations are given out in detail in panels(c)and(d)of Fig.1.The results show that the loss curve is very smooth which proves the effectiveness and stability of the integrable deep learning method.

Fig.1. The one-soliton solution q(x,t): (a)the density diagrams and figures at three different instants,respectively; (b)the error density diagram; (c)the three-dimensional motion;(d)the loss curve figure.

3.2. Two-soliton solution and breather solution

Now, we numerically construct the two-soliton solution and breather solution of Eq.(11)based on the neural network architecture with 9 hidden layers and 80 neurons per hidden layer. WhenN=2, the solution (12) can also be written out explicitly. We have

where

withθ1=−iζ1x −2iζ21t,θ∗1= iζ∗1x+2iζ∗21t,θ2=−iζ2x −2iζ22t,θ∗2=−iζ∗2x −2iζ∗22t,ζj(j=1,2) are complex value constants, so one can derive the general form of two-soliton solution as follows:

According to the relationship between the two-soliton solution and the breather solution, we can know that when Re(ζ1)/=Re(ζ2),the solutionq(x,t)is a two-soliton solution,and when Re(ζ1)=Re(ζ2), the solutionq(x,t) degenerates into a bound state which is also called the breather solution.Given appropriate parameters

we can obtain the exact two-soliton solution from the formulae(20)

where

On the other hand,given other appropriate parameters

one can obtain the exact breather solution

where

Now we take[x0,x1]and[t0,t1]in Eq.(11)as[−5.0,5.0]and [−3.0,3.0], respectively. For instance, we consider the initial condition of the two-soliton solution based on Eq.(22)

where

Similarly, the initial condition of the breather solution is given

where

With the same data generation and sampling method in Subsection 3.1, we numerically simulate the two-soliton solution and the breather solution of the nonlinear Schr¨odinger equation (11) using the physically-constrained deep learning method mentioned above. After training the two-soliton solution, the neural network achieves a relative L2error of 5.500792×10−2in about 2565 seconds, and the number of iterations is 17789. However, the network model for learning breather solution achieves a relative L2error of 9.689267×10−3in about 1934 seconds,and the number of iterations is 13488. Apparently, since the breather solution is a special form of the two-soliton solution and accordingly the solution structure is simpler,the training of the breather solution takes remarkably less time,the relative error is obviously smaller,and moreover the result is better than that of the twosoliton solution from Figs.2 and 3.

Figures 2 and 3 show the density diagrams,the profiles at different instants and error density diagrams of the two-soliton solution and the breather solution,respectively. From the bottom panel of panels (a) in Fig. 2, we can clearly see that the intersection of two solitary waves with different wave widths and amplitudes produces a peak of a higher amplitude different from the former two solitary waves,which satisfies the law of conservation of energy. We reveal the profiles of the three moments att=−1.50,0,1.50,respectively,and find that the amplitude is the largest whent=0. From soliton theory, we know that the two solitary waves have elastic collision. Similarly,one can look at the breather solution shown in panel(a)of Fig.3 it is a special bound state two-soliton solution formed by two solitary waves with the same wave velocity,wave width and amplitude,and has a periodic motion with respect to timet. The panel (b) of Figs. 2 and 3 shows the error dynamics of the difference between the exact solution and the predicted solution for the two-soliton solution and the breather solution,respectively. In Fig. 4, the corresponding three-dimensional motion of the two-soliton solution and the breather solution are shown,respectively. It is evident that the breather solution is more symmetric than the general two-soliton solution.

Fig.2. The two-soliton solution q(x,t): (a)the density diagrams and the profiles at different moments;(b)the error density diagram.

Fig.3. The breather solution q(x,t): (a)the density diagrams and the profiles at different moments;(b)the error density diagram.

Fig.4. The three-dimensional motion of q(x,t): (a)the two-soliton solution;(b)the breather solution.

For the numerical simulation of the three-soliton solution,we only need to takeN=3 in Eqs. (12)–(15) to get the exact solution of the three-soliton solution, and then discretize the initial and boundary value data of the exact solution as our original dataset and train our network to simulate the corresponding three-soliton solution numerically. Similarly,Nsoliton solutions can be learned by the same approach. Of course,the higher the order of soliton solution,the more complex the form of the solution,then the longer the resulting network training time takes.

4. Rogue wave solutions of the nonlinear Schr¨odinger equation

Recently, the research of rogue wave has been one of the hot topics in many areas such as optics, ocean dynamics,plasma,Bose–Einstein condensate,and even finance.[8,9,54–56]In addition to the peak amplitude more than twice of the background wave,rogue waves also have the characteristics of instability and unpredictability. Therefore, the study and application of rogue waves play a momentous role in real life,especially in avoiding the damage to ships caused by ocean rogue waves. As a one-dimensional integrable scalar equation, the nonlinear Schr¨odinger equation plays a key role in describing rogue waves. In 1983, Peregrine[2]first gave a rational rogue waves to the nonlinear Schr¨odinger equation,whose generation principle is identified as the evolution of the breather waves when the period tends to infinity. At present,the researches on rogue wave of this equation through data-driven methods, such as machine learning, are relatively less. Marcucciet al.[48]have studied the computational machine in which nonlinear waves replace the internal layers of neural networks, discussed learning conditions, and demonstrated functional interpolation, datasets, and Boolean operations. When considering the solitons,rogue waves,and shock waves of the nonlinear Schr¨odinger equation,highly nonlinear and even discontinuous regions play a leading role in the network training and solution calculation.In this section,we construct the rogue wave solutions of the nonlinear Schr¨odinger equation by the neural network with underlying physical constraints. Here,we consider the another form of focusing nonlinear Schr¨odinger equation along with Dirichlet boundary conditions given by

whereq0(x)is an arbitrary complex-valued function of space variablex, herex0,x1represent the lower and upper boundaries ofxrespectively, andt0,t1represent the initial and terminal time instants oftrespectively. In addition,this equation corresponds to Eq. (1) withα=1/2 andβ=1. The rogue wave solutions of Eq. (27) can be obtained by lots of different tools.[11]Therefore,we can get respectively the one-order rogue wave and the two-order rogue wave of Eq. (27) as follows:

In the following two parts, we will construct the training dataset to reconstruct our predicted solutions based on the above two rogue wave solutions by constructing a neural network with 9 hidden layers and 40 neurons per hidden layer.

4.1. One-order rogue wave

In this subsection, we will numerically uncover the oneorder rogue wave of the nonlinear Schr¨odinger equation using the neural network method above. Now, we take [x0,x1] and[t0,t1] in Eq. (27) as [−2.0,2.0] and [−1.5,1.5], respectively.The corresponding initial condition is obtained from Eq.(28),we have

Next,we obtain the initial and boundary value dataset by the same data discretization method in Subsection 3.1, and then we can simulate precisely the one-order rogue wave solution by feeding the data into the network. By randomly subsamplingNq=100 from the original dataset and selectingNf=10000 configuration points which are generated by LHS,a training dataset composed of initial-boundary data and collocation points is generated. After training,the neural network model achieves a relative L2error of 7.845201×10−3in about 871 seconds,and the number of iterations is 9584.

Our experiment results are summarized in Fig.5,and we simulate the solutionq(x,t) and then obtain the density diagrams,profiles at different instants,error dynamics diagrams,three dimensional motion and loss curve figure of the oneorder rogue wave. Specifically,the magnitude of the predicted spatio-temporal solution|q(x,t)| is shown in the top panel of panel(a)of Fig.5. It can be simply seen that the amplitude of the rogue wave solution changes greatly in a very short time from the bottom panel of Fig. 5(a). Meanwhile, we present a comparison between the exact and the predicted solution at different time instantst=−0.75,0,0.75. Figure 5(b) reveals the relative L2error becomes larger as the time increases.From Fig.5(d),we can observe that when the number of iterations is more than 2000,there are some obvious fluctuations which we could call“burr”in the training,it does not exist during the training process about the one-soliton solution of the nonlinear Schr¨odinger equation.With only a handful of initialboundary data,one can accurately capture the intricate nonlinear dynamical behavior of the integrable Schr¨odinger equation by this method.

Fig.5. The one-order rogue wave solution q(x,t): (a)the density diagram and profiles at three different instants;(b)the error density diagram;(c)the three-dimensional motion;(d)the loss curve.

In addition, based on the same initial and boundary values of the one-order rogue waves in the case ofNq=100 andNf=10000, we employ the control variable method often used in applied sciences to study the effects of different numbers of network layers and neurons per hidden layer on the one-order rogue wave dynamics of nonlinear Schr¨odinger equation. The relative L2errors of different network layers and different neurons per hidden layer are given in Table 1. From the data in Table 1,we can see that when the number of network layers is fixed, the more the number of single-layer neurons, the smaller the relative error becomes. Due to the influence of randomness caused by some factors,there are some cases that do not conform with the above conclusion. However,when the number of single-layer neurons is fixed,the influence of the number of network layers on the relative error is not obvious. To sum up, we can draw the conclusion that the network layers and the single-layer neurons jointly determine the relative L2error to some extent. In the case of the same training dataset. Table 2 shows the relative L2error with 9 network layers and 40 neurons per hidden layer when taking different numbers of subsampling pointsNqin the initial-boundary data and collocation pointsNf. From Table 2, we can see that the influence ofNqon the relative L2error of the network is not obvious, which also indicates the network model with physical constraints can uncover accurate predicted solutions with smaller initial-boundary data and relatively many sampled collocation points.

Table 1. One-order rogue wave of the nonlinear Schr¨odinger equation: Relative final prediction error estimations in the L2 norm for different numbers of network layers and neurons per hidden layer.

Table 2.One-order rogue wave of the nonlinear Schr¨odinger equation:Relative final prediction error measurements in the L2 norm for different numbers of Nq and Nf.

4.2. Two-order rogue wave

In the next example,we consider the two-order rogue wave of the nonlinear Schr¨odinger equation,and properly take[x0,x1]and [t0,t1] in Eq. (27) as [−2.0,2.0] and [−0.5,0.5]. Here we consider the corresponding initial condition from Eq. (29) as follows:

Fig.6. The two-order rogue wave solution q(x,t): (a)the density diagrams and the snapshots at three different instants; (b)the error density diagram;(c)the three-dimensional motion;(d)the loss curve figure.

We use the same data discretization method in Subsection 3.1 to collect the initial and boundary data.In the network architecture,initial and boundary training dataset ofNq=100 are randomly subsampled from the original initial-boundary data. In addition,configuration points ofNf=10000 are sampled by LHS.Finally, the hidden two-order rogue wave solution of nonlinear Schr¨odinger equation is approximated fairly accurately by constraining the loss function with underlying physical laws. The neural network model achieves a relative L2error of 1.665401×10−2in about 1090 seconds, and the number of iterations is 11450.

The detailed illustration is shown in Fig.6. The top panel of Fig. 6(a) gives the density map of hidden solutionq(x,t),and when combing Fig.6(b)with the bottom panel in Fig.6(a),we can see that the relative error is relatively large att=0.25.From Fig.6(d),in contrast with the one-order rogue wave solution,the fluctuation(burr phenomenon)of the loss function is obvious when the number of iterations is less than 3000.

5. Summary and discussion

In this paper, we introduced a physically-constrained deep learning method based on PINN to solve the classical integrable nonlinear Schr¨odinger equation. Compared with traditional numerical methods,it has no mesh size limits. Moreover, due to the physical constraints, the network is trained with just few data and has a better physical interpretability.This method showcases a series of results of various problems in the interdisciplinary field of applied mathematics and computational science which opens a new path for using deep learning to simulate unknown solutions and correspondingly discover the parametric equations in scientific computing.

Specifically, we apply the data-driven algorithm to deduce the soliton solutions, breather solution, and rogue wave solutions to the nonlinear Schr¨odinger equation. We outline how different types of solutions (such as general soliton solutions,breather solution,and rogue wave solutions)are generated due to different choices of initial and boundary value data. Remarkably, these results show that the deep learning method with physical constraints can exactly recover different dynamical behaviors of this integrable equation. Furthermore,the sizes of space-time variablexand intervaltare selected by the dynamical behaviors of these solutions. For the breathers,in particular,the wider the interval of time variablet,the better we can see the dynamical behavior in this case. However,with a wider range of time intervalt,the training effect is not very good. So more complex boundary conditions, such as Neumann boundary conditions,Robin boundary conditions or other mixed boundary conditions, may be considered. Similarly,for the integrable complex modified Korteweg–de Vries(mKdV) equation, the Dirichlet boundary conditions cannot recover the ideal rogue wave solutions.

The influence of noise on our neural network model is not introduced in this paper. This kind of physical factors in real life should be considered to show the network’s robustness. Compared with static LHS sampling with even mesh sizes,more adaptive sampling techniques should be considered in some special problems, for example, discontinuous fluid flows such as shock wave. In addition, more general nonlinear Schr¨odinger equation, such as the derivative Schr¨odinger equation, is not investigated in this work. These new problems and improvements will be considered in the future research.