APP下载

Analysis of influence of observation operator on sequential data assimilation through soil temperature simulation with common land model

2018-11-15XiaoleiFuZhongYuYongjianDingYingTangHaishenXiaoleiJiangQinJu

Water Science and Engineering 2018年3期

Xiao-lei Fu,Zhong-o Yu*,Yong-jian Ding,Ying Tang,Hai-shen LüXiao-lei JiangQin Ju

aCollege of Civil Engineering,Fuzhou University,Fuzhou 350116,China

bState Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering,Hohai University,Nanjing 210098,China

cState Key Laboratory of Cryospheric Science,Northwest Institute of Eco-Environment and Resources,Chinese Academy of Sciences,Lanzhou 730000,China

dDepartment of Geography,Environment,and Spatial Sciences,Michigan State University,East Lansing,MI 48824,USA

Abstract An observation operator is a bridge linking the system state vector and observations in a data assimilation system.Despite its importance,the degree to which an observation operator influences the performance of data assimilation methods is still poorly understood.This study aimed to analyze the influences of linear and nonlinear observation operators on the sequential data assimilation through soil temperature simulation using the unscented particle filter(UPF)and the common land model.The linear observation operator between unprocessed simulations and observations was first established.To improve the correlation between simulations and observations,both were processed based on a series of equations.This processing essentially resulted in a nonlinear observation operator.The linear and nonlinear observation operators were then used along with the UPF in three assimilation experiments:an hourly in situ soil surface temperature assimilation,a daily in situ soil surface temperature assimilation,and a moderate resolution imaging spectroradiometer(MODIS)land surface temperature(LST)assimilation.The results show that the filter improved the soil temperature simulation significantly with the linear and nonlinear observation operators.The nonlinear observation operator improved the UPF's performance more significantly for the hourly and daily in situ observation assimilations than the linear observation operator did,while the situation was opposite for the MODIS LST assimilation.Because of the high assimilation frequency and data quality,the simulation accuracy was significantly improved in all soil layers for hourly in situ soil surface temperature assimilation,while the significant improvements of the simulation accuracy were limited to the lower soil layers for the assimilation experiments with low assimilation frequency or low data quality.

©2018 Hohai University.Production and hosting by Elsevier B.V.This is an open access article under the CC BY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).

Keywords:Observation operator;Unscented particle filter(UPF);Soil temperature;MODIS LST;Data assimilation

1.Introduction

Soil temperature plays a key role in land surface processes,and has a large impact on the energy partitioning and the landatmosphere water cycle process(Huang et al.,2008;Yu et al.,2014a).Currently,it is difficult to obtain large-scale soil temperature measurements in three spatial dimensions with high accuracy.Although in situ observations generally have high accuracy,they are limited to sparse observation sites(Owe and de Jeu,2003;Yu et al.,2014b).Land surface models can predict three-dimensional soil temperature distribution for regional applications,but the accuracy of their predictions is not high(Mihalakakou,2002;Chau et al.,2005;Wu et al.,2007).Additionally,the accuracy of remote sensing of soil temperature is relatively low with surficial observations,but it provides data on a large scale(Diak and Whipple,1993;Wan and Dozier,1996;Njoku and Li,1999;Owe and de Jeu,2003).More recently,data assimilation has been increasingly used as an effective way to improve the soil temperature prediction accuracy of land surface models by assimilating in situ observations or remote sensing data(Huang et al.,2008;Yu et al.,2014b).

Sequential data assimilation,an important type of data assimilation,has been developed for several decades for modeling land-atmosphere processes(McLaughlin,2002;Kumar et al.,2008;Reichle,2008).The Kalman filter(Kalman,1960)is the most commonly known algorithm for solving linear problems.In order to account for nonlinearity,the extended Kalman filter(EKF),which uses the first-order term of the Taylor series expansion,was proposed(Miller et al.,1999;Kumar and Kaleita,2003;Han and Li,2008;Lü et al.,2010).Subsequently,computationally efficient forms of EKF were introduced,such as the singular evolutive extended Kalman filter(SEEKF)(Pham et al.,1998)and the reduced rank square root Kalman filter(RRSRKF)(Verlaan, 1998;Verlaan and Heemink,2001).To overcome the difficulties in dealing with nonlinear problems,the ensemble Kalman filter(EnKF),based on the particle ensemble,was proposed and applied efficiently.However,one limitation of the EnKF is its assumption of a Gaussian distribution(Evensen,1994,1997;Burgers et al.,1998;Bengtsson et al.,2003;Fu et al.,2014).The unscented particle filter(UPF)(van der Merwe,2004;Han and Li,2008)was introduced by combining the unscented Kalman filter(UKF)(Julier and Uhlmann,2004;Han and Li,2008)and the particle filter(PF)(Han and Li,2008).The advantage of the UPF is that particles selected are not given a fixed probability distribution,which is a feature of the UKF,and different weights are set to different particles according to their contributions to system state simulation,as in the PF.

To date,the development of data assimilation methods has been focusing on improving the simulation/prediction accuracy by changing the assimilation mechanism.Despite the importance of the observation operator as the link between observations and the system state vector,the influence of the observation operator on the performance of data assimilation methods is seldom discussed.In the data assimilation process,different errors are introduced by different observation operators(e.g.,the linear/nonlinear function and community microwave emission model),which may lead to different data assimilation performances.Accordingly,further insight on observation operators in data assimilation is needed to improve the simulation accuracy in assimilation of in situ observations or remote sensing data.

With this background,this study aimed to assess the effect of the observation operator on the performance of data assimilation methods.The UPF was used in three assimilation experiments on soil temperature simulations with the common land model(CLM)(Oleson et al.,2004;Dai et al.,2003)in the Walnut Gulch Experimental Watershed(WGEW),in the United States.A linear observation operator was used for the original simulations and observations,and a nonlinear one was used for the processed data.

2.Study area and data

The WGEW,located in southeastern Arizona,with an area of approximately 149 km2(Keefer et al.,2008),was selected as the study area(Fig.1).A large amount of in situ and remote sensing data were available in the watershed.The data used in the study included meteorological forcing data,soil hydrological properties(e.g.,air temperature,wind speed,net radiation,soil moisture,soil temperature,and soil heat flux),and moderate resolution imaging spectroradiometer(MODIS)land surface temperature(LST).Other parameters and variables(e.g.,leaf area index and day length)required for the study were obtained from the Agricultural Research Service(ARS)of the United State Department of Agriculture(USDA)(http://www.tucson.ars.ag.gov/dap/).Detailed information about the study area,including its climate and other geographical information,can be found in Keefer et al.(2008).The collected soil temperature every hour and MODIS LST at the Lucky Hills meteorological(LHMet)site in the WGEW were used to analyze the influence of the observation operator on data assimilation.

3.Methodology

To achieve the objective of this study,the following three experiments with the linear and nonlinear observation operators in a real-world application of soil temperature simulation were performed:(1)assimilating the hourly in situ soil surface temperature,(2)assimilating the daily in situ soil surface temperature,and(3)assimilating the MODIS LST taken once per day.

3.1.Model operator

In this study,the CLM was selected as the model operator.The model simply treats the surface processes and produces the essential land-atmosphere characteristics for climate or water predictions(Oleson et al.,2004;Dai et al.,2003).The soil layers were discretized into five layers in the soil column in this study.The heat diffusion equation of soil temperature in the soil column is as follows:

Fig.1.Study area and location of LHMet site.

where c is the volumetric soil heat capacity(J·m-3·K-1),T is the soil temperature(K),t is time(h),F is the heat flux(W·m-2),λ is the thermal conductivity(W·m-1·K-1),and z is the depth(m).Discretization of the soil heat diffusion equation and other details can be found in Oleson et al.(2004)and Dai et al.(2003).

3.2.Unscented particle filter

In the data assimilation system,there are two functions:one is the nonlinear state function M,which maps the state vector of soil temperature Xi-1in all layers at time step i-1 to the state vector Xiat time step i,and the other is the observation function H,which specifies the deterministic relationship between the system state and observations(Anderson,2001;Whitaker and Hamill,2002;Kumar et al.,2008;Xie and Zhang,2010).These two functions can be expressed as follows:

where Yiis the vector of observed soil temperature in all layers at time step i,and Vi-1and Ui-1are the model state error vector and measurement error vector,respectively,at time step i-1.

UPF was used in this study,with the mean and covariance for the proposed distribution of each particle obtained using the UKF(van der Merwe,2004;Han and Li,2008).A detailed description of the UPF can be found in Han and Li(2008).

3.3.Evaluation criterion

The root mean square error(RMSE)was used to analyze the influence of the observation operator on the sequential data assimilation.RMSE of layer j can be expressed as follows:

4.Observation operator formulation

4.1.Linear observation operator

Due to the fact that the in situ soil surface temperature observations and MODIS LST were assimilated into the assimilation system,the observation operator between the simulations and observations of the soil surface temperature from October 2(day 276)to October 21(day 295),2004 was constructed after model calibration.The original soil surface observations and simulations were used to construct the linear observation operator for the three experiments,which are shown in Fig.2.

Fig.2 shows the linear observation operators between hourly in situ observations and simulations,daily in situ observations and simulations,and MODIS LST and simulations of the soil surface temperature,respectively,for the corresponding experiments at the LHMet site,where R2is the coefficient of determination.The results show that only the correlation coefficient(r)between hourly in situ observations and simulations was high,with r>0.92.As the MODIS sensor passed over the WGEW around 10:40 a.m.each day,in situ observations at 10:40 a.m.were used as daily observations,in order to provide the closest possible point of comparison between in situ and remote sensing measurements.The correlation coefficients between daily in situ observations and simulations and between MODIS LST and simulations were relatively low,with r<0.65.In addition,the recorded daily soil surface temperature(at 10:40 a.m.)and MODIS LST were higher than the corresponding temperature from simulations at most time,with the independent variable coefficient significantly larger than 1 for daily in situ observations and MODIS LST shown in Fig.2(b)and(c).

Since the linear observation operator did not adequately capture the relationships between daily in situ observations and simulations and between MODIS LST and simulations,it might introduce large errors into data assimilation and weaken the filter's performance.Accordingly,to analyze the influence of observation operators on the filter's performance,a nonlinear observation operator was introduced based on the processed simulations and observations.

Fig.2.Linear observation operators between observations and simulations and between MODIS LSTand simulations of soil surface temperature for three assimilation experiments at LHMet site.

4.2.Data processing

To adequately represent the relationship between observations and simulations,in situ observations,MODIS LST,and simulations of soil surface temperature were processed using the following equations:

where xsiis the simulated soil surface temperature at assimilation time i,and ysiis the in situ observation of soil surface temperature or MODIS LSTat assimilation time i.i varies from 1 to m,where m is the total number of assimilation times.

Based on Eqs.(6)and(7),the processed results of soil surface temperature simulations,in situ soil surface temperature observations,and MODIS LST are shown in Fig.3.It can be seen from the figure that the correlation between the simulations and daily in situ observations or MODIS LSTwas still poor,while that for the hourly in situ observations was good.It is noted that Obs-var,Sim-var,and LST-var were the processed results of observed soil surface temperature,simulated soil surface temperature,and MODIS LST,respectively,according to Eqs.(6)and(7).

The data were further processed using the following equation to improve the correlation between simulations and observations:

where cwas 0.5 in this study.

The data further processed according to Eq.(8),here referred to as Sim-Obs-var and Sim-LST-var,corresponding to the variations of observed soil surface temperature and MODIS LST with simulated results,respectively,are shown in Fig.3.It is noted that the correlations between LST-var and Sim-LST-var and between Obs-var and Sim-Obs-var were high for the three assimilation experiments.

4.3.Nonlinear observation operator

To formulate a nonlinear observation operator,the relationships between LST-var and Sim-LST-var and between Obs-var and Sim-Obs-var can be written as

Fig.3.Processed results of in situ observations,MODIS LST,and simulations of soil surface temperature at LHMet site.

where a is the independent variable coefficient and b is a constant.The nonlinear observation operator between in situ observations and simulations,or between MODIS LST and simulations of soil surface temperature,was derived using Eqs.(6)through(9),as below:

After data processing,linear relationships between Obs-var and Sim-Obs-var and between LST-var and Sim-LST-var(i.e.,the nonlinear observation operators between original hourly observations and simulations,original daily observations and simulations,and MODIS LST and simulations of soil surface temperature at the LHMet site,respectively)could be obtained,as shown in Fig.4.The correlation coefficients between Obsvar and Sim-Obs-var of hourly soil surface temperature and daily soil surface temperature,and between LST-var and Sim-LST-var of MODIS LST were all high,with r=0.992,0.947,and 0.971,respectively.Additionally,the independent variable coefficient was around 1 for the three experiments.The results suggest that the correlation coefficients between original hourly and daily in situ soil surface temperature observations and simulations,and between MODIS LST and simulations,based on Eq.(10),were higher than those shown in Fig.2.

Comparing the correlations between the original observations and simulations in sections 4.1 and 4.3,it can be seen that the correlation was significantly improved after data processing(e.g.,R2increased from0.855,0.394,and0.345 before data processing to0.984,0.897,and0.942after data processing,for the hourly and daily in situ soil surface temperature assimilation experiments and the MODIS LST assimilation experiment,respectively).Accordingly,the nonlinear observation operator better reflected the relationship between the system state vector and observations compared to the linear observation operator. Eq. (10) was used as the nonlinear observation operator in this study.

5.Results and discussion

The observation operators were used in the hourly and daily in situ soil surface temperature as well as MODIS LST assimilation experiments,as discussed in Section 4.For the purpose of this discussion,the results from the last five days(days 291-295,2004)were used to demonstrate the influence of the observation operators on data assimilation performance in the three experiments.Yu et al.(2014b)has demonstrated that some data assimilation methods do not perform well in soil temperature simulations of the bottom soil layer through assimilation of the soil surface temperature or MODIS LST,and thus,only the soil temperature for the first four layers are discussed in the following sections.

5.1.Hourly in situ soil surface temperature assimilation

The hourly in situ soil temperature observations versus simulations and assimilations using the linear and nonlinear observation operators are shown in Fig. 5. The simulation accuracy was improved significantly after assimilation of soil surface temperature using the UPF,with the linear and nonlinear observation operators.The figure also shows that simulations generally overestimated the soil temperature,except for in the first soil layer,where simulations underestimated the soil surface temperature around peaks.Overall,for the first layer,the temperature values from assimilations were very close to those observed.However,for the second and third layers,while simulations overestimated the soil temperature significantly,assimilations insignificantly overestimated the soil temperature.In contrast to the results in the second and third layers,assimilations in the fourth layer were close to observations.

It is noted that assimilated results obtained using the UPF with the nonlinear observation operator were closer to the observations than those with the linear observation operator in the first three soil layers,meaning that the UPF with the nonlinear observation operator generally performed better(i.e.,closer to observations)than that with the linear observation operator.In the fourth layer,the UPF with the nonlinear observation operator might perform more poorly than that with the linear observation operator based on the comparison of assimilated results with observations.

Fig.4.Nonlinear observation operators between observations and simulations and between MODIS LST and simulations of soil surface temperature for three assimilation experiments at LHMet site.

Fig.5.Last five-day results of assimilating hourly in situ soil surface temperature using UPF with linear and nonlinear observation operators.

Table 1 shows RMSE values of assimilations and simulations for the last five days.It can be seen that the RMSE values of assimilations,as compared to observations,were much lower than those of simulations,further indicating that using the UPF improved the soil temperature simulation accuracy significantly in the soil layers.The RMSE values of assimilations that used the UPF with the nonlinear observation operator were lower than those with the linear observation operator for first three soil layers,but that was not the case for the fourth layer,indicating that the UPF performed better in the upper three soil layers when the nonlinear observation operator was used.

5.2.Daily in situ soil surface temperature assimilation

To compare the results of the MODIS LST assimilation,the in situ soil surface temperature at 10:40 a.m.every day was assimilated with the model as the remote sensor passed over the area at around 10:40 a.m.The assimilated results of the last five days are shown in Fig.6.

For the first layer at the soil surface,it is not clear that the UPF with different observation operators could improve the simulation accuracy significantly by assimilating the observed daily in situ soil surface temperature,as shown in Fig.6.Table 1 shows that the UPF could improve the simulation accuracy,but not significantly in this layer.According to the results from the second to the fourth layer shown in Fig.6,assimilations were closer to observations than simulations.The RMSE values in Table 1 also show that the UPF can improve the simulation accuracy significantly in these three layers.

Fig.6 also shows that the difference between the assimilated results with different observation operators was not significant in all soil layers.According to Table 1,the UPF performed better with the nonlinear observation operator than with the linear one for most soil layers,even though the improvement was not significant.

5.3.MODIS LST assimilation

Fig.7 shows the results of the MODIS LST assimilation using the UPF versus observations.Similar to the daily in situ soil surface observation assimilation experiment,the filter also improved the simulation accuracy by assimilating MODIS LST.The UPF improved the simulation accuracy significantly in the second to fourth soil layer,but the improvement was not significant in the first soil layer.

Fig.7 shows that assimilated results obtained using the UPF with the linear observation operator were closer to observations than those with the nonlinear observation operator for the last three layers.Table 1 also shows that the RMSE values of assimilations,with the nonlinear observation operator,were larger than those with the linear observation operator(i.e.,the UPF performance worsened as the observation operator changed from linear to nonlinear for the MODIS LST assimilation).

When Fig.7 was compared with Fig.6,the difference between the results of the two assimilation experiments wasvery small.Table 1 shows that the UPF performed a little better in the daily in situ soil surface temperature assimilation than in the MODIS LST assimilation.

Table 1 RMSE values of last five-day simulations and assimilations using UPF with linear and nonlinear observation operators.

Fig.6.Last five-day results of assimilating daily in situ soil surface temperature using UPF with linear and nonlinear observation operators.

Based on the results of the three assimilation experiments,the UPF improved the simulation accuracy regardless of the low assimilation frequency or the low quality of assimilated data.For the hourly in situ soil surface temperature assimilation,significant improvement could be found for all the four soil layers.For the daily in situ soil surface temperature and MODIS LST assimilations,the significant improvement was limited to the lower soil layers,with only marginal improvement for the first soil layer.

5.4.Discussion

From the results described above,it can be seen that when the assimilation frequency and data quality were high,the filter could improve the simulation accuracy significantly no matter which observation operator was used in the data assimilation system.However,for the low assimilation frequency,the filter's performance was poor for the first layer.The reason could be that high-quality data frequently assimilated to the land surface model might correct simulated results frequently.

It can be seen that the performance of the data assimilation method was improved significantly when the nonlinear observation operator was used for the hourly in situ soil surface temperature assimilation.This implies that the filter's performance was better and the simulated results were much more optimized when the nonlinear observation operator was used at a higher assimilation frequency,leading to assimilated results much closer to observations.For the daily in situ soil surface temperature assimilation,the difference between the results with the linear and nonlinear observation operators was marginal.The two experiments(hourly and daily assimilation)show that the filter's performance was improved as the observation operator was changed from linear to nonlinear.This was related to the higher correlation between simulations and observations when the nonlinear observation operator was used,introducing lower amounts of error into the assimilation system.

Fig.7.Last five-day results of assimilating MODIS LST using UPF with linear and nonlinear observation operators.

Fig.8.Comparison between MODIS LST and daily in situ soil surface temperature at 10:40 a.m.at LHMet site.

In the MODIS LST assimilation experiment,the UPF performed worse with the nonlinear observation operator than with the linear observation operator.This could be attributed to the low-quality data of MODIS LST,compared to daily in situ soil surface temperature observations.Fig.8 shows that MODIS LST overestimated the daily in situ soil surface temperature for most points,with a low correlation coefficient between them,implying that a higher correlation between simulations and MODIS LST would result in larger errors in the assimilation system and weaken the filter performance.

6.Conclusions

The aim of this study was to analyze the influence of the observation operator on the sequential data assimilation through soil temperature simulation with the UPF and CLM.To achieve the objective,three assimilation experiments with different assimilation frequencies,assimilated data,and observation operators were conducted at the LHMet site in WGEW in Arizona,in the United States.

The following conclusions are drawn:Both the linear and nonlinear observation operators improved the filter's performance regardless of the low assimilation frequency or the low quality of assimilated data.The nonlinear observation operator improved the filter's performance more significantly in the hourly and daily in situ soil surface temperature assimilations than the linear observation operator.However,for the MODIS LST assimilation,low-quality data and the nonlinear observation operator weakened the filter's performance.For the cases of the daily in situ soil surface temperature and MODIS LST assimilations,significant improvements of the simulation accuracy were limited to the lower soil layers,while for the case of the hourly in situ soil surface temperature assimilation,the simulation accuracy was significantly improved in all the soil layers because of high assimilation frequency and data quality.