APP下载

Distributed online mission planning for multi-player space pursuit and evasion

2016-11-23LiuYunYeDongHoYong

CHINESE JOURNAL OF AERONAUTICS 2016年6期

Liu Yun,Ye Dong,Ho Yong

aCollege of Automation,Harbin Engineering University,Harbin 150001,China

bResearch Center of Satellite Technology,Harbin Institute of Technology,Harbin 150001,China

Distributed online mission planning for multi-player space pursuit and evasion

Liu Yuana,Ye Dongb,*,Hao Yonga

aCollege of Automation,Harbin Engineering University,Harbin 150001,China

bResearch Center of Satellite Technology,Harbin Institute of Technology,Harbin 150001,China

There are three important roles in evasion conflict:pursuer,target and defender.Pursuers’mission is to access targets;targets’mission is to escape from pursuers’capture;defenders’mission is to intercept pursuers who are potentially dangerous to targets.In this paper,a distributed online mission plan(DOMP)algorithm for pursuers is proposed based on fuzzy evaluation and Nash equilibrium.First,an integrated effectiveness evaluation model is given.Then,the details of collaborative mission planning which includes the co-optimization of task distributing,trajectory and corresponding maneuvering scheme are presented.Finally,the convergence and steadiness of DOMP are discussed with simulation results.Compared with centralized mission planning,DOMP is more robust and can greatly improve the effectiveness of pursuing.It can be applied to dynamic scenario due to its distributed architecture.

1.Introduction

In addition to the standard pursuit-evasion game,in which the pursuer tries to catch the target and the target tries to escape,the defender intercepts pursuer to protect target.This conflict has attracted increasing attention in space technology research.1As one type of low-cost spacecraft,small satellites with characteristics of lightness,agility,large coverage,etc.have become the ideal pursuer in the study of pursuitevasion conflict.2–4Some representative projects are US AFRL’s(Air Force Research Laboratory)XSS(Experimental Satellite System),DARPA’s(Defense Advanced Research Project Agency),MiTEx(Mirco-satellite Technology Experiment)and NASA’s DART(Demonstration for Autonomous Rendezvous Technology).3–5

To successfully access targets,pursuers must survive from defenders’interception.6Facing multiple targets and multiple defenders,pursuers must cooperate to obtain the optimal pursuing effectiveness.7–11Therefore,collaborative mission planning is important for pursuers.

For a multi-agent system,a joint strategy is a set of individual policies with one for each agent.An optimal strategy can maximize the expected gains of the system.The process of obtaining the optimal strategy is called mission planning,which is an NP-hard problem.Generally,there are two types of mission planning:centralized and distributed.Typically,a centralized planning system consists of one leader and several followers,where the leader is a unique planner of the system.Because multi-agent models and the planning algorithm lead to huge state spaces,the leader needs to have significant computational resources.Once it is damaged,the whole system would crash.On the other hand,mission planning could run on each agent as well,in which case every agent has the same priority.Therefore,one or a few of agents’failure would not affect the whole system’s operation.In a distributed planning system,each agent calculates its own policy while other agents’policies are considered fixed,and broadcasts their planning results to others.Since any modification of an agent’s policy might induce other agents’reward change,agents iteratively update their policies until a system equilibrium is reached.Compared with centralized planning,distributed planning has advantages of autonomous agents,robust system and real time.12Hence,distributed planning is more applicable in a high dynamic antagonistic application,such as space pursuit and evasion.In the past few years,popular approaches for multi-agent mission planning are contract net protocol,visibility graph and heuristic algorithms such as genetic algorithm(GA).13,14

In a space pursuit and evasion conflict,pursuers’mission planning is a complex problem,which is affected by many factors such as high relative speed,strict fuel limitation,real-time requirement.Besides,the intercepting role of defenders further increases the complexity of pursuers’planning,which creates constraints on well-developed algorithm such as visibility graph.Therefore,previous works mainly focus on the optimization and modeling of single pursuer’s trajectory.15,16Multi-player space pursuit and evasion problem still lacks a thorough and systematic study.In this paper,a distributed online mission plan(DOMP)algorithm for multi-player pursuers is presented,and the dynamics used in this algorithm is explained.Then a multi-objective effectiveness model is used to quantitatively estimate policies.After a set of potential pursuing trajectories is obtained by fuzzy comprehensive evaluation,pursuers can reach the Nash equilibrium point through local optimization,negotiation and iteratively updating information aboutneighbors.Finally,asimulation example demonstrates the effectiveness of DOMP algorithm.

2.Problem description

2.1.Space pursuit and evasion conflict

Normally,to reduce energy consuming and prolong on-orbit life,small satellites that perform as pursuers work in standby mode,in which the satellites only make periodic selfinspection and necessary orbit maintenance.Once they receive wake-up command,they will go into preparation mode and try to approach targets.

The basic concept of space pursuit and evasion conflict is shown in Fig.1.It includes a cluster of pursuers.In the intensive space pursuit and evasion scenario,we assume that there are multiple targets.Hence,cooperative pursuing trajectory optimization and target assigning are two critical factors in mission planning.

Fig.1 Basic conception of space pursuit and evasion.

2.2.Integral effectiveness model

Mission planning algorithm produces an optimal pursuing strategy for pursuers.It is evaluated by the integral pursuing effectiveness,which includes various factors,such as benefit,expending and capture probability.The evaluation of pursuing strategy is affected by the situation(objective)and the preference(subjective).Overall,a mathematical multi-objective effectiveness model for pursuing strategy can be expressed as Eq.(1).

where O(km)=[O1,O2,...,Om]is the current objective vector;m is the number of sub-objectives;kmis the time of current moment;λ is the normalized subjective weight vector with the proper dimension;E is the effectiveness;S and A are the state vectors of pursuers and targets respectively;D is the strategy vector of the pursuers;feis the effectiveness function representing the integral effectiveness according to the current strategy and the current situation;fois the objective function describing the mapping from state space to objective space;fsis the state transferring function;Ssand Ddidentify the state space and decision space respectively;rij(km)indicates the consumption of the jth resource when pursuer i executes current strategy;and Rij(km)represents the amount of the jth resource of pursuer i.

The essence of space pursuit and evasion mission planning is an optimization process with comprehensive constraints.The global optimal solution cannot always be acquired in limited time under the greatly varying circumstances(even if we obtain a global optimal solution with much time,it may be meaningless).Therefore,the key solution to mission planning is to achieve a series of sub-optimal strategies in limited time,and then select an optimal one which can best satisfy the mission’s requirements with the highest effectiveness.

3.Pursuing trajectory

A typical space-pursuing trajectory is shown in Fig.2,which consists of three types of orbits:standby orbit,transfer orbit and capture orbit.In peacetime,pursuers wait on standby orbits,and they can reach the capture orbits through one or several orbit transformations(in the track of transfer orbits).In capture orbit,a pursuer can access a target that is located at a specific space point at a scheduled time.Assuming that a pursuer receives a command at time t0and performs the orbit transfer at t1,we can give the standby time as Tw=t1-t0.If the scheduled pursuing terminal time is denoted by tf,the orbit transfer time can be calculated by Tt=tf-t1.In the space pursuit and evasion,once a pursuer’s orbit is discovered intersecting with any target’s orbit,the pursuer will be viewed as a threat by all targets and defenders.As a result,the threatened target will adjust its orbit to evade the danger and defenders will try to intercept the potential threat to protect the target.If the first orbit intersecting moment is denoted by tc,the preparing time Tpleft for the target and defenders is

where Tais the pre-warning time determined by the scout ability and information processing ability of targets and defenders.Obviously,smaller Tameans larger Tp,which is unfavorable to pursuers.

3.1.Strategy based on Lambert orbit transfer

In most of the time of space pursuit-evasion,pursuers are far from targets and not on the same orbit plane,which means that the C-W equation’s linearization condition cannot be met.17–19Moreover,for tactic purpose,we assume that pursuers are required to capture targets during scheduled time interval.According to the Lambert law,the orbit transferring time is decided by transferring orbit’s semi-major axis,the distance between the initial and end points,and the distance from the initial and end points to gravitational center.In addition,there are two typical impulse transfer methods:single-pulse and multi-pulse.

Fig.2 Typical trajectory of pursuer.

In single-pulse orbit transfer,the onboard rocket engine only generates one thrust impulse.Hence,the pursuer’s transfer orbit coincides with the capture orbit.As Eq.(3)shows,with the planned transferring moment t1,the pursuer’s corresponding ECI coordinate R1,the planned terminal time tf,the pursuer’s corresponding ECI coordinate Rf,the transfer orbit’s major semi-axis a and other five orbit elements can be obtained.In Eq.(3),the parameter kcis the number offlying cycles before the pursuer captures the target.Therefore,kc=0 and kc>0 correspond to single-circle and multi-circle mode,respectively.After transformed to the dimensionless form,the relationship between dimensionless semi-axis a and dimensionless Δt of Eq.(3)can be shown in Fig.3.When kc=0,only one reasonable transfer orbit exists.However,if kc>0,there are two transfer orbits:a long range one and a short range one(specifically,if Δt=tmin(i) (i=1,2,3...)where tmin(i)is the minimum flying time with i complete cycles,the long and short range orbits coincide with each other).According to Eqs.(4)and(5),for single-pulse orbit transfer,there are nspotential transfer orbits totally.

In multi-pulse orbit transfer,the onboard pulse rocket engine generates multiple thrust impulses.The combinations of impulses correspond to different orbit transfer schemes.In space pursuit and evasion conflict,the nonrenewable fuel consumed by the onboard rocket engine is one of the most valuable resources.Therefore,the reduction in characteristic velocity is always preferred in the design of orbit maneuver scheme.Denoting the angle between R1and Rfby ξ and the characteristic velocity by Δv,we have

Fig.3 Relationship between semi-major axis and Δt in a singlepulse orbit transfer.

where v-and v+are velocities of pursuer at R1in the standby orbit and capture orbit,respectively.Calculating the derivative of Δv,we have

According to the existence condition of the extreme value,when Eq.(7)equals zero as shown in Eq.(8),the energy optimized capture orbit can be obtained,of which the semi-axis is denoted by amin.The corresponding flying time tmin,characteristic velocity ΔVminand characteristic speed Δvmincan also be deduced.

If Δt> tmin,the pursuer has to perform one or several orbital maneuvers and fly in one or several special transfer orbits to consume the extra time.Then,it will perform the final orbital maneuver at a particular moment and fly to Rfalong the energy optimization capture orbit.According to Eq.(7),ΔVminis the energy optimized characteristic velocity for the pursuer when it changes state from(t1,R1)to(tf,Rf).By decomposing ΔVmin,the corresponding decomposed impulses can be obtained.Assuming that there are two decomposed impulses(more impulse elements are similar),we can give the first impulse ΔVcand the second impulse ΔVain Eq.(9).

Define Δvcthe speed of ΔVcand Δvathe speed of ΔVa.When ΔVcand ΔVaare in the direction of ΔVmin,the sum of Δvcand Δvaequals to the characteristic speed Δvmin,which means that the rocket fuel is conserved to the most extent.

To make the directions of ΔVc, ΔVaand ΔVminthe same,the flying time in transfer orbit should be integer multiples of the transfer orbit period:

where acis the transfer orbit’s semi-axis and Tcis the transfer orbit’s period.Fig.4 shows the relationship between the dimensionless characteristic velocity and flying time which corresponds to different kc.From this figure,we can find that there are nmpotential multi-pulse orbital maneuver schemes as shown in Eq.(12),where Tminiis the minimal full period flying time oficycle,Tmaxithe maximal full period flying time of the i th cycle,and tclsthe actual flying time.

To access the target,pursuer’s real intention should not be discovered by targets and defenders.Even when discovered,pursuer should still survive from the interceptions by defenders.Hence,the characteristic velocity,the probability of intention being discovered,the preparing time left for defenders and the possibility of being intercepted are four main factors to be fully considered.

Fig.4 Relationship between characteristic velocity and flying time.

Before reaching the capture orbit,pursuer has to go through z-1 transfer orbit,where z is the number of impulses.The closer the transfer orbit is to the capture orbit,the more easily the pursuer maneuvers to the capture orbit,and hence the more dangerous the transfer orbit is,viewed by targets and defenders.Consequently,the minimum distance between transfer orbit and target’s orbit is defined as the sensitive distance,which can be used to analyze the potential threat degree of transfer orbit.According to Weber-Fischna law,the degree of perception is proportional to the degree of stimulus intensity’s logarithm.And the warning index Sdof pursuer’s transfer orbit is described as follows:

where K is the perception coefficient of the targets and defenders,and Dminthe sensitive distance.

When transferring to the capture orbit,the pursuer produces real threat to the target.Defenders will try to intercept and the target will begin to adjust orbit to evade.At this time,the pursuer tends to choose safer capture orbit.The defense area of each defender can be simplified as a moving spherical space which takes the point in its orbit as the center.15,16Then the threat of defender i can be quantitatively evaluated as fwi:

where Wiis the intercepting capability corresponding to the defender i;L1and L2(L1≤L2)are two distance thresholds;Baand Bdiare orbit elements of pursuer and defender i;and fdis the relative distance function.Then,the threat degree of defender to the pursuer Aacan be measured as

Obviously,if there is only one pursuer and one target in space,the profit of pursuing is fixed.Thus,we can only consider the cost,and the orbital maneuvering sub-object set can be defined as G={g1,g2,g3,g4}={Δv,Sd,Va,Aa},where Vais the value of pursuer.

3.2.Synthesized weight of orbital maneuvering sub-objects

We define the pursuer’s orbital maneuver scheme set as Ap={Ap1,Ap2,...,Ap(ns+nm)},and the effectiveness matrix constructed by the property value of scheme Apiunder object set G as E=[eij](ns+nm)×4.Since all the four sub-objects of G are costs(the smaller,the better),the normalized effectiveness matrix E*=[e*ij]can be expressed as

The relative importance degree of different objects can be represented as ordered preference values.The subjective relative preference for sub-object gito gjcan be written as pij.

Denote the optimal values of each sub-objects of E*by B=[b1,b2,b3,b4],whereand the deviation between B and E*is the sub-object’s objective weight,which represents the difference degree between one sub-target and the whole and the in fluence of this sub-object on other sub-targets.Therefore,the objective relative weight vectorcan be obtained with Eqs.(19)and(20).

The fusion of λ′and λ′can take advantage of both known objective and subjective information,so the integrated weight vector λ =[λ1,λ2,λ3,λ4]can be indicated through Eqs.(21)and(22).

After normalization,λ can be transformed into the standard form ¯λ =[¯λ1,¯λ2,¯λ3,¯λ4].

3.3.Fuzzy comprehensive effectiveness evaluation

Fuzzy comprehensive evaluation is an effective way to quantify various factors that are difficult to evaluate.20,21Most evaluation methods only generate an evaluation value,while the fuzzy evaluation can generate a vector,which contains more information.We suppose each sub-object of G has seven grades L={l1,l2,...,l7},which correspond to ‘best,better,good,middle,bad,worse and the worst”.Based on the normal membership function:

we can obtain the three-dimensional fuzzy membership matrixof the orbital maneuvering schemes of pursuers.Perform fuzzy fusion operation onE and λ,and we can obtain the fuzzy evaluation matrix E′.

Different results can be achieved with different fuzzy composition operators.With the main factor determining type fuzzy operator,as

we obtain the evaluation matrixWith the main factor outstanding type fuzzy operator,as

we obtain the relevant evaluation matrixWith the imbalance average type fuzzy operator,as

we obtain the evaluation matrix E′3.With the weighted average type fuzzy operator,as

we obtain the evaluation matrix

Based on each evaluation matrixwe can achievewhich represents the ith advice.Combining those advices with the Borda algorithm,we can achieve the final orbital maneuver scheme setand select the K~th scheme withas the optimal one.

4.Distributed mission planning

4.1.Effectiveness model of mission

We suppose that there are Napursuers and Notargets in space which are denoted by A={A1,A2,...,ANa} and O={O1,O2,...,ONo},respectively.In addition,there are Nddefenders which are denoted by D={D1,D2,...,DNd}.To simplify the calculation,we treat the defense area of each defender as a spherical space with the defense radius of Rsi(i=1,2,...,Nd).The threat to the pursuer from each defender is Wti(i=1,2,...,Nd).We denote the task distribution matrix as S=[sij]Na×No,where sij∈ {1,0}indicates that the target j is or is not assigned to the ith pursuer,and as shown in Eq.(30),each pursuer can only be assigned to no more than one target:

The pursuers should try their best to get away from the defense area.If they cannot do so,they should move away from the center of defense areas as far as possible to reduce the possibility of being intercepted.The judging function dais defined as

When Aiis pursuing Ojvia the capture orbit,its survival probability Psis

where VOjrepresents the value of target Ojand Pijthe success rate of the ith pursuer successfully approaching the jth target.Obviously,in order to maximize the profit,the algorithm should try its best to minimize fe1.

The cost of pursuit can be represented as fe2.

where Ciis the manufacturing and launching expense of the ith pursuer.The capability of pursuer will decay as its on-orbit time increases.The on-orbit attenuation coefficient can be represented as

where TLmaxiis the on-orbit lifetime of ith pursuer,and ϑ is the time attenuation constant.The task distributing algorithm will try to choose low cost or short residual life pursuers to execute the pursuing tasks.

There is a fixed proportional relationship between the fuel consumption and characteristic velocity;therefore,the task distributing algorithm tends to minimize the characteristic speed of pursuers.We can define a function representing the fuel consumption as follows:

where Δvmaxiis the maximal characteristic speed generated by the remaining fuel of the ith pursuer;Δvijis the characteristic velocity requirement when the ith pursuer is pursuing the jth target.

If there are more than one target for pursuers,the pursuers need to capture as much targets as possible under the current resource constraints.In order to describe this purpose,we can define the target coverage function as

Obviously,targets can be covered at the most when fe4reaches its minimum.

In order to achieve a satisfied result,each target’s capture probability should be greater than the desired thresholds(see Table 1).Each target’s scheduled thresholds can be denoted by Q={Q1,Q2,...,QNo}.

A threshold function sig1(x)is defined as

If the function above reaches its minimum,the capture probabilities of all targets are no more than the scheduled threshold.

Finally,combining all the influence factors above,we obtain the integral effectiveness function fE:

where φiis the weight of each sub-object.

4.2.Governing equation

In a multi-player collaborative space pursuit and evasion conflict,an admissible strategy is called Nash equilibrium strategy σ*if it satis fies Eq.(42).

where fEiand Siare the effectiveness evaluation function and strategy space for node i,respectively;the subscript-i repre-sents all other nodes except i.The Nash equilibrium calls for complete information exchange among all the players.

Table 1 Capture probability thresholds.

Table 2 Information of pursuers.

Table 3 Information of defenders.

Based on the idea above,we put forward DOMP algorithm,which runs on each pursuer to calculate the best pursuing trajectory for each target as Section 2 described,allows each pursuer to choose its best target with the highest effectiveness,and broadcasts its choice to other pursuers.After receiving other pursuers’decisions,each pursuer adjusts its scheme based on the global information,and re-broadcasts its updating choice.The negotiation will repeat until the cutoff conditions were satisfied which means that the Nash equilibrium solution given by Eq.(42)is obtained.DOMP will be restarted if the mission< AT,TL > or global situation < IA,ID,IT> changes,where AT and TL are the terminal time and target list,whereas IA,IDand ITare the information about pursuers,defenders and targets,respectively.

DNMP algorithm:

Step 1.Clear iterative number,qi=0.

Table 4 Information of targets.

Step 2.Update local information of mission and global situation.

Step 3.Generate orbital maneuver scheme for each target with the method in Section 2.1.

Step 4.Choose the effectiveness optimized scheme σi(qi)as Section 3.3 described.Set it as the initial scheme and broadcast it to others.

Table 5 Task distribution.

Fig.5 Trajectory planning results.

Step 5.Receive σ-i(qi)from other pursuers,and calculate local optimal scheme σi*(qi)based on Eq.(42).Then,broadcast it to others.

Step 6.Check whether the cutoff condition is met:|fEi(σ*(qi))– fEi(σ(qi))|<ε or iterative number qi> qmax.If all pursuers meet the cutoff conditions,go to Step 7;Otherwise set σi(qi)= σi*(qi),qi=qi+1,and go to Step 5.

Step 7.Output σi*(qi)as the mission planning result.

Step 8.Check the mission and global situation,if one of them is changed,go to Step 1,else stay in Step 8.

In space pursuit and evasion conflict,a pursuer may be intercepted on its way to the target.In order for DOMP to handle this situation,pursuers should periodically broadcast their life signals.A pursuer will be suspected to be damaged if its life signal cannot be received.To check its status,other pursuer will broadcast an inquiring order.Once the pursuer receives the inquiring order,it should restore its life signal and broadcast it immediately,otherwise other pursuers will eliminate it from the resource table.Under this circumstance,the global situation should be updated,which will lead to the restarting of DOMP algorithm.

Fig.6 Effectiveness index of planning.

Table 6 Calculating time of planning.

5.Test cases

We considera case in which there are 5 pursuers Ai(i=1,2,3,4,5),4 defenders Di(i=1,2,3,4)and 3 targets Oi(i=1,2,3). The orbital elements of each agent(semi-major axis a,eccentricity e,inclination id,longitude of ascending node Ω,argument of periapsis ω and time of perigee passing τ)are listed in Tables 2–4.The starting time is t0=706520 s and the end time is tf=735320 s.The simulation was run on an onboard computer simulator with CPU:MPC8245(133 MHz),SRAM 64 MB,Flash 512 MB.

Fig.8 Effectiveness index of re-planning after Pursuer 1 missing.

Table 7 Task re-distribution after Pursuer 1 missing.

Fig.7 Trajectory re-planning results after Pursuer 1 missing.

Table 8 Evasion orbit elements of Targets 1 and 2.

Table 9 Task re-distribution after Targets 1 and 2 evading.

DOMP leads to optimal task distribution(see Table 5)and pursuers’trajectories(see Fig.5).Target 2 is assigned to both Pursuers 3 and 5,and Targets 3 and 1 are assigned to Pursuers 3 and 5 respectively,while Pursuer 2 draws a bye in this planning.In addition,to maximize system efficiency,single-impulse maneuvering is preferred for Pursuers 1 and 3,while Pursuers 4 and 5 tend to implement multi-impulse maneuvering.Fig.5 shows the trajectory of each pursuer,along which the pursuer can finally approach its target.The system cost type effectiveness declines rapidly as the algorithm iteration number increases,which becomes stabilized after the third iteration(see Fig.6).A planning algorithm for the space pursuit and evasion conflict which is based on GA is used as a reference.16The task distribution results generated by GA are exactly the same as Table 5,which verifies the validity of DOMP on the other hand.However,the calculating time of GA is much longer than DOMP(see Table 6).Due to the low computing capability of onboard computer and the high dynamic characteristic of the application,planning algorithm should have low computation cost and good real-time performance.Since DOMP reduces computing time through distributed calculating effectively,it is practical and efficient for solving multiplayer space pursuit and evasion conflict problems.

Fig.9 Trajectory re-planning results after Targets 1 and 2 evading.

Fig.10 Effectiveness index of re-planning after Targets 1 and 2 evading.

After situation change,the ability of automatic re-planning is an important character of distribution planning.This feature of DOMP is verified by two steps.In the first step,we assume that Pursuer 1 is intercepted at the time of 711400 s.According to the previous section,once a pursuer’s life signal broadcasting breaks up for 60 s,other pursuers send the inquiring order with the highest priority to discover Pursuer 1.If there is no valid reply in another 60 s,Pursuer 1 is considered as disappeared by DOMP,which triggers re-planning automatically.The re-planning only involves surviving pursuers,which leads to optimal task distribution and trajectories under current situation(see Table 7 and Fig.7).Compared with previous planning results,Target 2 that was formerly assigned to Pursuer 1 is now assigned to Pursuer 2 which drew a bye in last round’s planning.Target 1 is assigned to Pursuer 3,while Target 3 is assigned to both Pursuers 4 and 5.However,because the remaining time is not enough for multi-pulse maneuvering,all 4 pursuers choose single-pulse maneuvering trajectories(see Fig.7),and the system effectiveness becomes stabilized after the fourth iteration(see Fig.8).

In the second step,we assume Targets 1 and 2 start maneuvering at 716520 s,and their new orbital elements are listed in Table 8.The changes of situation again trigger re-planning automatically.The optimal planning results are shown in Table 9 and Fig.9,from which we find out that Pursuers 3 and 4 exchange their targets to get better system effectiveness.The system effectiveness becomes stabilized after the fifth iteration(Fig.10).

According to the simulation results above,DOMP shows the convergence,efficiency and real-time performance in multi-playerspace pursuitand evasion online mission planning.

6.Conclusions

Taking the characteristics of space pursuit and evasion conflict into account,we proposed DOMP algorithm based on fuzzy evaluation and Nash equilibrium.Firstly,the fuzzy evaluation was used to combine the sub-object(profit,expense,risk)and generate the integral effectiveness model.Secondly,the task distribution and orbitalmaneuveringschemewere co-optimized based on the Nash equilibrium.Finally,simulationresultsverifythatDOMP algorithm can generate effectiveness-optimized task distribution plan,trajectories and orbital maneuvering scheme with good convergence and real-time response.In addition,the distributed algorithm architecture enables the DOMP to handle dynamic changes of global situation.

Acknowledgments

This study was co-supported by the Heilongjiang Postdoctoral Scientific Research Developmental Fund(No.LBH-Q14054)and the Fundamental Research Funds for the Central Universities of China(No.HEUCFD1503).

1.Wu DK,Zhou YP.Optimization design of spreader for space passive countermeasure.Infrared Laser Eng 2012;41(2):457–60[Chinese].

2.Davis TM,Melanson D.XSS-10 micro satellite flight demonstration program results.Proceedings of SPIE the international society for optical engineering;2004 Aug;Orlando.Bellingham:SPIE;2004.p.16–25.

3.Madison RW.Micro-satellite based,on-orbit servicing work at the air force research laboratory.IEEE aerospace conference proceedings;2000 Mar 25;Big Sky.Piscataway(NJ):IEEE Press;2000.p.215–26.

4.Osborn M,Clauss C,Gorin B.Micro-satellite Technology Experiment (MiTEx) upper stage propulsion system development.Reston:AIAA;2007 Jul.Report No.:AIAA-2007-5434.

5.Iannotta B.DART aims at space rendezvous.Aerosp Am 2005;43(3):26.p.26–30.

6.Nourzadeh H,McInroy JE.Planning the visual measurement of n moving objects by moving cameras,given their relative trajectories.Proceedings of the IEEE international conference on control applications;2011 Sep 28–30;Denver.Piscataway(NJ):IEEE Press;2011.p.165–70.

7.Emilio F.Quasi-random algorithms for real time spacecraft motion planning and coordination.Acta Astronaut 2003;53(2):485–94.

8.Udwadia FE,Wanichanon T,Cho H.Methodology for satellite formation-keeping in the presence of system uncertainties.J Guid Control Dynam 2014;37(5):1611–24.

9.Mazal L,Gurfil P.Closed-loop distance-keeping for long-term satellite cluster flight.Acta Astronaut 2014;94(1):73–82.

10.Asgarimehr M,Hossainali MM.Optimization of geosyn-chronous satellite constellation for independent regional navigation and positioning in Middle East region.Acta Astronaut 2014;104(1):147–58.

11.Mascarenas D,Macknelly D,Mullins J.Dynamic characterization of satellite assembly for responsive space applications.Measur Sci Technol 2013;24(7):1084–5.

12.Kim MH,Baik H,Lee S.Response threshold model based UAV search planning and task allocation.J Intell Robot Syst 2014;75(3):625–40.

13.Alexo PC,Griffin PM.Path planning for a mobile robot.IEEE Trans Syst,Man Cybernet 1992;22(2):318–22.

14.Liu CL,Wei XY,Yang JY.A path planning method based on improved genetic algorithm for mobile Robot in static environment.ICIC Exp Lett 2011;5(12):4283–90.

15.Deng H,Zhong WC,Sun ZW.Relative navigation algorithm research of chaser spacecraft.Proceedings of the 32nd IEEE aerospace conference;2011 Mar 5–12;Big Sky.Piscataway(NJ):IEEE Press;2011.p.1–11.

16.Sun ZW,Deng H,Zhong WC.Attacking satellite path planning based on genetic algorithm.J Aerosp Eng 2012;25(1):32–8.

17.Richardson DL,Mitchell JW.A third-order analytical solution for relative motion with a circular reference orbit.J Astronaut Sci 2003;51(1):1–12.

18.CarterT.Clohessy-wiltshireequationsmodified to include quadratic drag.J Guid Control Dynam 2012;25(6):1058–63.

19.Alfriend KT,Vadali SR,Vaddi SS.Formation flying:accommodating nonlinearity and eccentricity perturbations.J Guid Control Dynam 2012;26(2):214–23.

20.Rao JY,Xie T,Liu YM.Fuzzy evaluation model for in-service karst highway tunnel structural safety.Ksce J Civil Eng 2016;20(4):1242–9.

21.Li MZ,Du YN,Wang QY,Sun CM,Ling X,Yu BY,et al.Risk assessment of supply chain for pharmaceutical excipients with AHP-fuzzy comprehensive evaluation.Drug Develop Ind Pharm 2016;42(4):676–84.

Liu Yuan is a lecturer at College of Automation,Harbin Engineering University.He received the Ph.D.degree from Harbin Institute of Technology in 2010.His current research interests are orbit dynamic,planning and control.

Ye Dong is a lecturer at Research Center of Satellite Technology,Harbin Institute of Technology.He received the Ph.D.degree from the same university.His current research interests are spacecraft design and nonlinear control.

Hao Yong is a lecturer at College of Automation,Harbin Engineering University.He received the Ph.D.degree from the same university.His current research interests are spacecraft simulation and attitude control.

24 September 2015;revised 8 November 2015;accepted 18 May 2016 Available online 21 October 2016

Evaluation;

Nash equilibrium solution;

Pursuit-evasion strategy;

Planning;

Space

Ⓒ2016 Chinese Society of Aeronautics and Astronautics.Production and hosting by Elsevier Ltd.This is anopenaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).

*Corresponding author.Tel.:+86 451 86412357.

E-mail addresses:undertwilight@foxmail.com(Y.Liu),yed@hit.edu.cn(D.Ye),haoyong@hrbeu.edu.cn(Y.Hao).

Peer review under responsibility of Editorial Committee of CJA.

Production and hosting by Elsevier

http://dx.doi.org/10.1016/j.cja.2016.10.012

1000-9361Ⓒ2016 Chinese Society of Aeronautics and Astronautics.Production and hosting by Elsevier Ltd.

This is an open access article under the CC BY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).