Energy-Delay Tradeoff for Online Offloading Based on Deep Reinforcement Learning in Wireless Powered Mobile-Edge Computing Networks

2020-02-01WANGZhonglinCAOHankaiZHAOPing赵萍RAOWei饶为

Journal of Donghua University(English Edition) 2020年6期

关键词：赵萍

WANGZhonglin，CAOHankai，ZHAOPing(赵萍),RAOWei(饶为)

1 College of Finance and Information, Ningbo University of Finance and Economics, Ningbo 315000, China 2 College of Information Science and Technology, Donghua University, Shanghai 201620, China 3 Tencent Media Lab, Shenzhen 518000, China

Abstract: Benefited from wireless power transfer (WPT) and mobile-edge computing (MEC), wireless powered MEC systems have attracted widespread attention. Specifically, we design an online offloading scheme based on deep reinforcement learning that maximizes the computation rate and minimizes the energy consumption of all wireless devices (WDs). Extensive results validate that the proposed scheme can achieve better tradeoff between energy consumption and computation delay.

Key words: mobile-edge computing (MEC); wireless power transfer (WPF); computation offloading; energy consumption; deep reinforcement learning

Introduction

Computation latency and energy consumption in wireless powered mobile-edge computing (MEC) systems have attracted a growing research interest in both academia and industry[1-3]. Several existing works[4-9]considered the MEC systems mainly powered by batteries, and optimized the energy consumption and the computation delay. However, these works either studied wireless power transfer (WPT) and MEC separately, or only focused on MEC, and thereby did not combine the advantages of both WPT and MEC. Latest works[10-13]have focused on the wireless powered MEC systems. However, Youetal.[10]considered the single-user wireless powered MEC system, and thus the system was not applicable in practical applications where a number of users were involved. Wangetal.[11]focused on the partial offloading cases where the tasks could be partitioned and a subset of the tasks were offloaded, ignoring the binary offloading cases. Huangetal.[12]only optimized the computation delay in the wireless powered MEC system, without considering the energy consumption. Additionally, Yangetal.[13]only optimized the energy consumption of the wireless powered MEC system, without considering the computation delay.

To address the problems above, in this paper, we consider the wireless powered MEC system consisting of one access point (AP) and multiple wireless devices (WDs) which follow the binary offloading, and jointly optimize the energy consumption and computation delay, rather than only optimize the computation delay. As shown in Fig. 1, each WD is powered by the energy transmit beamforming from the AP, and uses the harvested energy to locally compute the task or offload the task to the AP. To jointly optimize the energy consumption and the computation delay in such a wireless powered MEC system, we propose an online computation offloading scheme based on deep reinforcement learning that maximizes the computation rate and minimizes the energy consumption of all WDs.

Fig. 1 An illustration of wireless powered MEC system consisting of one AP and multiple WDs

We make the following main contributions:

(1) We do the first attempt towards the wireless powered MEC system consisting of one AP and multiple WDs which follow the binary offloading, and jointly optimize the energy consumption and computation delay.

(2) We formalize the task offloading in such a wireless powered MEC system as an maximization problem, and propose an online offloading algorithm based on deep reinforcement learning to solve the problem, achieving the optimal tradeoff between computation delay and energy consumption.

(3) We implement the proposed online offloading scheme, and the extensive numerical results validate that our work outperforms the existing work[12], providing faster computation rates and less energy consumption.

The remainder of this paper is organized as follows. Section 1 introduces the system model and problem formulation. Section 2 presents the proposed online offloading scheme. Section 3 evaluates the performance of the proposed scheme. Section 4 concludes the whole paper.

1 System Model and Problem Formulation

1.1 System model

Consider a wireless powered MEC system consisting of one AP andNWDs denoted by WDi,i={1, 2, …,N}. AP can transmit wireless energy to WDs, receive offloaded tasks from WDs, and send the corresponding results back to WDs. As described in existing works[11-13], since the computing power of the AP is much higher than that of a WD, the time that an offloaded task is computed at the AP is ignored. Since the size of the result returned by the AP, in practice, is often much smaller than that of the task, we also ignore the time it takes for the AP to return the result. It is further assumed that energy transmission and task offloading operate over the same frequency band, so the two phases need to be implemented successively[14]. A WD is composed of an energy-acquire module, a compute module, and a communication module. These modules are independent, and thus they can work at the same time. All WDs have no other external power sources, so they can only use the energy harvested in the energy transmission phase. The system adopts a binary offloading policy, where the task is either computed by a WD locally or offloaded to an AP. The time is divided into a fixed lengthT. In each frame, the AP first assigns timeaTto transmit wireless energy to each WD, and then assigns timeτiTto receive the task offloaded from each WD.aandτiare scale factors. Assume that the offloading time for a specific WD which does not offload tasks is 0. The time constraint is then formulated as

(1)

Denote the size of a task to be processed by theith WD in a time frame asDi, which follows a normal distribution with fixed expectations and variances.

1.2 Local computing mode

(2)

whereφis the number of cycles needed to process one bit of task. Regardless of whether the WD gains enough energy in a time frame to complete the processing of the task of sizeDi, we assume that it will schedule the CPU frequency according to the policy of maximizing the computation rate, and then discard the task that exceeds the maximum computing capability.

(3)

In a time frame, the CPU frequency of a WD remains constant while the task is processed. Therefore, it can be considered that the instantaneous power a WD consumes remains constant in a time frame. The energy consumed by theith WD in a time frameEL, i, consumedcan be presented as

(4)

1.3 Edge computing mode

Assume that in the edge computing mode, the main energy consumption is resulted from the task offloading process of WDs. To maximize the computation capability, a WD in the edge computing mode fully exhausts the harvested energy to offload tasks. Lethidenote the wireless channel gain betweenith WD and the AP,vudenotes the communication overhead ratio,Bdenotes the communication bandwidth, andN0denotes the received noise power. The maximum size of a processable task is

(5)

Thus, the maximum computation rate is

(6)

For edge computing, it is important to keep the connections between the WDs and the AP stable. Therefore, ifDi≤Di, max, theith WD will spend the additional energy to enhance the channel gain. Thus, we haveEE, i, consumed=EE, i, received.

1.4 Problem formulation

Among all the input parameters, only the parameterhivaries with time, while other parameters remain time invariant. Letxirepresent the action of theith WD, specifically,xi=0 for local computation, andxi=1 for edge computation. WDs may have different priorities, denoted by a weight factorωi. In different cases, the emphasis of optimization varies, so we denote the configurable weight of energy terms asρ. Setτ={τi|i∈N} andx={xi|i∈N}. We have the computation rate objectr(h,x,a,τ), the energy consumption objecte(h,x,a,τ) and the overall optimization objectQ(h,x,a,τ) as follows:

(7)

(8)

Q(h,x,a,T)=r(h,x,a,T)+ρe(h,x,a,T),

s.t.xi∈{0,1},i={1, 2, …,N},

(9)

Thus our goal is,

s.t.xi∈{0,1},i={1, 2, …,N},

a≥0,τi≥0,hi≥0,

(10)

whereP1 is a mixed integer programming problem, which is difficult to solve. However,P1 can be separated into two sub-problemsP2 andP3.

P2: givenh, findx.

P3: givenhandx, finde*andτ*.

π:h→x*.

(11)

SetM0={i|xi=0} andM1={i|xi=1}. GivenM0andM1, we have

(12)

Lemma1Q(h,a,τ) is a convex function.

ProofThe Hessian of -Q(h,a,τ) is

2Q(h,a,τ)=

(13)

where

(14)

The corresponding dual function is

(15)

and the dual problem is

(16)

Therefore, algorithms with low time complexity can be applied to figure out parametersv,a, andτ.

2 Online Offloading Algorithm

Fig. 2 Proposed online offloading algorithm

3 Performance Evaluation

In this part, we will investigate the performance of our work in terms of the computation rate and energy consumption.

3.1 Experimental setup

3.1.1Parametersettings

3.1.2Metrics

We use the metrics, including the energy consumption (J), and the computation rate (bit/s). Moreover, we also investigate the impact of the parameterρon both the energy consumption and the computation rate, aiming to study the energy-delay tradeoff.

3.1.3Existingworkforcomparison

We compare the proposed online offloading algorithm (hereafter Our) with the latest work[12](hereafter DROO). Since the existing work DROO did not consider the energy consumption when designing the offloading scheme, we only compare the computation rate of Our and DROO in the following.

3.2 Result analysis

It can be seen from Fig. 3 that the energy consumption drops significantly faster than the computation rate when a larger weight |ρ| is applied to the energy consumption term. Thus, it is reasonable to consider adding an energy consumption term to the optimization goal. Although the energy consumption significantly decreases, the computation rate is not much affected.

Fig. 3 Normalized average computation rate and energy consumption

Fig. 4 Energy consumption and computation rate varying from time frame 5 030 to 5 040: (a) impact of parameter on energy consumption; (b) impact of parameter on computation rate

4 Conclusions

In this paper, we consider the wireless powered MEC system consisting of one AP and multiple WDs which follow the binary offloading, and jointly optimize the energy consumption and the computation delay. Specifically, we first formalize the offloading as an optimization problem, and then design an online computation offloading scheme based on deep reinforcement learning that maximizes the computation rate and minimizes the energy consumption of all WDs. Finally, we validate the performance of the proposed scheme, and the extensive results validate that it can achieve the better tradeoff between energy consumption and computation delay, providing faster computation rates and less energy consumption.