APP下载

Load Balancing Fat⁃Tree on Long⁃Lived Flows: Avoiding Congestion in a Data Center Network

2014-03-22

ZTE Communications 2014年2期

(College of Computer Science,Zhejiang University,Hangzhou 310027,China)

Load Balancing Fat⁃Tree on Long⁃Lived Flows: Avoiding Congestion in a Data Center Network

Wen Gao,Xuyan Li,Boyang Zhou,and Chunming Wu

(College of Computer Science,Zhejiang University,Hangzhou 310027,China)

In a data center network(DCN),load balancing is required when servers transfer data on the same path.This is necessary to avoid congestion.Load balancing is challenged by the dynamic transferral of demands and complex routing control.Because of the distributed nature of a traditional network,previous research on load balancing has mostly focused on improving the perfor⁃mance of the local network;thus,the load has not been optimally balanced across the entire network.In this paper,we propose a novel dynamic load⁃balancing algorithm for fat⁃tree.This algorithm avoids congestions to the great possible extent by searching for non⁃conflicting paths in a centralized way.We implement the algorithm in the popular software⁃defined networking architecture and evaluate the algorithm’s performance on the Mininet platform.The results show that our algorithm has higher bisection band⁃width than the traditional equal⁃cost multi⁃path load⁃balancing algorithm and thus more effectively avoids congestion.

data center network;software⁃defined networking;load balancing;network management

1 Introduction

In a data center network(DCN),a large number of serv⁃ers are connected together by high⁃speed links and switches[1].A traditional DCN architecture typically has a multi⁃rooted tree topology.Usually,such archi⁃tecture has a two⁃tier or three⁃tier data⁃switching structure that can accommodate tens of thousands of servers.However,with the emergence of new services in recent years,the limitations of traditional tree⁃based DCNs have been exposed.These limi⁃tations include poor scalability,low link utilization,and re⁃source slicing.To overcome these limitations,DCN architec⁃tures such as fat tree[2],PortLand[3]and BCube[4]have been proposed.Of these,fat tree is the simplest,easiest to de⁃ploy,and most used.

In the fat tree,routing tables are configured statically[2].Al⁃though the algorithm for managing table configuration distrib⁃utes flows evenly over multiple links,the network becomes un⁃balanced when forwarding for an increasing number of applica⁃tions.When the network becomes unbalanced,several flows compete for the same links while other links remain idle.Also,the traffic in a DCN comprises many small,transactional⁃type remote procedure call(RPC)flows and only a few long⁃lived flows,which require the majority of the bandwidth.If long⁃lived flows collide on some links,network performance is great⁃ly reduced.

This problem is compounded by the unpredictability of data⁃forwarding demands because multiple DCN applications ran⁃domly request a path at runtime.Because of dynamically changing data flows,the routing in a DCN cannot be changed in time to avoid congestion.Current work on this problem shows that performance can be improved by spreading flows onto multiple paths on every node in a distributed way.Howev⁃er,this does not equate to optimal load balancing across the en⁃tire network.Software⁃defined networking(SDN)has a great advantage in that it provides a global network view and can op⁃timize the network by controlling the data plane in a very cen⁃tralized way.

SDN[5]separates the control plane from the data plane in traditional routers.The data plane still remains in network de⁃vices and is responsible for forwarding packets at high speed. The control plane is passed to the SDN controller and controls the underlying network.

Thus,the underlying network infrastructure is abstracted from applications,and network state and intelligence are logically centralized in the SDN controller.This provides researchers, enterprises,and carriers with unprecedented network program⁃mability,automation,and control and enables them to build

highly scalable,flexible networks that can rapidly adapt to changing business needs.In other words,we can control the network from a global perspective according to the information maintained by the SDN controller.

SDN creates new opportunities to balance the load of a DCN.Because of the distributed nature of a traditional net⁃work,previous research on DCN load balancing has focused on each network node,but not enough consideration has been giv⁃en to load balancing across the entire network.However,SDN enables centralized control,and the SDN controller can view several aspects of the entire network,e.g.,topology,flows,and link utility.This is an advantage when balancing the load across the DCN.

This paper describes an improved fat⁃tree architecture and dynamic load⁃balancing algorithm(FTLB).The algorithm regu⁃larly collects flow information from edge switches and uses the network view stored in the SDN controller to compute new paths for large flows whose paths conflict with other flows’paths.Then,FTLB tells the switches to modify relevant flow ta⁃ble entries to route large flows through new paths.Our goal is to better balance the load on fat tree by allocating non⁃conflict⁃ing paths for long⁃lived flows and preventing these paths from colliding on the same physical links.

We evaluate FTLB by emulating a fat⁃tree DCN topology on Mininet.This topology comprises 128 hosts.We use three com⁃munication patterns to generate synthesized data traffic in a DCN.The experiment results show that FTLB improves the bi⁃section bandwidth by up to 50%that which is possible using the traditional equal⁃cost multipath(ECMP)approach.

The rest of the paper is organized as follows:In section 2, we describe the fat⁃tree architecture and improved fat⁃tree based on SDN;in section 3 we propose the FTLB algorithm for load balancing;in section 4,we evaluate the experimental re⁃sults;in section 5,we discuss related work;and in section 6, we conclude the paper.

2 Architecture

Fat tree is a new DCN architecture proposed by Al⁃Fares in 2008[2].It is based on Clos topology[6]and provides full bandwidth for communication between any two in a DCN.Fig. 1 shows ak⁃ary fat tree.The architecture comprises edge switch layer,aggregate switch layer,and core switch layer.A k⁃ary fat tree contains k pods,each of which contains k/2 k⁃port edge switches and k/2 k⁃port aggregate switches(Fig.1). Each edge switch use k/2 ports to connect to k/2 servers,and the remaining k/2 ports connect to k/2 aggregate switches in the same pod.Similarly,k/2 ports of each aggregate switch con⁃nect to k/2 edge switches,and the remaining k/2 ports connect to core switches.Because the total number of aggregate switch uplink ports equals the total number of core switch ports,the number of k⁃port core switches is(k/2)2.Each core switch has one port that connects to each pod;that is,the i⁃th port of each core switch connects to pod i.

In the fat⁃tree,the routing tables distribute flows evenly be⁃tween aggregate and core switches so that the bisection band⁃width is maximized.To do this,the routing tables use a two⁃lev⁃el match:the main routing table uses a prefix match(i.e.,left⁃handed,/m prefix masks of the form 1m032⁃m),and the secondary routing table uses a suffix match(i.e.,right⁃handed,/m suffix masks of the form 032⁃m1m).Each entry in the main routing table has a potential pointer to a secondary routing table of(suffix, port)entries.A prefix match terminates if it does not contain a pointer to a suffix match.If the prefix⁃matching search yields a non⁃terminating prefix,then the matching suffix in the second⁃ary routing table is found and used.Prefix matching is used by edge and aggregate switches to route flows destined towards the pod they stay in.Suffix matching is used by edge and aggre⁃gate switches to spread the flow between aggregate and core switches according to current switch ID and destination host ID.Core switches only use prefix matching to route flows to corresponding aggregate switches in the destination pod.

The advantage of the fat tree is that all the network devices are identical,and cheap commodity parts can be used for all the switches in the architecture.Furthermore,the fat tree can be rearranged and is non⁃blocking.This means that link utiliza⁃tion is relatively high.In addition,a k⁃ary fat tree can accom⁃modate k3/4 servers if we use 48⁃port switches.This means that fat tree can contain 48 pods,each of which contains 24 ag⁃gregate switches and 24 edge switches.Therefore,the whole to⁃pology can contain 27,648 servers,and fat tree is relatively scalable.

Fat tree provides different paths for the communication be⁃tween any two servers.However,because of the limited num⁃ber of core switches and static routing tables,performance may deteriorate when several flows compete for the same link re⁃sources,and some links may remain idle.Therefore,we need to make full use of the multiple paths between any two servers and improve the routing algorithm in order to dynamically bal⁃ance the network load.

▲Figure 1.Fat⁃tree architecture.

▲Figure 2.New fat⁃tree architecture based on SDN.

In the existing DCN,the load⁃balancing methods tend to make use of local link information and therefore are only capa⁃ble of partial load balancing,not overall network load balanc⁃ ing.However,the SDN controller has an overall view of the DCN,so we can combine DCN with SDN to create a fat⁃tree ar⁃chitecture based on SDN(Fig.2).The architecture in Fig.2 is different from traditional fat⁃tree architecture in that it has a centralized SDN controller,called OpenFlow controller(OF controller)[7],and OpenFlow switches are used as network de⁃vices.Each OpenFlow switch connects to the OpenFlow con⁃troller through a secure channel using transport layer security (TLS).An OpenFlow controller uses a standard interface pro⁃vided by OpenFlow switches in order to update the flow tables on those switches and control the forwarding behavior of the underlying network.Further,the OpenFlow controller can ob⁃tain statistics and other information,such as topology and port status,from underlying networks to form a view of the whole network.

3 FTLB Algorithm

Several studies have shown that Internet traffic is character⁃ized by a few large,long⁃lived flows consuming most of the bandwidth,as well as many small,short⁃lived flows[8],[9]. The method of routing large,long⁃lived flows can significantly affect network performance and bisection bandwidth;there⁃fore,we need to handle such flows in a special way.Large flows always exist in a network for a long time and have a rela⁃tively large number of packets.

ECMP is currently used to take ad⁃vantage of multiple paths between any two servers in a fat⁃tree architec⁃ture.Each ECMP⁃enabled switch maintains a data structure called a flow mapping table that stores map⁃pings of flows to paths.If a new flow arrives but cannot be found in a flow mapping table,it is forwarded along a selected path that corresponds to a hash of the selected fields of the packet’s headers modulo the num⁃ ber of paths.Thus,the load is split between multiple paths.In addition,the mapping of the new flow to the selected path is written to the flow mapping table,and subsequent packets be⁃longing to this flow are forwarded along this selected path so that packets do not need to be reordered[10].However,when using ECMP to select flow paths,several large,long⁃lived flows can collide on the same links,creating a bottleneck in the network and decreasing network bisection.Therefore,large flows need to be scheduled in order to avoid collision and in⁃crease throughput.

Fig.3shows a network with four large flows.Flow A and flow B use the same link between aggregate switch 1 and core switch 1 when forwarding packets to core switch.Flows C and D are aggregated on core switch 3 and use the same link be⁃tween core switch 3 and aggregate switch 8 when forwarding packets to the destination.In other words,flows A and B col⁃lide with each other on some links,and flows C and D collide with each other on some links as well.Assuming that the link bandwidth is 100 Mbps and there are collisions between flows, for each of all four flows,its rate is clinched to 50 Mbps,and the bisection bandwidth is also halved.If we schedule flow A to go through core switch 2,and if we schedule flow D to go through core switch 4,all the flows will have full bandwidth, and network bisection bandwidth will be improved.

The proposed FTLB algorithm has three basic steps:1)the OF controller regularly detects large flows in the network;2) the OF controller decides which large flows should be sched⁃uled and computes new paths according to the network view stored in the OF controller;3)the OF controller sends OFP⁃FC_MODIFY messages to corresponding switches to deploy the new paths.

There are two methods for detecting large flows:push and pull.With the former,whenever counters in a flow entry reach the threshold,the switch sends an OpenFlow message to the controller to report a new large flow.With the latter,the Open⁃Flow controller sends an OpenFlow message to edge switches to periodically retrieve the counters of flow entries.If the value of a counter is greater than the threshold,the flow is marked as large.

▲Figure 3.Different flows compete for the same link resource.

FTLB should compute new paths for large flows when they

have been detected.The algorithm makes full use of the multi⁃ple paths between any two servers[11]and,to the greatest pos⁃sible extent,allocates non⁃conflicting paths for different large flows.The specific method is as follows:

1)All flows are forwarded according to the fat⁃tree routing algo⁃rithm.If a flow does not match any entry in the flow table,it is forwarded to the controller,which computes a path for it. At the same time,new flow entries are inserted into the flow tables.FTLB can only act on large flows,and the paths of new flows are computed with the original routing algorithm.

2)When large flows have been detected,FTLB selects those flows that should be scheduled according to overlapping in⁃formation between the large flows.If the original path of a large flow does not overlap the path of others,then it should not be scheduled.In this case,we only need to mark the links on the flow path,which means these links have been allocated.For example,if we detect four large flows once, we should only schedule flows B and D,and flows A and flow C do not need to be scheduled(Fig.3).This decreases the overhead of FTLB.

3)According to the fat⁃tree architecture,the large flows of which the source and destination are not in the same pod with k2/4 paths.In addition,each flow is only forwarded by a core switch.Therefore,in order to make the large flows do not conflict;we can search k2/4 different paths according to core switches for every large flow that should be scheduled. A path is allocated to a large flow if all links on the path have not been allocated.

4)Large flows of which the source and destination are in the same pod but do not correspond to the same edge switch pass through aggregate switches instead of core switches. Therefore,we only need to search the corresponding paths of k/2 aggregate switches.If there is a path whose links have not been allocated,it is allocated to current large flow.

5)If FTLB cannot find a non⁃allocated path for a new large flow,it also searches the corresponding path of every core switch or aggregate switch to find a path that can be allocat⁃ed to the minimal number of flows.

6)When a large flow disappears,i.e.,when the flow entry of the large flow’s new path has timed out,the OpenFlow switch will send a flow⁃removed message to the controller, which then unmarks the links belonging to the large flow.

Deploying a new path is the last step of FTLB.The Open⁃Flow controller sends an OFPFC_MODIFY message to the rele⁃vant edge of the new path and aggregates switches in order to modify corresponding flow entries.Packets of the large flows are then transported through these new paths.

FTLB computes new paths for large flows by searching paths that correspond to aggregate or core switches.The time to de⁃cide whether a path meets the requirements is given by O(1) because a path contains a maximum of four links.For each large flow,the number of paths that FTLB needs to search is O(k2)at worst and O(1)at best.Thus,for n large flows,the time taken by FTLB is O(nk2)at worst.

4 Implementation and Evaluation

To determine the feasibility of the proposed FTLB algo⁃rithm,we use the Mininet platform to emulate a network with many large flows.Here,we introduce the experimental scenari⁃os and analyze the results.

4.1 Experment Platform

Building a real physical network for experimentation is not ideal because network devices,such as routers and switches, are expensive,and the platform cannot be easily reused.This is a serious waste of resources.The control plane is integrated within the router,making it difficult to develop and test new al⁃gorithms.Also,the platform is relatively small,so traffic on platform lacks authenticity.

Mininet[12]is a process⁃virtualized network experimental platform based on Linux Container and proposed by Nick McK⁃eown of Stanford University.We can use Mininet to build com⁃plex network compared favorably with real network hardware environment to experiment our new ideas based on OpenFlow. More importantly,the experimental code can be seamlessly mi⁃grated to the real network environment.Therefore,we use Mini⁃net as our experiment platform to validate FTLB algorithm.

4.2 Experiment Scenario

We use an 8⁃ary fat⁃tree architecture that contains 128 hosts,32 edge switches,32 aggregate switches,and 16 core switches.The switches in fat tree are OpenFlow switches,and the network is controlled by an OF controller.

The performance metric is bisection bandwidth[13],which is the maximum transmission rate through a section if the net⁃work is halved by section and the number of nodes in each half is equal.The bisection bandwidth can give an indication of overall network performance—the larger the bisectional band⁃width,the better the transmission of the network.

Because there are no DCN traces,we need to create a group of communication patterns[2]:

·staggered prob(pEdge,pPod).A host sends packets to 1)an⁃other host in the same subset with probability pEdge;2)an⁃other host in the same pod with probability pPod;and 3)to any host not in the same pod with probability 1⁃pEdge⁃pPod.

·random.A host sends packets to any other host in the net⁃work with equal probability.

·stride(i).A host with index m sends packets to host with in⁃dex(I+m)mod n,where n is the total number of hosts in the network.

In DCN,traffic comprises many small,transactional⁃type RPC flows,e.g.,search results,and a few long⁃lived flows,e.g., backups and MapReduce tasks.To simulate traffic in DCN,ev⁃ery host generates new flows of different length at a Poisson distributed start time,and the length is exponentially distribut⁃

ed.The send rate of every host is updated continuously accord⁃ing to transmission control protocol(TCP)slow⁃start and addi⁃tive increase multiplicative decrease(AIMD).If it is in slow⁃start stage,the send rate is doubled in every tick.When the send rate reaches its threshold,it is halved and moves to the congestion⁃avoidance stage,where the send rate is increased by addition.

4.3 Evaluation

In our experiment,every host constantly measures the in⁃coming bandwidth.Each experiment lasts 60 s,and we mea⁃sure the average bisectional bandwidth in the middle 40 s.We performed each experiment three times for staggered prob,ran⁃dom,and stride communication patterns.The average was tak⁃en as the result of each experiment.Figs 4,5 and 6,show the performance of ECMP algorithm and FTLB algorithm using three different communication patterns.

Fig.4 shows that FTLB performs better than ECMP.When Staggered Prob is(0.8,0.2),the two algorithms perform similar⁃ly because only 20%of flows whose source and destination are in the same pod but not in the same subnet need to be sched⁃uled.However,when the number of flows sent to the same sub⁃net decreases and the number of flows that need to be sched⁃uled increases,ECMP performance declines sharply.FTLB is therefore very effective for scheduling large flows.

▲Figure 4.Average bisectional bandwidth using Staggered Prob communication pattern.

▲Figure 5.Average bisectional bandwidth using Random communication pattern.

▲Figure 6.Average bisectional bandwidth using Stride communication pattern.

In Fig.5,the difference in performance between ECMP and FTLB is relatively large because the number of flows whose source and destination are not in the same subnet is in the ma⁃jority.The first three groups of data are generated using com⁃pletely random communication pattern whereas the last three groups of data are generated using a bijective random commu⁃nication pattern.In a bijective random communication pattern, two hosts send packets to each other.If the two flows moving in the opposite direction between the two hosts are both large flows,they can be scheduled to the same path using FTLB. Thus,link resources can be used more efficiently.Therefore, the performance of FTLB using bijective random communica⁃tion is slightly better than that using a completely random com⁃munication pattern.

As the Stride parameter increases,the number of flows that should be scheduled also increases,and FTLB becomes more efficient than ECMP(Fig.6).

From the experimental results,we see that FTLB effectively schedules large flows to different paths to avoid network con⁃gestion whereas ECMP cannot schedule flows as dynamically because is it static and has local characteristics.Therefore, FTLB provides higher average bisection bandwidth than EC⁃MP.

5 Related Work

The several DCN architectures that have been proposed can be classified into the switch⁃centric architectures and the serv⁃er⁃centric architectures[14].In the former,switches are used to interconnect the network of servers and have routing intelli⁃gence.Such architectures include fat tree[2],Monsoon[15], and PortLand[3].In the latter,servers with multiple network interface card(NIC)ports have routing intelligence.Such ar⁃chitectures include BCube[4],DCell[1],and MDCube[16]. However,these switch⁃centric and server⁃centric approaches lack a means of universal control in order to avoid link conges⁃tion.

Researchers also have explored load⁃balancing algorithms for fat tree(e.g.,load balancing algorithm based on packet,EC⁃MP)in a multipath environment.However,these algorithms are implemented on every fat⁃tree node and have local fea⁃

tures.Thus,they cannot provide universal load balancing for fat tree.Load⁃balancing algorithms based on packets can pro⁃vide improved bisection bandwidth by using round robin or def⁃icit round robin.However,this causes packets to be reordered. ECMP selects a flow path according to the hash of several packet header fields,but the selected path cannot be changed, and several flows can be hashed onto the same path.

Our dynamic traffic⁃aware algorithm improves the perfor⁃mance of a fat⁃tree architecture by taking into account the char⁃acteristics of DCN flows and using the universal network view stored in the controller.

6 Conclusion

In this paper,we have proposed a novel FTLB algorithm for fat tree.This algorithm schedules large flows in DCN by apply⁃ing SDN to the fat⁃tree architecture.A universal view of active flows and network resources is stored in SDN controller.By us⁃ing this information in the SDN controller,FTLB is more effec⁃tive than a static load⁃balancing algorithm.With FTLB,net⁃work resources can be more efficiently utilized,and the perfor⁃mance of fat tree can be improved.We decrease the overhead of our algorithm by limiting flows that should be scheduled to large flows that will send more bytes across the network. Through experimentation,we observe that FTLB always outper⁃forms ECMP and provides better bisectional bandwidth than ECMP.

[1]C.Guo,H.Wu,K.Tan,L.Shi,Y.Zhang,and S Lu,“DCell:a scalable and fault⁃tolerant network structure for data center,”in Proc.ACM SIGCOMM,Seattle, USA,Aug.2008,pp.75-86.doi:10.1145/1402958.1402968.

[2]M.Al⁃Fares,A.Loukissas,and A.Vahdat,“A scalable,commodity data center network architecture,”in Proc.ACM SIGCOMM,Seattle,USA,Aug.2008,pp. 63-74.doi:10.1145/1402958.1402967.

[3]R.N.Mysore,A.Pamboris,N.Farrington,N.Huang,P.Miri,S.Radhakrishnan, V.Subramanya,and A.Vahdat,“PortLand:a scalable fault⁃tolerant layer 2 data center network fabric,”in Proc.ACM SIGCOMM,Barcelona,Spain,Aug.2009, pp.39-50.doi:10.1145/1592568.1592575.

[4]C.Guo,G.Lu,D.Li,H.Wu,X.Zhang,Y.Shi,C.Tian,Y.Zhang,and S.Lu,“BCube:a high performance,server centric network architecture for modular da⁃ta centers,”in Proc.ACM SIGCOMM,Barcelona,Spain,Aug.2009,pp.36-74. doi:10.1145/1592568.1592577.

[5]ONF.(2012,Apr.13).Software⁃Defined Networking:The New Norm for Networks [Online].Available:https://www.opennetworking.org/images/stories/downloads/ sdn⁃resources/white⁃papers/wp⁃sdn⁃newnorm.pdf

[6]C.Clos,“A study of non⁃blocking switching networks,”The Bell System Techni⁃cal Journal,vol.32,no.2,pp.406-424,1953.

[7]N.McKeown,T.Anderson,H.Balakrishnan,G.Parulkar,L.Peterson,J.Rex⁃ford,S.Shenker,and J.Turner,“OpenFlow:enabling innovation in campus net⁃ works,”ACM SIGCOMM Computer Communication Review,vol.38,no.2,pp. 69-74,Apr.2008.doi:10.1145/1355734.1355746.

[8]A.B.Downey,“Evidence for long⁃tailed distributions in the internet,”in Proc. ACM SIGCOMM Internet Measurement Workshop,San Francisco,USA,Nov. 2001,pp.229-241.doi:10.1145/505202.505230.

[9]S.Ben Fred,T.Bonald,A.Proutiere,G.Régnié,and J.W.Roberts,“Statistical bandwidth sharing:a study of congestion at flow level,”in Proc ACM SIGCOMM, San Diego,USA,Aug.2001,pp.111-122.doi:10.1145/383059.383068.

[10]Ka⁃Cheong Leung,V.O.K.Li,and Daiqin Yang,“An overview of packet reor⁃dering in transmission control protocol(TCP):problems,solutions,and chal⁃lenges,”IEEE Transaction on Parallel and Distributed Systems,vol.18,no.4, pp.522-535,Apr.2007.doi:10.1109/TPDS.2007.1011.

[11]F.P.Tso,G.Hamilton,R.Weber,C.S.Perkins,and D.P.Pezaros,“Longer is better:exploiting path diversity in data center networks,”IEEE 33rd Interna⁃tional Conference on Distributed Computing System,Philadelphia,USA,Jul. 2013,pp.430-439.doi:10.1109/ICDCS.2013.36.

[12]B.Lantz,B.Heller,and N.Mckeown,“A network in a laptop:rapid prototyping for software⁃defined networks,”in Proc ACM HotNets⁃IX,Monterey,USA,Oct. 2010.doi:10.1145/1868447.1868466.

[13]T.Hoefler,T.Schneider,and A.Lumsdaine,“Multistage switches are not cross⁃bars:effects of static routing in high⁃performance networks,”IEEE Internation⁃al Conference on Cluster Computing,Tsukuba,Japan,2008,pp.116-125.doi: 10.1109/CLUSTR.2008.4663762.

[14]Y.Zhang and N.Ansari,“On architecture design,congestion notification,TCP incast and power consumption in data centers,”IEEE Communications Surveys & Tutorials,vol.15,no.1,pp.39-64,Feb.2013.doi: 10.1109/ SURV.2011.122211.00017.

[15]A.Greenberg,P.Lahiri,D.A.Maltz,P.Patel,and S.Sengupta,“Towards a next Generation data center architecture:scalability and commoditization,”in Proc.ACM PRESTO’08,Seattle,USA,pp.57-62.doi: 10.1145/ 1397718.1397732.

[16]H.Wu,G.Lu,D.Li,C.Guo,and Y.Zhang,“MDCube:a high performance net⁃work structure for modular data center interconnection,”in Proc.Co⁃NEXT’09,Rome,Italy,pp.25-36,doi:10.1145/1658939.1658943.

Manuscript received:2014⁃03⁃03

BiograpphhiieessWen Gao(gavingao@zju.edu.cn)is a PhD candidate at the New Generation Network Technology Laboratory at Zhejiang University.His current research interests in⁃clude network management,software⁃defined networks,and reconfigurable flexible networks.

Xuyan Li(lixuyanzju@163.com)received her BS degree from Zhejiang University in 2014.Her research interests include software⁃defined networking and data center networking.

Boyang Zhou(zby@zju.edu.cn)is a PhD candidate at the New Generation Network Technology Laboratory at Zhejiang University.His current research interests in⁃clude future Internet architecture,software⁃defined networks,and reconfigurable flexible networks.

Chunming Wu(wuchunming@zju.edu.cn)is a professor in the College of Computer Science at Zhejiang University.He is the director of the New Generation Network Technology Laboratory at that university.His research interests include flexible re⁃configurable networks,software⁃defined networks,network and service testbeds, and innovative security technology for active defense networks.

This work is supported by the National Basic Research Program of China (973 Program)(2012CB315903),the Key Science and Technology Innovation Team Project of Zhejiang Province(2011R50010⁃05),the National Science and Technology Support Program(2014BAH24F01),863 Program of China(2012AA01A507),and the National Natural Science Foundation of China(61379118 and 61103200).This work is sponsored by the Research Fund of ZTE Corporation.