APP下载

Hybrid Content Distribution Framework for Large?Scale Vehicular Ad Hoc Networks

2016-10-13HEJianpingandCAILin

ZTE Communications 2016年3期

HE Jianping and CAI Lin

Abstract Content distribution in large?scale vehicular ad hoc networks is difficult due to the scalability issue. A message may need to be carried by several vehicles till it reaches the destination. To select an appropriate next?hop carrier, the current carrier should exchange control messages with a large number of vehicles encountered, and thus the pure ad hoc solution is not scalable. In this paper, we introduce a hybrid?network solution. We first divide the area into regions, and select a hot spot in each region to install a road?side unit (RSU). RSUs can coordinate message exchanges between vehicles, and storage devices are used to temporarily hold a message waiting for the next?hop carrier. The RSUs and the vehicles traveling between them construct an overlay store?carry?and?forward content distribution network. Two types of vehicles exist, one with fixed mobility patterns such as buses, and the other with random patterns such as taxis. Considering one or both types of vehicles, utility?based optimization problems can be formulated to find the optimal routing solutions. Using the bus and taxi traces of Shanghai city, we demonstrate the effectiveness of the hybrid framework in terms of delivery delay, delivery ratio and overhead ratio.

Keywords content distribution; VANETs; utility?based routing; store?carry?and?forward

1 Introduction

Connected vehicles will exceed 400 millions in 2030, according to ABIs forecast on smart cars. Both Google and Apple have developed operating systems for automobiles, and major vehicle manufacturers are developing the vehicles that are readily connected to provide a wide spectrum of services to people on move. In addition to the navigation safety applications, the new driven forces for vehicular ad hoc networks (VANETs) are the media?rich entertainment and location?sensitive information, e.g., video clips of local tourism information, or promotions from hotels, attractions, restaurants and stores can be geocasted to nearby vehicles or to those close to an airport, ferry, or train and bus stations.

There are many existing works considering real?time path planning [1], differentiated reliable routing [2] or infrastructure based routing [3] to solve content distribution problems in VANETs. However, given the high volume and location?sensitivity of the information, the existing infrastructure?based and ad hoc solutions encounter tremendous challenges [4]-[8]. For example, the cellular systems can provide seamless connectivity, while the high spectrum cost makes it very expensive to distribute large?volume content to all vehicles near certain locations. A mobile user can download content from license?free Wi?Fi access points or road?side units (RSUs), but Wi?Fi access points have a limited coverage and capacity. These pure infrastructure?based solutions may not be scalable when the density of connected vehicles becomes large. On the other hand, in VANET, we can rely on vehicle?to?vehicle communications for sharing the information in a peer?to?peer fashion, which is particularly viable for location?sensitive multimedia applications. The pure ad hoc, vehicle?to?vehicle solutions, however, also encounter the scalability issue when the data needs to be delivered in a large?scale VANET with a high node (vehicle) density. This is because each message may need to be delivered over multiple hops to reach the destination, while each message carrier (vehicle) needs to exchange control information with a large number of encountered vehicles to select a good candidate as the next?hop carrier.

To enable low?cost, media?rich contention distributions in a large?scale VANET, this paper introduces a scalable hybrid content distribution framework, combining the vehicle?to?RSU and vehicle?to?vehicle communications. In a nut?shell, we establish an overlay communication network by dividing the whole area into a number of regions, selecting a hot spot in each region to install an RSU for coordinating the vehicle?to?vehicle data exchange, and using a drop box to store messages temporarily for the next suitable vehicle if necessary. The RSU together with the drop?box can be viewed as a router that can store and forward messages for the overlay network. The coverage area of each RSU is called a cluster. A message is carried and delivered by vehicles traveling between clusters, which can be viewed as a link. Using the vehicle traces in Shanghai city, we demonstrate by trace?driven simulation that this hybrid framework is efficient and scalable for content distributions in large?scale VANETs.

2 Network Architecture

We assume that participating vehicles are equipped with on?board units (OBU) and can communicate with each other or with RSUs. The near?future vehicle traveling destination and path are assumed known by the OBU, and they are not affected by the messages the vehicle carries, so the message delivery is transparent to the driver, without any additional vehicle traveling cost. Fig. 1 shows a sample VANET. A message needs to be carried by vehicles from the source to a target region, and then be geocasted there. How to geocast in a small region has been heavily investigated [9], [10]. In this paper, we focus on how to construct the overlay network and find the suitable path and vehicles to carry the message to the target region.

When the traveling path of a vehicle deviates from the direction to the destination of the message it carries, the message should be forwarded to a more appropriate vehicle during the contact time of the two vehicles (i.e., when the two vehicles are within each others transmission range); or the message should be forwarded to a drop box for temporary storage, and then wait for an appropriate vehicle as the next?hop carrier. Note that for large?scale VANETs, a long distance between the source and the destination is possible. Therefore, multiple vehicles may be needed to carry the message, which forms a multi?hop path, and then a large number of vehicles may encounter the message carrier(s). Exchanging control messages with all these vehicles will result in a high overhead.

To minimize the control message overhead and optimize the routing, our approach includes six steps (Fig. 2). First, the whole area covered by the VANET is divided into regions, and we select a hotspot in each region to install an RSU. The coverage of the RSU is named a cluster. Then, the real vehicular traffic data trace is used to obtain the statistics and mobility patterns of the vehicles, including the arrival rate of different types of vehicles traveling between two clusters, their average travel time, etc. Next, using the statistics obtained, we construct a weighted graph to model the overlay network. These three steps are the offline preparation for VANET operation.

The next three steps are to make a routing decision for each message. We consider source routing here, and the solution can be extended for other routing protocols. The fourth step is using the shortest?path algorithm to identify a few suitable paths as the routing candidates. Next, depending on the quality of service (QoS) requirements of the message and the types of vehicles used, a utility?based optimal routing problem can be formulated and then solved by designing the routing algorithm.

3 Regional Clustering

Given a large?scale VANET, there are a large number of road segments and the possible number of paths between two locations increases exponentially with regard to the number of road segments. For instance, considering a two?dimensional [k×k] Manhattan street map, the number of paths between the two diagonal street intersections equals [2kk].

To simplify the problem, we first divide the area into regions. Within a region, we select a hot?spot location (with high density of vehicles), such as a major bus exchange, shopping center, airport, or central business district, to install an RSU which is the center of a cluster. When a vehicle enters a cluster, it will send a registration message to the RSU, reporting its location, destination, and messages that need to be transferred to other vehicles. Based on such information, the RSU can select a suitable next?hop carrier that has sufficient contact time with the current carrier. If no such suitable next?hop carrier is available, the RSU will advise the current carrier to send the message to a drop box to wait for a future carrier. The drop box can be a storage device located in a vehicle currently parked in the cluster, or installed in the RSU. Note that the drop box does typically not need internet access.

Given the clusters, we can simplify the network topology by ignoring the physical roads, and focus on the connectivity of clusters by traveling vehicles. In the overlay network, a cluster (including the RSU and the drop boxes) is viewed as a node, and a link exists between two nodes if there are vehicles traveling between them.

Various approaches can be used for clustering. We apply the k?means clustering algorithm [12], [13] to generate regions with similar sizes. k is the number of regions (clusters). Traditionally, the clustering algorithm is based on the Euclidean distance. With the real?world road structures and traffic speeds, the clustering algorithm can consider the actual reachability between any two clusters using the travel distance of two locations instead. Travel distances can be obtained through online map services, e.g., Google maps, or using the historical measurement data. Using the map of Shanghai city as an example, we obtain 40 regions/clusters by setting[ k=40] (Fig. 3). Comparing it to the city map, it is not difficult to know the location corresponding to each cluster, e.g., cluster 18 is the Hongqiao airport and cluster 37 is the Pudong airport.

4 Trace Processing for Link Properties

With the simplified cluster?based map, we need to obtain the parameters related to the links between clusters from measurements.

VANET measurement results are typically the Global Positioning System (GPS) location traces of vehicles over certain time period. By analyzing the traces, we can understand the geographic distribution of the city traffic (macroscopic mobility) and individual vehicle mobility patterns (microscopic mobility). From the trace processing, we can obtain the statistics of the vehicles that can be used to describe the link properties between clusters, e.g., the travel time between two clusters can describe the propagation delay of the link, and the number of traveling vehicles per time unit can describe the link capacity.

Further, there are different types of vehicles with different mobility models and travel time, which makes the link properties of the overlay VANET more complicated.

Buses and taxis represent two typical types of vehicles with predetermined and random mobility patterns, respectively. The movement of a bus is relatively predicable, and there exists a bus moving along a fixed route during a certain period of time. Buses however have a limited spatial and temporal coverage. The movement of a taxi is less predictable with a much larger spatial and temporal coverage. Intuitively, three factors may impact the mobility of a taxi: customer demands, drivers driving habits, and real?time traffic conditions. For an occupied taxi, an experienced driver usually selects the path with the lowest anticipated delay to reach the destination, considering the anticipated traffic congestion; for an unoccupied one, the mobility depends on the taxi drivers driving habits and “hunting” preferences.

Three parameters are important to describe a link of the overlay network: the travel time between two clusters, the inter?arrival time, and the arrival rate for both buses and taxis traveling between the clusters. The inter?arrival time and the arrival rate are inter?changeable.

Next, we use the real vehicle GPS traces, including both buses and taxis, collected in the city of Shanghai, China, as an example to obtain the statistics and models of the above parameters. The traces are partially available in [14]. We use the data of 2299 taxis from January 31, 2007 to February 27, 2007 and of 2500 buses of 103 routes from February 24, 2007 to March 27, 2007. Each bus recorded its GPS location every minute while each taxi recorded it every 15 seconds if there was no customer on board, and every minute if with customer(s). Each record includes the vehicle ID, the latitude and longitude of the current location, a timestamp, the vehicle moving speed, the heading direction, and for taxi, whether it was hired by customers or not.

From the traces of buses, the values of travel time, inter?arrival time, and arrival rate between two clusters are relatively stable with small variations, so the buses are modeled as constants for simplicity. For the taxis, the travel time during a certain daily period can be simplified as a constant, but the inter?arrival time and instant arrival rate have large variations and they should be modeled by two random variables. It is found that the distribution of taxi inter?arrival time is close to an exponential distribution (Fig. 4), where the simulation results were derived from 2000 runs of Monte Carlo simulation. In other words, taxis traveling between two clusters follow a Poisson process with the average arrival rate equal to the inverse of the average inter?arrival time.

Considering the daily traffic volume variations, we only count the vehicles from 8:00 am to 4:00 pm each day from the traces. During this daily period, 5467 taxis traveled from Cluster 17 to Cluster 2, so the average taxi arrival rate is 5467/(30×8×60) per minute. We further obtain the histograms of taxi inter?arrival time between neighboring clusters. As shown in Fig. 4, the exponential inter?arrival time model (Poisson process) matches well with the histogram from the traces. Other inter?arrival time distributions show a similar tendency and are omitted.

5 Graph Construction and Prescreen Paths

From the trace analysis, we obtain the link parameters and then the VANET can be modeled as a weighted directed graph, denoted by [G = (V, E , W)], where [V] is the set of nodes, [E] is the set of edges, and [W] is the weight matrix. A node denotes a cluster. A directed edge [∈E] denotes a link from node i to node j, which means that a message can be delivered by a vehicle (bus or taxi) from node [i] to node[ j]. Let [Ni = {j| ∈E}] be the neighbor set of node [i]. The weight of a link is associated with the link cost, and the weight depends on the application QoS requirements, the message storage and transmission costs. For example, if the application is concerned on delay, the link cost is the average link delay between two nodes, including the average waiting time for the carrier and the average travel time of the carrier.

Considering the Hongqiao airport and Pudong airport as the source and destination, we build a simplified weighted graph as shown in Fig. 5a, where the clustering results in Fig. 3 are used. The weight of each edge (link) is the link cost in terms of average link delay.

Even with clustering, there exist many possible paths between two nodes in a large?scale VANET. To speed up the optimal route selection process, many paths can be eliminated first. The prescreen process can be done using the generalized Dijkstras shortest?path algorithm to select [K] shortest paths (in terms of smallest sum of the link weights along the paths). [K] is typically a small integer, and is determined by the criteria that the sum of the weights of the [K]?th shortest path is much smaller than that of the [(K+1)]th path. Considering the Hongqiao to Pudong airport case shown in Fig. 5a, we have [K = 3] and the paths selected by the prescreen process are shown in Fig. 5b.

We do not use the shortest path algorithm to select a single shortest path as the optimal routing decision. This is because, in our problem, the deterministic bus arrival process and the random taxi arrival process make the delay component along a path dependent [15]. Consequently, the shortest path algorithm may fail to find the optimal solution since the routing problem does not satisfy Bellmans Principle of Optimality. Nevertheless, even with the dependence, the optimal path is likely being one of the paths with relatively low sum?weights. Thus, we use the shortest path algorithm to do one prescreen. Then, we derive the solution to identify the optimal path within this set of K candidates.

6 Utility?Based Problem Formulation and

Optimal Routing

Utility?based optimization has been widely used for resource allocation and routing [16], [17]. A utility?based optimization problem can be formulated to select the best per?hop forwarding strategy and the optimal path. If only one type of vehicles, e.g., buses or taxis, are considered, and the link costs (e.g., delay components) in a path are independent and additive, we can formulate a simplified utility maximization problem as follows.

[maxpk, k=1,2,…,Ki=1|pk|Ui] (1)

where [pk] is a path from the source node to the destination node, and [K] is the total number of paths in the simplified weighted graph, [|pk |] is the number of hops (links) of path [pk] , and [Ui] is the utility of the link starting at node i.

The utility function is generally convex and twice differentiable over their variables. For different scenarios, the utility function may have a different form. For example, if the application prefers faster delivery without worrying about the cost, the utility function can be a monotonic decreasing function of the end?to?end delay. The utility function can be monotonically decreased with regard to deliver delay and other cost.

On the other hand, when both bus and taxi are considered, each node needs to decide the best strategy in terms of which type of vehicles has a preference and how long the message should wait for a vehicle of the preferred type. Besides, the costs in consecutive links of a path become dependent [15]. In this case, a coupled utility maximization problem is formulated to solve the data forwarding problem:

[maxpk, k=1,2,…,K maxsi∈θi, i∈pkU(s1,s2,…,spk)] (2)

where si is the strategy used by node i selected from the strategy set Θi, and U (s1, s2, ..., s|pk |) is the coupled utility function which is a multi?variables function and the strategy variables of all nodes in the path are considered.

Considering a simple and reasonable assumption that the travel time of a taxi is smaller than that of a bus, and given the taxi arrivals follow a Poisson process, a desirable node strategy is to wait for a taxi if the average inter?arrival time of taxis is smaller than the difference of the link travel time of a bus and that of a taxi; otherwise, the node can select the first vehicle encountered, whichever the type is, to carry the message [15]. In short, by solving the stationary point of the utility function, we can obtain the best strategy of each node, and then the optimal path is obtained by comparing the expected utilities of different paths under the optimal node strategies.

7 Performance Evaluations

Simulation results are presented to demonstrate the effectiveness of the above framework, using the real traces of taxis and buses in Shanghai city. Both the taxis and buses are considered as the possible carriers, and we thus consider Problem (2) to find the optimal solution. The result using the proposed framework is named “Proposed”, and the work in [11] is the state?of?the?art purely ad hoc solution for comparison. We conducted 2000 runs of Monte Carlo simulation for each routing algorithm.

First, the utility function is considered as a monotonically decreasing function of the average deliver delay, i.e., the smaller deliver delay, the higher the utility is. In this case, we apply the theoretical analysis in [15] to obtain the optimal solution.

In the simulation, the traffic arrival rates between two clusters are set according to the traces. Then, we repeat the simulation 104 runs with random seeds to obtain the statistics. We set the average travel time of each link to 15 and 30 minutes for taxis and buses, respectively, and the bus interval is 15 minutes. Fig. 6 shows the cumulative distribution function (CDF) of the path delay for the three paths shown in Fig. 5b. It is clear that Path 1 (the upper path) is the best one, which is also the optimal solution according to the theoretical analysis in [15].

We then compare the proposed solution with the one in [11] in terms of average delay, successful delivery ratio and overhead ratio. The overhead ratio is defined as the average number of control messages exchanged between vehicles and RSUs for a single message being delivered (excluding the registration messages sent from vehicles to the RSUs). The comparison results are given in Table 1, and the proposed approach achieves a much better performance with a much lower per?message overhead than the previous solution in [11]. The additional cost of the proposed solution is the deployment of RSUs at the selected hot?spots, and the registration message overhead when vehicles enter a cluster. Since we only need to deploy a relatively small number of RSUs to cover a large?size city, the introduced cost and overhead are low compared to the above gains.

Further, in addition to the delay, other costs can also be considered in the utility function. For instance, we model the utility function as follows

[U=-10logd2×db-c] (3)

where d is the average deliver delay, [db](= 180 minutes) is the minimum deliver delay if using buses only, and c is the cost proportional to the hop number of the path (since the message may be exchange and may occupy the drop box in each hop). Here, we set c equal to 0.4 times the hop count as an example.

The CDF of the anticipated utility for the three paths are shown in Fig. 7. It is observed that when the cost is considered, the best path is Path 2 instead. In summary, the proposed framework is still effective to find the optimal path with different utility functions.

8 Conclusions and Open Issues

We have investigated the content distribution problem in large?scale VANETs. A hybrid?network framework is proposed to solve the problem. In this framework, an overlay store?carry?and?forward content distribution network is constructed to model a large?scale VANET. We formulate utility?based optimization problems to find the optimal routing solutions by considering different mobility patterns of vehicles. Trace?driven simulations demonstrate that the proposed hybrid framework is efficient in terms of delivery delay, delivery ratio and overhead ratio, and scalable for content distributions in large?scale VANETs.

Given the hybrid?network framework for content distribution in large?scale VANETs, there are many open issues worth further investigation. First, it is possible to combine several wireless technologies together and optimize the data delivery over them jointly. Second, the obtained optimal solution for data forwarding under the proposed framework is for source routing protocols. How to design a hop?by?hop online routing protocol in the hybrid network is an open problem. For each node, it can dynamically select the next?hop link and carrier based on some real?time information. For instance, at a node, finding a vehicle moving directly to a cluster very close to the destination may occur with very low probability, so this path is not considered in source routing. However, if this situation does occur, the hop?by?hop online routing solution may use this rare opportunity to further improve the performance. Third, in this paper, it is assumed that the message handover only happens at hot spots with limited coverage. How to use multi?hop transmissions to enlarge the coverage of each hot spot for further improving the performance deserves further investigation. Finally, in addition to VANET, there are other applications that can take the advantage of our framework, such as mobile social networks where both the deterministic and random mobility patterns exist and multi?modal travel planning.

References

[1] M. Wang, H. Shan, R. Lu, et al., “Real?time path planning based on hybrid?VANET?enhanced transportation system,” IEEE Transactions on Vehicular Technology, vol. 64, no. 5, pp.1664-1678, 2015. doi: 10.1109/TVT.2014.2335201.

[2] R. He, H. Rutagemwa, and X. Shen, “Differentiated reliable routing in hybrid vehicular ad?hoc networks,” in IEEE International Conference on Communications, Beijing, China, May, 2008, pp. 2353-2358.

[3] K. Mershad and H. Artail, “We can deliver to far vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 3, pp.1099-1115, 2012. doi: 10.1109/TITS.2012.2183124.

[4] Y. Zhuang, J. Pan, Y. Luo, and L. Cai, “Time and location?critical emergency message dissemination for vehicular ad?hoc networks,” IEEE Journal on Selected Areas in Communications, Special Issue on Vehicular and Communications and Networks, vol. 29, no. 1, pp.187-196, 2011.

[5] H. Zhou, B. Liu, T. H. Luan, et al., “ChainCluster: engineering a cooperative content distribution framework for highway vehicular communications,” IEEE Transactions on Intelligent Transportation Systems, vol. 15, no. 6, pp.2644-2657, 2014. doi: 10.1109/TITS.2014.2321293.

[6] H. Zhou, B. Liu, F. Hou, et al., “Spatial coordinated medium sharing: optimal access control management in drive?thru internet,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 5, pp. 2673-2686, 2015. doi: 10.1109/TITS.2015.2416257.

[7] K. Abboud and W. Zhuang, “Impact of microscopic vehicle mobility on cluster?based routing overhead in VANETs,” IEEE Transactions in Vehicular Technology, vol. 64, no. 12, pp. 5493-5502, 2015. doi: 10.1109/TVT.2015.2482904.

[8] H. Zhu, S. Chang, M. Li, S. Naik, and X. Shen, “Exploiting temporal dependency for opportunistic forwarding in urban vehicular networks,” in IEEE International Conference on Computer Communications, Shanghai, China, 2011, pp. 2192-2200.

[9] A. Bachir and A Benslimane, “A multicast protocol in ad hoc networks inter?vehicle geocast,” in IEEE Vehicular Technology Conference, Orlando, USA, 2003, pp. 2456-2460. doi: 10.1109/VETECS.2003.1208832.

[10] F. Li and Y. Wang, “Routing in vehicular ad hoc networks: a survey,” IEEE Vehicular Technology Magazine, vol. 2, no. 2, pp.12-22, 2007.

[11] L. Zhang, B. Yu, and J. Pan, “GeoMob: A mobility?aware geocast scheme in metropolitans via taxicabs and buses,” in IEEE International Conference on Computer Communications, Toronto, Canada, 2014, pp. 1779-1787.

[12] J. A. Hartigan and M. A. Wong, “Algorithm AS 136: a k?means clustering algorithm,” Applied statistics, pp.100-108, 1979.

[13] T. Kanungo, D. Mount, N. Netanyahu, et al., “An efficient k?means clustering algorithm: analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, 2002. doi: 10.1109/TPAMI.2002.1017616.

[14] UST. (2015). Vehicle GPS traces, including both buses and taxis, collected in the city of Shanghai, China [Online]. Available: http://www.cse.ust.hk/scrg

[15] J. He, L. Cai, P. Cheng, and J. Pan, “Delay minimization for data dissemination in large?scale VANETs with buses and taxis,” IEEE Transactions on Mobile Computing, vol. 15, no. 8, pp. 1939-1950, 2016. doi: 10.15680/IJIRSET.2015.0403022.

[16] M. Xing, J. He, and L. Cai, “Maximum?utility scheduling for multimedia transmission in drive?thru internet,” IEEE Transactions on Vehicular Technology, vol. 65, no. 4, pp. 2649-2658, 2016. doi: 10.1109/TVT.2015.2424372.

[17] K. Ota, M. Dong, S. Chang, and H. Zhu, “MMCD: cooperative downloading for highway VANETs,” IEEE Transactions on Emerging Topics in Computing, vol. 3, no. 1, pp. 34-43, 2015. doi: 10.1109/TETC.2014.2371245.

Manuscript received: 2015?12?29