Investigate the Feasibility of Traffic Speed Estimation Using Cell

Document Sample
Investigate the Feasibility of Traffic Speed Estimation Using Cell Powered By Docstoc

Investigate the Feasibility of Traffic Speed Estimation
Using Cell Phones as Probes

     Zhi-Jun Qiu*
     Department of Civil and Environmental Engineering,
     University of Wisconsin Madison, Madison, WI 53706 U.S.A
     Fax: 1-608-262-5199
     *Corresponding author

     Peng Cheng
     Department of Automation,
     Tsinghua University, Beijing 100084 China

     Bin Ran
     Department of Civil and Environmental Engineering,
     University of Wisconsin Madison, Madison, WI 53706 U.S.A

     Abstract: These days wireless location technology develops very fast. The
     accuracy of mobile location has been remarkably improved, which provides the
     possibility of using cellular probes to give a fairly good estimate of travel speed
     or travel time in an urban region. Probe-based traffic monitoring system using
     cellular technology has much lower cost than the traditional method using GPS,
     and it can improve the utilization of the existing facilities. This paper
     summaries several previous studies and field tests, analyzes the raw cell phone
     data from one wireless carrier, and tries to address the potential issues and
     feasible additional refinement for the future field deployment and operational

     Keywords: Wireless location technology, Speed estimation, Geographical
     Information Systems.

     Biographical notes: Zhi-Jun Qiu received his Bachelor and Master Degree in
     Automation Department from Tsinghua University in 2001 and 2004
     respectively. He is currently PhD candidate at the Department of Civil and
     Environmental Engineering, University of Wisconsin Madison, U.S.A, His
     research interests include intelligent transportations system, traffic network
     analysis, dynamic traffic prediction, geographical information system applied
     in traffic engineering, and wireless communication.
         Z.Qiu,P.Cheng and B.Ran

         Peng Cheng is an Associate Professor of the Department of Automation,
         Tsinghua University, China. He received his PhD from Tsinghua University in
         2000. His current research interests include modelling, scheduling and
         optimization techniques for large-scale, complex systems arising in the fields
         including air transportation systems, manufacturing systems and service
         systems; data warehouse and date mining; decision support systems.

         Bin Ran is a Professor of the Department of Civil and Environmental
         Engineering, University of Wisconsin Madison, U.S.A. He received his PhD
         from the University of Illinois, Chicago in 1993. His current research interests
         include intelligent transportations system, dynamic traffic prediction, and
         wireless communication, and location commerce.

1   Introduction

During the past years, wireless location technology (WLT) develops very fast, and a
number of simulated studies and operational tests have attempted to develop traffic
information system using real-time cell phone data. INRETS, which is a French
transportation research organization, developed a discrete event simulation of traffic flow
to determine requirements of sample size and accuracy of a simulated system (Ygnace et
al.2000). Another study conducted by the University of Maryland indicated that cell
phone data could provide a general characterization of flow on a freeway, but accurate
speed estimates were beyond the capabilities of the simulated system (Lovell 2001). The
third research result provided by the Berkeley Institute for Transportation Studies showed
that the factors that could affect the utility of traffic information system based on cell
phone data include: location accuracy, frequency of locations of a single wireless device,
and the total number of locations (Cayford and Johnson 2003).
The first operational test related to this research topic was CAPITAL (Cellular Applied to
ITS Tracking And Location) project, which was conducted in the mid-1990 on several
interstates and state routs in Virginia (UMD 1997). The conclusion derived from the
project demonstrated that cell phone data provided the potential to collect reasonably
accurate location data, but were unsuccessful in producing traffic information. U.S.
Wireless Corporation took part in another operational test in Oakland, California (Yim
and Cayford 2001). After analyzing 44 hours of wireless location data, the researchers
from the University of California Berkeley found the position estimates generally had a
60-m accuracy, generally the call lengths were very short, and 60% of vehicles cannot be
matched to a roadway link. The relationship between the design of a WLT-based
monitoring system and the accuracy of speed estimates has been explored (Fontaine and
Smith, 2005). A simulation-based approach was used to define general guidelines for
different aspects of system design and roadway network characteristics. Since 2005 the
research group from Berkeley has been deploying an operational probe-based traffic
monitoring system in Tampa of Florida using cell phone data (Cayford and Yim, 2006).
The preliminary result addressed in the paper shows the system has a good geographical
coverage, but the validation of output results is not reported and regarded as one of future
research tasks.
The most popular probe vehicle technology is based on the Global Positioning System
(GPS), and GPS is a location system based on a constellation of about 24 satellites
         Investigate the feasibility of traffic speed estimation using cell phones as probes

orbiting the earth at altitudes of approximately 11,000 miles. Traditional GPS-Probed
method has several major disadvantages: firstly, the accuracy of estimated results relies
on the accuracy of map matching method very much, and map matching will be
challenging when handling arterial roadway case; secondly, the real-time data
transferring from individual probe vehicle to data processing center is a critical issue, and
the required real-time data sample size and the limitation of wireless network bandwidth
are a trade-off for consideration. Further, it is obvious that the installation cost of GPS-
Probed system is very high. All these limit the large-scale application of GPS-Probed
system. Compared with other probe vehicle technologies, the cellular location technology
for travel time collection has several distinct advantages as the following:
     No in-vehicle equipment needs to be installed, which can decrease the employment
     cost dramatically. As long as there exists one passenger or driver with a cell phone
     in the vehicle, the vehicle can be regarded as a probe vehicle.
     Recruited drivers or volunteers are not required. The system utilizes samples from
     the existing of vehicles equipped with cell phones.
     Potential large sample size. Studies have suggested that cell phone users increase as
     congestion increases. As the number of cell phone owners increases, the number of
     potential probe vehicles increases. Large sample size can help us to generate
     relatively accurate traffic related information
The aims of this paper are to:
     Investigate the feasibility of estimating traffic speed by using cell phones as probes
     after summarizing the results of the existing simulated and field tests from other
     Propose a simple regression model to estimate the traffic speed, and compare the
     estimated results with the loop detector data from the existing system. The data
     source for validating and verifying the proposed model is provided by one wireless
     carrier, and loop detector data are obtained from the local traffic management
     agency in the city Shanghai of China;
     Prospect the feasible research direction to improve the performance of the cellular
     probe-based system, and make it more practicable for the real world.

2   Overview of Cellular System

Cellular network uses a series of radio transmitters called Base Transceiver Station (BTS)
or Base Stations (BS) to connect cell phone to one corresponding cellular network. Each
Base Station is also termed a cell, so named because it covers a certain range within a
discrete area (cell). Base Stations are all interconnected, which is the reason why
someone can move from one cell to another without losing his connection. Cell is the
basic geographic unit of a cellular system, and is also the basis for the generic industry
term: cellular. A city or county is divided into smaller cells, each of which is equipped
with a low-powered radio transmitter/receiver. The cells can vary in size depending upon
terrain, capacity demands, etc. By controlling the transmission power, the radio
frequencies assigned to one cell can be limited to the boundary of that cell. These cells
         Z.Qiu,P.Cheng and B.Ran

are further divided into sectors that are also called as subcells, by the use of directional
antennas at each cell.

Figure 1 structure of system principle

Figure 1 shows the structure of system principle. When a cell phone during a call moves
from one cell toward another, a controller monitors the movement and transfers or hands
off the phone call to the new cell at the proper time. Handoff is the process by which the
controller passes a cell phone conversation from one cell to another. The handoff is
performed so quickly that users usually never notice, and the controller records each
handoff once it occurs. These records are the data source for our proposed model.
Figure 2 studied expressway segments (length = 5.2 kilometres)

If one movement of traffic along a major roadway is considered, the handoff times and
the associated distances between two consecutive cell boundaries can be used to deduce
average velocities of individual vehicles. Figure 2 indicates the studied expressway
segments which includes seven small roadway links which are from 500 meters to 1000
meters respectively, and 12 loop detector groups have been deployed along this route.
The distance from freeway node 1 to node 8 is around 5.2 kilometres, and the travel
direction is west bound.
         Investigate the feasibility of traffic speed estimation using cell phones as probes

3   Traffic volume and Handoff volume

The cellular handoff volume has two types, one is based on out-calls, which refers to the
calls sent by the cell phone holders, and the second type is based on in-calls, which is the
calls received by the cell phone holders. All the records which belong to both out-calls
and in-calls are regarded as valid potential samples to deduce the actual travel speed for
the corresponding roadway link which is determined by the location of two consecutive
handoff points, and the valid sample size is termed as cellular handoff volume.
The analysis of the relation between traffic volume and cellular handoff volume for the
corresponding roadway network and wireless network needs to be done, even if the exact
coefficient of assumed linear projection cannot be obtained. It is known that the wireless
carriers who run operational wireless networks started to deploy their cellular network
along the major road network axes, assuming that the natural usage of the mobile phone
was while travelling. Intuitively, handoff volume will be likely high once traffic volume
is high, and potential sample size will be large in this case. Such a straightforward
conclusion has been confirmed by our analyzing the source data. Figure 3 shows the
comparison results between traffic volume and handoff volume. As seen in Figure 3, the
detected traffic volume derived from loop detector 8 (see Figure 2), and cellular probe
handoff volume data are obtained from roadway link 5-6 (see Figure 2) which is
determined by the two consecutive nodes 5 and 6. Loop detector 8 locates on link 5-6,
which enables such comparison valuable.
Figure 3 Comparison between traffic volume and handoff volume (5 minutes interval)

The data are recorded from 2006 July 18th 12:00AM to 2006 July 19th 12:00AM for
twenty four consecutive handoff links, which are determined by two consecutive handoff
points respectively. The time interval of traffic data analysis is 5 minutes. It has been
         Z.Qiu,P.Cheng and B.Ran

confirmed that handoff points, where handoff happens when a cell phone during a call
moves along the roadway at normal travel speed, are relatively stable after the field
measurement and test was implemented. Intuitively, handoff point defined here must be
on the link of roadway network, and it is the junction of roadway link and boundary of
two consecutive cells conceptually.
The detected traffic volume shows the high traffic volume period which last from 7:00
AM to 7:00 PM, and the trend of handoff volume change is similar with that of traffic
volume change. The similar curves reflecting the relation between traffic volume and
handoff volume can be obtained after analysing the other links of the studied freeway
route. Roughly, during the traffic peak hour, the sampling rate of cellular probe is around
10% ~ 20%, and most of the valid sample sizes within one 5-minute time interval are
from 20 to 60, as we can see in Figure 3.

4   Methodology Description

Linear regression model is used to approximate the relationship of the future traffic with
the historical and the current detected data. The multivariate local linear regression
estimator (Fan et al., 1996) is focused on in this paper.
The traffic estimation problem can be described in the following: given the observed
travel time for one roadway link,
                       T (m, n), m = 1, 2...t , n = 1, 2...d (1),
the model tries to generate an estimation of T (t + δ , d ) , where δ is the prediction
horizon, m is the index of time interval, and n is the index of day. In the corresponding
regression model, the given inputs are also called covariates or predictors and the output
variable is called response.
An array v (t , d , k )(t ∈ T , d ∈ D, k ∈ K ) is used to denote the travel speed that is
measured at time t on day d at link k. The travel time Tij(t,d), which it takes for travellers
to travel from link i to link j once starting at time t on day d, can be approximated once
travel speed is known. If i = j , it means the required travel time is for the single link i.
Using information available at time t, the defined travel time can be computed as the
                            v(t , d , k ) =                          (2)
                                              Tkk (t , d )
                            Tij * (t , d ) = ∑                       (3)
                                              k = i v (t , d , k )

                            Tkk (t ) =            ∑ Tkk (t , d )
                                          | D | d∈D                  (4)
Here lk is used to denote the distance for link k, and T* is called the instantaneous travel
time, which is the travel time that would have resulted from the departure from the
beginning node of link i to the end node of link j at time t on day d when no significant
         Investigate the feasibility of traffic speed estimation using cell phones as probes

changes in traffic occurs. If T(t,d) have been computed for one collection D of the days in
the past, the average historical travel time is computed as the above.
Considering two input variables of T (t + δ , d ) are the instantaneous travel time
T * (t , d ) and the historical average T (t + δ ) , predicting T (t + δ , d ) for δ ≥ 0 on the
basis of the available information at time t on day d is feasible. T (t , d ) predicts well in
the case δ is small and T (t + δ ) predicts better in the case δ is large, especially day
interval, month interval and season interval. An empirical fact shows that there exist
linear relationships or approximately linear relationships between T (t , d ) and
T (t + δ , d ) for all t and δ , and the relation varies with the choice of t and δ ..
On the basis of the above discussion, the following model is proposed:
         T (t + δ , d ) = α (t , δ )T (t + δ ) + β (t , δ )T * (t , d ) + ε (5)
where ε is a zero-mean random variable modelling random fluctuations and
measurement errors, and the parameters α and β are allowed to vary with t and δ .
Making the model be fit to the existing data is a familiar linear regression problem that
we solve by a weighted least squares. Define the pair α (t , δ ) and β (t , δ ) to minimize
          ∑ (T (s, d ) − α (t, δ )T (t + δ ) − β (t,δ )T (t, d )) K (t + δ − s)
                                                           *        2

where K denotes the Gaussian density with mean zero and a certain variance σ , which

the user needs to specify.
                                     K ( x) =        e − x / 2σ
                                                          2     2
                                                σ 2π
The purpose of this weight function is to impose smoothness on          α   and    β   as function of
t and δ , because we expect that average properties of traffic do not change abruptly and
day pattern and week pattern work for normal traffic condition. The actual prediction of
T (t + δ , d ) becomes
                 T (t + δ , d ) = α (t , δ )T (t + δ ) + β (t , δ )T * (t , d ) (8)
The formula expresses a future travel time as a linear combination of the historical mean
and the current travel time. For the improvement of estimation accuracy, different
coefficients can be assigned for the variant of different historical times, other than using
the historical mean to denote the impact of historical traffic state. Hence, the mathematic
expression can be extended into multi-dimension. In the following discussion, all the
covariates are the time series of past and current observations, and it means
               x = [T(t − m +1, d − n +1),T (t − m + 2, d − n +1),...
               T(t, d − n +1),T(t − m +1, d − n + 2),...,T(t, d − n + 2),... (9)
               T(t −1, d),T(t, d)]T
Given multivariate covariate X and a univariate response Y, the goal of this paper is to
estimate the mean regression function, which is the prediction of traffic variables,
m(x) = E (Y | X = x) , where xT = ( x1 , x2 ,...xq ) is a point in R q . Given the
          Z.Qiu,P.Cheng and B.Ran

{( XT , Yi ) : i = 1, 2,..., p} , and XT = (X i1, X i2, ..., X iq, ) ,
    i                                  i

the estimator of   β = ( β 0 , β1 ,...β q )T        to minimize
                      p                   q

                    ∑{Yi − β0 − ∑ β j ( X ij − x j )}2 K B (Xi − x) (10)
                     i =1                 j =1
                    β = ( β 0 , β1 ,...β q )T = ( XT WX q ) −1 XT Wy
                                                   q            q                    (11)
                                 ⎛ 1 X 11 − x1                X 1q − xq ⎞
                                 ⎜                                       ⎟
                                   1 X 21 − x1                X 2 q − xq ⎟
                            Xq = ⎜                                                 (12)
                                 ⎜                                       ⎟
                                 ⎜1 X − x                                ⎟
                                 ⎝     p1    1                X pq − xq ⎟⎠

                                W = diag { K B ( Xi − x)}                   (13)

                               y = (Y1 , Y2 ,..., Yp )T                     (14)
                                K B (u) =            K ( B −1u)             (15)
Similar with the above, K is a multivariate probability density function with mean zero
and the covariance matrix. B is called bandwidth matrix and |B| indicates its determinant.
The weighting kernel K is chosen as Gaussian function and B = hI q .

Hence, the following results can be obtained apparently:
                                      m( x) = β 0                    (16)

                               (        )(x) = β j , j = 1, 2,...q          (17)
                                   ∂x j
y = β 0 is the required prediction value (Fan et al., 1996) .
Bandwidth h and covariate vector dimension q are the parameters needed to select. The
issue of how to select value of bandwidth and dimension will be studied in the future
When we consider the update time interval for real-time traffic information application, 5
minutes is always used, because it is long enough to collect enough samples to give a
relatively accurate traffic state estimation and short enough to guarantee there are no big
change within such a time interval. Hence, here δ is adopted as 5 minutes, and four
variables are chosen in the covariates:
                x = [T (t − 1, d ), T (t , d ), T (t , d − 1), T (t + 1, d − 1)]T (18)
And the response variable is
                                      y = T (t + 1, d )              (19)
         Investigate the feasibility of traffic speed estimation using cell phones as probes

5   Experimental Results

The route being known, speeds of the vehicles between two consecutive handoff points
can be detected, and speed on individual route section can be extracted. This procedure is
continuous, and can produce continuously updated speed map that is being updated
periodically, such as 5 minutes. Our analysis is based on the data from March 2005 to
July 2006, along Yan’An Expressway of Shanghai City of China. Two kinds of traffic
data have been collected: the estimated travel speed based on real-time cell phone
handoff records which were delivered by the monitoring server of one wireless carrier in
China; the detected travel speed data from loop detectors which have been deployed in
our test region already. The algebraic average value of upstream and downstream loop
detected speeds is regarded as time-mean-speed for roadway segment as approximation.
Figure 4 shows the comparison between the cellular probe estimated travel speed and
loop detected travel speed for the same roadway segment and one whole day. Loop
detected speed shows the factual speed should be free flow speed from 12:00 am to 7:00
am, but the cellular probe estimated speeds are usually more than free flow speed 70
km/h. This is caused by the fact the sample size for cellular probe system is a bit low, and
drivers are accustomed to drive fast. From 8:00 am to 8:00 pm the sample sizes for
cellular probe are usually good enough to estimate traffic condition, and the absolute
error, which is denoted as the difference between cellular probe estimated speed and loop
detected speed, is within 10 km/h. The relative error between cellular probe estimated
speed and loop detected speed is within 15%. Table 1 shows one time interval result.
Figure 5 shows the comparison between the cellular probe estimated travel time and loop
detected travel time for the chosen route which Figure 2 shows, and the total length for
the route is 5.2 kilometres. The absolute error denoted as the difference between cellular
probe estimated travel time and loop detected travel time, is within 60 seconds in most
time intervals (262 time intervals of total 288 time intervals), and the relative error
between cellular probe estimated speed and loop detected speed is around 15% in most of
cases. On the basis of experimental result analysis, the proposed method can generate a
reasonable traffic speed especially in the case of urban expressway.
              Table 1 Comparison of loop detector and cellular probe results
                               (One time interval snapshot)
                                             Field Result    Estimated Result
               From     To                 Travel   Actual    Travel Estimated
              Node ID Node ID               Time    Speed      Time   Speed
                                           (secs)   (km/h)    (secs)  (km/h)
                 1         2       699       59      42.5      49       50.5
                 2         3       740       92      28.8      94       28.1
                 3         4       542       88      22.1      75       25.7
                 4         5       1124      171     23.6      155      26.1
                 5         6       537       107      18       89       21.5
                 6         7       992       136     26.2      148      24.1
                 7         8       566       67      30.3      51       39.2
                 1         8       5200      720     26.0      661      28.3
           Z.Qiu,P.Cheng and B.Ran

Figure 4. Comparison of travel speed results

Figure 5. Comparison of travel time results
         Investigate the feasibility of traffic speed estimation using cell phones as probes

6   Conclusions and Future Work

This paper summaries several previous studies and field tests, and provides one simple
regression model to analyze the real-time cell phone data. One refinement of this paper is
to adopt the linear regression model to estimate travel speed for roadway links of interest
and travel time for chosen route using the raw data from one operational wireless carrier
in China. We try to investigate the feasibility of using such cell phone data to estimate
traffic state, and our results show that using cell phone as probes is a doable and
promising method to collect real-time traffic data in the case of urban expressway.
To improve the accuracy of cellular probe traffic detection technology, macroscopic
traffic flow theory should be integrated with estimation technology, such as Kalman
Filter and Particle Filter. The more accurate models, including traffic models and cell
phone users’ behaviour models, can be built to meet the requirement of deploying an
operational system for factual traffic application. Further, how to validate and verify the
estimated cellular probe results is another issue, especially in the case of lacking of data
source from the existing system, such as loop detectors or videos. All the above will be
focused on in our future research.

Cayford, R., and T. Johnson. Operational Parameters Affecting the Use of Anonymous Cell Phone
     Tracking for Generating Traffic Information. Presented at the 82nd Annual Meeting of the
     Transportation Research Board, Washington, D.C., 2003.
Fan, J. & Gijbels, I. (1996). Local Polynomial Modelling and its Applications. London:Chapman &
Florida Department of Transportation, Travel Time Estimation Using Cell Phones(TTECP) for
     Highways and roadways. June, 2004
Fontaine and Smith, B.L. Probe-based traffic monitoring system using wireless location technology:
     investigation of the relationship between system design and effectiveness, Presented at the
     84th Annual Meeting of the Transportation Research Board, Washington, D.C., 2005.
R. Sankar and L. Civil, Traffic Monitoring and Congestion Prediction Using Handoffs in Wireless
     Cellular Communications, IEEE 47th Annual International Vehicular Technology Conference
     (IEEE VTC), Phoenix, AZ, May 1997, pp.520-524.
Smith, B.L., M.L. Pack, D.J. Lovell, and M.W. Sermons. Transportation Management Applications
     of Anonymous Mobile Call Sampling. Presented at the 80th Annual Meeting of the
     Transportation Research Board, Washington, D.C., 2001.
University of Maryland Transportation Studies Center. Final Evaluation Report for the CAPITAL-
     ITS Operational Test and Demonstration Program. University of Maryland, College Park,
Ygnace, J-L., J-G. Remy, J-L Bosseboeuf, and V. Da Fonseca. Travel Time Estimates on Rhone
     Corridor Network Using Cellular Phones as Probes: Phase 1 Technology Assessment and
     Preliminary Results. French Department of Transportation, Arcueil, France, 2000.
Yim, Y.B.Y., and R. Cayford. Investigation of Vehicles as Probes Using Global Positioning System
     and Cellular Phone Tracking: Field Operational Test. Report UCB-ITS-PWP-2001-9.
     California PATH Program, Institute of Transportation Studies, University of California,
     Berkeley, 2001.

Shared By: