Characterizing and Visualizing Mobile Networks.pdf

Document Sample
Characterizing and Visualizing Mobile Networks.pdf Powered By Docstoc
					          Characterizing and Visualizing Mobile Networks
     Anuradha Vaidyanathan                               Mark Billinghurst                            Harsha Sirisena
  Human Interface Technology Lab                 Human Interface Technology Lab                    University of Canterbury
     University of Canterbury                       University of Canterbury                      Christchurch, New Zealand
    Christchurch, New Zealand                      Christchurch, New Zealand

    anuradha.vaidyanathan*                               mark.billinghurst*      

ABSTRACT                                                               Using Per Call Measurement Data (PCMD) from a leading
This paper seeks to characterize data obtained from calls made         service provider in New Zealand, we analyze the mobility
on a mobile phone network, from one of New Zealand’s leading           exhibited by the subscribers. We present detailed findings,
telecommunications providers. By understanding the                     which characterize some important features of the data-set,
characteristics of the data, we provide insight into the metrics,      lending itself for easy adaptation in models, which will evaluate
which are important to keep in sight for modelling such data for       further metrics in future. The data-characterization forms an
use in our performance visualization tool, Caviar. The primary         integral component of the visualization chassis, Caviar, which
characteristics that are of interest to us include the user’s          is envisioned to assist in proactive, spatial visualization of cell-
mobility across the board, while calls are being placed on the         phone data.
network. In understanding the co-relation of the users’ activity
as opposed to mobility, gathering insight into the frequency,
                                                                       2. RELATED WORK
                                                                       Telecommunications services and associated networks generate
duration and median cell-site changes, we provide the
                                                                       vast amounts of data on a day-to-day basis. This data pertains to
foundation for data-characteristics, which can be used to
                                                                       a multitude of characteristics such as bandwidth, latency,
visualize performance of the mobile network, besides being
                                                                       congestion, loss-rates etc. It is important that these
used in traffic monitoring, resource-usage planning, social
                                                                       characteristics be monitored in order to provide various Quality
networks, proactive error-correction etc. in order for the
                                                                       of Service (QoS) guarantees and maintain a loyal customer
telecommunications service provider to maintain their clientele
                                                                       base, by providing on-time and proactive services. Visualizing
and provide other services.
                                                                       massive data-sets pertaining to networks, specifically
                                                                       telecommunication networks, is not a new problem [1]. Several
Keywords                                                               end-uses of such visualization have been proposed, including
Mobile Networks, visualization                                         city planning [2], analyzing commute times and automotive
                                                                       diagnostics by means of mobile sensors [3], Internet services
1. INTRODUCTION                                                        with voice network traffic [1,4], vehicular internet access [5],
Understanding how a user moves within a cellular network               analyzing social networks [6,7], real-time survivability metrics
enables several end results including congestion control,              [8] etc. While some of these studies use data from sensors
proactive customer servicing, efficient resource partitioning and      actually deployed on moving entities [3], many of these still use
error-handling at the various call sites. With widespread              simulated data.
deployment of the CDMA 2000 technology, service providers              The present work uses traces collected from a leading mobile
are further able to provide a plurality of user-services such as       networks service provider in New Zealand, to characterize
web-based services, SMS, multi-player gaming and other                 certain features of mobile networks and further propose a
location-based services.                                               visualization framework, Caviar, which will be used to
As the number of subscribers increase, the importance of               understand and monitor such networks for a variety of end-uses.
proactive monitoring and error correction become more                  This work is the first of its kind for the Oceania sector and New
necessary. Previous studies have used models of cell-phone data        Zealand, in particular.
in order to simulate scenarios to evaluate various metrics and
propose improved solutions to aid performance. Accurate
                                                                       3. METHODOLOGY
                                                                       In this section we present the methods employed to collect and
models need to be validated by real-time data and further
                                                                       analyze the data. These are important to acknowledge, as
characterize their data-sets along the patterns seen in real-time
                                                                       changing the analysis or collection methods, might affect the
                                                                       temporal and spatial characteristics obtained.

                                                                       3.1 Data Collection
                                                                       For purposes of this study, traces were collected from CDMA
                                                                       networks, where a Per Call Measurement Data (PCMD) feature
This paper was published in the proceedings of the New Zealand         is present. PCMD provides access to key network information
Computer Science Research Student Conference 2008. Copyright is held
                                                                       for every 3G1x (voice, SMS or data) call that is placed via the
by the author/owner(s).
                                                                       network. The data recorded pertains to several aspects including
                                                                       Identity (MIN, ESN), service type, number dialled, call length,
                                                                       signal quality, timing from pilots, sector in which the call was
                                                                       placed, the latitude and longitude of the cell tower where the
NZCSRSC 2008, April 2008, Christchurch, New Zealand.                   call commenced and ended, call result, cause of failure etc.
166                                 A. Vaidyanathan, M. Billinghurst, and H. Sirisena

 PCMD records provide an unprecedented, unobtrusive view of
 customer behaviour and end-user performance. In order to
                                                                             Data-set              Brief Description        Sample Size
 obtain a geographical view of information pertaining to the
 network, a geo-location algorithm has been used to extract
 accurate location information from PCMD. The timing and               Abbreviated               This is an abbreviated     2424
 hand-off systems, unique to CDMA, alongside accurate                                            data-set     containing
 information about the network allow the user location data to be                                some select fields from
 gathered easily. Maximum likelihood methods are used in the                                     the PCMD data
 geo-location algorithm in order to refine triangulation estimates
 to the highest probability location. Field calibration has a per-     Detailed_OneHour          This is an hour long       56118
 call median error of ~140 meters, averaged over all call                                        trace generated on
 locations wherein accuracy is increased when the calls are                                      07/09 at Auckland
 placed closer to cell sites. For this study, data from two distinct
 switches were collected and a sample of that was anonymised                    Table 1 – Data-sets and their descriptions.
 and used as input for characterizing various aspects of the           The first data-set’s fields with an explanation of the values
 network.                                                                        they represent are tabulated in Table 2.
 3.2 Data Analysis
 Analysis of the two traces was carried out using a combination
 of generating histograms and where necessary and using other                Data                        Detailed Description
 similar statistical tools. One particular aspect of the data-             Descriptor
 analysis we wish to point out is the calculation of distance,           Date                The date on which the data-trace was
 based on latitude and longitude information. In certain cases,                              collected.
 we needed to measure the distance travelled by the caller, when
 the call was carried out. While most callers stayed stationary,         Hour                The hour at which the data-trace was
 there was a sub-set pertaining to                                                           collected.
 callers who moved around, while talking on their cell-phones.
                                                                         ECP                 ECP ID of the MSP where the PCMD record
 While PCMD only collects latitude and longitude of the cell
                                                                                             was created.
 site where the call started and ended, we computed the distance
 using the Haversine formula, which is presented below:                  Sequence            Sequence number incremented per call
R = earth’s radius (mean radius = 6,371km)
∆lat = lat2− lat1                                                        Start Time          Call start time
∆long = long2− long1
a = sin²(∆lat/2) + cos(lat1).cos(lat2).sin²(∆long/2)
                                                                         Duration            Duration of the call
c = 2.atan2(√a, √(1−a))
d = R.c
                                                                         Call Type           Call processing call type

 4. TRAFFIC DATA ANALYSIS                                                Mobile         ID   Mobile identification number
 This section presents the results obtained from mobile phone            Number
 traffic traces, including distributions over time and presents the
 findings on data-characteristics. This understanding will enable
                                                                         ESN                 Electronic Serial Number
 the modelling component of our visualization chassis, Caviar.

 4.1 Overview                                                            Initial Site        Initial cell site number
 Our analysis uses two distinct traces. One of the traces was
 collected from the Auckland area switch on 1st September,
 2007. This trace is a sample of an hour-long trace and has              Last Site           Last cell site number
 around 65K data-samples in it. The second trace was collected
 from the Wellington area and is a much shorter trace, comprised
 of data spanning a few minutes, containing around 3K samples.           Call Summary        Call Summary (normal, terminated, dropped,
 The data-sets used in this study, along with their sample sizes                             origination failed)
 and a brief description are presented in Table 1.                       Reason              Reason why the call ended abnormally etc.

                                                                         Extended            Verbose reason

                                                                         Other MSC           Other Mobile Switching Center
                                          Characterizing and Visualizing Mobile Networks                                                                                                                            167

                                                                          the y-axis represents the number of calls. Around 65,000 calls
                                                                          were monitored in this trace wherein 23% of the calls were
     Pages             Number of CDMA pages?
                                                                          made, while the user moved within two cell-sites. 76% of the
                                                                          calls were made when the users were moving more than two
     Access            Access channel number used to setup the call       cell-sites, during the call. This is different from the cell changes
     Channel                                                              plotted in Fig. 1 as the first figure plots the unique users or
                                                                          unique mobile numbers and how their movement is occurring
                                                                          vs. Fig. 2, which tracks the total number of calls which are
     Paging            Paging channel number used to setup the call       made by each unique ID. The percentages outlined with respect
     Channel                                                              to Fig. 2 pertain to call traffic, over the total volume of calls, as
                                                                          it relates to cell-site movement. Per this plot, the number of
                                                                          calls placed by a unique user varies between 4 calls to 45, with
     TCSI_Timer        Traffic Channel Supervision Interval Timer         the average number of calls being 2 and the mean and median
                                                                          being 1.
                                                                                                                                                         Number of Calls
  Table 2 – Detailed description of data descriptor fields in the                                               50
                       abbreviated trace.                                                                       45


                                                                              N u m b e r o f C a lls M a d e

   In order to analyze the characteristics, a couple of tests on how                                            25                                                                                Number of Calls
   mobility is affected during the calls were performed. It is                                                  20
   important to understand the mobility characteristics for two                                                 15
   end-goals:                                                                                                   10

   •        When modelling mobile phone networks, basing the                                                     5

   model on real-time data, the mobility aspect is the key. The                                                  0
                                                                                                                     1   560 1119 1678 2237 2796 3355 3914 4473 5032 5591 6150 6709 7268 7827
   model has to accurately present how the users move within the                                                                                      Unique ID
   •        Other metrics such as average call-time, user-activity,
   user-mobility correlated with user-activity, number of cell                                                              Fig. 2 – User Activity on an hour-long trace.
   changes etc. give further insight needed to model these
   networks accurately.
                                                                          Cell Changes                                                               Frequency                                  Percentages
   To understand how a user moves between cell-sites, when
   placing calls, the originating cell ID for each call in the trace is   2                                                                          7592                                       54.3%
   compared to the cell ID where the call ends. The number of cell        3                                                                          1396                                       14.7%
   site changes represents the movement of the user between calls,
   thereby demonstrating user mobility. The distribution of the           4                                                                          426                                        4.5%
   number of cell site changes amongst users on the longer trace,         5                                                                          72                                         0.7%
   is presented, normalized to 400, in Fig. 1. The x-axis represents
   the number cell-site changes made by the user and the y-axis           6                                                                          4                                          0.04%
   represents the number of callers who move as many cell-sites.
   This plot indicates that very few callers actually move large                                                                        Table 3 – Cell-Site changes
   distances, while placing calls. 46 % of the callers were
   stationary, 34% moved only one site and 20 % moved over two
   sites.                                                                 Fig. 3 correlates user mobility with user activity, using the
                                                                          shorter trace. The x-axis is the unique phone ID and the y-axis
                                                                          is the number of calls made by that number.User mobility is
                                                                          defined as the number of calls made by a unique user who
     350                                                                  pertains to a unique mobile phone number. The user’s activity
                                                                          is measured in terms of the distance travelled by the user, while
                                                                          the call is in progress. This is measured by calculating the
                                                                          distance between the latitude, longitude tuple of the cell-site
     200                                                Frequency         where the call was initiated and where the call ended. There are
     150                                                                  three pieces of information represented in Fig. 3. The Number
                                                                          of calls, indicates just that for each unique user. The max
                                                                          distance indicates the maximum distance, travelled by that user,
                                                                          while placing his or her entire set of calls. The mode indicates
       0                                                                  the most popular or frequent distance, which a particular user
           0   200   400    600     800   1000   1200
                                                                          travels, while placing their set of calls. This is important and
                                                                          interesting as when modelling this data for our visualization
Fig. 1 – Distribution of number of cell changes amongst users             tool, we will need to understand what the frequently occurring
                    on an hour-long trace.                                distances are, in order to simulate a similar scenario for future
   Fig. 2 presents the user-activity observed on the longer, hour-        use.
   long trace. The x-axis represents the number of cell changes and
168                                                                                     A. Vaidyanathan, M. Billinghurst, and H. Sirisena

                                                                                                                   including systems biology [9], social networks [10], plain APIs
                                                                                                                   [11,12,13,14], network flow [15] etc. Caviar was built using
                                                                                                                   OpenSceneGraph [16].


                                                                                                 Number of Calls
                         800                                                                     Max Distance      6. SUMMARY


                                                                                                                   This paper seeks to characterize data obtained from two distinct
                                                                                                                   traces, from a leading mobile services provider in New Zealand.
                                1   3       5   7   9       11 13 15 17 19 21 23 25 27 29                          The goal of such characterization is primarily to understand the
                                                                                                                   nature of data, which could potentially be modelled, for use in a
  Fig. 3 – User Mobility correlated with user activity level.                                                      visualization tool. The visualization tool itself could serve a
                                                                                                                   variety of end-results including performance visualization of the
                                                                                                                   mobile provider’s network, proactive customer service,
                                                                  User Mobility
                                                                                                                   performance monitoring, resource-usage planning, enabling
                                                                                                                   social networks, etc. From our experiments, we find that user
                         7000                                                                                      mobility is distributed, with the greatest frequency of movement
                                                                                                                   being between 1 or 2 cell-sites. Many users are stationary when
            ell hanges


                                                                                                                   they place and participate in calls made on their mobile phones.
  N ber of C C

                         4000                                                                     Frequency

                                                                                                                   Further, user mobility is weakly correlated to user-activity and

                         2000                                                                                      while modelling this data, we need to keep in mind that a
                                                                                                                   person making a lot of calls, is not necessarily moving around
                                        2               3             4             5       6                      too much geographically. The frequency of cell-site changes is
                                                            Fre quency of Changes
                                                                                                                   most populated at 2 cell-site changes and the number of calls
                                                                                                                   made by all users of the network and not just the unique users,
                                                Fig. 4 – User roaming range                                        is cumulated at around 4 cell-site changes. Finally, we propose
                                                                                                                   a visualization tool, Caviar, which lays the foundation for
                                                                                                                   performance visualization of mobile networks.
 We further see that the user mobility is weakly correlated with
 the user’s activity level, at 0.12. Fig. 4 plots the user roaming
 range wherein the frequency of cell-site changes is plotted                                                       7. BIBLIOGRAPHY
 against the number of cell-site changes, using 9000 unique
 callers, from the hour-long trace as a representative sample. The
 x-axis plots the frequency of changes and the y-axis plots the                                                    [1] Visualization Research with Large Displays, Bin Wei,
 number of cell changes. Table 3 summarizes the percentages of                                                     Claudio Silva et. Al., 2000 IEEE Computer Graphics and
 cell-site changes against frequency.                                                                              Applications
                                                                                                                   [2] Mobile Landscapes: Graz in Real Time, Carlo Ratti, Andres
                                                                                                                   Sevtsuk, Sonya Huang, SENSEable City Laboratory,
                                                                                                                   Massachusetts Institute of Technology, Cambridge, MA, USA,
                                                                                                                   Rudolf Pailer, Mobilkom Austria AG & Co KG, Vienna,
                                                                                                                   [3] CarTel: A Distributed Mobile Sensor Computing System.
                                                                                                                   Bret Hull, Vladimir Bychkovsky, Kevin Chen, Michel
                                                                                                                   Goraczko, Allen Miu, Eugene Shih, Yang Zhang, Hari
                                                                                                                   Balakrishnan, and Samuel Madden, in Proc. ACM SenSys,
                                                                                                                   [4] Visualizing Large-Scale Telecommunication Networks and
                                                                                                                   Services, Eleftherios E. Koustsofios et. al, Information
                                                                                                                   Visualization Research, AT&T Labroratiries, Florham Park, N.J

                    Fig. 5 – CAVIAR, a mobile network visualization tool                                           [5] A measurement study of vehicular internet access using in
                                                                                                                   situ wi-fi networks Vladimir Bychkovsky et. al., MobiCom ’06,
                                                                                                                   September 24-29, 2006, Los Angeles, California, USA.
                                                                                                                   [6] ALVIN: A system for visualizing large networks Davood
 5. VISUALIZATION                                                                                                  Raiei, Stephen Curial, WWW 2005, May 10-14, 2005, Chiba
 In this section, we outline the components of our visualization
                                                                                                                   [7] Vizster: Visualizing Online Social Networks Jeffrey Heer,
 tool, Caviar, wherein the mobile network and its participants
                                                                                                                   Danah Boyd, Computer Sciences Division, UC Berkeley.
 can be visualized for a variety of ends. The data-
 characterization, outlined in section 4, alongside results, is                                                    [8] Visualization of real-time survivability metrics for mobile
 useful for modelling potential input to this visualization tool.                                                  networks T.A.Dahlberg, K.R.Subramanian, MSWIM 2000,
 Fig. 5 shows the first snapshot of our visualization tool,                                                        Boston MA.
 mapping a subset of cell-sites, as distributed across New                                                         [9]          The          GinY          interface         library
 Zealand. Several network visualization tools have been                                                  
 proposed wherein several end-uses are kept in perspective
                                   Characterizing and Visualizing Mobile Networks   169

[10] Touch Graph
[11] J-Graph
[12] J-Graph T
[13]      GVF-The       Graph      Visualization    Framework
[14] JUNG – The Java Universal Network/Graph Framework
[15]     RAVE        –     Network       Flow      Visualization

Shared By:
handongqp handongqp