Apollo_ Tapping into the Bittorrent Ecosystem by jlhd32


BitTorrent (referred to as BT) is a file distribution protocol, which identified by URL and web content and seamless integration. It contrast HTTP / FTP protocol, MMS / RTSP streaming protocols such as download method advantage is that those who download a file to download, while also continue to upload data to each other, so that the source file (can be a server can also be a source of individual source generally refers specifically to the first seed to seed or the first publisher) can increase the very limited circumstances to support the load of a large number of those who download the same time to download, so BT and other P2P transmission has "more people download, the download faster, "this argument. BT official name is "Bit-Torrent", is a multi-sharing protocol software, from California, a programmer named Bram Cohen developed.

More Info
									                        Apollo: Tapping into the Bittorrent Ecosystem

            Georgos Siganos                     Marios Iliofotou                           Xiaoyuan Yang
           Telefonica Research                   UC Riverside                            Telefonica Research
           georgos@tid.es                    marios@cs.ucr.edu                            yxiao@tid.es
                                 Josep M. Pujol                      Pablo Rodriguez
                              Telefonica Research                   Telefonica Research
                                jmps@tid.es                         pablorr@tid.es

                       Abstract                                   ational details. Lifting the veil of opaqueness and en-
                                                                  hancing transparency of the network can be of vital im-
   This paper presents Apollo, a system that can tap into
                                                                  portance. There are multiple benefits for accurate mea-
the Bittorrent (BT) ecosystem and efficiently and dis-
                                                                  surements. From an operational perspective, measure-
cretely collect Bittorrent related performance data, with-
                                                                  ments can help improve both the performance of appli-
out requiring the consent or cooperation of either the
                                                                  cations and the reliability of the network. From a regu-
ISPs or the end-users. Apollo is highly efficient and
                                                                  lators perspective, measurements can be used by policy
scalable and can monitor tens of thousands of Bittorrent
                                                                  makers to advocate healthy competition and burst myths
clients in parallel by utilizing a single commodity server.
                                                                  and prejudice regarding the policies of ISPs. High stakes
The system is designed to inflict the lowest possible over-
                                                                  are placed into different visions for the “wellness” and
head to the Bittorrent ecosystem by using all the avail-
                                                                  future direction of the Internet, thus regulators need to
able gossiping mechanisms of Bittorrent like DHT and
                                                                  have a clear picture of it. Finally, accurate measurements
PEX (tracker-less operation). Additionally, the load for
                                                                  can help end-users to make better informed decisions re-
a Bittorrent client is kept to the bare minimum. Apollo
                                                                  garding their service providers.
sends less than 1Kbyte and receives typically less than
10Kbytes in volume every time it connects to a client.               Ideally, a measurement system would tap into the end-
                                                                  user perceived performance. To this end, the monitoring
   The Apollo system provides a means to capitalize on
                                                                  system should have access to the end user application.
the millions of existing online Bittorrent users and har-
                                                                  This approach, however, suffers from a shortcoming that
vest performance related data. We present several appli-
                                                                  prevent its utility. It requires the explicit collaboration of
cations that use this new dataset. In particular, we show
                                                                  the user or the ISP as third parties in the measurements.
how we improve Network Transparency from a BT point
                                                                  Moreover, this is only well suited for opportunistic mea-
of view by: (a) identifying in real-time the traffic man-
                                                                  surement, since the third parties are not under the con-
agement policies of ISPs across the globe, (b) proving a
                                                                  trol of the measurement system. In order to avoid these
means to compare the BT related QoS that different ISPs
                                                                  shortcomings, Apollo does not rely on the consent of
offer, and (c) enhancing the download speed of existing
                                                                  any third-parties, neither ISPs nor end-users. Despite the
BT clients. We hope our system will provide the ground
                                                                  lack of consent, we still want to measure user perceived
for additional research efforts in collecting and analyzing
                                                                  experience, e.g. download speeds, rather than network
the rich data available in the BT ecosystem.
                                                                  centric measurements, such as loss rate, which are ob-
                                                                  viously related to the end-user experience but harder to
1   Introduction                                                  quantify their impact. Finally, the measurement process
                                                                  should minimize the overhead on the network as well as
Internet’s architecture is based on the end to end argu-          the ISPs and the users. Apollo is a system aiming to-
ment, where the core remains “dumb” with the bare min-            wards these objectives.
imum functionalities, and the ends are intelligent. This             The importance of measuring end-user perceived per-
design decision stripped away necessary functionality             formance has gained importance lately with the debated
such as a performance measurement capability, and has             issue of network neutrality. Traditional measuring tech-
lead to a constant struggle to devise new techniques to           niques have failed so far to provide convincing answers
fill the gap. Additionally, Internet today consists of com-        regarding whether ISPs perform traffic management or
mercial competing entities that do not reveal their oper-         not, and if so, to what extent. There exists no simple and

easily deployable solution that can detect the policies of        2     Motivation
ISPs, and most importantly quantify what is the impact
of these policies on the QoS they offer to their customers.       In this section, we describe the main motivation design
The research community has provided a number of tech-             choices for Apollo, which includes the need to gather
niques and systems [7] [19] [22] that can provide hints           high resolution end user perceived experience measure-
for network neutrality violations, but much remains to            ments and a scalable system to carry out this task.
be done to advance our ability to accurately measure the
impact of these policies.
                                                                  2.1    The Challenges of Measurement
                                                                  In the introduction, we argued for the need of collecting
   In this paper, we present Apollo, a system that tries to
                                                                  accurate user perceived experience measurements. Ob-
enrich the existing measurement systems with a new ap-
                                                                  taining such measurements, however, is far from trivial.
plication based measurement system. Our vision for the
                                                                  User perceived experience is application dependent; how
system is to show how we can capitalize on the millions
                                                                  a user experiences Bittorrent, Web browsing or online
of Bittorrent (BT) clients to harvest their performance re-
                                                                  gaming cannot be derived from network centric measures
lated data with a negligible resource footprint. Bittorrent
                                                                  such as the nominal capacity of the link, loss rates, etc.
has been used before as a measurement platform. Our ap-
                                                                     To extract end user application specific measurements
proach has a slight but fundamental difference compared
                                                                  requires either to tap into the wires via DPI or to con-
to previous systems. We don’t want to exploit Bittorrent
                                                                  trol one of the two ends of the communication: either the
to perform active measurements, as is done by previous
                                                                  end-user or the application server. DPI requires costly
systems [9][11]. Contrary, we want to infer the perfor-
                                                                  specialized hardware to be spread across multiple points
mance data of Bittorrent clients from control messages.
                                                                  of presence, which translates into even higher costs for
This different approach leads to a disruptive change of
                                                                  a widespread deployment. DPI also suffer a strong op-
the measurement process and of the scale and resolu-
                                                                  position from end users due to privacy concerns. At-
tion of data that can be collected. Our system can lit-
                                                                  tacking the problem from the application server front is
erally scan tens of thousands of Bittorrent peers within
                                                                  cumbersome since only horizontal services with a global
few minutes with a single commodity server. Previous
                                                                  footprint such as Akamai or Google can collect. The
systems, would require multiple machines and days to
                                                                  last remaining front, the end-users, is the only alterna-
perform active measurements for such a large number of
                                                                  tive left, but at the same time the most promising area
                                                                  which Apollo and other systems are putting their efforts.
                                                                     The underlying design of the Internet is the one to
   The most important contribution of our work is the             blame for the difficulties and limitations to obtain appli-
Apollo system itself. Apollo is able to collect the control       cation specific end-user perceived experience measure-
messages of hundreds of thousands of Bittorrent clients           ments. There could be, however, a work around if ap-
in a very short period of time, and infer accurate end-user       plications were to collect their own measurements and
measurements from these publicly available information.           there was a consensus on the way to share them. For in-
Apollo’s realtime measurements on Bittorrent can shed             stance, an application such as browser clients like Fire-
light on the controversial areas of traffic management and         fox, could gather their own measurements and publish
network transparency. Potential applications of this new          this information into a public repository or allow access
dataset include a) identifying in real-time the traffic man-       to them via a secure API. The only problem left would
agement policies for ISPs across the world, b) providing          be how to gather the information efficiently minimizing
a way to compare the QoS that different competing ISPs            the network and the application overhead. The system
offer, and c) boosting the download speed of Bittorrent           we present in this paper could play this role thanks to its
clients. Additionally, we hope our system will provide            scalable and efficient design.
the grounds for new research directions and novel ways               Unfortunately such as grand vision towards a sort of
to process and analyze the data harvested from Bittorrent.        a de-facto standardization of application initiated mea-
Apollo gives information about the users of the BitTor-           surements is just a vision. There are, however, some
rent application, and about the users of the Internet as a        steps towards this direction in the form of application
whole. Clearly, if network neutrality existed, the above          plugins that collect such statistics. For instance the ONO
two statements would be the same. That is, you can in-            plugin [4] for Azureus that collects the end-user per-
fer the behavior of the Internet by observing the behavior        ceived experience on Bittorrent. A similar approach
of one of its applications. However, since net neutrality         could be used for Web browsing via Firefox plugins, or
does not exist the two statements are not completely the          any other application.
same.                                                                The plugin approach suffers from serious shortcom-

ings. First, it requires the proactive participation of the
user, therefore, subject to the pitfalls of collaboration:                                  Apollo                 DPI
cold-start effect, not enough quorum, and lack of partici-

                                                                         Scanning Rate
pation. The final shortcoming of relying on applications

                                                                          per Machine
plugins is the loss of control on the experiment. Mea-
surements depend on the availability and usage of the
end user and therefore are opportunistic.
   The optimal approach would be for the application it-                                    BitProbe           NANO iPlane
self to provide the measurements. Then, each user of the
application would act as a sensor of the monitoring en-
                                                                                                     Number of Machines
vironment. Furthermore, this approach would be more
user friendly in terms of security concerns. The need for
active measurements would be also reduced because the                                    Figure 1: Measurement Space
application might already use this measurement for in-
ternal operations.
                                                                  2.2    Crawling on the cheap
  The Workaround: Inferring Measures from Con-
trol Messages for Bittorrent (P2P)                                The Bittorrent clients who inadvertently have become
                                                                  sensors for Apollo experience a very small overhead.
   The bad news is that no application provides this func-        However the workload of inferring the measurements
tionality as for today. Therefore, we cannot obtain end           from the myriad of available of sensors is far from trivial.
user experience measurements directly from the applica-           Apollo architecture and design is optimized to maximize
tion. There is, however, a workaround that allow us to            the sampling rate r of a single server.
collect the end user experience measurements indirectly.             The server sampling rate r is the number of sensors
If an application publicly exposes its control messages           that a server can poll per second. This rate multiplied
one might try to piggyback on them in order to infer              by the number of available servers n gives the global
the desired measurements indirectly. P2P applications,            sampling rate R. The global rate is a key parameter of
due to their collaborative nature, usually publicly expose        any measurement system since it determines the tempo-
their control messages. We focus on the most prevalent            ral and spatial resolution of the data. For instance if we
P2P application: Bittorrent. Apollo is able to: a) collect        wanted to analyze the performance of the top 1000 ISP’s
the control messages of hundreds of thousands of Bit-             which contains about 25M concurrent users that need to
torrent clients in a very short period of time, and b) in-        be sample once per hour this gives a sampling rate R
fer accurate end-user measurements from these publicly            of 7000s−1 . Obviously one can limit the resolution by
available information.                                            randomly selecting 500 clients per ISP, in this case the
                                                                  global sampling rate R would be still 140s−1 .
   In this paper, we present how we can infer relevant
                                                                     The global sampling rate R gives the lower bound at
end-user experience for Bittorrent by tapping only into
                                                                  which the system need to pull information out of clients.
control messages. Since Bittorrent application are opti-
                                                                  Note that a sampling operation might not be atomic, re-
mizing for file transfer, the application is implicitly as-
                                                                  quiring more than one request to the client (in the particu-
sessing upload and download rates that end-user experi-
                                                                  lar case of deriving measurement from control messages
ences, although they are not accessible from outside the
                                                                  is 2 requests per sample). The measurement system has
application. We can, however, derive a reliable approx-
                                                                  two alternative ways to meet the objective sampling rate
imation to these measurements from control messages
                                                                  R: by pushing the number of samples that a single server
that are publicly available.
                                                                  can carry out r and by increasing the number of sam-
   The advantage of this inference from control messages          plings servers n. The measurement system can be ideal-
is that we can perform application specific measurements           ized as R = n × r.
and indirectly obtain most of the desired information to-            Different measuring systems have a different interplay
day. The inference process is extremely unintrusive to            of two parameters r and n depending on the desired data
the Bittorrent ecosystem. All the Bittorrent clients have         resolution. Figure 1 depicts a classification of differ-
become potential sensors for us to be polled with a neg-          ent state-of-the-art measuring systems in this parame-
ligible overhead for them. Apollo sends to a BT client            ter space. BitProbe [9], NetDiff [12], NANO [18] [19]
a single Bittorrent packet that is less than 1Kbyte and           and iPlane [11] are systems with a limited sampling
the client transmits less than 20 Bittorrent control level        rate due to their active measurements, which consume
packets.                                                          a lot of resources and therefore can quickly max out the

server resources including the network bandwidth. For
instance, let us assume that a user has an average down-             60000
load rate of 1Mbps, using active measurements a server               50000
with 100Mbps can only serve a maximum of 100 con-                                                     Sockets in use
current users. Thus, if the measurement time is set to 10                                             Established Connections
seconds, it will result in a r of 10 samples per second.             30000
In other words, have only 360 available slots serving up             20000
to 100 users, thus one server can only actively sample               10000
36000 clients per hour. Following the prior example, we
would need 700 servers to gather the information from                       0
                                                                             0             50    100     150     200                    250      300
all Bittorrent clients or 14 servers if we limited the mea-                                     Time (Seconds Elapsed)
surements to 500 clients per ISP.
   On the other hand, DPI and Apollo can achieve a high                  Figure 2: Consumption of Network Resources.
sampling rate because they do no need to run active mea-
surements. DPI only observes traffic passively whereas                                                                                  Per Torrent
Apollo polls the value from the application (in the future                                                Torrent Control                     PEER
                                                                           Rate Control
ideal scenario) or infers the value from control messages                                                                                     PEER
                                                                    Functionality                 Functionality
for the particular case of Bittorrent.                              Start new Torrents:           - Control Rate of New Peers
                                                                                                  - Keep statistics
                                                                    - Static List                                                             PEER
   The higher server sampling rate of Apollo allows to              - Dynamic List
                                                                                                  - Start DHT & Tracker if necessary
                                                                                                  - Decide when to Stop

have high resolution data minimizing the number of
servers, and therefore, cost and deployment. Apollo is
able to collect the information of all Bittorrent clients of                                                                                  Logger
                                                                           DHT Routing           DHT Crawler             Tracker
the top 600 torrents of the most popular tracker in less                                                                                      Logger
than one hour. DPI also has a high sampling rate, how-              Maintain the routing
ever, a DPI deployment needs multiple points of presence            table for DHT

to have a widespread spatial resolution. Apollo’s per-
formance is not exclusive to the lack of active measure-
ments which introduce bottlenecks such as bandwidth.
Apollo’s architecture is also highly optimized to handle                    Figure 3: High level interactions of Apollo.
thousand of concurrent requests with a single commodity
server.                                                            use of the actor model and asynchronous message pass-
                                                                   ing to eliminate state which reduces memory consump-
3     Design and Implementation of Apollo                          tion and to eliminate context switches at the OS level that
                                                                   reduce both memory and CPU utilization.
Apollo is an instrumented Bittorrent client built from                As a result of this massively concurrent architecture,
scratch to enable efficient and light weight monitoring of          we hit the limits of the network resources of the Linux
Bittorrent. Collecting the control level messages poses            server. In the next section we describe a stress test of
serious challenges both regarding the scalability of the           Apollo to demonstrate its high performance.
solution and regarding the load that Apollo will put on               The high-level design of Apollo is shown in Figure 3.
the BT clients.                                                    Every Apollo server has a single process (Rate Control)
                                                                   for controlling the number of torrents monitored in par-
                                                                   allel, and a single process (DHT Routing) that facili-
3.1    Architecture
                                                                   tates the DHT functionality. On a torrent basis, the Tor-
Apollo’s architecture is designed to maximize the server           rent Control (TC) process controls and synchronizes the
sampling rate. In order to be able to sample as many               monitoring of a torrent. To bootstrap the monitoring,
clients as possible, we resorted to a massive concur-              TC spawns a DHT crawler to scan DHT for BT peers
rent design that allow us to run up 80K concurrent pro-            that participate in the torrent. If this fails, TC spawns
cesses on a single server. These processes however are             a tracker process. To communicate with a BT peer, TC
not threads, but lightweight Erlang processes which are            will spawn a new Peer process that will be responsible
stateless, have no shared memory and require no context            for communicating and harvesting the BT peer perfor-
switching. Hence minimizing the memory footprint and               mance data. Next, we briefly describe key implementa-
the cpu overhead.                                                  tion details that contribute to the efficiency and resiliency
   Apollo is built from scratch with a focus on efficiency          of Apollo.
and performance. Its architecture relies on the extensive             DHT Routing: DHT is a critical subsystem of our ar-

chitecture. The system needs to discover the BT peers             passes. In the case that the remote peer is not sending us
that participate in a torrent as fast as possible. Typical        PEX1 , we stay connected for a few seconds. The default
DHT implementations maintain a table of few tens of               value is 10 seconds.
DHT peers that act as an entry point to the DHT. Query-              For efficiency reasons, a Peer process keeps the pro-
ing the DHT successfully can take many seconds due to             cessing of the messages to a minimum. The only on-
its exploratory nature. To be able to retrieve the peers as       line processing is to detect whether the remote Peer is a
fast as possible we use a radix tree data-structure which         Leecher or a Seeder2 . After a Peer process disconnects
allow us for a very efficient discovery process.                   from the remote peer, it notifies the TC and passes in-
   Tracker Module: The tracker process is the backup              formation regarding the type of the peer, when it sends a
solution when DHT fails. The process contacts the                 PEX, etc and then it exits.
tracker through the TOR network [1]. The operator of
a tracker can not distinguish Apollo from the typical BT
                                                                  3.2     Benchmarking Apollo
peers that use the TOR network. This design choice fol-
lows the initial objective of not requiring the consent of        To show case the efficiency of our design, we perform
third parties. After receiving a successful reply Apollo          the following stress test. We aim to scan the top-10 most
will notify the tracker that it stopped downloading the           popular torrents from pirate bay as fast as possible. The
torrent. This saves traffic for the TOR network, because           torrents had over 300 thousands BT peers participating.
the tracker will not advertise our client, and thus there         We start all the torrents at once, we bootstrap the torrents
will be no incoming connections through TOR. Apollo               using DHT, and we connect greedily to all the BT peers
only relies on TOR for communication with the tracker.            that we discover.
   Torrent Control: TC keeps internally two queues that              As we already mention, our optimized architecture for
will store the necessary information {IP, Port, Time} to          massive concurrency ends up having the number of sock-
start the communication with a new BT peer. The first              ets as the bottleneck. In Figure 2 we have shown the
queue holds the new BT peers that we haven’t initiated            evolution of the number of used sockets as reported in
the connection yet. The second queue is for BT peers              /proc/net/sockstat, and the number of established con-
that we will contact again in the future and wait for their       nections as reported by netstat. Apollo is able to fully
turn to be activated. The reason we use the first queue            utilize the available sockets in only 100 seconds. This
is to rate limit the number of Peers we start per tor-            demonstrates the high performance implementation of
rent so that we can have a more predictable consumption           the discovery mechanism of Apollo. The ramping up
of resources. The popularity of torrents follows a long           time of Apollo is linear in respect to available sock-
tail distribution and so is the arrival rate of new peers.        ets which demonstrates the scalability of Apollo. On
Leaving this factor unchecked seriously hinters the over-         the other hand, as expected not all of these sockets can
all performance of Apollo. The rate-limiting mechanism            successfully establish the connection with the end user.
keeps the arrival rate uniform, thus maintaining the CPU          However, we can realize the ratio of successful connec-
load balance between the different Apollo modules. The            tion do not decrease in time, suggesting a constant mon-
second queue used to meet the more strict requirements            itoring efficiency of Apollo.
of when to reconnect with a BT client. We want to be                 In total, in less than 5 minutes, we harvested more than
consistent with the time span between sampling the same           77 thousands BT peers, failed to communicate with 108
BT peer. Additionally, PEX is typically send exactly              thousand BT peers, and were still trying to contact 177
once per minute, thus we can arrange so that the time we          thousands BT peers.
reconnect is just a few seconds before the client sends the          How do we manage to scale so well? The answer lies
PEX message. As we discuss later, after receiving the             on the light-weight techniques that we utilize. We stay
PEX message we disconnect. Finally, we do not allow               connected to a BT peer at most for a minute, typically
incoming connections to Apollo. Incoming connections              much less, and during this time, we send just 1 BT re-
are not important for the system, because the rate that           lated packet, and we receive on average 20, with no other
we open connections is much larger than the rate of in-           activity in the mean time.
coming connections. Additionally, we want to be in full              For the stress test and the evaluation part of the paper,
control of the measurement process and be selective with          we use a dedicated server at a hosting provider in the
whom we talk.                                                     Netherlands. The hardware specifications for the com-
   Peer: The Peer process contacts a remote peer and              modity server are a quad core intel processor at 2.13GHz
sends an empty bitfield, then it switches to inactive mode,           1 We only support utorrent and mainline’s PEX messages, which is
where it just receives packets from the other side. If the        not supported by all clients, for example Azureus.
remote peer supports PEX, then we stay connected until               2 Leecher is a peer who is still downloading the torrent, while a

either we receive the Peer Exchange(PEX), or 1 minute             Seeder has finished downloading the file

                                                                              Apollo Client                                  Bittorrent Client
                                                                                                         Open Socket
                                                                                                   Send Bitfield & Capabilities
                                                                                            Bitfield & Capabilities
                                                                   1 minute   {                Have Messages
                                                                                                                     Close Socket

                                                                      Reconnect after few minutes
                                                                  if the Bittorrent Client is a leecher.
                                                                                                        Open Socket
                                                                                                  Send Bitfield & Capabilities

                                                                   seconds {                Bitfield & Capabilities
                                                                                                Have Messages
                                                                                                                     Close Socket

Figure 4: Apollo front-end screen shot. The colors cor-           Figure 5: Interactions between the Apollo client and a
respond to download speeds. The darker the color the              Bittorrent client. Functionality of Peer process.
slower the average download performance within the
                                                                  the control messages allows to derive very accurate es-
and 4Gb of RAM.
                                                                  4.1      Bitfield and Download speed
3.3    Live deployment of Apollo                                  From a remote BT peer, we collect the Bitfield, which
                                                                  is an array of 0 and 1 that specifies which pieces the BT
Apollo has been online for almost a year harvesting hun-          peer has finished downloading. Before, using the bitfield
dreds of thousands of BT peers on an hour basis. The              we need to make sure it is accurate.
system is resilient and can adjust to sudden problems like           Lazy Bitfield: Many BT clients implement a feature
closures of Bittorrent trackers. Based on Apollo, we have         called Lazy Bitfield, where a client does not send the ac-
build an automated system that collects, processes and            tual bitfield, but removes from the bitfield a number of
stores the data in a central database. Managing this pro-         pieces, and then subsequently notifies the client that it
cess and maintaing a database with a size of a few Terra          has these pieces by using have messages. Before pro-
bytes per month is a feat by itself. Additionally, we have        cessing the bitfield of a BT peer, we include these mes-
build a front-end website to Apollo, where we display             sages to the bitfield, and we don’t consider them to be
the processed data at different aggregation levels. In Fig-       actual have messages.
ure 4, we have a snapshot of the live system, which repre-           Download speed: To estimate the download speed of
sents a map of the world colored by the download speeds           a BT peer, see figure 5, we connect more than once to
of BT peers on a country basis averaged over the last 24          the same BT peer to track its download progress. By
hours. In the Figure, we have zoomed into Europe. The             comparing the processed Bitfields, we can have a crude
system can also reveal other statistics like country level        estimate of how fast the BT peer is downloading. The
popularity of ISPs, or traffic management policies. In             estimate of the download speed for a BT peer, is given
fact, the system is currently deployed as a commercial            by the number of new pieces the BT peer downloaded
solution within a large ISP.                                      between the two observations multiplied by the size of
                                                                  the piece and divided by the time between the two obser-
4     Data Collection and Inference Process
                                                                  4.1.1       Upload Activity
One of the contribution of Apollo is its ability to infer
end-user perceived experience measurements by piggy-              The same control messages from which we can derive the
backing on Bittorrent control messages. In this section           download speed can also estimate the upload activity of a
we describe how measurements such as download speed               client, although this time is not specific to Bittorrent but
and upload activity measurement can be inferred with a            it is the aggregated upload from all the applications run-
negligible overhead and without the collaboration of the          ning on the client. We infer the aggregate upload activity
end-user or the application provider. Even though Bit-            by analyzing the IP-ID field of the IP header. Analyz-
torrent protocol does not provide explicit end-user ex-           ing the IP-ID has been used before in [2] [3] to count
perience measurements, the information embedded in                hosts behind NATs, and study the upload performance.

The technique takes advantage of an implementation fea-                      a small test-bed experiment with DSL connections and
ture of the network stack of many popular Operating Sys-                     a dedicated server to gather empirical data that will be
tems, like many versions of Microsoft Windows. These                         compared to the estimated yield by Apollo.
Operating Systems use a global counter, the IP identi-                          The DSL connections are in an ISP with low upload
fication field3 (IP-ID), that they monotonically increase                      capacity. The dedicated server has an Ethernet 100Mbps
by one for each sent packet. Other Operating Systems                         uplink/downlink connection. We tested in these se-
like Linux, Macosx and Solaris take different approaches                     tups with the most popular Bittorrent clients, which are
and use different implementations. For the clients that                      Azureus, uTorrent and the Mainline Bittorrent software
their Operating System has the global counter implemen-                      client. For the tests, we use popular torrent files from the
tation, we can estimate their upload activity if we receive                  PirateBay website and we join the Swarm. We collect
a sufficient number of packets.                                               tcpdump traces for all the tests that we run.
   To estimate accurately the upload activity, we first
need to decide whether the OS of the client uses a global                        Control messages Validation: First, we make sure
counter. To detect this, we depend solely on the inter-                      that the Bittorrent software clients are sending their Bit-
arrival times of the packets and their IP-ID values for                      field, and they don’t send fake have messages, apart from
packets that arrive at most within a minute.                                 those already described as part of the lazy bitfield. We
   We pre-process the peers to remove cases that match                       should note here that there exists an extension to Bittor-
any of the following criteria:                                               rent called Super Seeding, under which the initial seeder
                                                                             sends a fake Bitfield and have messages to speedup the
  1. A peer sends a packet with an IP ID value of 0. This                    initial dissemination of the torrent. Super Seeding can
     is typically done by Linux for MTU discovery.                           potential be an issue, that will require extra processing to
                                                                             be dealt with. For the processing we present in this paper
  2. The counter is not monotonically increasing. This
                                                                             it is not causing a problem, since we focus only on well
     can imply a pseudo-random generator. There can
                                                                             seeded torrents where this feature is disabled.
     be false positives though if the counter loops over.
                                                                                Download speed Accuracy: To validate what Apollo
  3. We receive less than 4 packets. We don’t have                           estimates as download speed, we consider the following
     enough information to infer the behavior of the OS.                     scenarios. Bittorrent clients that are either fast or slow
                                                                             that download torrents that utilize either a small or large
  4. The IP ID counter increases by 4, 000 or more. It
                                                                             piece size. The fast clients are emulated by using a dedi-
     is possible to have more than one machine behind
                                                                             cated server and by placing limits using the built-in traf-
     a NAT downloading the torrents we monitor. We
                                                                             fic shaping mechanisms of the linux kernel.
     want to avoid such cases, because it can introduce
     an artificially high upload activity. This limit was                        What we expect to find is that the faster the Bittor-
     used in [3] to multiple machines behind a NAT.                          rent client, i.e. the more pieces it finishes downloading
                                                                             within the time period we check, the better Apollo can
   For the remaining of the clients, the crude upload ac-                    approximate the download performance. Additionally,
tivity is computed by first finding the two packets that                       we expect to estimate the download speed more accu-
their inter-arrival time is the longest possible and by di-                  rately when the torrent uses a smaller piece size than a
viding the difference of their counters with their inter-                    larger piece size. In Figures 6(a) 6(b) 6(c), we show
arrival time. We measure the activity in packets per sec-                    three of the scenarios we run. The dash-line is the actual
ond.                                                                         download speed of the clients, and the solid line is the es-
   The upload activity we measure is the aggregate activ-                    timated one by using Apollo and the have messages. The
ity of both the Bittorrent application and of all the other                  results are as expected. We can very accurately estimate
Applications the client is running. Regarding the contri-                    the performance of fast clients or clients that are down-
bution of Bittorrent to the activity, it is typically domi-                  loading torrents with a small piece size. On the other
nated by the TCP control packets, like TCP Acks, espe-                       hand, it does not perform so well on slow clients who are
cially in the case of torrents that have many seeders.                       downloading torrents with a large piece size. The main
                                                                             reason is that a slow client will have many half-finished
4.2     Accuracy and Limitations of the data                                 pieces that will not be reported until later. In the worst
                                                                             case, where we have a very slow client and very large
After we showed that the end-user experience measure-                        piece size like 4Mbit, the estimated speed will be zero.
ment can be inferred from control messages, we need to                       This is a limitation of our methodology. We aim to an-
validate of the inference process. To this end we setup                      alyze the speed of BT clients sampled over a fixed time
  3 The IP-ID field is a two-byte field that is used to reassemble frag-       interval. This works well for the average and fast BT
mented packets.                                                              peers, but introduces a bias for the very slow peers.

           160                                                2500                                                            3000
           140                                                           Estimated                                            2500
           100                                                                                                                2000


            80                                                                                                                1500
            60                                                1000
                                                Real           500                                                             500                                      Real
                                           Estimated                                                                                                               Estimated
             0                                                   0                                                               0
                 0   10     20        30          40   50            0     10        20       30     40     50   60   70             0   10         20        30          40   50
                             Time (min)                                                       Time (min)                                             Time (min)
                          (a)                                                         (b)                                                     (c)

     Figure 6: (a) Slow client - Small piece size (b) Average Client - Big Piece Size (c) Fast Client - Small Piece Size

4.3              Data: Post Processing                                                         surements on Bittorrent can shed light on the contro-
                                                                                               versial areas of traffic management and network trans-
Top-600 torrents of Pirate Bay: Apollo can monitor                                             parency. Questions such as who throttles? at what time?
any set of torrents we feed it. For the purposes of our sys-                                   which ISP is faster? are of the utmost interest of ISPs,
tem and paper we use the top-600 torrents of PirateBay,                                        regulators and consumers. The other application built
the most popular torrent website. We monitor large and                                         upon Apollo’s measurements does not provide additional
well seeded torrents so that the download performance                                          information, but it delivers an increase of performance on
of Bittorrent will be influenced as much as possible by                                         Bittorrent itself. We show that prior realtime information
the characteristics of the access link and the QoS that the                                    of the client’s downloading speed can improve the peer
ISP provides.                                                                                  selection algorithm resulting in a boost of Bittorrent per-
   For the results in the next sections, we analyze data                                       formance.
gathered between August 17 to August 24, 2009. The
results are qualitatively the same across time. During
the last week of August, the trackers of the PirateBay                                         5.1         Detecting Traffic Management Policies
were shutdown, but the community has replaced them
with others.                                                                                   How we define a BT Traffic Management Policy: To
   To aggregate the data on a ISP level, we use BGP rout-                                      detect traffic management, we make the assumption that
ing table from the routeviews [13] project and longest                                         the fastest users of an ISP, are able to download close to
match on the prefixes. To map the IP of the user to a                                           line speed throughout the day. If they show any signs of
country, we use the IP Allocation data provided by the                                         poor performance then this is a sign of BT traffic man-
Regional Internet Registries(RIR).                                                             agement. This corresponds to a policy where an ISP ac-
   Users from 213 countries download the top 600 tor-                                          tivates and deactivates the policy based on the time of
rents of Pirate Bay. The population of clients is biased                                       the day. We will use the 99th percentile of the speeds of
towards English speaking and European countries, while                                         the BT peers. The 99th percentile reveals the download
countries like Korea and Japan are under-represented in                                        speeds of the fast BT peers and will not be affected by
these torrents.                                                                                any misbehaving client implementation.
                                                                                                  British Telecom - Ground Truth: We start with the
                                                                                               known traffic management policy of British Telecom, a
5          Applications                                                                        large ISPs in Great Britain. This policy can act as the
                                                                                               ground truth and extra validation for our system. British
Apollo allows us to collect end-user perceived experi-                                         Telecom is reporting in their website that they are per-
ence measurements of the Bittorrent ecosystem. So far                                          forming BT traffic management. Even though they don’t
we have described the system that allows to sample mil-                                        provide details, there exists external reports [5] that de-
lions of clients and to infer accurate measurements by                                         scribe their policy in sufficient detail. SamKnows is a
piggy-bagging on the Bittorrent control messages. In this                                      company in Great Britain that is deploying the Sam-
section, we show the benefits of having access to this                                          Knows Performance monitoring network [5], which re-
dwell of realtime end-user perceived experience mea-                                           lies upon volunteers to install a hardware based solution
surements that Apollo is able to collect. We argued on                                         in their broadband line. The hardware is continually per-
the introduction for the need of this kind of measure-                                         forming active based measurements of a number of met-
ments, we now present some applications that make use                                          rics, like latency, loss rates and speeds of various appli-
of such measurement in two orthogonal domains.                                                 cations like Bittorrent on a 24 hour basis. They moni-
   The first application lies on the context of business in-                                    tor 25 British Telecom customers, and the test of interest
telligence and policy making. Apollo’s realtime mea-                                           is the one where they emulate Bittorrent traffic. They

                         4000                                         std                                                                              std
    Down Speed (Kbits)                                       Perc 99.0 th                                                                     Perc 99.0 th

                                                                                         Down Speed (Kbits)
                         2000                                                                                 4000
                         1500                                                                                 3000
                         1000                                                                                 2000
                          500                                                                                 1000
                            0                                                                                    0
                            00:00       06:00        12:00           18:00   23:00                               00:00   06:00        12:00           18:00   23:00
                                            24 Hour Offset (GMT 0)                                                           24 Hour Offset (GMT 0)
                                    (a) British Telecom 99th perc                                                          (b) O2 99th perc

                                                  Figure 7: Download Speeds for 2 ISPs in Great Britain.

found that while on non-peak hours the Bittorrent traffic                                   while at night they can only receive 5 to 10 Mbits. To
can saturate the line, during peak hours it is consistently                                address this issue, we require at least 3 hours of consis-
around 15% of the nominal capacity, while at the same                                      tent worse performance to avoid misclassification due to
time, HTTP based traffic can saturate the line.                                             measurement noise, and we require these hours to be at
                                                                                           local peak time. In total using a threshold of 0.3 for the
   Apollo as depicted in Figure 7(a), can capture pre-
                                                                                           coefficient of variation, we found 17 ISPs that enforce
cisely the behavior that SamKnows reports. The down-
                                                                                           BT traffic management policies and throttle the capacity
load speed of the 99th percentile during non peak hours
                                                                                           across all their customers. Apart from ISPs in Canada,
has a max value of 2.9Mbits, while at peak hours the
                                                                                           traffic management is widespread in Great Britain with
min value is 490Kbits, which corresponds to 16% of the
                                                                                           4 ISPs. Another country is Brazil where 2 ISPs have ac-
max. This is the same result reported by SamKnows us-
                                                                                           tive policies. There are 8 more countries, where at least
ing hardware equipment installed at the end users homes.
                                                                                           a single ISP is performing traffic management.
Additionally, from Figure 7(a), we can also observe that
                                                                                              Limitations of Apollo; the Case of a Canadian ISP:
the traffic management policies of BT are active between
                                                                                           The Canadian Radio-television and Telecommunications
7am to 11pm (GMT).
                                                                                           Commission recently held hearings on the practices of
   How widespread is traffic management? To auto-                                           traffic management [17]. Most of the ISPs revealed that
mate the process of detecting traffic management, we use                                    they have deployed traffic management hardware within
the coefficient of variation of the 99th percentile across                                  their network, without revealing details on their enforced
the 24 hours. If the coefficient is over some threshold                                     caps. Apollo can detect the traffic management poli-
then the ISP is a candidate for enforcing BT traffic man-                                   cies for the ISPs that don’t have the policy constantly
agement. In our dataset, we have BT peers for over 900                                     active. On the other hand, it struggles with the policy
ISPs that can potentially be analyzed to detect their poli-                                of Shaw that seems to be always enabled or more dis-
cies. Some of these ISPs have a small sample size, just a                                  crete than the typical policies. In Figure 9(a), we com-
few tens of Bittorrent peers, and thus we need to remove                                   pare the 99th percentile of Bell Canada to the 99th per-
them for the subsequent analysis. To understand what                                       centile of Shaw. We see that the fast users of Shaw down-
is the right trade-off between the sample size and vari-                                   load with a consistent download speed compared to fast
ability, we plot in Figure 8(a) the coefficient of variation                                users in Bell Canada. When we check though other per-
versus the sample size. We see that with a sample size of                                  centiles 9(b) 9(c), we see that other than the peak hours,
200 or more removes a lot of the noise. Thus out of these                                  the customers of Bell Canada outperform the customers
900 ISPs we consider only the 215 ISPs that have at least                                  of Shaw. Can this be the sign of traffic management?
200 measured BT peers. This doesn’t mean that with a                                       Bell has testified that they manage the traffic only during
smaller sample size we can not detect the policy of an                                     peak hours, while Shaw didn’t make this discrimination.
ISP. For example, in Figure 8(b), we show that even with                                   There can be many valid objections whether these plots
using 156 samples we can detect traffic management, but                                     can classify Shaw as performing traffic management, and
this requires manual intervention. Next, we want to ad-                                    we won’t characterize it as one.
dress the assumption that the most popular torrents can
saturate the fast clients of an ISP. In Figure 8(c), we have
                                                                                           5.2                   Relative Ranking of ISPs per Country
the 99th percentile for an ISP from Romania. We can
clearly see that that at night the fast clients of this ISP                                The next application of Apollo is to provide a metric to
can not really saturate their download. At peak hours,                                     compare ISPs by the Bittorrent download performance of
the 99th download percentile is approximately 15Mbits,                                     their customers. This analysis requires preprocessing to

                                                 1                                                                                                                                                             25000
                                                                                                                                 1000                                         std                                                                            std
                Coefficient of Variation                                                                                                                             Perc 95.0 th                                                                   Perc 99.0 th

                                                                                                        Down Speed (Kbits)

                                                                                                                                                                                         Down Speed (Kbits)
                                                0.8                                                                                                                                                            20000
                                                0.6                                                                                                                                                            15000
                                                0.4                                                                                                                                                            10000
                                                0.2                                                                              200                                                                            5000

                                                 0                                                                                 0                                                                               0
                                                            100                  1000                10000                         00:00       06:00         12:00           18:00   23:00                         00:00       06:00        12:00           18:00   23:00
                                                             Sample Size (Measured BT peers)                                                       24 Hour Offset (GMT -3)                                                        24 Hour Offset (GMT +2)
                                               (a) Coefficient of Variation 99th perc                                                    (b) ISP in Brazil - 156 Samples                                       (c) ISP in Romania (Can not be saturated)

                                                                                                     Figure 8: Analyzing the policy in isolation

                                           8000                                                                                  800                                                                           400
                                                                                 std Bell Canada                                                                  std Bell Canada                                                               std Bell Canada
                                           7000                           Bell Canada Perc 99 th                                 700                       Bell Canada Perc 70 th                              350                       Bell Canada Perc 50 th
  Down Speed (Kbits)

                                                                                                            Down Speed (Kbits)

                                                                                                                                                                                         Down Speed (Kbits)
                                           6000                                         std Shaw                                 600                                     std Shaw                              300                                     std Shaw
                                           5000                                 Shaw Perc 99 th                                  500                             Shaw Perc 70 th                               250                             Shaw Perc 50 th
                                           4000                                                                                  400                                                                           200
                                           3000                                                                                  300                                                                           150
                                           2000                                                                                  200                                                                           100
                                           1000                                                                                  100                                                                            50
                                              0                                                                                    0                                                                             0
                                              00:00          06:00           12:00           18:00   23:00                         00:00       06:00         12:00           18:00    23:00                      00:00        06:00         12:00           18:00   23:00
                                                                   24 Hour Offset (GMT -5)                                                         24 Hour Offset (GMT -5)                                                        24 Hour Offset (GMT -5)
                                                              (a) 99th perc                                                                      (b) 70th perc                                                             (c) 50th perc (Median)

                                                  Figure 9: Bell Canada & Shaw Communications. Two different traffic management policies in Canada

                                           1                                                                                                                  tries. The figure shows that the similarity of the content
                                                      Between Countries                                                                                       for ISPs within the same country is high, typically over
                                                      Inside Country                                                                                          0.9. Therefore, we can compare ISPs within the same
                       0.6                                                                                                                                    country without introducing a bias.

                       0.4                                                                                                                                       Comparing using the average median download
                                                                                                                                                              speed: We compare the download performance per
                                                                                                                                                              country for the top 3 most popular ISPs in a country. Per
                                           0                                                                                                                  ISP, we compute over a period of one month the aver-
                                            0         0.2            0.4      0.6              0.8                  1
                                                                  Cosine Similarity
                                                                                                                                                              age median download speed. We are interested to show
                                                                                                                                                              per country what is the percentage of improvement when
                                           Figure 10: Similarity of Content between ISPs.                                                                     comparing the slowest of the three ISP, to the other two
                                                                                                                                                              ISPs. In Figure 11, we plot for a number of countries,
identify which ISPs we can compare, without introduc-                                                                                                         the percentage improvement of Bittorrent speeds within
ing any bias. Bittorrent customers at different ISPs can                                                                                                      the country. Per country we have two bars. The first bar
download different torrents, that potentially can have dif-                                                                                                   corresponds to comparing the slowest ISP in that country
ferent download speed characteristics. Thus, there can be                                                                                                     to the fastest of the three in the same country. The sec-
cases where comparing different ISPs should be avoided.                                                                                                       ond bar corresponds to comparing the slowest ISP in that
   Content Similarity: To check which ISPs we can                                                                                                             country to the second faster in the same country. There
compare, we analyze how the popularity of the torrents                                                                                                        exist significant differences within a country. Over 50%
we monitor changes within a country and across coun-                                                                                                          improvement is typical for many countries. We also find
tries. To measure the similarity of the Content we use                                                                                                        two countries with over 100% improvement, for exam-
the Cosine similarity metric. Cosine Similarity is a mea-                                                                                                     ple, in GB. In some countries, for example Spain, Italy,
sure of similarity between two vectors of n dimensions,                                                                                                       Germany, France and Philippines, there is virtually no
in our case the 600 torrents and the number of clients that                                                                                                   difference among the ISPs.
download the torrent. A similarity of 0 indicates that the                                                                                                       Zooming into the top-5 ISP in US: Finally, we zoom
two vectors are independent, while similarity of 1 indi-                                                                                                      into the differences that exist in the US. In Figure 12, we
cates exact match. We use the top 5 ISPs for 25 countries                                                                                                     compare the evolution of the 99th, 70th, 50th percentiles
that have at least 1, 000 Bittorrent clients. For every one                                                                                                   for the top-5 ISPs. We have randomized their order and
of these ISPs, we calculate its cosine similarity with the                                                                                                    keep them anonymized. There exist big differences, that
other 124 ISPs. In Figure 10, we plot the ECDF of the                                                                                                         are mainly contributed to the underlying technology the
cosine similarity for ISPs that are operating in the same                                                                                                     ISPs deploy. For example, we see ISP-5 to have the typi-
country, and for ISPs that are operating in different coun-                                                                                                   cal 3 Mbits performance that is typical for ISPs that offer

                               AR BR                  CA US                DE ES FI FR GB GR IT NL NO PL PT RO SE                                                                                  IL SA TR                  AU IN PH

                            Figure 11: Percentage Improvement of Bittorrent Downloads when comparing ISPs inside a Country

                         9000                                                                            1000                                                                            450
                         8000        ISP1             ISP 3           ISP 5                                          ISP1                    ISP 3    ISP 5                              400         ISP1          ISP 3          ISP 5
                                    ISP 2             ISP 4                                                         ISP 2                    ISP 4                                                  ISP 2          ISP 4
  Down Speed (Kbits)

                                                                                    Down Speed (Kbits)

                                                                                                                                                                    Down Speed (Kbits)
                         7000                                                                            800                                                                             350
                         6000                                                                                                                                                            300
                         5000                                                                                                                                                            250
                         4000                                                                                                                                                            200
                         3000                                                                                                                                                            150
                         2000                                                                            200                                                                             100
                         1000                                                                                                                                                             50
                            0                                                                              0                                                                               0
                            00:00           06:00         12:00         18:00   23:00                      00:00            06:00         12:00         18:00   23:00                      00:00        06:00         12:00         18:00   23:00
                                                24 Hour Offset (GMT -5)                                                         24 Hour Offset (GMT -5)                                                     24 Hour Offset (GMT -5)
                                            (a) 99th perc                                                                   (b) 70th perc                                                          (c) 50th perc (Median)

                                                                           Figure 12: Comparison of the top-5 ISPs in the US

                                                              PlusNet Clients                                      26000                       Table 1: PlusNet Traffic Management - Download (P2P)
                            1200                                  BT Clients
         PlusNet Peers

                            1100                                                                                                                         Time                                Value           Unlimited              Pro
                                                                                                                                  BT Peers

                            1000                                                                                   22000                             00:00 - 12:00                           256Kb             Line                 Line
                              900                                                                                  20000                             12:00 - 14:00                           164Kb            512Kb                 Line
                              800                                                                                  18000                             14:00 - 16:00                           128Kb            256Kb                 Line
                              700                                                                                                                    16:00 - 18:00                           100Kb            164Kb                 Line
                              600                                                                                                                    18:00 - 22:00                            50Kb            128Kb                 Line
                                00:00              06:00     12:00     18:00                              23:00                                      22:00 - 23:00                           100Kb            256Kb                 Line
                                                    24 Hour Offset (GMT 0)                                                                           23:00 - 00:00                           128Kb            512Kb                 Line
Figure 13: Evolution of number of Bittorrent clients
within PlusNet and British Telecom
                                                                                                                                               load caps per hour, and how they change for different
DSL connectivity. Another interesting observation is that                                                                                      categories of customers. PlusNet is not throttling the up-
while ISP-1 is not the fastest ISP, when we consider the                                                                                       load.
99th percentile, is providing the best performance for the                                                                                        Studying PlusNet allows us to analyze what is the po-
70th and the 50th percentiles. The drop in performance                                                                                         tential effect that a public policy has on the customers
we observe at night is the result of the decline of Bittor-                                                                                    of an ISP. In Figure 13, we show how the population of
rent clients during night, not to the traffic management                                                                                        Bittorrent peers in PlusNet and British Telecom evolves
policy. This conclusion can be drawn from the Bittorrent                                                                                       within the day. The population is the average popula-
demographics.                                                                                                                                  tion of Bittorrent clients calculated over the interval of
                                                                                                                                               one week. Regarding PlusNet, at peak hours we have the
                                                                                                                                               lowest number around 800 BT peers, while at off-peak
5.3                       Effect of public Policy on Customer Be-                                                                              hours it goes to over 1, 200. This is the opposite be-
                          havior                                                                                                               havior to the rest of ISPs in Great Britain, where there
                                                                                                                                               are more BT peers at local peak hours because that’s
PlusNet is the only ISP, that we are aware of, that pub-                                                                                       when people return from work and start using their home
lishes in detail their traffic management policy [16] and                                                                                       computer, than during off peak hours. For example, for
makes it easily accessible to their customers. The policy                                                                                      British Telecom, the Bittorrent population size changes
is described in Table 1. The policy contains the down-                                                                                         from 18, 000 BT peers at off-peak hours (2am-9am) to

         1                                                           0.7% of the swarm. This is certainly not optimal. We
       0.9       Mainline(Slow)
       0.8       Mainline(Fast)                                      want to bootstrap and speed up the stratification effect of
       0.7        Apollo(Slow)                                       Bittorrent, by using the detailed performance character-
                                                                     istics of Bittorrent clients for ISPs across the world that

       0.4                                                           we collect on the resolution of an hour.
       0.1                                                              We show this potential by modifying the mainline Bit-
         0                                                           Torrent client. Specifically, we introduce two changes:
             1         10             100        1000   10000
                                                                     a proactive neighborhood manager and a selective chok-
                                  Speed (Kbps)
                                                                     ing/unchoking algorithm, without tampering on the Bit-
Figure 14: Boosting the performance of a Bittorrent                  torrent algorithm. A similar approach has been studied
client                                                               in [11].

                                                                        Proactive Neighborhood Manager: We use the up-
24, 000 at peak hours, the opposite than the behavior of             load activity estimation of Apollo as an input to estimate
the PlusNet customers, and this is consistent across all             the speed of the current neighbors for each IP in the pool.
ISPs in Great Britain, and across different time intervals.          Every 30 seconds, we select n inactive neighbors with
    Why do some PlusNet customers demonstrate such a                 the lowest estimated speed. From these n neighbors, we
different behavior? We can only conjecture that these                close those that we can find a faster peer in the IP pool.
customers are influenced by the public policy of PlusNet.             Thus, for each closed peer, we open a new faster peer.
This is the main and only difference that we are aware of
between PlusNet and the rest of the ISPs in Great Britain.              Selective Choking/Unchoking Algorithm: Connect-
The PlusNet customers know that they will be throttled to            ing with faster neighbors is not enough to get a faster
128Kbits-256Kbits during peak hours and thus won’t be                download because neighbors may not send any data un-
able to download fast, and they back off from using Bit-             til they get unchoked. During the discovery phase of
torrent. Why don’t they just let Bittorrent run in the back-         fast neighbors, each neighbor has the same probability to
ground? First, using Bittorrent has a cost for the end-              be selected, so the discovery process is basically a blind
users in the form of consumed upload capacity, which is              search process. We modify the discovery process by giv-
a scarce resource for DSL users. Second, they expect to              ing a higher priority to those neighbors with a higher es-
receive download speeds at line speed at night, 3Mbits to            timated speed by Apollo.
8Mbits. One hour downloading at 128Kbits is equivalent
                                                                        To validate the potential of Apollo, we select a torrent
to just 2.5 minutes of downloading at 3Mbits. The Plus-
                                                                     with 45, 000 Bittorrent peers in the swarm. We download
Net case can potential reveal how important transparency
                                                                     the torrent using an unmodified mainline Bittorrent client
                                                                     as well as our Apollo-assisted client. We first download
                                                                     50 MBs in order to have data to send to other Bittorrent
5.4      Boosting Download performance of                            clients, and then we run the two versions to download
         Bittorrent                                                  the next 50 MBs. We run the measurement 16 times,
                                                                     alternating the two versions of the clients. We used two
Finally, we show that data collected by Apollo are                   different scenarios, one with a 1.5 Mbps and one with 4
not limited to evaluating the performance of Bittorrent              Mbps uploads.
clients, but can also be used to improve their down-
load performance by designing more efficient manage-                     Figure 14 shows the CDF of the download speed in 16
ment strategies. Apollo can provide global information               runs. In the case of 1.5 Mbps, the download speed is less
about the P2P ecosystem, such as detailed performance                than 100 Kbps 40% of the time. This result demonstrates
characteristics of Bittorrent clients for ISPs across the            the inefficiency of blind unchoking algorithm to discover
world on the resolution of an hour. A Bittorrent client              good neighbors. With the help of Apollo, our client is
can use the Apollo data, to quickly find other fast Bittor-           able to converge much faster and spends only 4% of its
rent clients, and thus speed up the stratification among              time with a throughput of less than 100 Kbps. Com-
Bittorrent peers. To provide an example, consider a pop-             paring the 2 scenarios, we can conclude that the stan-
ular torrent with 50, 000 BT peers. A Bittorrent peer can            dard Bittorrent client is not able to take advantage of the
scan the swarm with a speed in the best case of 10 to 20             higher upload capacity. Averaging the results, in the case
seconds per remote BT peer. This means that a peer to                of 1.5 Mbps, the mainline needs 860 seconds to down-
test all of them will require more than 6 days. If we as-            load 50 MBs, whereas with Apollo, we only need 540
sume that a typical download lasts one hour, then the BT             seconds. In the case of 4 Mbps, the time is 700 and 270
peer would have the opportunity to test in the best case             seconds respectively.

6   Future Work                                                     most related to the following papers [11, 9, 14, 15] that
                                                                    estimate the upload capacity of a Bittorrent client by an-
In the previous section, we presented two applications              alyzing the data received from a client during optimistic
that can be build on top of the data collected by Apollo.           unchoke. For example in [11], they use 367 PlanetLab
This is just a fraction of what can be achieved by having           nodes and 120 torrents in 48 hours to process 301, 595
access to end-user perceived measurements. In this sec-             distinct IP addresses and collect sufficient data to char-
tion we want to hint to another application of Apollo’s             acterize 70, 428 of them. Among them, the most rele-
data, this time aimed to real-time monitoring of the net-           vant to our work is [9], where they are using a similar
work status.                                                        technique.The scale of the measurement though is com-
   Due to the decentralized nature of the network infras-           pletely different. For example, they measure 500, 000
tructure only organizations with machines in multiple               end hosts per week by using 8 different machines. The
ISPs can have a holistic understanding of the network in-           main reason for this is that they also do exchanges on the
frastructure. Even ISP’s have a limited view of the whole           data level of the Bittorrent protocol, something that inad-
system; since they can only gather information from their           vertently impacts the number of clients they can analyze.
backbone and transit links. Thus, all network status mon-           Finally, in [20], the authors using collected traces char-
itoring is based on local measures and suffer from the              acterize the performance of p2p streaming flows for DSL
limitations of local knowledge. For instance, it is trivial         and Cable customers.
for an ISP to detect congestion of a transit link, however,
                                                                       Measuring Differences between ISPs and Charac-
the explanation to that congestion it is far from evident,
                                                                    terizing Residential networks: Netdiff [12], is a system
as the field of network tomography can tell.
                                                                    that enables the comparison of ISPs by measuring net-
   The data collected by Apollo can be used to inform               work metrics like latency. It is based on active measure-
methods and algorithms which can detect those links                 ments and can accurately infer the characteristics of ISPs
who are experiencing network problems. The Bittorrent               within 15 minutes. Characterizing residential broadband
PEX’s gives information to the connections among mil-               networks and analyze differences between ISPs has been
lions of clients distributed across hundreds of ISPs. We            studied in [6]. They analyzed DSL and Cable ISPs
can easily tap into the stratification of Bittorrent [10] by         and measured the performance they provide to their end-
observing in real time hundreds of millions of connec-              users. Additionally, they provide a small evaluation of
tions across time.                                                  whether ISPs are throttling.
                                                                      Using the IP Identification field: Analyzing the IP-
7   Related Work                                                    ID has been used before in [2] [3] to count hosts behind
                                                                    NATs, and study the upload performance.
Improving Network Transparency and Network Neu-
trality Evaluating network neutrality and detecting the
ISPs that discriminate against a service has attracted a
lot of attention recently [7] [18] [21] [19] [22]. The
Glasnost project [7], tries to detect the behavior of ISPs
                                                                    8   Conclusions
by providing a java applet that allows through a web
browser to test whether an ISP sends fakes TCP control              In this paper, we presented the Apollo system that can
packets. In [18] [19], the authors try to lay the foundation        perform large global scale application-based measure-
of a service that will utilize the measurements collected           ments. Based just on the control messages of Bittorrent,
at end-users to discover the ISPs that discriminate against         we showed that such a measurement system is not only
specific services. They propose statistical methods that             possible but also very efficient. By using a single com-
can be used to accurately distinguish the discrimination            modity server, we demonstrated that our system can col-
from typical performance problems that can appear in the            lect in real-time the user perceived performance of mil-
daily operation of a network. In [21] [22], they provide            lions of Bittorent users. In total, our data spans for a pe-
evidence of network neutrality violations from the back-            riod of one year allowing us to design applications that
bone ISPs. They discover these violations by using their            are built on top of this data. Our system can improve
monitoring system NVLens, and performing active mea-                Network Transparency from a Bittorrent point of view by
surement of loss rates per application. An extensive list           (a) identifying in real-time the traffic management poli-
of tools and projects on active measurements to detect              cies of ISPs, (b) proving a means to compare the BT re-
ISP discrimination and improve network transparency,                lated QoS that different ISPs offer, and (c) enhancing the
can be found in the EFF website [8]. These approaches               download speed of existing BT clients. Finally, we also
are very interesting, and complement our own work.                  implemented an API to interact with Apollo to retrieve
   Measuring Bittorrent Performance: Our work is                    real-time performance information.

References                                                     [14] Michael Piatek, Tomas Isdal, Thomas Anderson,
                                                                    Arvind Krishnamurthy, and Arun Venkataramani.
 [1] Tor: Anonymity online.           http://www.                   Do incentives build robustness in BitTorrent? In
     torproject.org/.                                               Proc. of NSDI’07, Cambridge, MA, 2007.
 [2] Steven M. Bellovin. A technique for counting nat-         [15] Michael Piatek, Tomas Isdal, Arvind Krishna-
     ted hosts. In IMW, pages 267–272, 2002.                        murthy, and Thomas Anderson. One hop reputa-
                                                                    tions for peer to peer file sharing workloads. In
 [3] Weifeng Chen, Yong Huang, Bruno F. Ribeiro,
                                                                    Proc. of NSDI’08, 2008.
     Kyoungwon Suh, Honggang Zhang, Edmundo
     de Souza e Silva, James F. Kurose, and Donald F.          [16] PlusNet.    Detailed traffic management pol-
     Towsley. Exploiting the ipid field to infer network             icy.     http://www.plus.net/support/
     path and end-system characteristics. In PAM, pages             broadband/quality broadband/speed.
     108–120, 2005.                                                 shtml.
 [4] David R. Choffnes and Fabi´ n E. Bustamante.              [17] Canadian Radio-television and Telecommunica-
     Taming the torrent: a practical approach to reduc-             tions Commission. Review of the internet traffic
     ing cross-isp traffic in peer-to-peer systems. SIG-             management practices of internet service providers.
     COMM, 2008.                                                    http://www.crtc.gc.ca/PartVII/eng/
 [5] Sam Crawford. Performance monitoring report.                   2008/8646/c12 200815400.htm.
     http://www.samknows.com/broadband/                        [18] Mukarram Bin Tariq, Mutaraza Motiwala, and Nick
     pm/PM Summer 08.pdf.                                           Feamster. Nano: Network access neutrality obser-
 [6] Marcel Dischinger, Andreas Haeberlen, Krishna P.               vatory. In ACM HotNets-VII, 2008.
     Gummadi, and Stefan Saroiu. Characterizing resi-          [19] Mukarram Bin Tariq, Mutaraza Motiwala, Nick
     dential broadband networks. In Proc. of ACM IMC                Feamster, and Mostafa Ammar. Detecting network
     ’07, pages 43–56, 2007.                                        neutrality violations with causal inference. In ACM
 [7] Marcel Dischinger, Alan Mislove, Andreas Hae-                  Conext, 2009.
     berlen, and Krishna P. Gummadi. Detecting bit-            [20] Chuan Wu, Baochun Li, and Shuqiao Zhao.
     torrent blocking. In Proc. of IMC ’08, 2008.                   Characterizing peer-to-peer streaming flows. Se-
 [8] Electronic Frontier Foundation.   Soft-                        lected Areas in Communications, IEEE Journal on,
     ware for keeping isps honest.   http:                          25(9):1612–1626, December 2007.
                                                               [21] Ying Zhang, Zhuoqing Morley Mao, and Ming
                                                                    Zhang. Ascertaining the reality of network neutral-
 [9] Tomas Isdal, Michael Piatek, Arvind Krishna-                   ity violation in backbone isps. In Proc. of ACM
     murthy, and Thomas Anderson. Leveraging bittor-                HotNets-VII, October 2008.
     rent for end host measurements. In Proc. of PAM
                                                               [22] Ying Zhang, Zhuoqing Morley Mao, and Ming
     ’07, 2007.
                                                                    Zhang. Detecting traffic differentiation in backbone
[10] Arnaud Legout, Nikitas Liogkas, Eddie Kohler, and              isps with netpolice. In ACM IMC, 2009.
     Lixia Zhang. Clustering and sharing incentives in
     bittorrent systems. In Proc. of ACM SIGMETRICS
     ’07, 2007.
[11] H. V. Madhyastha, T. Isdal, M. Piatek, C. Dixon,
     T. Anderson, A. Krishnamurthy, and A. Venkatara-
     mani. iPlane:an information plane for distributed
     services. In OSDI, 2006.
[12] Ratul Mahajan, Ming Zhang, Lindsey Poole, and
     Vivek Pai. Uncovering performance differences
     among backbone isps with netdiff. In NSDI’08.
[13] University of Oregon. Route view project. http:


To top