93 93

Document Sample
93 93 Powered By Docstoc
                   COMPUTING ENVIRONMENT

                                Narottam Chand1, R.C. Joshi2 and Manoj Misra2
                                   Department of Computer Science and Engineering
                                 National Institute of Technology Hamirpur, INDIA
                                               Email: nar@nitham.ac.in
                                Department of Electronics and Computer Engineering
                                   Indian Institute of Technology Roorkee, INDIA
                                      Email: {joshifcc, manojfec}@iitr.ernet.in

               Caching is a key technique in mobile computing environment for improving the
               data retrieval performance. Due to cache size limitations, cache replacement
               algorithms are used to find a suitable subset of items for eviction from the cache. It
               has been observed that cached items in a client are related to each other and
               therefore replacement of a data item which is highly associated may lead to series
               of misses during client’s subsequent requests. The existing policies for cache
               replacement in mobile environment do not consider relationship among data items
               along with caching parameters. This paper proposes a novel cache replacement
               policy, R-LPV that considers the caching parameters of a data item alongwith the
               relationship of this item with the cache set. Association rule based data mining is
               applied to find the relationship among data items. The simulation experiments
               show that the R-LPV policy substantially outperforms two other policies, namely
               I-LRU and Min-SAUD.

               Keywords: Cache replacement, mobile, invalidation, data mining, profit, wireless.

1   INTRODUCTION                                             hit ratio by attempting to cache the items that are
                                                             most likely to be accessed in the near future. In
     A promising technique to improve the                    contrast to the typical use of caching techniques in
performance in mobile environment is to cache                these areas, client side data caching in mobile data
frequently accessed data on the client side [1-4, 11,        dissemination has the following characteristics [6-8]:
19, 20]. Caching at mobile client can relieve                1. Cached data items may have different sizes and
bandwidth constraints imposed on wireless and                     thus a replacement policy needs to be extended
mobile computing. Copies of remote data from the                  to handle items of varying sizes. As to the size
server can be kept in the local memory of the mobile              factor, the cost to download data items from the
client, thus substantially reducing user requests for             server may vary. As a result, the cache hit ratio
retrieval of data from the origin server. This not only           may not be the best measurement for evaluating
reduces the uplink and downlink bandwidth                         the quality of a cache replacement algorithm.
consumption, but also the average query latency.             2. Data items may be constantly updated at the
Caching frequently accessed data in the mobile client             server side. Thus the consistency issue shall be
also saves on energy used to retrieve the repeatedly              considered. That is, data items that tend to be
requested data. Due to limited cache size at a mobile             inconsistent earlier should be replaced earlier.
client, it is impossible to hold all accessed data items     3. Mobile clients may frequently disconnect
in the cache. Thus, cache replacement policies are                voluntarily (to save power and/or connection
used to find a suitable subset of data items for                  cost) or due to failure.
eviction. The performance of a caching system                4. A client may move from one cell to another.
depends heavily on the replacement algorithms used                With the consideration of the above issues, the
to dynamically select an evict subset from a finite          endeavor to find an efficient replacement algorithm
cache space.                                                 in mobile environment becomes an even more
     Cache replacement algorithms have been                  difficult problem compared to traditional cache
extensively studied in the context of operating              replacement algorithms. To utilize the limited
system, virtual memory management and database               resources at a mobile client, cache replacement
buffer management [5]. In this context, cache                policies for mobile on-demand broadcasts with
replacement algorithms usually maximize the cache            varying sizes have been investigated in the recent

                     Ubiquitous Computing and Communication Journal                                              1
past [7, 8]. These policies consider various caching     and d3→d4 with confidences of 90 percent and 40
parameters namely, data access probability, update       percent respectively. Here di denotes the data item i.
frequency, retrieval delay from the server, cache        Assume that the fetching delay of item d2 is 2 and
invalidation delay, and data size in developing a        that of item d4 is 4. The expected delay saving from
benefit function which determines the cached item(s)     caching d2 is then equal to 1.8 (=0.9x2), whereas the
to be replaced. Incorporating these parameters to        expected delay saving from caching d4 is equal to 1.6
their designs, these cache replacement algorithms        (=0.4x4). In this case, caching d2 (i.e., the item with
show significant performance improvement over the        lower fetch delay) is in fact more beneficial than
conventional ones (for example, LRU or LFU               caching d4 (i.e., the item with higher fetch delay).
algorithms).                                             Similar explanation also holds when using other
     Data items queried during a period of time are      caching parameters. Thus, not only the caching
related to each other [9]. Hence, the cached items in    parameters of the data item, but also the confidence
a client are associated to each other and therefore      shall be taken when devising the cache replacement
replacement of a cached item cannot be seen in           algorithm. In conclusion, a caching strategy that
isolation. Replacement of a data item which is highly    considers relationship among cached items alongwith
related to a cache subset may lead to series of cache    other parameters is a better choice in mobile
misses in the client’s subsequent requests. Data         environment.
mining research deals with finding associations
among data items by analyzing a collection of data       1.2 Paper contribution
[10]. In our replacement algorithm, the access history        To maximize the performance improvement due
of a client is mined to obtain the association rules.    to caching, we propose a novel cache replacement
Then confidence value and caching parameters             algorithm, not only considering the caching
(update frequency, retrieval delay from the server,      parameters of a data item, but also the relationship of
cache invalidation delay, and data size) of the data     this item with the cache set. Association rule based
item in the consequent of the rules, are used to         data mining is applied to find the relationship among
compute the benefit function for replacement. In         data items. We design a profit function to
contrast to the use of data access probability, our      dynamically evaluate the profit from caching an item.
policy uses confidence value which is a conditional      Simulation is performed to evaluate the performance
probability.                                             of our algorithm under several circumstances. The
                                                         experimental results show that our algorithm
1.1 Motivation                                           considerably outperforms other algorithms.
     We have identified some characteristics of the           More precisely, this paper makes following
mobile environment that affect client side caching       contributions:
performance:                                             1. Data mining algorithm to generate association
1. Various parameters of a data item viz. update              rules with only one item in the consequent and
     frequency, retrieval delay from the server, cache        one or more items in the antecedent. Our
     invalidation delay, and data size contribute             motivation is to find several items that are
     towards benefit gained by caching the item.              highly related to the item to be replaced.
2. Data items queried during a period of time are        2. Development of cache replacement policy that
     related to each other. The choice of forthcoming         considers data association rules alongwith
     items can dependent, in general, on a number of          caching parameters.
     previously accessed items.                          3. Extensive       simulation      to    evaluate   the
3. Cached items in a client are related to each other         performance of proposed cache replacement
     and therefore replacement of a data item which           algorithm.
     is highly associated may lead to series of misses
     during client’s subsequent requests.                1.3 Organization
4. Cache misses are not isolated events, and a               The rest of the paper is organized as follows.
     cache miss is often followed by a series of cache   The related work is described in Section 2. The
     misses.                                             system model is presented in Section 3. Section 4
5. The association among data items can be               describes data mining techniques to generate the
     utilized to make replacement decision.              caching rules. Section 5 presents the details of
     In view of the above, the motivation for our        proposed cache replacement policy. Section 6 is
study is to deign a novel cache replacement              devoted to the performance evaluation and presents
algorithm for the mobile environment. To find the        simulation results. Concluding remarks are given in
relationship among data items, association rules         Section 7.
based data mining technique is used.
Consider the following examples to illustrate the        2   RELATED WORK
caching characteristics in mobile environment:
Example 1. Consider two association rules d1→d2              Caching frequently accessed data on the client

                    Ubiquitous Computing and Communication Journal                                            2
side is an effective technique to improve                       [4] uses a modified LRU, called invalid-LRU (I-
performance in mobile environment [1, 12]. A lot of        LRU), where invalid cache items are first replaced.
research has been done on cache invalidation in the        In case, there is no invalid cache item, the client
past few years [1-4, 11, 19, 20], with relatively little   considers the oldest valid item for replacement.
work being done on cache replacement methods. In                In [6], Xu et al. developed a cache replacement
the following, we briefly review related studies on        policy, SAIU, for wireless on-demand broadcast.
cache replacement in mobile environment.                   SAIU took into consideration four factors that affect
     Cache replacement policy is widely studied in         cache performance, i.e., access probability, update
web proxy caching. Recently, many new                      frequency, retrieval delay, and data size. However,
deterministic replacement schemes, such as GD-Size         an optimal formula for determining the best cached
[13], LNC-R-W3-U [14], LRV [15], and Hybrid [16],          item(s) to replace based on these factors was not
have been studied particularly for the Web proxy           given in this study. Also, the influence of the cache
cache. W.-G. Teng et al. [17] have proposed a novel        consistency requirement was not considered in SAIU.
proxy caching strategy based on the normalized             Xu et al. [6, 7] propose an optimal cache replacement
profit function called IWCP (integration of Web            policy, called Min-SAUD, which accounts for the
caching and prefetching). This algorithm not only          cost of ensuring cache consistency before each
considers factors such as the size, fetching cost,         cached item is used. It is argued that cache
reference rate, invalidation cost, and invalidation        consistency must be required since it is crucial for
frequency of a Web object, but also exploited the          many applications such as financial transactions and
effect caused by various Web prefetching schemes.          that a cache replacement policy must take into
Since access latency, connectivity, limited power and      consideration the cache validation delay caused by
memory capacity of mobile devices are the                  the underlying cache invalidation scheme. Similar to
characteristics of today’s mobile environment, the         SAIU, Min-SAUD considers different factors in
algorithms deployed in Web proxy cache cannot be           developing the gain function which determines the
adapted directly to manage mobile cache.                   cache item(s) to be replaced. Although the authors
     The cache replacement issues for wireless data        proved that their gain function is optimal, they did
dissemination were first studied in the Broadcast          not show how to get such an optimal gain function.
Disk (Bdisk) project [12]. Acharya et al. proposed a       Since this approach needs an aging daemon to
cache replacement policy PIX in which the cached           periodically update the estimated information of each
data item with the minimum value of p/x was evicted        data item, the computational complexity and energy
for cache replacement, where p is the item access          consumption of the algorithm are too high for a
probability and x is its broadcast frequency. Thus, a      mobile device.
cached item has a high access probability or has a              Huaping Shen et al. [11] proposed an energy-
long broadcast delay. Simulation based study showed        efficient utility based replacement strategy with
that this strategy could significantly improve the         O(logN) complexity, called GreedyDual Least Utility
access latency over the traditional LRU and LFU            (GD-LU), where N is the number of items in the
policies.                                                  cache.
     Caching and its relation with the broadcast                Recently, L. Yin et al. [8] propose a generalized
schedule in the Bdisk system was empirically               value function for cache replacement algorithm for
investigated in [18]. Caching and scheduling affect        wireless networks under strong consistency model.
each other. A scheduling scheme determines data            The distinctive feature of the value function in
retrieval costs, and thus affects the caching policy.      contrast to [7] is that it is generalized and can be
On the other hand, caching affects as well, since it       used for various performance metrics by making the
reduces client access requests to the server and thus      necessary changes. Disadvantage is that the strategy
changes the clients’ access patterns. This gives rise      suffers from high computational complexity.
to a circular problem. In [18], some interesting
results were discovered through simulation. Efficient      3   SYSTEM MODEL
scheduling can provide performance improvement
when caches are small. However, schedules need not              This paper studies cache replacement in mobile
be very skewed when large caches are used, and in          environment. Fig. 1 depicts a typical system model
this case efficient caching algorithms should be           used during the study. As illustrated, there is a cache
favored against refined broadcast schedules. It            management mechanism in a client. The client
remains open on how to design a cooperative                employs a rule generating engine to derive caching
protocol such that the performance is optimized.           rules from the client’s access log. The derived
     All of the above studies are based on simplifying     caching rules are stored in the caching rule
assumptions, such as fixed data sizes, no updates,         depository of the client. Whenever an application
and no disconnections, thereby making the proposed         issues a request, the cache request processing module
schemes impractical for a realistic mobile                 first logs this request into record and checks whether
environment.                                               the desired data item is in the cache. If it is a cache

                     Ubiquitous Computing and Communication Journal                                             3
hit, the cache manager still needs to validate the         the antecedent while the then part is called the
consistency of the cached item with the copy at the        consequent.
origin server. To validate the cached item, the cache          The rule presented above is known as
manager retrieves the next validation report from the      association rule in the data mining literature [10, 21,
broadcast channel. If the item is verified as being        22]. We propose to use data mining technique to
upto date, it is returned to the application               discover the association rules in the access history
immediately. If it is a cache hit, but the value is        and apply the rules to make the replacement decision.
obsolete, the cache manager sends an uplink request        We will call them the caching rules. The problem of
to the server and waits for the data broadcast. When       finding association rules among items has been
the requested data item appears on the broadcast, the      clearly defined by Agrawal et al. [10]. Our context,
cache manager returns it to the requester and retains      mobile environment, imposes different conditions
a copy in the cache. In the case that a cache miss         and thus a direct application of existing association
occurs, the client cache manager sends uplink              rule mining algorithms, such as those presented in
request to the server for the cache missed item.           [21] is not applicable to generate the caching rules:
When the requested data item arrives on the wireless            We are interested in rules with one or more data
channel, the cache manager returns it to the requester          items in the antecedent. We restrict the
and retains a copy in the cache. The issue of cache             maximum number of items in the antecedent
replacement arises when the free cache space is not             because the computation of rules where for
enough to accommodate a data item to be cached.                 example nine items imply a tenth one, is
We develop an optimal cache replacement scheme                  expensive. In our opinion, these rules are not
that incorporates a profit function in determining the          more useful than those where three items imply
cache item(s) to be replaced. Various caching                   another one, and whose computation requires
parameters alongwith replacement rules are                      less effort thus making the process more suitable
considered while computing the profit value.                    for a mobile client. Our motivation is to retain in
     To address the cache consistency issue, strategy           cache a data item that is highly related to the
based on update report (UR) [4] has been used in                data items present in the cache. For example, out
this paper. The strategy periodically broadcasts                of two rules {d1, d2, ..., d9}→d10 and {d1, d2,
update report (UR) to minimize uplink requests and              d3}→d10, latter will be more appropriate to find
downlink broadcasts. To reduce the query latency,               the relation of d10 with cached items when d5
the strategy uses request reports (RRs), where all the          and d6 are not present in the cache. Due to cache
recently requested items are broadcast.                         replacement, all the items in antecedent of a
                                                                long rule may not be present in the cache and
                                                                hence such rules are rarely used.
                                                                We want to generate rules that have just a single
                                                                item in the consequent. This is because, when to
                                                                consider a data item di for replacement, the
                                                                confidence value of the rules having the item di
                                                                in consequent will be used.
                                                               In the following, a formal statement of the
                                                           problem of mining caching rules is presented.

                                                           Problem statement
                                                                Let D = {d1, d2, ..., dN} be the set of data items at
                                                           the server. Suppose a client’s access trace S consists
    Figure 1: System model for cache replacement.          of a set of consecutive parts: {part1, part2, ..., parti, ...,
                                                           partn}. Let A = {d1, d2, ..., dm} denotes the set of data
4    GENERATION OF CACHING RULES                           items accessed by the client. Let Si denotes the data
                                                           items contained in part parti. Si is called a session
     At a mobile client, data items queried during a       and Si ⊂ A. We say a session Si contains x if Si ⊇ x,
period of time are related to each other. Observation      where x ⊂ A. A caching rule rx,y is an expression of
of the history of data items queried by the client may     the form x→dy, where x ⊂ A, dy ∈ A, and x ∩ {dy}
lead to find relationship among the items. These           = φ. x is called antecedent of the rule and dy is called
relationships can take the form of patterns of             consequent.
behavior that can tell us that if the client has               In general, a set of data items is called an itemset.
accessed certain items during a period of time then it     The number of data items in an itemset is called the
is likely that one particular item will be accessed in     size of the itemset and an itemset of k size is called k-
near future. An example of such a relationship is “if      itemset. The support of an itemset x, support(x), is
a client accesses d1 and d2 then it accesses d3 80         defined as the percentage of sessions that contains x
percent of the times”. The if part of the rule is called   in the client’s access trace S. The support of a

                     Ubiquitous Computing and Communication Journal                                                    4
caching rule rx,y, support(rx,y), is defined as the         possible to generate new itemsets or the number of
support of the itemset that consists data items in both     items in an itemset exceeds the predefined maximum
the antecedent and the consequent of the rule, i.e.,        NR. Lines 3-14 generate all the new candidate
support(rx,y) = support(x ∪ {dy}).                          frequent k-itemsets out of the frequent (k-1)-itemsets.
     We define the confidence of a caching rule rxy,        Lines 15-18 remove those candidate frequent k-
confidence(rx,y), as the support of the rule divided by     itemsets that do not fulfill the minimum support
the support of the antecedent.                              requirement. In line 21 the algorithm returns all the
                           support(x ∪ {d y })              frequent itemsets generated.
    Confidence(rx, y ) =                         × 100% .
                              sup port ( x )
                                                             1) F1 = {frequent 1-itemsets}
    In general, cx,y denotes confidence(rx,y) and rx,y is
                                                             2) k = 2
expressed as x ⎯⎯ → d y .
                                                             3) while Fk-1 ≠ φ ∧ k ≤ NR do
    The confidence of a caching rule is the                  4)      Fk = φ
conditional probability that a session in S contains         5)      for each itemset f1 ∈ Fk-1 do
the consequent given that it contains the antecedent.        6)           for each itemset f2 ∈ Fk-1
Given an access trace S, the problem of mining               7)                 if f1[1] = f2[1] ∧ ... ∧ f1[k-2] =
caching rules is to find all the association rules that         f2[k-2] ∧ f1[k-1] < f2[k-1]
have support and confidence greater than the user-           8)                 then f = {f1} ∪ {f2[k-1]}; Fk = Fk
defined minimum support (minsup) and minimum                    ∪ {f}
confidence (minconf), respectively.                          9)                 for each (k-1)-subsets s ∈ f do
    The problem of mining caching rules can be               10)                    if s ∉ Fk-1
decomposed into the following subproblems [10]:              11)                    then Fk = Fk –{f}; break
1. Find all the itemsets x such that support(x) ≥            12)                end
     minsup. An itemset x that satisfies this condition      13)          end
     is called frequent itemset. See Section 4.1.            14)     end
2. Use the frequent itemsets to generate association         15)     for each itemset f1 ∈ Fk do
     rules with minimum confidence. See Section 4.2.         16)          if support(f1) < minsup
                                                             17)          then Fk = Fk –{f1]
4.1 Algorithm to generate frequent itemsets                  18)     end
     In this section, we present an algorithm to             19)     k = k+1
generate frequent itemsets from the client’s access          20) end
trace. Table 1 shows the notations used in the               21) return ∪kFk
     Fig. 2 shows the main steps of the algorithm. It
                                                            Figure 2: Algorithm to generate frequent itemsets.
accepts an access trace S, a minimum support
(minsup), and the maximum number of items NR to
                                                            4.2 Algorithm to generate caching rules
be used in a rule as parameters. In line 1, S is
                                                                 We are interested in generating, from a frequent
analyzed to generate the frequent 1-itemsets. This is
                                                            k-itemset fi, rules of the form x→dy, where x is a (k-
done by calculating the support of each data item and
comparing it to the minimum support. Every data             1)-itemset, dy is a 1-itemset and fi = x ∪ {dy}. Table
item that has minimum support forms one frequent            2 shows the notation used in the algorithm.
1-itemset.                                                       Fig. 3 illustrates the main idea of the algorithm.
                                                            The algorithm accepts the frequent itemsets and a
Table 1: Notations.                                         minimum confidence (minconf) as parameters. For
                                                            each frequent itemset, the rules are generated as
                                                            follows. Of all the data items within frequent itemset,
 NR             Maximum number of items in a rule
                                                            one item becomes the consequent of the rule, and all
 k-itemset      An itemset with k items
                                                            other items become the antecedent. Thus, a frequent
 Fk             The set of frequent k-itemsets (those
                                                            k-itemset can generate at most k rules. For example,
                with minimum support)
                                                            suppose {d1, d2, d3} is a frequent 3-itemset. It can
  f 1, f 2      Frequent (k-1)-itemsets within Fk-1
  f1[m]         m-th item in itemset f1                     generate at most three rules: {d1, d2}→d3, {d1,
  f             A new frequent k-itemset obtained by        d3}→d2, and {d2, d3}→d1. After the rules have been
                combining a frequent (k-1)-itemset          generated, their confidences are calculated to
                with one item                               determine if they have the minimum confidence.
       Loop from line 3 to line 20 is used to generate      Only the rules with at least the minimum confidence
all the frequent 2-, 3-, ..., k-itemsets. Each iteration    are kept in the rule set Z. For example, for the rule
of the loop, say iteration k, generates frequent k-         {d1, d2}→d3, confidence conf = support({d1, d2,
itemsets based on the (k-1)-itemsets generated in the       d3})/support({d1, d2}). If conf ≥ minconf, the rule
previous iteration. This loop continues until it is not     holds and it will be added to the rule set Z.

                        Ubiquitous Computing and Communication Journal                                           5
Table 2: Notations.                                            pattern may change from time to time. For example,
                                                               at one time the client is interested in stock quotes and
 Z             The set of caching rules                        after some time, the client may like to browse the
 Fk            The set of frequent k-itemsets                  cricket score. Therefore, we need to analyze the
 fi            Frequent k-itemset within Fk                    client’s access trace periodically, eliminating the
 fi[m]         m-th item in itemset fi                         obsolete rules and add new rules if necessary.
                                                               Mining the access trace very frequently is a waste of
 1) k = 2                                                      client resources, however, using the same set of rules
 2) while Fk ≠ φ do                                            for a long time may affect the caching performance
 3)     for each itemset fi ∈ Fk do                            negatively since the current rule set may no longer
 4)          for each item fi[j] ∈ fi                          reflect the access pattern of the client. In the R-LPV
                                                               scheme, we re-mine and update caching rules
                            sup port (f i )
 5)                 if                             ≥ minconf   periodically to keep them fresh. This is done by
                         sup port (f i − f i [ j])             adding recent access to the trace and cutting off the
 6)              then Z = Z ∪ {{fi-fi[j]}→fi[j]}               oldest trace.
 7)          end                                                  Another important issue is whether to perform
 8)      end                                                   mining on the mobile support station (MSS) or on
 9)      k = k+1                                               the client. Mining of association rules in mobile
 10) end                                                       environment is application dependent. For example,
 11) return Z                                                  data broadcasting [24] and mobility prediction [25]
                                                               use the MSS side mining whereas in prefetching [9]
Figure 3: Algorithm to generate the caching rules.             the mining is performed on the client side. Mining on
                                                               MSS is a better choice when decision is to be taken
4.3 Session formation                                          on the wired side such as [24] and [25]. Mining on
     Before applying the data mining algorithm to              the MSS, while performing caching and prefetching
generate caching rules, we first need to identify              have the following disadvantages:
sessions out of the client access trace. This section,         − Overhead of broadcasting the rule set to the
describes the session formation and other related                 clients.
issues. The concept of user session is defined in all          − Access trace at the MSS corresponds to the cache
existing web prefetching algorithms. The objective                miss pattern only. If there is a cache hit, the access
of session formation in such studies is to separate               trace is not reflected at the MSS. Transmitting of
independent accesses made by the same user at                     the access trace for cache hits to the MSS
distant points in time. This issue is handled in [23],            consumes client energy as well as wireless
where accesses of each user are grouped into                      bandwidth.
sessions according to their closeness in time. If the          − Communication consumes more energy than
time between successive page requests exceeds a                   computation for a mobile client. For example, the
certain limit, it assumed that the user is starting a             energy cost for transmitting 1K bits of
new session.                                                      information is approximately the same as the
     In this paper, we determine the session                      energy consumed to execute three million
boundaries using an approach similar to [23]. Here,               instructions [26].
we assume that a new session starts when the time                  Keeping in view the above facts, in our approach,
delay between two consecutive requests is greater              each client mines and maintains its own caching rule
than a pre-specified time session_threshold. The               set independent of the other clients.
access trace of a client is collected in the form S =
<(d1, t1), (d2, t2), ..., (di, ti), ... (dk, tk)>. Here di     5   RULE BASED CACHE REPLACEMENT
denotes ID of the data item which the client accesses
at time ti. After the access history of a client is                Here we present a replacement strategy called
collected in a predefined time interval in the above           Rule based Least Profit Value (R-LPV) which
format, this trace is partitioned into sessions. If the        considers the profit gained due to data caching. To
time gap between two consecutive accesses is larger            devise a profit function from caching an item, we
than a session_threshold, the new session starts. For          have to find the replacement rule set for the item,
example, if ti+1 – ti > session_threshold, we assume           which is a subset of caching rules generated in
that session <d1, d2, ..., di> ends and at ti+1 a new          Section 4.
session is started with first item di+1. In this way the
access trace is partitioned into sessions and used in          5.1 Generating replacement rule set
frequent itemsets generation algorithm given in                    The replacement rules to be used in the
Section 4.1.                                                   computation of profit function are generated from the
     The caching rules that are generated are highly           caching rules. For example to compute profit value
related to the client’s access pattern and the access          profity for an item dy, the replacement rule set, Zy,

                      Ubiquitous Computing and Communication Journal                                                  6
contains all the caching rules with consequent dy and         data item dy is not in the cache, it will take by to
antecedent as subset of cached items at the client.           retrieve it into the cache. In otherwise, if dy is in the
The generation steps for Zy are as follows:                   cache, we can save the delay by by. However, it also
1. Include in Zy all the rules rx,y ∈ Z such that every       takes ( v + Pu y × v y ) to validate it and get the
    data item that appears in x is cached by the client.
    If V is set of cached data items at the client, then      updated data if necessary. Thus caching the item dy
                                                              can save the delay by ( b y − ( v + Pu y × v y ) per
    Zy = U rx , y ∧ rx , y ∈ Z ∧ x ⊆ V .
     For example, if Z = {{d1, d2, d3}→d4, {d1,                   Combining the Eq. (2) and (3), we get
     d2}→d4, {d1, d5}→d2, {d1, d5}→d4},
                                                                  profity = cy × ( b y − v − Pu y × v y )      (4)
     then Z4 = {{d1, d2, d3}→d4, {d1, d2}→d4, {d1,
2.   If a rule rx,y is contained in Zy, then no rule rw,y     Table 3: Notations.
     with w ⊂ x should be used in the replacement set
     Zy. So we retain only the rules with maximum                             The set of all the data items in the
     number of items in the antecedent. To achieve                            database
     this,       update       Zy  as,   Zy      =      Zy -                   The number of data items in the
      U rx , y ∧ rw , y ∈ Z y ∧ w ⊂ x ∧ rx , y ∈ Z y .                        database
                                                               dy             A data item with id y
     For example rule {d1, d2}→d4 will be excluded                                                   c
                                                               rx,y                           ⎯
                                                                              Caching rule x ⎯⎯ → d y

     because {d1,d2} ⊂ {d1,d2,d3}, so Z4 = {{d1, d2,
     d3}→d4, {d1, d5}→d4}.                                     cx,y           Confidence of the rule rx,y
     The idea behind retaining the rules with                  cy             Expected number of accesses to item dy
     maximum items in the antecedent is to consider            sy             Size of data item dy
     the relation of an item with largest cache subset                        The delay of retrieving data item dy from
     while performing replacement. An item which is                           the server, i.e. cache miss penalty
     related to larger subset is more beneficial for the                      The cache validation delay, i.e., access
     client and hence is assigned lower priority for           v              latency of an effective invalidation
     replacement.                                                             report
                                                                              The delay in retrieving updated data
5.2 Replacement using profit function                                         item dy from the server
    Most cache replacement algorithms employ an                ay             The mean access rate to data item dy
eviction function to determine the priorities of data          uy             The mean update rate of data item dy
items to be replaced (or to be cached). We devise a            Pu y           The probability of invalidating cached
profit function to determine the gain from caching an                         item dy
item. Our replacement policy depends on the profit             V              The set of cached data items
of individual item to be kept in the cache. To                 C              Cache size (in bytes)
determine the profit due to an item dy, we will first
calculate the expected number of accesses cy of dy.                The profit function is used to identify the item to
Using the replacement rule set Zy, the value cy for an        be retained (i.e. not to be evicted). Intuitively, the
item dy can be computed in terms of confidence of             ideal data item in cache should have a high expected
the rules. Due to replacement rule rx,y ∈ Zy, the             number of accesses, a low update rate, high retrieval
expected number of accesses of item dy is increased           delay and small data size. These factors are
by cx,y.                                                      incorporated if we evict an item di with minimum

     Therefore cy =                 x,y          (1)
                                                                   The objective of the cache replacement is to
                      rx , y ∈Z y
                                                              maximize the total profit value for cached data items,
Profity for a cached item dy is given as                      that is
     porfity = cy × dsy                          (2)
     where dsy denotes delay-saving due to caching of
                                                                                 profit ,
                                                                                   d y ∈V

item dy. R-LPV associates a profit value to each item                 subject to            y   ≤C            (5)
in the cache and replaces the item with least profit.
                                                                                   d y ∈V
Delay-saving can be computed as inspired by [7, 8].
To facilitate the computation, notations are given in             Based on the above objective function, we can
Table 3.                                                      define our R-LPV policy. We follow the description
     Based on the notations, the delay saving dsy due         provided by Yin et al. [8]. To add a new data item to
to caching of item dy is                                      the cache, suppose we need to replace data items of
     ds y = b y − v − Pu y × v y             (3)              size s. R-LPV finds the set V ′ of items to be
                                                              replaced which satisfies the following conditions:
     The Eq. (3) can be justified as follows. If the

                        Ubiquitous Computing and Communication Journal                                               7
     d y ∈V′
               y   ≥s,                                                                                     fewer than K samples are available, all the samples
                                                                                                           are used to estimate the value. Shim et al. [27]
                                                                                                           showed that K can be as small as 2 or 3 to achieve
          ⎛                       ⎞                                                                        the best performance. Client access log is used to
     ∀V k ⎜ Vk ⊆ V ∧
          ⎜                      ∑
                            sy ≥ s⎟ ,
                                                                                                           compute ay, thus, there is no additional spatial
          ⎝          d y ∈Vk      ⎠
                                                                                                           overhead to maintain TaK . The parameter uy is
       ∑ profit
     d y ∈Vk
                     y   ≥     ∑ profit
                              d y ∈V′
                                                 y                                                                                y

                                                                                                           maintained and stored on the server side and is
     Here V ′ is the least profitable subset of V with                                                     piggybacked to the clients when data item dy is
total size of at least s.                                                                                  delivered.
                                                                                                               To estimate by and vy, we use a well known
5.3 Implementation issues                                                                                  exponential aging method [27]. It combines both the
    The R-LPV algorithm evicts a least profitable                                                          history and the current observed value to estimate the
cached item during each replacement as given by                                                            parameters. Whenever a query for item dy is
optimization problem defined by Eq. (6). The                                                               answered by the server, by is reevaluated as follows
optimization is essentially the 0/1 knapsack problem                                                           b y = α × b new + (1 − α) × b old
                                                                                                                           y                 y          (10)
which is known to be NP-hard. When the data size is
small compared to cache size, a sub-optimal solution                                                           Where b new is the currently measured data
can be obtained. We define a heuristic to throw out a
cached item dy with minimum profity/sy value until                                                         retrieval delay, and b old is the calculated by before
the free space is sufficient to accommodate the                                                            the last retrieval of item dy. Here α is a constant
incoming data item.                                                                                        factor to weigh the importance of the most recent
    To implement the R-LPV algorithm, we need to                                                           measured value.
estimate the parameters by, vy, and Pu y . To facilitate                                                       Similarly,
this estimation, we assume that the arrival of data                                                            v y = α × v new + (1 − α) × v old
                                                                                                                           y                 y          (11)
accesses and updates for data item dy follow the                                                                A binary min-heap data structure is used to
Poisson processes. Specifically, t a and t u , the
                                   y       y
                                                                                                           implement the R-LPV policy. The key field of the
                                                                                                           heap is the profity/sy value for each cached item dy.
interarrival times for data access and update of item                                                      When the cache replacement occurs, the root item of
dy follow exponential distributions with mean of ay                                                        the heap is deleted and the operation is repeated until
and vy respectively. In other words the density                                                            sufficient space is obtained for the incoming data
                                                                                      −a y t a
functions for t a and t u are f ( t a ) = a y e
                y       y           y
                                                                                                  and      item. Let Nc denotes the number of cached items and
                                                                                                           Nv the number of items to be deleted during a
                   −u y t u
g(t u ) = u ye
                               , respectively [7]. Let Ta y and                                            replacement operation. Every deletion and insertion
                                                                                                           operation has a complexity of O(logNc). Thus, the
Tu y be the time of access and time of invalidation of                                                     time complexity for every cache replacement
data dy respectively. The probability that the cache                                                       operation is O(NvlogNc). Most likely, the maximum
invalidation happens before the next data access is:                                                       value of Nv is three. In addition, when an item’s
                                                                                                           profit value is updated, its position in the heap needs
                                ∞ Ta y
                                                                                                           to be updated. O(logNc) time is needed to update its
                                             − a y Ta y          − u y Tu y                        uy
                                ∫ ∫a e
Pu y = Pr(Tu y < Ta y ) =                                 u ye                dTu y dTa y =                position. Thus, overall time complexity of R-LPV is
                                                                                                 ay + uy
                                0 0
     So, the Eq. (4) becomes                                                                                    To implement the R-LPV policy, profity given by
                                                                                                           Eq. (7) needs to be recalculated whenever a
                      ⎛             uy         ⎞
     profit y = c y × ⎜ b y − v −         × vy ⎟ (7)                                                       replacement is necessary. This computation overhead
                      ⎜           ay + uy      ⎟
                      ⎝                        ⎠                                                           may be very high. To reduce the computation
     We apply a sliding window method used by                                                              overhead, we follow the same idea as proposed in [8].
Shim et al. [27] to estimate ay and uy. The method                                                         When a cache replacement is necessary, instead of
uses K most recent samples to estimate ay and uy as                                                        recalculating the profit value for every data item, the
follows                                                                                                    value of Nl least profitable items will only be
         K                                                                                                 recalculated. Most likely, the items to be replaced
ay =                                             (8)                                                       will be among them only because their values are
      T − TaK y
                                                                                                           relatively small. It has been observed that Nl = 3
          K                                                                                                provides satisfying performance, thus, the
uy =                                                                                  (9)
       T − TuKy                                                                                            computational overhead of the profit function is very
     Where T is the current time, TaK and Tu y are the

time of Kth most recent accesses and updates. When

                                  Ubiquitous Computing and Communication Journal                                                                                8
6   PERFORMANCE EVALUATION                                  contains update history of the past w broadcast
                                                            intervals. There are N data items at the server. Data
    In this section, we evaluate the performance of         item sizes vary from smin to smax such that size si of
the proposed methodology. We compare R-LPV                  item di is, s i = s min + ⎣random().(s max − s min + 1)⎦ ,
cache replacement algorithm with the I-LRU [4] and          i = 1, 2,... N, where random() is a random function
Min-SAUD [7] algorithms.                                    uniformly distributed between 0 and 1. The server
                                                            generates a single stream of updates separated by an
6.1 The simulation model                                    exponential distributed update interarrival time with
In the simulation, a single server maintains a              mean value of Tu. The data items in the database are
collection of N data items and a number of clients          divided into hot (frequently accessed) data subset
access the data items. The UR [4] cache invalidation        and cold data subset. Within the same subset, the
model is adopted for data dissemination.                    update is uniformly distributed, where 80% of the
                                                            updates are applied to the hot subset. Most of the
6.1.1 The client model                                      system parameters are listed in Table 4.
     The time interval between two consecutive
queries generated from each client follows an                           Table 4: Simulation parameters.
exponential distribution with mean Tq. Each client
generates a single stream of read-only queries. After         Parameter                    Default        Range
a query is sent out the client does not generate new                                       Value
query until the pending query is served. Each client          Database size (N)            10000 items
                                                              smin                         10 KB
generates accesses to the data items following Zipf
                                                              smax                         100 KB
distribution [28] with a skewness parameter θ. In             Number of clients (M)        70
Zipf distribution, the access probability of the ith (1 ≤     Client cache size (C)        600 KB         200~1400 KB
i ≤ N) data item is represented as follows:                   UR broadcast interval (L)    20 sec
                                                              Number of RR broadcasts      4
                       1      , 0≤θ≤1
          Ai = N                                              (m-1)

                  ∑θ       1                                  Broadcast window (w)         10 intervals
                 i          θ                                 Broadcast bandwidth          100 Kbps
                     k =1 k                                   Hot subset percentage        20 %
     If θ = 0, clients uniformly access the data items.       Hot      subset     update   80 %
As θ is increasing, the access to the data items              Mean update arrival time     10 sec         1-10000 sec
becomes more skewed. During simulation we have                (Tu)
chosen θ to be 0.8.                                           Mean query generate time     100 sec        5~300 sec
     Every client, if active, listens to the URs and          (Tq)
                                                              Skewness parameter (θ)       0.8
RRs to invalidate its cache accordingly. When a new
request is generated, the client listens to the next UR
or RR to decide if there is a valid copy of the             6.2 The simulation results
requested item in the cache. If there is one, the client         In the simulation results we show the byte hit
answers the query immediately. In case of invalid           ratio (B) and average query latency (Tavg) as a
copy, the client downloads the update copy of the           function of different factors such as cache size (C),
data item from the broadcast channel and returns it to      mean query generate time (Tq) and mean update
the application. If cache miss happens, the client          arrival time (Tu). Byte hit ratio is defined as the ratio
sends an uplink request to the server. After receiving      of the number of data bytes retrieved from the cache
the uplink request, the server broadcasts the               to the total number of requested data bytes. The
requested data during next report (UR or RR). Then,         average query latency is the total query latency
the client can download them and answer the query.          divided by the number of queries.
To accommodate a new item, the client follows R-            Three cache replacement algorithms are compared in
LPV cache replacement policy.                               our simulations:
     The client cache management module consists of         • I-LRU [4]: The Invalid-LRU algorithm keeps
an access log, which is used to generate caching                 removing the invalid item that was used the least
rules. The access log is divided into sessions and the           recently until there is enough space in the cache.
sessions are mined using the association rule mining             If there is no invalid item in the cache, the oldest
algorithms. To keep the caching rules fresh, the                 valid item is removed as per LRU policy.
client updates the access log whenever a query is           • Min-SAUD [7]: Min-SAUD considers various
generated, and remines the rules periodically.                   factors that affect cache performance, namely,
                                                                 access probability, update frequency, data size,
6.1.2 The server model                                           retrieval delay, and cache validation cost.
    The server broadcasts URs (or RRs) over the             • R-LPV: This is our algorithm. It keeps removing
wireless channel with the broadcast interval specified           the item di with least profiti/si value, where the
by the parameter L (or L/m). IR part of the UR report            profit function is defined by Eq. (7).

                      Ubiquitous Computing and Communication Journal                                                    9
                                                           respectively in terms of average query latency.
6.2.1 The effects of cache size                            Similar results can be found in Fig. 5a. For all the
     In this section, we investigate the performance of    three algorithms, the average query latency is high at
the cache replacement algorithms under different           lower Tu because of lower number of updates at the
cache sizes. The simulation results are shown in Fig.      server. Fig. 5a shows that the byte hit ratio drops
4. Our algorithm outperforms the other ones in terms       with the decrease in mean update arrival time.
of the byte hit ratio and average query latency. This
is explained as follows. In contrast to I-LRU, the                                       0.9
proposed algorithm R-LPV favors small size data                                          0.8
item, thus a larger number of items can be saved in
the cache. As a result, the byte hit ratio is higher and
the average query latency is lower. Both Min-SAUD                                        0.6

                                                           Byte hit ratio
and I-LRU do not consider the relationship among                                         0.5
data items. Thus important data items could be                                           0.4
replaced by unimportant ones using these algorithms.
For the R-LPV algorithm, since it has mined the
relationship among data items, it knows which data                                       0.2
items have higher future access probabilities. So it                                     0.1                                         Min-SAUD
will keep these important data items in the cache for
longer. In this way, more client’s requests can be                                             200   400   600         800    1000   1200   1400
served locally in the cache and the cache byte hit                                                         Client cache size (KB)
ratio is improved. On average, the improvement of                                                                (a)
R-LPV in byte hit ratio over I-LRU and Min-SAUD
is 9.3% and 5.2% respectively. The byte hit ratio for                                    5.0
all the algorithms improves with increasing cache                                                                                    I-LRU
                                                                                         4.5                                         Min-SAUD
size because with larger size more data items can be                                                                                 R-LPV
                                                           Average query latency (sec)

stored in the cache.
     Fig. 4b shows that R-LPV incurs lowest average                                      3.5

query latency at all the cache sizes. For example, the                                   3.0
Tavg value of R-LPV is 33.3% less than that of I-LRU                                     2.5
and 26.7% less than that of Min-SAUD when the                                            2.0
cache size is 600 KB. The R-LPV algorithm                                                1.5
considers relationship among data and differentiates
the importance of the data items and keeps the
important data item in the cache for longer. As a
result, this scheme achieves a lower average query                                       0.0
                                                                                               200   400   600         800    1000   1200   1400
latency than the I-LRU and Min-SAUD algorithms.
                                                                                                           Client cache size (KB)
In contrast, the I-LRU and Min-SAUD algorithms do
not differentiate the items based on correlation in the                            (b)
cache.                                                     Figure 4: The effects of cache size on byte hit ratio
                                                           and average query latency.
6.2.2 The effects of mean update arrival time
     We measure the byte hit ratio and average query       6.2.3 The effects of mean query generate time
latency as a function of the mean update arrival time           Fig. 6 shows the effects of mean query generate
(Tu). Tu determines how frequently the server              time on the average query latency of the R-LPV, I-
updates its data items. As shown in Fig. 5, our            LRU and Min-SAUD algorithms. As explained
algorithm is much better than I-LRU and Min-SAUD.          before, each client generates queries according to the
For example, in Fig. 5b, when the update arrival time      mean query generate time. The generated queries are
is 10 seconds, the average query latency of R-LPV is       served one by one. If the queried data is in the local
26.7% less than that of the Min-SAUD algorithm.            cache, the client can serve the query locally;
Our algorithm considers various caching parameters         otherwise, the client has to request the data from the
and exploits relationship among the cached items to        server. If the client cannot process the queries
determine their importance, thus retains important         generated due to waiting for the server reply, it
item for longer. In contrast I-LRU and Min-SAUD            queues the generated queries. For all the algorithms,
algorithms do not take into consideration the              the average query latency drops when Tq increases
relationship of an item with the set of cached items       since fewer queries are generated and the server can
and hence an important item may be replaced by an          serve the queries more quickly. As we can see from
unimportant one. On average, R-LPV is about 27.9%          Fig. 6, the average query latency of R-LPV is much
and 18.5% better than I-LRU and Min-SAUD                   less than that of I-LRU and Min-SAUD. This is due

                     Ubiquitous Computing and Communication Journal                                                                         10
to the reason that R-LPV uses the cache space more                                                  7   CONCLUSIONS
effectively and retains the most important items in
the cache based on the caching parameters and                                                            In this paper, we present the Rule based Least
correlation. For example, when Tq is 150 seconds,                                                   Profit Value (R-LPV) replacement policy for mobile
the Tavg of R-LPV is respectively 42.9% and 14.3%                                                   environment that considers various caching
lower than I-LRU and Min-SAUD.                                                                      parameters such as update frequency, retrieval delay
                                                                                                    from the server, cache invalidation delay, and data
                              0.9                                                                   size, alongwith association among the cached items.
                                                                                                    We design a profit function to dynamically evaluate
                                                                                                    the profit from caching an item. To enhance the
                                                                                                    caching performance, the generalized association
                              0.6                                                                   rules are applied to find the relationship among data
Byte hit ratio

                              0.5                                                                   items. Simulation experiments demonstrate that R-
                                                                                                    LPV policy outperforms the I-LRU and Min-SAUD
                                                                                                    policies in terms of byte hit ratio and average query
                                                                                                    latency. In our future work, we will investigate the
                                                                                   I-LRU            use of data mining techniques for prefetching in
                              0.1                                                  Min-SAUD         integration with caching to further enhance the data
                                                                                                    availability at a mobile client.
                                    1        10               100           1000          10000
                                              Mean update arrival time (sec)                        8 REFERENCES
                                                        (a)                                         [1] D. Barbara and T. Imielinski: Sleepers and
                                                                                                        Workaholics: Caching Strategies in Mobile
                              3.0                                                                       Environments, Proceedings of the ACM
                                                                                                        SIGMOD Conference on Management of Data,
                              2.5                                                                       pp. 1-12 (1994).
Average query latency (sec)

                                                                                                    [2] G. Cao: On Improving the Performance of
                              2.0                                                                       Cache Invalidation in Mobile Environments,
                                                                                                        ACM/Kluwer Mobile Networks and Applications,
                                                                                                        Vol. 7, No. 4, pp. 291-303 (2002).
                                                                                                    [3] G. Cao: A Scalable Low-Latency Cache
                                                                                                        Invalidation Strategy for Mobile Environments,
                                                                                   I-LRU                IEEE Transactions on Knowledge and Data
                                                                                   Min-SAUD             Engineering, Vol. 15, No. 5, pp. 1251-1265
                                    1        10               100           1000          10000     [4] N. Chand, R.C. Joshi and Manoj Misra: Energy
                                              Mean update arrival time (sec)                            Efficient Cache Invalidation in a Disconnected
                          (b)                                                                           Wireless Mobile Environment, International
Figure 5: The effects of mean update arrival time on                                                    Journal of Ad Hoc and Ubiquitous Computing
byte hit ratio and average query latency.                                                               (IJAHUC), Vol. 2, No. ½, pp. 83-91 (2007).
                                                                                                    [5] E. Coffman, P. Denning: Operating System
                              5.0                                                                       Theory, Prentice-Hall, Englewood Cliff, NJ
                                                                                   I-LRU                (1973).
                              4.0                                                  R-LPV            [6] J. Xu, Q.L. Hu, D.L. Lee and W.-C. Lee: SAIU:
Average query latency (sec)

                                                                                                        An Efficient Cache Replacement Policy for
                                                                                                        Wireless On-Demand Broadcasts, Ninth ACM
                                                                                                        International Conference on Information and
                              2.5                                                                       Knowledge Management, pp. 46-53 (2000).
                              2.0                                                                   [7] J. Xu, Q.L. Hu, W.-C. Lee and D.L. Lee:
                              1.5                                                                       Performance Evaluation of an Optimal Cache
                              1.0                                                                       Replacement Policy for Wireless Data
                              0.5                                                                       Dissemination,      IEEE     Transactions    on
                                                                                                        Knowledge and Data Engineering, Vol. 16, No.
                                    0   50        100         150     200          250        300       1, pp. 125-139 (2004).
                                              Mean query generate time (sec)                        [8] L. Yin, G. Cao and Y. Cai: A Generalized
                                                                                                        Target-Driven Cache Replacement Policy for
Figure 6: The effects of mean query generate time                                                       Mobile Environments, Journal of Parallel and
on average query latency.                                                                               Distributed Computing, Vol. 65, No. 5, pp. 583-
                                                                                                        594 (2005).

                                                  Ubiquitous Computing and Communication Journal                                                      11
[9] H. Song and G. Cao: Cache-Miss-Initiated            [23] R. Cooley B. Mobasher and J. Srivastava: Data
    Prefetch in Mobile Environments, Computer               Preparation for Mining World Wide Web
    Communications Journal, Vol. 28, No. 7, pp.             Browsing Patterns, Knowledge and Information
    741-753 (2005).                                         Systems, Vol. 1, No. 1, pp. 5-32 (1999).
[10] R. Agrawal, T. Imielinski and A. Swami:            [24] Y. Saygin and O. Ulusoy: Exploiting Data
    Mining Association Rules Between Sets of Items          mining Techniques for Broadcasting Data in
    in Large Database, ACM SIGMOD Conference                Mobile Computing Environmentts, IEEE
    on Management of Data, pp. 207-216, (1993).             Transactions on Knowledge and Data
[11] H. Shen, Mohan Kumar, Sajal Das and Z. Wang:           Engineering, Vol. 14, No. 6, pp. 1387-1399
    Energy-Efficient Data Caching and Prefetching           (2002).
    of Mobile Devices Based on Utility, Mobile          [25] G. Yavas, D. Katsaros, O. Ulusoy and Y.
    Networks and Applications (MONET), Vol. 10,             Manolopoulos: A Data Mining Approach for
    No. 4, pp. 475-486 (2005).                              Location Prediction in Mobile Environments,
[12] S. Acharya, R. Alonso, M. Franklin and S.              Elsevier Data and Knowledge Engineering, Vol.
    Zdonik: Broadcast Disks: Data Management for            54, No. 2, pp. 121-146 (2005).
    Asymmetric Communication Environments,              [26] G.J. Pottie and W.J. kaiser: Wireless Integrated
    ACM SIGMOD Conference on Management of                  Network Sensor, Communications of ACM, Vol.
    Data, San Jose, USA, pp. 199-210 (1995).                43, No. 5, pp. 551-558 (2000).
[13] P. Cao and S. Irani: Cost-Aware WWW Proxy          [27] J. Shim, P. Scheuermann and R. Vingralek:
    Caching Algorithms, USENIX Symposium on                 Proxy        Cache      Design:      Algorithms,
    Internet Technologies and Systems, pp. 193-206          Implementation and Performance, IEEE
    (1997).                                                 Transactions on Knowledge and Data
[14] J. Shim, P. Scheuermann and R. Vingralek:              Engineering, Vol. 11, No. 4, pp. 549-562 (1999).
    Proxy       Cache      Design:       Algorithms,    [28] L. Breslau, P. Cao, L. Fan, G. Phillips and S.
    Implementation and Performance, IEEE                    Sheker:     Web     Caching      and    Zipf-Like
    Transactions on Knowledge and Data                      Distributions: Evidence and Implications, IEEE
    Engineering, Vol. 11, No. 4, pp. 549-562 (1999).        INFOCOM, pp. 126-134 (1999).
[15] L. Rizzo and L. Vicisano: Replacement Policies
    for a Proxy Cache, IEEE/ACM Transactions and
    Networking, Vol. 8, No. 2, pp. 158-170 (2000).
[16] R. Wooster and M. Abrams: Proxy Caching
    That Estimates Edge Load Delays, Proceedings
    of Computer Networks and ISDN Systems, pp.
    977-986 (1997).
[17] W.-G. Teng, C.-Y. Chang and M.-S. Chen:
    Integrating Web Caching and Web Prefetching
    in Client-Side proxies, IEEE Transactions on
    Parallel and Distributed Systems, Vol. 16, No. 5,
    pp. 444-455 (2005).
[18] V. Liberatore: Caching and Scheduling for
    Broadcast Disk Systems, UMIACS, pp. 98-71
[19] G. Cao: Proactive Power-Aware Cache
    Management for Mobile Computing Systems,
    IEEE Transactions on Computers, Vol. 51, No.
    6, pp. 608-621 (2002).
[20] S. Gitzenis and N. Bambos: Power-Controlled
    Data Prefetching/Caching in Wireless Packet
    Networks, IEEE INFOCOM, pp. 1405-1414
[21] R. Agrawal and R. Srikant: Fast Algorithms for
    Mining Association Rules, 20th International
    Conference on Very Large Data Bases, VLDB,
    pp. 487-499 (1994).
[22] T.M. Anwar, H.W. Beck and S.B. Navathe:
    Knowledge Mining by Imprecise Queries: A
    Classification-Based Approach, IEEE Eighth
    International Conference on Data Engineering,
    Arizona, pp. 622-630 (1992).

                    Ubiquitous Computing and Communication Journal                                        12

Shared By:
Tags: UbiCC, Journal
Description: UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.
UbiCC Journal UbiCC Journal Ubiquitous Computing and Communication Journal www.ubicc.org
About UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.