Docstoc

lin

Document Sample
lin Powered By Docstoc
					       Algorithm for Improving the Aspect Ratio of Treemaps

                                         Xiaochuan Lin
                                        (MACS Project)




1. Introduction
  Treemap is a new way to represent a tree-structured or a large-scale hierarchical data set
[1,2,5-7]. Knuth [3] was the first to suggest a space-filling approach to represent trees.
Shneiderman [5] developed the first treemap, and following this many improvements were
added to his method [1,2,3,6]. His treemap is essentially a 2-dimensional rectangular area
divided into smaller rectangles corresponding to the hierarchical data sets, that is, the areas
of these rectangles are proportional to the values of the data in the set.


  Although treemaps have space-saving advantages over node-links representation, they
suffer two drawbacks. First, they lack intuitive hierarchical structure [7]. Second they
might produce rectangles with potentially high aspect ratio [1,2,6]. High aspect ratio of thin
rectangles makes it difficult to see and compare the size of two rectangular areas; and also
makes it harder to use location-based input devices (e.g. mouse) to select a rectangle, etc
[1,2,6]. Therefore, finding an approach that produces smaller aspect ratio is a problem of
practical significance.


  Figure 1.1 shows a simple example with three solutions given below in Figures 1.2, 1.3,
and 1.4; where each has a different aspect ratio. We explain these solutions briefly below,
and present the details of the algorithms underlining them later (in Section 2). The data set
T = (B6, C6, D4, E3, F2, G2, H1) in this example corresponds to the 7 leaf nodes of a1-
level tree. The map for this tree is displayed on a rectangular region of width 6 and of
height 4, i.e., has an area equal to 24, which is the size of all 7 leafs added together. In tree
given in Figure 1.1, every node has a name and a size, where the size of an internal node is
defined as the sum of its children. Thus, the sizes of leaves are 6, 6, 4, 3, 2, 2, and 1.



                                                                                               1
                                                                        w=A/h
                                                                           6    6     4 3 2
                                                                         = ,      ,    , , ,
                                                                           4    4     4 4 4
                  A24




                                                                  h=4
B6    C6     D4     E3     F2   G2     H1
6      6     4      3      2    22     1
                                                            Fig.1.2. First solution by slice-and-dice
       Fig.1.1. A one level tree.
                                                            algorithm with horizontal slicing


                     w=6                                                w=A/h=6/2=3
  =6/6=1
  h=A/w




                                                              h=4/2=2
                                     6/6
                                     6/6
                                     4/6
                                     

  Fig.1.3. Second solution by slice-and-                    Fig.1.4. Third solution by squarified
  dice algorithm with vertical slicing.                     algorithm.


  The first layout shows a solution based on the slice-and-dice algorithm [5] which is
obtained by cutting the map vertically into subareas equal to the values of the leafs. This
solution generates 7 long rectangular strips, leading to an average aspect ratio = 6.67.
Notice that for a rectangle of size “w by h”, its aspect ratio = max{h/w, w/h}. Thus, the
thinnest rectangle with the area equal to 1 has the worst (i.e., largest) aspect ratio equal to
16.


 The second layout shows another solution using slice-and-dice algorithm where map is
divided horizontally (instead of vertically). As the shows, the rectangle with the area equal
to 1 has worst aspect ratio of 36, and the average aspect ratio for the whole map is 15.


 The third layout shows the solution of using the squarified algorithm [2]. The slicing of
the map in this case is done partially horizontally and partially vertically (in a manner to be


                                                                                             2
explained in details later). This leads to an average aspect ratio equal to 1.68 and the worst
aspect ratio of any rectangle is 2.78 (corresponding to the rectangle of area 1).


  By comparing the aspect ratios produced by the above solutions, one concludes that the
third solution leads to better results. Notice that, by the definition given before, the aspect
ratio of any rectangle (of dimension w x h) is ≥ 1, and moreover, it is 1 only w = h, i.e.,
when the rectangle is a square. We define the average aspect ratio of the whole map as the
average of all aspect ratios of the rectangles corresponding to the leaf-nodes [1,6].


 Thus, our problem can now be defined as follows: given a hierarchical data set
corresponding to a tree and an initial rectangle with an area equal to the sum of the sizes of
all leaf nodes in the tree, in which the size of each node is also equal to the final display
area of each sub-rectangle corresponding to this leaf node, we need to effectively fill each
node of the tree into the initial rectangle in order to achieve the smallest overall aspect ratio
for the whole treemap.


 Figure 1.5 shows another example of a treemap layout generated by slice-and-dice
algorithm resulting from converting a much larger tree whose nodes represent a
hierarchical file system with thousands of files. In this case, treemap effortlessly shows the
whole hierarchical file system. Also users can directly see the largest document from the
whole treemap, i.e. which part consumes the most disk space, etc. Node-link diagram in
representing the hierarchical structure of the data set, however, wastes a lot of display
space and is unsuitable for large scale data. It would be difficult to use node-link diagram
to completely display such a large hierarchical file system. When size is the most important
feature to display, treemaps become a very effective tool to use.


 Figure 1.6 shows an example treemap layout of the result of applying squarified algorithm
to the same large tree-structure data set of Figure 1.5. This example treemap layout
generated by squarified algorithm and that example of treemap layout generated by slice-
and-dice algorithm for the same large tree-structure data set shows that squarified
algorithm is better than slice-and-dice algorithm in that treemap layout generated by

                                                                                               3
squarified algorithm has a better aspect ratio than treemap layout generated by slice-and-
dice treemap. The comparison of these two treemap layout examples in Fig.1.6 and in
Fig.1.5 shows clearly that the squarified algorithm is better than slice-and-dice algorithm
and is successful in the sense that the rectangles in the treemap layout are far less elongated
and that the black areas with cluttered rectangles have disappeared.




 Fig.1.5 A treemap layout generated by            Fig.1.6. A treemap layout generated by
 slice-and-dice algorithm for a large             squarified algorithm for a large tree-
 tree-structure data set corresponding to         structure data set corresponding to a
 a hierarchical file system.                      hierarchical file system in Fig.1.5.


  Making the rectangles of a treemap to be more square-like has many advantages such as
the following [2].
   1. It provides more efficient usage of display space. The circumference of a square is
       minimal compared to that of a rectangle with same area, which leads to minimal
       number of pixels to display the border.
   2. It is easier to detect, locate and point at, square-like rectangles. (Thick rectangles
       clutter and aliasing errors caused by thin rectangles are minimized).
   3. It is easier to compare the size of rectangles with similar aspect ratios.
   4. It improves the accuracy of the presentation.


2. Related work
 Since the invention of treemap idea and the slice-and-dice algorithm [5], research has
been focused in improving the two deficiencies of the treemap mentioned before. Over the


                                                                                             4
last decade, many treemap generation algorithms were proposed, e.g., the cushion method
[7], the squarified algorithm [2], the pivot algorithm [6], and the strip method [1].


2.1 Slice-and-dice treemap algorithm
  As a simple example for a data set of hierarchical structure, Figure 2.1 and 2.2 illustrates
the representations of a hierarchical data set by the traditional node-link diagram and also
as a treemap. In this example, the tree has four levels, starting from the root, where each
leaf node in the tree has a fixed size, and each internal node x has a size defined recursively
as the sum of all sizes of the nodes in the subtree rooted at x.


  The process of treemap formation is as follows: the entire display space, i.e. the initial
rectangle, is used to represent the entire data set of the tree rooted under A16. Then,
according to the size of the data elements B3, C3 and D10, the initial rectangle will be cut
into the three rectangles corresponding to B3, C3 and Dl0; respectively, where the
proportions of areas of these three rectangles will be equal to the proportions of the sizes of
B3, C3 and D10. The process will continue recursively, where the slicing of the rectangle
area will alternate from horizontal to vertical, then back to horizontal, and so forth, as
illustrated in the Figure 2.2 below. This is the most simple slice-and-dice method.




       Figure 2.1 Node and link diagram                 Figure 2.2. Treemap layout
                                                        generated by slice-and-dice algorithm


 Although the treemap generated by the slice-and-dice method have space-saving
advantages node-link representation, the aspect ratio of the treemap produced can
potentially be very large (see Figure 1.5). Another important method for generating maps is
the squarified method explained below, for the example of 1-level tree with the leaf nodes
(B6, C6, D4, E3, F2, G2, H1) rooted at root node A24.


                                                                                             5
2.2. Squarified algorithm
  We illustrate this method using an example from [2], which shows how to construct a
treemap for the tree given before in Figure 1.1. The steps of this construction are given
below in Figure 2.3.
  The algorithm first splits the original rectangle. It does subdivision horizontally; because
the initial rectangle is longer in width than height. Then the left half is filled by adding the
first rectangle (figure 2.3), the aspect ratio of which is 8/3. In step 2 a second rectangle is
added on top of the first rectangle, and the aspect ratios improve to 3/2. In step 3, the third
rectangle (with area 4) is added on top of the first and second rectangles, however, the
aspect ratio worsen to 4/1. This means an optimum for the left half has been reached in step
2, thus the algorithm starts processing the right half of the initial rectangle.




     Figure 2.3 Constructing a treemap for the tree of Figure 1.1 using the squarified method.




                                                                                                 6
 Continuing from the result of Step 2, it first does the subdivision vertically, because the
initial rectangle now is longer in height than width. Step 4 adds the rectangle with area 4,
the aspect ratio of which is 9/4. Following this, step 5 adds the rectangle with area 3, and
the aspect ratios improve to 49/27. In step 6, the next rectangle with area 2 is added.
However, this does not improve the aspect ratio, which is now 9/2. Therefore the result of
step 5 is accepted, and the algorithm starts to fill the right top partition. The algorithm
repeats these steps until it completes the processing of all rectangles.


3. Proposed Algorithms
  None of the methods proposed before for treemap generations is optimal in all
parameters. The solutions produced by the squarified algorithm have better aspect ratio,
whereas the slice-and-dice algorithm produces solutions with better order and dynamic
stability. Similarly, the pivot and strip algorithms attempt to achieve a balance between the
aspect ratio on one hand, and the order and dynamic stability on the other. In short, there
are two directions to further the research in treemap generation; either to develop a new
algorithm that performs better under certain criteria, or that achieves a reasonable balance.
Finding a better balance here means achieving a broader adaptability. This project is
intended to develop a new algorithm that produces solutions with better aspect ratios than
those obtained by the squarified algorithm [2]


 The better aspect ratio achieved by the squarified algorithm compared to the slice-and-
dice algorithm (see the solutions in Figures 1.2, 1.3 and 1.4), is a result of the following
two ideas: the self-adaptation of the layout based on each rectangle’s aspect ratio and the
ordering of the data elements by size.


  In addition, the squarified algorithm makes the following two assumptions. First, when
filling in each rectangle, the squarified algorithm assumes that it will be better to layout the
rectangle along the shorter side of the initial rectangle. This way, the initial rectangle of the
next level would be more like a square, thus more likely leading to good aspect ratio.
However, there might be some cases where filling in the rectangle along the longer side
may result in a smaller overall aspect ratio. Second, before filling in data elements to the

                                                                                               7
initial rectangle, it assumes that it will be better to order them in decreasing order and then
fill in the largest data element in the data set first. On the surface, this appears a natural
choice. The closer to the end of the algorithm, the smaller the initial rectangle remained to
be filled in. But this choice may not always produce good results.


  Therefore, in this project I propose to develop some modified versions of the squarified
algorithm, and to carry out the necessary experiments to show that our algorithms may
produce treemaps with better aspect ratio. Our experiments comparing the original
squarified algorithm with the modified algorithms incorporate the following 2 groups of
new ideas.


a. Comparing different layouts
When we place the rectangles in the treemap, there are three options that we can be use.
We can layout all rectangles along the shorter side (as in the squarified algorithm), or
layout all rectangles along the longer side, or for each rectangle to be placed, we try both of
these layouts (along the shorter and the longer sides) and select the one that leads to a
smaller aspect ratio.


b. Comparing different orderings
I would propose to test the same idea of using recurrence as in the squarified algorithm
except that we order the data elements in increasing order i.e. first fill the smaller element
and later fill larger elements, and compare this with the original squarified algorithm. In
addition, we also compare these two approaches with that of randomly ordering the
elements. Lastly, we also compare them with that of ordering the data in the best of
decreasing and increasing order.

  As pointed out in one of the reference [2], finding a solution that is optimal in aspect ratio
is NP-Hard. Thus rather than attempting to find the most optimal solution, our goal is to
further improve the good solution achieved by squarified algorithm.


  We present several examples in more details so that it becomes easier for the reader to
follow the rest of our simulation experiments.

                                                                                              8
Example 1: Compare the layout along the longest-side and along the shortest-side.
   The following example demonstrates that modifying the original squarified algorithm by
laying out the items along the longer (instead of the shorter) side can achieve a better
average aspect ratio. The example consists of an 8-item data set whose values are generated
randomly (based on a log-normal distribution). This set will be placed in an initial square
whose area is equal to the sum of all 8 elements. The elements of the data set are as follows.


        Data set = (0.6702, 0.5025, 0.8937, 0.4331, 0.8275, 4.3615, 0.8979, 0.3188)


 We still use the same idea of recurrence [2], and as in the original algorithm, we order the
data elements in decreasing order. However, unlike the original algorithm, when we fill in a
new element, we place it along the longer side. The final result is better as demonstrated in
details in Figure 3.1 and 3.2. The steps of processing all the rectangles are demonstrated in
these figures. The modified algorithm achieves an average aspect ratio equal to 1.3259,
whereas the original algorithm produces a treemap with an average aspect ratio equal to
1.5991. In fact, with the use of layouts along the shorter side and along the longer side,
each rectangle obtained has the same area, only with slightly different location and
correspondingly different aspect ratio.


  It is worth point out that this layout is similar to strip algorithm [1], which layout the
rectangles always in horizontal (or always in vertical) strips of varying thicknesses. In our
modified algorithm, the layout is along the longer side. When the initial rectangle is square
(i.e. 2 sides have equal length), the length of the initial chosen layout side (either horizontal
or vertical) would not be reduced by layout of each row while the length of the other side is
reduced and becomes shorter and shorter after layout of each row. Thus the length of the
layout side would not change and the layout would also be always in horizontal (or always
in vertical) rows like the strip algorithm.




                                                                                               9
 Optimum        Worsen




Add L(1)      add L(2)
= 4.3615,      = 0.8979;

                                Optimum            Worsen




Add L(2)      add L(3)        add L(4)        add L(5)
= 0.8979,      = 0.8937;      = 0.8275,        = 0.6702,


                                                            The final
                                                            average aspect ratio
                                                             = 1.3259.

Add L(5)      add L(6)        add L(7)       add L(8)
= 0.6702,      = 0.5025;       = 0.4331,      = 0.3188.

       Figure 3.1. Layout along the longer side.




                                                                                   10
 Optimum         Worsen




Add L(1)       add L(2)
= 4.3615,       = 0.8979;

                   Optimum           Worsen




Add L(2)       add L(3)        add L(4)
= 0.8979,       = 0.8937;       = 0.8275;

                   Optimum           Worsen




Add L(4)       add L(5)        add L(6)
= 0.8275,       = 0.6702;       = 0.5025;

   Optimum          Worsen




Add L(6)        add L(7)
= 0.5025,        = 0.4331;

   Optimum          Worsen




Add L(7)        add L(8)
= 0.4331,        = 0.3188;


              The final
              average aspect ratio
               = 1.5991.

Add L(8)
= 0.3188;

Figure 3.2. Original squarified algorithm.
                                              11
Example 2: Compare the layout of the rectangles along the best of shorter or longer
sides and the layout of the rectangles along the shorter side.
   The following example demonstrates that modifying the original squarified algorithm by
laying out the items along the best of shorter or longer sides can achieve a better average
aspect ratio. The example consists of an 8-item data set whose values are generated
randomly (using a log-normal distribution).


        Data set = (0.9374, 3.7711, 0.8815, 2.1927, 0.3901, 0.8526, 0.8410, 1.2385)


 The algorithm first splits the initial rectangle along the shorter side, and then keeps
adding the next rectangles to fill the subdivision until the aspect ratio of rectangles added
into this subdivision becomes worse, at which time the result of one step back would be
considered as the result of layout along the shorter side of this subdivision. Next the
algorithm goes back to the initial state to split the initial rectangle along the longer side,
and then keeps adding the next rectangles to fill the subdivision until the aspect ratio of
rectangles added into this subdivision becomes worse, at which time the result of one step
back would be considered as the result of layout along the longer side of this subdivision.
Then the algorithm compares the 2 results of layouts along the shorter side and along the
longer side, accepts the better one as the result of this subdivision. Continuing from this
result, the algorithm starts processing the next subdivision of the next initial rectangle in
the same manner of choosing the better of the shorter and longer sides of partition. The
algorithm repeats these subdivisions steps until it completes the processing of all rectangles.


 The steps of processing all the rectangles are demonstrated in Figure 3.3. During the
layout of each row, the algorithm tries the laying out along the shorter side and along the
longer side; and then, it selects the one with a smaller aspect ratio. (The details of the
modified algorithm are given in Appendix). For the given data set, the modified algorithm
generates a treemap with an average aspect ratio of 1.4505, which is better than the value
(2.0012) obtained by the original squarified algorithm as shown in Figure 3.4.




                                                                                           12
              Optimum         Worsen




Add L(1)     add L(2)       add L(3)
= 3.7711,     = 2.1927,      = 1.2385;
                                                                               Better
   Optimum        Worsen                                                       Optimum          Worsen




Add L(3)     add L(4)                       Add L(3)        add L(4)        add L(5)       add L(6)
= 1.2385,     = 0.9374;                     = 1.2385,        = 0.9374,       = 0.8815,      = 0.8526

   Optimum        Worsen
                                                                                         The final
                                                                                         average aspect ratio
                                                                                          = 1.4505.

Add L(6)     add L(7)                       Add L(6)        add L(7)        add L(8)
= 0.8526,     = 0.8410;                     = 0.8526,        = 0.8410,       = 0.3901.

             Figure 3.3 Layouts along the best of shorter or longer side.




                                                                                                13
               Optimum        Worsen




Add L(1)      add L(2)        add L(3)
= 3.7711,      = 2.1927,       = 1.2385;

   Optimum         Worsen




Add L(3)      add L(4)
= 1.2385,      = 0.9374;

                  Optimum          Worsen




Add L(4)      add L(5)        add L(6)
= 0.9374,      = 0.8815,       = 0.8526;

                  Optimum          Worsen




Add L(6)      add L(7)        add L(8)
= 0.8526,      = 0.8410,       = 0.3901;


            The final
            average aspect ratio
             = 2.0012.

Add L(8)
= 0.3901;

              Figure 3.4 Original squarified algorithm.




                                                          14
Example 3: Compare ordering data elements in increasing and decreasing order
   The following example demonstrates that modifying the original squarified algorithm by
changing the ordering of the data elements to increasing (instead of decreasing) order can
also achieve a better average aspect ratio. The example consists of an 8-item data set whose
values are generated randomly (using a log-normal distribution). This set will be placed in
an initial square whose area is equal to the sum of all 8 elements. The elements of this data
set are as follows.


        Data set = (5.9175, 3.1784, 1.2964, 0.7290, 0.4724, 4.1725, 0.8795, 3.0248)


 We still use the same idea of a recurrence as in the original algorithm [2]; however,
elements will be ordered and filled in increasing order to obtain a treemap with an average
aspect ratio of 1.4637 as shown below in Figure 3.5. This is an improvement over the
original algorithm which orders and fills elements in decreasing order as this leads to a
treemap with an average aspect ratio equal to 1.8368 as shown in Figure 3.6.




                                                                                          15
                                                Optimum        Worsen




Add L(1)       add L(2)        add L(3)       add L(4)       add L(5)
= 0.4724,       = 0.7290,       = 0.8795,      = 1.2964,      = 3.0248;

                  Optimum           Worsen




Add L(5)       add L(6)        add L(7)
= 3.0248,       = 3.1784,       = 4.1725;

  Optimum         Worsen




Add L(7)       add L(8)
= 4.1725,       = 5.9175;


             The final
             average aspect ratio
              = 1.4637.

Add L(8)
= 5.9175;

Figure 3.5 Modified squarified algorithm by ordering elements in increasing
order.




                                                                              16
                 Optimum        Worsen




Add L(1)       add L(2)        add L(3)
= 5.9175,       = 4.1725,       = 3.1784;

   Optimum          Worsen




Add L(3)       add L(4)
= 3.1784,       = 3.0248;

   Optimum          Worsen




Add L(4)       add L(5)
= 3.0248,       = 1.2964;

   Optimum           Worsen




Add L(5)        add L(6)
= 1.2964,        = 0.8795;

                    Optimum         Worsen




Add L(6)        add L(7)        add L(8)
= 0.8795,        = 0.7290,       = 0.4724;


             The final
             average aspect ratio
              = 1.8368.

Add L(8)
= 0.4724;

Figure 3.6 Original squarified algorithm.
                                .
                                             17
Example 4: Compare ordering data elements in random and decreasing order
   The following example demonstrates that a better average aspect ratio can also be
achieved by the modified squarified algorithm using random ordering of the data elements
instead of decreasing order. The example consists of an 8-item data set whose values are
generated randomly (using a log-normal distribution). This set will be placed in an initial
square whose area is equal to the sum of all 8 elements. The elements of this data set are as
follows.


       Data set = (0.2748, 0.4382, 0.3095, 0.4082, 1.6329, 0.3436, 0.4356, 1.4120)


  The same idea of a recurrence as in the original algorithm [2] is still used. However,
random ordering of data elements is used when filling in subrectangles to obtain a treemap
with an average aspect ratio of 1.2547 as shown below in Figure 3.7. This is smaller than
that obtained by the original algorithm which orders and fills elements in decreasing order
as this leads to a treemap with an average aspect ratio equal to 1.4843 as shown in Figure
3.8.




                                                                                          18
                                                   Optimum        Worsen




    Add L(1)       add L(2)        add L(3)      add L(4)        add L(5)
    = 0.2748,       = 0.4382,       = 0.3095,     = 0.4082,       = 1.6329;

       Optimum         Worsen




    Add L(5)       add L(6)
    = 1.6329,       = 0.3436;

                      Optimum           Worsen




    Add L(6)       add L(7)        add L(8)
    = 0.3436,       = 0.4356,       = 1.4120;


                 The final
                 average aspect ratio
                  = 1.2547.

    Add L(8)
    = 1.4120.

    Figure 3.7 Modified squarified algorithm by using random ordering of data
    elements.




.




                                                                                19
               Optimum        Worsen




Add L(1)      add L(2)        add L(3)
= 1.6329,      = 1.4120,       = 0.4382;

                 Optimum           Worsen




Add L(3)      add L(4)        add L(5)
= 0.4382,      = 0.4356,       = 0.4082;

                  Optimum          Worsen




Add L(5)      add L(6)        add L(7)
= 0.4082,      = 0.3436,       = 0.3095;

  Optimum          Worsen




Add L(7)      add L(8)
= 0.3095,      = 0.2748;


            The final
            average aspect ratio
             = 1.4843.

Add L(8)
= 0.2748.

              Figure 3.8 Original squarified algorithm.




                                                          20
Example 5: Compare ordering data elements in the best of decreasing or increasing
order and in decreasing order.
   The following example demonstrates that modifying the original squarified algorithm by
ordering the data elements in the best of increasing or decreasing order achieves a better
average aspect ratio. The example consists of an 8-item data set whose values are generated
randomly (using a log-normal distribution).


        Data set = (1.7476, 2.9099, 1.1319, 0.8325, 2.0535, 0.5614, 0.7500, 1.0990)


 The algorithm first orders the data elements in decreasing order, and then keeps filling
the next item into the subdivision until the aspect ratio of the corresponding constituent
rectangles filled into this subdivision becomes worse, at which time the result of one step
back would be considered as the result of ordering in decreasing order the data elements
filled in this subdivision. Next the algorithm goes back to the initial state to order the data
elements in increasing order, and then keeps filling the next item into the subdivision until
the aspect ratio of corresponding constituent rectangles filled into this subdivision becomes
worse, at which time the result of one step back would be considered as the result of
ordering in increasing order the data elements filled in this subdivision. Then the algorithm
compares the 2 results of orderings in decreasing order and in increasing order, accepts the
better one as the result of this subdivision. Continuing from this result, the algorithm starts
processing the next subdivision filling in the remaining data elements in the same manner
of choosing the better of the increasing or decreasing order of data. The algorithm repeats
these subdivisions steps until it completes the processing of all data elements.


 The steps of processing all the data elements are demonstrated in Figure 3.9. During the
processing of each row, the algorithm tries both ordering of data elements in decreasing
order and in increasing order; and then, it selects the ordering that leads to a smaller aspect
ratio. (The details of the modified algorithm are given in Appendix). For the given data set,
the modified algorithm generates a treemap with an average aspect ratio of 1.3586, which
is better than the value (1.5674) obtained by the original squarified algorithm as shown in
Figure 3.10.

                                                                                            21
                Optimum        Worsen                                                              Optimum     Worsen




Add L(1)      add L(2)       add L(3)             Add L(8)        add L(7)        add L(6)       add L(5)     add L(4)
= 2.9099,      = 2.0535,      = 1.7476;           = 0.5614,        = 0.7500,       = 0.8325,      = 1.0990,    = 1.1319;

   Optimum         Worsen                                            Optimum            Worsen




Add L(3)      add L(4)                            Add L(8)           add L(7)       add L(6)
= 1.7476,      = 1.1319;                          = 0.5614,           = 0.7500,      = 0.8325;

   Optimum         Worsen                                            Optimum            Worsen




Add L(3)      add L(4)                            Add L(6)          add L(5)        add L(4)
= 1.7476,      = 1.1319;                          = 0.8325,          = 1.0990,       = 1.1319;

   Optimum         Worsen                            Optimum           Worsen




Add L(3)      add L(4)                            Add L(4)           add L(3)
= 1.7476,      = 1.1319;                          = 1.1319,           = 1.7476;


                                                                 The final
                                                                 average aspect ratio
                                                                  = 1.3586.
Add L(4)                                           Add L(4)
= 1.1319;                                           = 1.1319.                                                              22

      Figure 3.9 Modified squarified algorithm by ordering in the best of decreasing or increasing order.
                Optimum         Worsen




Add L(1)       add L(2)        add L(3)
= 2.9099,       = 2.0535,       = 1.7476;

   Optimum          Worsen




Add L(3)       add L(4)
= 1.7476,       = 1.1319;

                   Optimum          Worsen




Add L(4)       add L(5)        add L(6)
= 1.1319,       = 1.0990,       = 0.8325;

   Optimum           Worsen




Add L(6)        add L(7)
= 0.8325,        = 0.7500;

    Optimum          Worsen




Add L(7)        add L(8)
= 0.7500,        = 0.5614;


             The final
             average aspect ratio
              = 1.5674.

Add L(8)
= 0.5614.

Figure 3.10 Original squarified algorithm.   23
4. Statistics experiments
  The experiments in the last section show that the modified algorithms can achieve a
better aspect ratio than the original algorithm. We can also cite many other examples;
however, on average how much improvement of the aspect ratio can be achieved on a
sufficiently large sample space? What percentages of cases have an improved aspect ratio?
Only when there is a relatively large improvement of aspect ratio on average, or a relatively
large percentage of improved aspect ratio in statistics, these ideas and modifications would
be of value. After all, the original squarified algorithm has already achieved a reasonably
good solution. If there is no significant improvement to the aspect ratio on average, but
there is a considerable percentage that the modified squarified algorithms have improved
the aspect ratio, as compared to the original squarified algorithm, it is possible to
complement and use the original squarified algorithm in conjunction with the modified
squarified algorithms to improve the aspect ratio. However this will inevitably cost greater
time, just like the idea of layout along the best of shorter and longer side and the idea of
ordering in the best of decreasing and increasing order, which would incur twice the
running time.


 In this section we will give further analysis and experiment on the idea of comparing
layout along shorter, longer, and the best of shorter and longer side, and the idea of
comparing ordering the data elements in decreasing, increasing, random and the best of
decreasing and increasing order.


4.1. Comparing the layout along the shorter, longer and best of shorter and longer
side
  There are circumstances that lead to better results for each of these cases. In this section
we perform the statistics experiment to see which of these layout leads to improved aspect
ratio statistically.


  The original squarified algorithm (OSAS) layouts the rectangle always along the shorter
side on each row. We have shown an example that the result of layout the rectangle always
along the longer side (MSAL) is better, although it makes the initial rectangle in the next

                                                                                           24
row less square-like. This is because it is not always true the more square-like is the initial
rectangle, the better the result. Even if the initial rectangle is long, filling it up with the
necessary number of elements may still produce many square-like constituent rectangles.
We are going to conduct statistics experiment on many data to compare it with the original
algorithm.


 The 2nd modified algorithm is laying out along the best of the shorter and the longer
sides (MSAB). The similarity is that we do 2 layouts along shorter side and longer side at
the same time on each row. The difference is that we then compare the worst aspect ratio of
rectangles added in the current row of layout along shorter side and along longer side. Then,
we layout the current row along the side that produces a smaller worst aspect ratio of
rectangles in the current row.


 We can also think of the layout along the best of shorter and longer side to find the lower
aspect ratio as a method based on heuristics information computed from the specific data.
In order to improve the final average aspect ratio of laying out the rectangles, we adopt a
local evaluation of the 2 layouts along longer and shorter sides in each row, to decide
which side to layout each row. The algorithm utilizes this specific heuristic knowledge of
aspect ratio to control the layout of the rectangles in current row so as to improve the aspect
ratio. The difference from the original squarified algorithm that always layout along the
shorter side (OSAS) and the modified squarified algorithm which always lay out along the
longer side (MSAL), is that MSAB can dynamically select which side to layout the current
row based on knowledge of which side leads to a smaller aspect ratio.


Statistics trials
 We use the five scales of data set: 8×3, 10×2, 20×1, 50×l, and 100×1 as the simulation
input, in which the number of data items per level are 8, 10, 20, 50 and 100. To the five
scales of data set, we conduct 100 tests, each consists of 100 steps. At the beginning of
each test, we generate random data that obey log-normal distribution (exponentiation of a
normal distribution with mean value 0 and variance 1). The log-normal distribution is
commonly used to represent naturally occurring positive-valued data [1, 4, 6] (Other

                                                                                            25
  experiments have used similar data distribution [1, 6]). At the beginning of each step, the
  elements of the data set are multiplied by a random variable ex (x is from a normal
  distribution with mean value 0 and variance 0.05), to simulate the noise [1, 6]. The final
  result takes the average value of 100 times of tests of 100 steps each. In all the experiments
  followed, the simulation data are also generated in this way.


    We compare layout along the longer side, and along the best of the shorter and longer
  side, with layout along the shorter side as in the original squarified algorithm. The
  experiment results are as shown in Figure 4.1.




Figure 4.1. Results of 1st group of squarified algorithms. The ordering of data is in decreasing
order (OD). The layouts of rectangles are along the shorter (OSAS), longer (MSAL) and best of
shorter and longer side (MSAB).


    From the figure we can see that to all five scales of datasets with various number of data
  items per level, the final average aspect ratio generated by the modified algorithm that
  layout along the longer side (MSAL) are a lot higher (3.5 ~ 9), and therefore the aspect
  ratio improvement is negative (-700% ~ -4000%). The percentage of achieving better
  aspect ratio represents the proportion of treemap with improved final average aspect ratio
  generated by corresponding algorithm on datasets with various numbers of items per level.
  As we can see, the proportion of treemap with improved aspect ratio produced by layout
  along longer side is very small (around 4 per cent on the datasets with 20 items per level
  and less than 0.3% on the datasets with 8, 10, 50 and 100 items per level). Although we

                                                                                             26
found an example in which layout along the longer side is better than the original algorithm,
but from the experiment results, the chance it produces smaller final average aspect ratio is
not too good.


 In experiment conducted, we have also discovered the similarity in layout results between
this modified algorithm (MSAL) and the strip algorithm, although this similarity was not
expected at the beginning. As is also confirmed now by the statistical experiment, this
modified algorithm produces the final average aspect ratio similar to that produced by strip
algorithm [1]. Strip algorithm was not designed to achieve the smallest average aspect ratio,
but to produce a treemap with smoother change in layout as the data changes. It produces
layouts with aspect ratio that falls somewhere in the middle between the slice-and-dice
method and squarified treemap.

 Although the final average aspect ratio of treemaps produced by the idea of layout always
along the longer side is not improved in most cases, the hybrid alternative method of trying
layout on the best of shorter and longer side have achieved some promising results.


 As can be seen from figure 4.1, although the average aspect ratio of the 2nd modified
algorithm (MSAB) is still not improved (-5% ~ -30%) as compared to the original
algorithm, there is significant percentage that it generates a better aspect ratio than the
original algorithm (nearly 50% on the datasets with 20, 50, and 100 items per level). This is
very significant provided that the average aspect ratio of the original squarified algorithm is
already good and has been very close to 1. However the percentage of better aspect ratio on
the datasets with 8 and 10 items per level is still relatively low (4% and 27%).


 In order to further improve the 2nd modified algorithm (MSAB), we need further analysis
of this strategy and improvement. Specifically, we introduce the threshold of aspect ratio
and the threshold of look ahead to further improve MSAB.




                                                                                            27
4.1.1. Introducing the thresholds of aspect ratio and look ahead to further improve
the layout along best of longer and shorter side
  By comparing both layouts along the shorter side and the longer side, MSAB tries to
lower the aspect ratio of the current row. However layout along the longer side results in
the initial rectangle of the next row to be less square-like, which is likely to increase the
aspect ratio of the next row and eventually increase the final aspect ratio. In order to
minimize this negative impact, we introduce a threshold of aspect ratio into MSAB so that
it doesn’t layout along the longer side if the aspect ratio of the initial rectangle of the
current row is already very high.


 We also found that at the beginning stage of the algorithm, the layout space is relatively
large, even if at this stage, layout the rectangles along the longer side, the remaining space
is still relatively spacious and will not have a negative impact on subsequent layout.
However, layout the last few rectangles along the longer side is likely to increase the aspect
ratio and we should not layout the last few rectangles on the longer side. Thus, we add a
threshold of look ahead into MSAB so that it doesn’t layout along the shorter side if the
number of elements left is less than threshold of look ahead.


The optimal threshold of aspect ratio and threshold of look ahead
 We conduct experiments to verify our hypothesis about the threshold of the aspect ratio
and the threshold of look ahead and measure the improvement to treemap’s final average
aspect ratio. We would find out whether any specific range of the two thresholds maybe
capable of improving the final average aspect ratio of treemap on various scales of datasets.


 We search the 2 dimensional spaces of the threshold of the aspect ratio and the threshold
of look ahead to observe the improvement of the final average aspect ratio of the treemap
produced on different scale of data sets with varying threshold of aspect ratio and threshold
of look ahead. We find aspect ratio threshold of 1.6 and the look ahead threshold 2 is an
all-around good choice that consistently achieves the optimal positive improvement of
aspect ratio for all datasets of various number of data elements and layers except very small
fluctuation caused by randomness in data.

                                                                                           28
 Figure 4.2 and 4.3 shows the landscape of the aspect ratio improvement on an 8x3 and a
50x1 dataset with varying values of threshold of aspect ratio and threshold of look ahead.
The aspect ratio improvement is drawn as a mesh defined over a regularly spaced grid of
the 2-d search space of the two thresholds. A contour plot based on the minimum and
maximum values of the aspect ratio improvement is also drawn beneath the mesh.




  Figure 4.2. The landscape of the                       Figure 4.3. The landscape of the
  improvement of aspect ratio on 8X3                     improvement of aspect ratio on 50X1
  dataset with varying thresholds.                       dataset with varying thresholds.


 In the figure, the threshold of aspect ratio varies from 1.0 to 2.0, while the threshold of
look ahead varies from 0 to 8 (since the number of elements per level in the dataset of 8x3
is only 8). In the figure, we can see the optimal region of the thresholds that achieves the
positive improvement of aspect ratio is contained within the search space centering around
the optimal threshold (1.6, 2). Also when the threshold of aspect ratio is set to 1.0 or the
threshold of look ahead is set to sufficiently large (e.g. 8, the number of elements per level
in the dataset of 8x3), it would be the same as the original squarified algorithm that always
layout along the shorter side and thus the aspect ratio improvement is 0. When the
threshold of look ahead is set to 0 and the threshold of aspect ratio is set to sufficiently
large (e.g. larger than 1.8 on the dataset of 8x3), it would be as if the thresholds were not
introduced yet into the modified algorithm that layout along the best of longer and shorter
side and thus the aspect ratio improvement would still be negative. In addition, when we


                                                                                           29
use both thresholds and set to optimal values, the improvement is larger than using only
one of the thresholds (either set the threshold of look ahead to 0 or set the threshold of
aspect ratio to infinity).


An example illustrating the idea of thresholds
  The process of the modified squarified algorithm by layout along the best of shorter and
longer side with thresholds (MSABT) is similar to the process of MSAB previously
described. The difference is just that at the beginning of processing each row it decides
whether to try both layout along the shorter side and layout along the longer side by
judging the relationship between the aspect ratio of the initial rectangle of current row and
the threshold of aspect ratio and the relationship between the number of elements left and
the threshold of look ahead. When the aspect ratio of the layout region is less than the
threshold of aspect ratio and the number of elements left is larger than the threshold of look
ahead, we try both layout along the shorter side and layout along the longer side, and adopt
the layout along the side with better result (The details of the algorithm is presented in
appendix).


  To help more intuitive understanding of the role of the optimal thresholds, we also
present here an example in which MSABT achieves an improved average aspect ratio as
compared to the original algorithm. The example consists of an 8-item data set whose
values are generated randomly (using a log-normal distribution).


        Data set = (1.3954, 0.3736, 1.6940, 3.6211, 0.3106, 1.6804, 0.1658, 0.4058)




                                                                                           30
                Optimum        Worsen




Add L(1)      add L(2)       add L(3)
= 3.6211,      = 1.6940,      = 1.6804;

   Optimum         Worsen




Add L(3)      add L(4)
= 1.6804,      = 1.3954;

   Optimum         Worsen                                           Optimum          Worsen




Add L(4)      add L(5)                           Add L(4)          add L(5)       add L(6)
= 1.3954,      = 0.4058;                         = 1.3954,          = 0.4058,      = 0.3736;

   Optimum         Worsen                                           Optimum          Worsen




Add L(5)      add L(6)                            Add L(5)         add L(6)       add L(7)
= 0.4058,      = 0.3736;                           = 0.4058,        = 0.3736,      = 0.3106;

   Optimum         Worsen




Add L(7)      add L(8)
= 0.3106,      = 0.1658;


             The final
             average aspect ratio
              = 1.5119.
Add L(8)
= 0.1658.

   Figure 4.4. Layouts along the best of shorter and longer side with thresholds (MSABT).
                                                                                       31
                 Optimum         Worsen




Add L(1)       add L(2)        add L(3)
= 3.6211,       = 1.6940,       = 1.6804;

   Optimum          Worsen




Add L(3)       add L(4)
= 1.6804,       = 1.3954;

   Optimum          Worsen




Add L(4)       add L(5)
= 1.3954,       = 0.4058;

   Optimum           Worsen




Add L(5)        add L(6)
= 0.4058,        = 0.3736;

                    Optimum          Worsen




Add L(6)        add L(7)        add L(8)
= 0.3736,        = 0.3106,       = 0.1658;


              The final
              average aspect ratio
               = 1.9762.

Add L(8)
= 0.1658.

Figure 4.5. Original squarified algorithm.
                                              32
    As shown in figure 4.4, the steps of processing all the rectangles are similar to that in
  figure 3.3. However, during the layout of each row, the algorithm tries to layout along the
  best of shorter and longer side only when the thresholds conditions are satisfied; if not, it
  simply layout along the shorter side. As shown, it skips layout along the best of shorter and
  longer side on the 2nd row where the aspect ratio threshold condition is not met and on the
  last two rows where the look ahead threshold condition is not met. For the given data set,
  the modified algorithm generates a treemap with an average aspect ratio of 1.5119, which
  is better than the value (1.9762) obtained by the original squarified algorithm as shown in
  Figure 4.5.


  Statistics trials
    We conducted statistical experiments to compare MSABT with the original squarified
  algorithm. The results of the average aspect ratio improvement and the percentage of better
  aspect ratio of the algorithms are as shown in Figure 4.6 (we also included the results of
  MSAL and MSAB to make it easy to compare them together).




Figure 4.6. Results of 1st group of squarified algorithms. The thresholds of aspect ratio and look
ahead are introduced into the variant of algorithm that layout along the best of shorter and
longer side (MSABT).




                                                                                             33
  In the experiments, we have applied the optimal thresholds as introduced and discussed to
further improve the aspect ratio. From Figure 4.6., it is not difficult to see, after the
introduction of the threshold of aspect ratio and the threshold of look ahead, the
improvement of the final average aspect ratio over the original algorithm are greater than
10% for all data sets. The percentages of better aspect ratio are greater than 70% on all
datasets. The data set of 8×3 (with 8 items per level) is now the data set with the biggest
percentage of better aspect ratio, reaching 99 percent.


4.2. Compare ordering data elements in decreasing, increasing, random and the best
of decreasing or increasing order
  The original squarified algorithm orders the data elements in decreasing order. In
example 3 in section 3, we showed that ordering the data elements in increasing order
results in a treemap with improved aspect ratio, as compared to the original squarified
algorithm. The initial rectangle is large at the beginning of the algorithm, filling it up with
many small elements may produce many square-like subrectangles at the beginning.


  We can change the ordering of data elements in all three variants of squarified algorithms
that order data in decreasing order (OD) and layout along various sides (OSAS, MSAL,
MSABT) as discussed in section 4.1. Thus, we would have a new group of algorithms
(OI+*) in parallel to the 1st group of squarified algorithms that order data elements in
decreasing order (OD+*) as already discussed. The input would be processed in a different
order, instead of sorting and adding the list of data items in decreasing order, we would sort
and add them in increasing order. However the recurrences would be the same as those of
the corresponding squarified algorithms. Similarly, we would also have another group of
algorithms in parallel by changing the ordering of data elements to random order (OR+*).
Lastly, we would also have a group of algorithms by ordering data in the best of decreasing
or increasing order (OB+*). The classifications of all twelve variants are shown in Table
4.1.




                                                                                            34
Table 4.1. 2 ways of classification of variants of squarified algorithms by layout side and
by data ordering, and their combinations.
                                                                            The side along which layout of constituent subrectangles is performed
                                                                         Along shorter side       Along longer side       Along best of longer and
                                                                                                                            shorter side and with
                                                                                                                                 thresholds
                                                                             Ordering in                                 Ordering in dereasing order
                                                                                                Ordering in decreasing
                                                   Decreasing order




                                                                         decreasing order +                               + The modified squarified
                                                                                                order + The modified
                                                                         Original squarified                              algorithm by layout along
                                                                                                 squarified algorithm
                                                                         algorithm by layout                              best of longer and shorter
                                                                                                by layout along longer
                                                                          along shorter side                               side and with thresholds
                                                                                                  side (OD+MSAL)
                                                                            (OD+OSAS)                                          (OD+MSABT)
The ordering in which data elements are filled




                                                                                                                         Ordering in increasing order
                                                   Increasing order




                                                                             Ordering in        Ordering in increasing
                                                                                                                            + layout along best of
                                                                          increasing order +     order + layout along
                                                                                                                         longer and shorter side and
                                                                         layout along shorter        shorter side
                                                                                                                               with thresholds
                                                                          side (OI+OSAS)             (OI+MSAL)
                                                                                                                               (OI+MSABT)
                                                                         Ordering in random      Ordering in random      Ordering in random order +
                                                   Random order




                                                                         order + layout along    order + layout along    layout along best of longer
                                                                             shorter side            shorter side         and shorter side and with
                                                                            (OR+OSAS)               (OR+MSAL)             thresholds (OR+MSABT)
                                                                                                                             Ordering in best of
                                                 Best of decreasing or




                                                                         Ordering in best of      Ordering in best of
                                                   increasing order




                                                                                                                          decreasing and increasing
                                                                           decreasing and           decreasing and
                                                                                                                         order + layout along best of
                                                                          increasing order +      increasing order +
                                                                                                                         longer and shorter side and
                                                                         layout along shorter    layout along shorter
                                                                                                                               with thresholds
                                                                          side (OB+OSAS)          side (OB+MSAL)
                                                                                                                               (OB+MSABT)


                         In this section we compare ordering data elements in decreasing, increasing, random and
best of decreasing or increasing orders, to find the improvement to the average aspect ratio



                                                                                                                                                     35
and the percentage where each of these outperforms the original algorithm. We will
conduct statistics experiment to verify which of these ideas works best on large data space.


Statistics trials
  Similarly using the experiment process and the experiment method described before, we
compare the original squarified algorithm with the three new groups of modified squarified
algorithms by ordering data elements in increasing order (OI+*), random order (OR+*) and
the best of decreasing and increasing order (OB+*). In the tests of three of the variants of
modified algorithms that layout along the best of shorter and longer side with thresholds
(OI+MSABT, OR+MSABT, OB+MSABT), we have also applied the optimal thresholds to
control and improve the aspect ratio as in the experiments in section 4.1.1. Figure 4.7
shows the experiment results.


  As can be seen from figure 4.7, the 2nd group of three modified algorithms by ordering
data elements in increasing order (OI+*) does not improve the aspect ratio, as compared to
the original algorithm on the datasets with various numbers of items per level. The aspect
ratio improvement is negative (-20% ~ -100%). The percentage that it produces a better
aspect ratio than the original algorithm is also relatively small (less than 25% on the
datasets with 10, 20, and 50 items per level, and less than 3% on the datasets with 8 and
100 items per level). Although in some cases ordering data elements in increasing order is
better than in decreasing order, in majority of cases, its aspect ratio is worse than the
original squarified algorithm.


  From the figure 4.7, the 3rd group of three modified algorithms that order data elements
in random order (OR+*) produces much worse aspect ratio, on all five scales of datasets
with various numbers of data items per level. The aspect ratio improvement is negative and
very poor (-150% ~ -1000% or worse). As we can see, the percentage of cases with better
aspect ratio compared to the original algorithm are very little (less than 0.04%) on all
datasets with various number of items per level.




                                                                                          36
Figure 4.7. Results of various squarified algorithms combining different orderings of data and
layouts of rectangles. The orderings of data are in decreasing (OD), increasing (OI), random (OR)
and best of decreasing and increasing order (OB). The layouts of rectangles are along the shorter
(OSAS), longer (MSAL) and best of shorter and longer side (MSABT).


     Although the idea of changing the ordering of data elements to increasing or random
   order does not improve the aspect ratio on average, our experiment results show that the
   4th group of three algorithms of ordering data in the best of the decreasing or increasing
   order (OB+*) further improves the aspect ratio, as compared to the 1st group of algorithms
   that order data in decreasing order (OD+*). Also, two of the new modified algorithms by
   ordering data elements in best of decreasing and increasing order (OB+OSAS,

                                                                                          37
OB+MSABT) both improve the aspect ratio as compared to the original algorithm on these
datasets with various numbers of items per level.


 We can see that ordering data elements in the best of decreasing and increasing order and
layout along the shorter side (OB+OSAS) produces significant improvement on the aspect
ratio, as compared to the original algorithm (over 3% on the datasets with 100 items per
level, more than 5% and 7% on the datasets with 50 and 20 items per level, and around
10% on the datasets with 8 and 10 items per level). The percentage it generates a better
aspect ratio than the original algorithm is also relatively large (around 97% on the datasets
with 8 and 10 items per level, and 88% on the datasets with 20 and 50 items per level, and
over 50% on the datasets with 100 items per level).


 In the case of OB+MSABT, the aspect ratio improvement (16% ~ 18%) is the largest
ever among all variants of algorithm we experimented. The percentage it generates a better
aspect ratio than the original algorithm is also very large (around 80% ~ 100%) on the
datasets with various items per level. Compare to ordering in decreasing order and layout
along the best side (OD+MSABT), the improvement by ordering data elements in the best
of decreasing and increasing order (OB+MSABT) is also significantly larger,
approximately equal to the sum of improvement by ordering data elements in the best of
decreasing and increasing order (OB+OSAS) and by layout along the best side
(OD+MSABT).


 From the figure 4.7, the new modified algorithm that orders data elements in the best of
decreasing and increasing order and layout along the longer side (OB+MSAL) still
produces very poor aspect ratio, on all five scales of datasets with various numbers of data
items per level. The aspect ratio improvement is still negative and very large (ranging from
-600% ~ -4000%), though slightly better than that of OD+MSAL. As we can see, the
percentage of cases with better aspect ratio compared to the original algorithm is still very
little (1% ~ 4% or less) on all datasets with various number of items per level.




                                                                                          38
4.2.1. Using the best of ordering data elements in increasing and decreasing order
exclusively (ExOB)
  Ordering data elements in increasing order does not improve the average aspect ratio of
the treemap produced in a large sample space. However, there are still some circumstances
that ordering data elements in increasing order lead to better results, though the percentage
is relatively small. In order to further compare different ways to utilize the idea of ordering
data elements in increasing order, we have also tried using the best of ordering data
elements in decreasing order and in increasing order exclusively (ExOB). We calculate the
final average aspect ratio of the treemap produced by exclusively ordering data elements in
increasing order and by exclusively ordering data elements in decreasing order, and then
take the final result with a smaller final average aspect ratio as the produced treemap. Such
a simple combination of two orderings would not be worse than the original algorithm in
any situations. The similarity to ordering data in the best of decreasing and increasing order
nonexclusively (OB) as described in example 5 in section 3 is they both do approximately
twice the work on one dataset bringing twice the run-time cost of the original algorithm.


  We may as well begin with ExOB+OSAS, which uses the best of the original squarified
algorithm (OD+OSAS) and ordering data elements in increasing order (OI+OSAS), to find
the improvement to the average aspect ratio. We conduct experiment to compare aspect
ratio of ExOB+OSAS with the original squarified algorithm. The result is shown in figure
4.8.




                                                                                            39
                                      OD+OSAS
                                      OI+OSAS
                                      ExOB+OSAS




                             (a)                                 (b)
 Figure 4.8.    Comparison of the aspect ratio produced by the original squarified
 algorithm (OD+OSAS) with that of ExOB+OSAS, which uses the best of the original
 squarified algorithm (OD+OSAS) and ordering data elements in increasing order
 (OI+OSAS).


 We can see that the improvement on the average aspect ratio of the treemap produced by
ExOB+OSAS is quite small (less than 3.3% on the datasets with 20 items per level, less
than 1.5% on the datasets with 10 and 50 items per level, and less than 0.2% on the datasets
with 8 and 100 items per level). The improvement is much less and very inefficient, as
compared to that of ordering in the best of decreasing and increasing order (OB+OSAS)
shown in figure 4.7. Thus, we did not follow up on this idea (such as using the best of
ordering of OD+MSABT and OI+MSABT exclusively). We certainly could supplement
the original squarified algorithm with all the variants of modified algorithms if a low
running time is not required while a particularly low aspect ratio is required.


 Although using the best of ordering data elements in increasing and decreasing order
exclusively (ExOB) would not be worse than ordering data in decreasing order (OD), the
improvement on aspect ratio from the experiment results (using the best of ordering of
OD+OSAS and OI+OSAS exclusively, i.e. ExOB+OSAS) is not so efficient as that of
ordering data in the best of decreasing or increasing order non-exclusively (OB+OSAS),
given that they both spend approximately twice the expense of running time of the original
algorithm (OD+OSAS). Thus, it is better to use ordering of data in the best of decreasing or


                                                                                         40
increasing order non-exclusively (OB+*) as presented in example 5 in section 3 and figure
4.7.

4.3. Metrics for treemap layout: aspect ratio distribution
  To have a more comprehensive understanding of the performance, we also compared the
general distribution of the aspect ratio of treemaps produced by the twelve algorithms.


  Figure 4.9 shows the distribution of the aspect ratios attained in each meaningful category
of Low-Medium-High aspect ratios under various numbers of elements per level for twelve
algorithms. The height of white, gray and black portion in each bar indicate the percentage
of low (white), medium (gray), and high (black) aspect ratios respectively. Here we
distinguish between the low aspect ratios and the medium aspect ratios according to the
representative aspect ratio (4/3 or approximately 1.333) of standard computer display (e.g.
found in resolution of 1024x768). And the high aspect ratios and the medium aspect ratios
are distinguished by the representative aspect ratio (16/9 or approximately 1.778) of the
widescreen computer display (e.g., found in resolutions of 1280x720 and 1920x1080). In
our definition, the low aspect ratio represents the rectangle whose visual impression is very
close to the square and is good. In the range of low aspect ratio, the users are operating on
the approximate square. The medium aspect ratio is within the range of the aspect ratio of
the standard computer display and the widescreen computer display. In the range of
medium aspect ratio, the rectangle is neither square like nor too slender, thus is still
acceptable to the users. However, the high aspect ratio is out of the range of the aspect ratio
of the widescreen computer display. In the range of high aspect ratio, the rectangle is
obviously not square like and hard for the users to operate on and compare the area of two
rectangles. Therefore we should, if at all possible, minimize the percentage of high aspect
ratio result that has a bad visual effect. In essence, an algorithm that improves the aspect
ratio is needed, to best meet the requirement on aspect ratio, we hope to attain more
distribution of the low aspect ratio, and minimize the distribution of the high aspect ratio.




                                                                                                41
Figure 4.9. Distribution of the aspect ratios of various squarified algorithms that combine different
ordering of data and layout of rectangles. The orderings of data are in decreasing (OD), increasing (OI),
random (OR) and best of decreasing and increasing order (OB). The layouts of rectangles are along the
shorter (OSAS), longer (MSAL), and best of shorter and longer side (MSABT).



        In general, the aspect ratio distribution of treemaps produced by squarified algorithms is
      better on datasets with more elements per level (except the 3rd group of modified
      algorithms of ordering data in random order, i.e. OR+*). As can be seen in figure 4.9, for
      the 3 groups of algorithms (OD+*, OI+*, OB+*), the distribution of the low (white) aspect
      ratio increases steadily from within 50% to over 80% and begins to saturate as the number

                                                                                                42
of data items per level increases from 8 and reaches 100. The reason of this is that more
items per level allow more flexibility computing rectangles’ layout by the algorithm. For
example, when the number of elements per level is 20, 50, and 100 (corresponding to the
data set of 20×l, 50×1 and 100×1), the majority (or over 50%) of the aspect ratio
distribution is within the low (white) aspect ratio category for all three algorithms while the
distribution on the high (black) aspect ratio is very small (20% or less). Smaller number of
items per level offer less flexibility resulting in more distribution in high aspect ratio. For
example, when the number of elements per level is smaller, i.e. 8 and 10 (corresponding to
8×3, 10×2 datasets), the distribution on the high (black) aspect ratio is nearly 20% or more
while the aspect ratio distribution within the low (white) aspect ratio category is less than
55% for these three groups of algorithms (OD+*, OI+*, OB+*).


 As figure 4.9 shows, the modified squarified algorithms that layout along longer side
(*+MSAL) has more distribution within the high (black) aspect ratio category and less
distribution within the low (white) aspect ratio category, as compared to layout along
longer side (*+OSAS) on all datasets. In comparison, the overall distribution of the aspect
ratio of the modified squarified algorithms by layout along best of longer and shorter side
with threshold (*+MSABT) is less concentrated in the high (black) aspect ratio with rather
poor display effect. While in the category of the low (white) aspect ratio with the best
visual effect, it accounts for bigger proportion, as compared to the algorithms that layout
along the shorter side (*+OSAS) on all datasets with varying number of items per level.
This is also consistent with our observation that MSABT produces relatively low (white)
aspect ratio result even though sometimes OSAS produces the treemap with high (black)
aspect ratio and poor visibility. MSABT not only significantly improves the average aspect
ratio, the overall aspect ratio distribution is also improved, as compared to OSAS. With the
introduction of the thresholds and carefully choosing the optimal threshold values, it
effectively controls the majority of aspect ratio distribution within the category of the low
(white) aspect ratio. In the category of low (white) aspect ratio with good visual effect
which is more acceptable to the user, it has larger proportion than OSAS.




                                                                                            43
 Comparing the 2nd group of modified algorithms of ordering data elements in increasing
order (OI+*) with the 1st group of corresponding variants of ordering in decreasing order
(OD+*), the 2nd group of modified algorithms have more distribution in the category of the
high (black) aspect ratio with rather poor display effect. While in the category of the low
(white) aspect ratio with the best visual effect, the distribution is similar or less on all
datasets with varying number of items per level. Moreover, comparing to the original
squarified algorithm (OD+OSAS), the 2nd group of modified algorithms (OI+*) all account
for bigger proportion in the category of the high (black) aspect ratio and less in the
category of the low (white) aspect ratio. Thus, it not only does not improve the aspect ratio,
the overall aspect ratio distribution is also worse than the original squarified algorithm.


 As shown, the aspect ratio distribution of the 3rd group of modified algorithms of
ordering data elements in random order (OR+*) are much more concentrated in the high
(black) aspect ratio category and much less distributed within the low (white) aspect ratio
category on all datasets than that of either ordering data in decreasing order or in increasing
order. Therefore the results of ordering data elements in random order is much worse than
either ordering data elements in increasing or decreasing order in terms of both achieving
smaller average aspect ratio and the overall aspect ratio distribution.


 Comparing the last group of modified algorithms of ordering data elements in the best of
decreasing or increasing order (OB+*) with the corresponding variants of ordering in
decreasing order (OD+*), the 4th group of modified algorithms have slightly less
distribution in the category of the high (black) aspect ratio with rather poor display effect.
While in the category of the low (white) aspect ratio with the best visual effect, the
distribution is slightly more on all datasets with varying number of items per level.


 Moreover, two variants of the 4th group (OB+OSAS, OB+MSABT) both account for less
proportion in the category of the high (black) aspect ratio with poor display effect, as
compared to the original squarified algorithm (OD+OSAS). While in the category of the
low (white) aspect ratio with the best visual effect, the distribution is slightly more on all
datasets with varying number of items per level. Thus, these two new variants of modified

                                                                                              44
algorithms not only improve the aspect ratio, the overall aspect ratio distribution is also
better than the original squarified algorithm.


  We can conclude that the results of ordering data elements in the best of decreasing and
increasing order (OB) is better than ordering data elements in decreasing, increasing and
random order (OD, OI, OR) in terms of both achieving smaller average aspect ratio and the
overall aspect ratio distribution.


4.4. Performance time comparison
  We have implemented the algorithms described above to conduct experiment to measure
the average run-time performances of the algorithms compared. As before, we used 100
tests (each test consists of 100 steps) to measure the average time on various scales of
dataset. Here we consider only the time difference between the different variations of
squarified algorithms without including the time for rendering the 2d graphics and other
common overhead such as generating random data and counting the statistics of percentage
of better aspect ratio and distribution etc that used to take a lot more time. We run tests of
the algorithms implemented using Java 1.4.2 on a 1.80 GHz AMD Sempron computer
running Windows XP. We used four scales of dataset: 20×1, 50×1, 10×2 and 8×3. The total
numbers of data elements in each dataset are 20, 50, 100, and 512 respectively. We plotted
both the total number of data items and the running time in log scale. The results are as
shown in Figure 4.10. We have included only four out of the twelve algorithms
(OD+OSAS, OD+MSABT, OB+OSAS, OB+MSABT), since the other variants did not
improve the aspect ratio. (Also the run-time of 2nd and 3rd groups of modified algorithms,
i.e. OI+*, OR+*, are similar to the 1st group of modified algorithms, i.e. OD+*. The run-
time of modified algorithms that layout along the longer side, i.e. *+MSAL, are similar to
that of layout along the shorter side, i.e. *+OSAS.)


  It can be seen from Figure 4.10, the running time increases approximately linearly
according to the total number of data items in a data set. The running times of the original
algorithm (OD+OSAS) on each dataset are around 40, 100, 200, and 1000 microseconds
respectively.

                                                                                           45
Figure 4.10. The comparison of performance time of four of the algorithms (OD+OSAS,
OD+MSABT, OB+OSAS, OB+MSABT).


 The running time of OD+MSABT is just a little more (30% ~ 40%) than the original
squarified algorithm. This is as expected. MSABT tries layout along longer side in addition
to layout along shorter side on every row, and thus cost more time. By limiting the
condition under which MSABT tries the layout along longer side, the threshold of aspect
ratio and the threshold of look ahead improve the aspect ratio, and by the way reduce the
algorithm computation time since it eliminates some unnecessary layout on certain rows. If
neither threshold condition is met on every row, MSABT doesn’t try the layout along the
longer side at all, and the running time would be same as the original algorithm. If both
thresholds condition are met on every row, the computation time would be approximately
twice that of the original algorithm. Thus the running time of MSABT is greater than that
of OSAS, but less than 2 times that of the original algorithm.


 We can see that the running time of OB+OSAS is approximately twice that of the
original squarified algorithm. As it always tries both ordering of data in decreasing and
increasing order on every row without any thresholds and then selects the better ordering
on the current row, it is equivalent to doing twice the work of the original algorithm.
However, ordering of the data elements in the best of decreasing and increasing order also

                                                                                        46
does not change the time complexity of the algorithm, it only increases the runtime by
approximately a factor of 2.


  As can be seen, the running time of OB+MSABT is around 2~3 times that of the original
squarified algorithm. Alternatively we can think of it as a little more (30% ~ 40%) than that
of OB+OSAS, in the same way the run-time of OD+MSABT is a little more (30% ~ 40%)
than that of OD+OSAS. We could also think of it as approximately twice that of
OD+MSABT since it tries both orderings and does equivalent of twice the work of
MSABT. Considering that it achieves the largest improvement of aspect ratio (almost 50%
more improvement than OD+MSABT), twice the run-time of OD+MSABT is still efficient
and worth the user’s time. Besides, even for the largest dataset with 512 elements (8×3),
the difference is just within a few milliseconds, which is still only a small fraction of time
for graphics rendering and other common overhead like data generation and data accessing
etc.


5. Conclusion
  Treemap is a 2d-space filling approach to represent hierarchical structure, the first
treemap method (slice-and-dice) has a problem of producing rectangles of high aspect ratio.
Squarified algorithm is an extension to treemap method focusing on lowering the aspect
ratio. Pivot treemap and strip treemap tries to find the optimal balance of various
performance of the aspect ratio, order and the dynamics stability. This project is committed
to obtain a smaller aspect ratio, and under this objective proposes several modified
algorithms based on squarified algorithms. The problem of subdividing a rectangle so as to
achieve the minimum overall aspect ratio of the treemap is an NP problem [2], i.e. we are
not able to find the optimal solution in polynomial time. Therefore, it is not practical to
search the problem space exhaustively to find the optimal solution. What we are looking
for is a heuristic method that could achieve near optimal solution in a reasonable time.


  In this project, the two proposals about laying out the given subrectangles and the
ordering of the data elements were analyzed. A dozen of modified squarified algorithms
were proposed to improve the aspect ratio.

                                                                                           47
 The modified algorithm for laying out along the longer side (MSAL) is found to be
similar to strip algorithm, that is, it is not ideal in improving the aspect ratio over the
original squarified algorithm. The modified algorithm of laying out along the best of
shorter and longer side (MSAB) produces significant percentage of treemaps with
improved aspect ratio over the original squarified algorithm (OSAS) although on average
the aspect ratio is still not improved. In order to further improve this idea, we further
analyze the properties of this modified algorithm and introduce the threshold of aspect ratio
and the threshold of lookahead to remedy the shortcomings of this modified algorithm. The
further modified squarified algorithm (MSABT) uses the strategy of layout along best of
longer and shorter side and with thresholds, which selects the better optimum from the two
layouts when the aspect ratio of the initial rectangle for processing each row and the
number of elements left meets the thresholds condition. Tests show that the modified
squarified algorithm with thresholds, as compared to the original squarified algorithm, can
significantly improve the average aspect ratio (by more than 10% on all datasets) while
spending a little more time. Also our results show that using both thresholds achieves larger
improvement than using only either one of the threshold.


 The modified squarified algorithms of ordering the data elements in increasing order
(OI+*) does not improve the aspect ratio of treemaps produced by ordering data in
decreasing order. And the modified squarified algorithms of using random ordering (OR+*)
of the data elements is much worse compared to that by ordering data in either decreasing
or increasing order. However, the modified squarified algorithms of ordering data in the
best of the decreasing and increasing order (OB+*) further improve the aspect ratio
compared to the 1st group of algorithms that order data in decreasing order (OD+*).
Specifically, two new modified algorithms (OB+OSAS, OB+MSABT) both improve the
aspect ratio compared to the original algorithm. The improvement by OB+OSAS is quite
significant (3% ~ 10%), almost half of that by OD+MSABT. The percentage of improved
aspect ratio is also large (50% ~ 97%). The improvement by OB+MSABT is the largest
(16% ~ 18%) ever among all variants of algorithm we experimented, significantly larger
than the improvement (around 10%) by OD+MSABT. The percentage of improved aspect

                                                                                          48
ratio is also the largest (80% ~ 100%). We can think of the improvement by OB+MSABT
as superposition of improvement by OB+OSAS and by OD+MSABT. The idea of using the
best of ordering of the data elements in increasing and decreasing order exclusively
(ExOB+*) is dropped since the improvement of aspect ratio is much smaller and inefficient,
as compared to that of ordering of the data elements in increasing and decreasing order
nonexclusively (OB+*).


 Our contribution is proposing various ideas to modify the original squarified algorithm to
improve the aspect ratio and thorough comparing variants of the modified algorithms with
the original algorithm through analysis and statistics experiments. We further improved the
modified squareified algorithm of layout along the best of the shorter and longer side by
introducing two thresholds of aspect ratio and look ahead. It results in a relatively large
improvement of the aspect ratio at the cost of relatively small amount of time. We also
achieved relatively significant improvement of aspect ratio by ordering of data elements in
the best of decreasing and increasing order. Finally, we achieved the best modified
squarified algorithm (OB+MSABT) by combining the improvement idea of ordering data
in the best of decreasing and increasing order and layout along the best of shorter and
longer side with thresholds. This is a result of the combination of two good improvement
methods of the aspect ratio. Both methods (best ordering, best side) can be regarded as
heuristic methods that give direction on which ordering of data elements and which layout
of rectangles to proceed. To facilitate our comparison and experiments, we also devised a
classification of all twelve variants of modified squarified algorithm by layout side and data
ordering.


 Further improvement on the basis of ordering of data elements in the best of decreasing
and increasing order and the layout along the best of shorter and longer side with thresholds
(OB+MSABT) would not be time-efficient. Searching the best random ordering of data
elements may improve the aspect ratio. However the average aspect ratio of OB+MSABT
is already very close to 1, and the strategy of ordering data elements in the best of
decreasing and increasing order to find the lowest aspect ratio is the most efficient to find a
good ordering in a reasonably short running time. Thus the feasibility of further adjusting

                                                                                            49
the ordering of the data elements is not too good. It also would not have much effect to use
OB+MSABT in conjunction with the original squarified algorithm (OD+OSAS), since the
percentage of situations where OB+MSABT achieves the improved aspect ratio is very
near 100%.


  Future work may include the implementation of treemap application software integrating
practical features such as representing multiple levels directories and files to improve the
utilization and management of hard drive and building a friendlier user interface for
experiments to make studying the treemap algorithm more convenient. Future work may
include also comparison with other treemap algorithms, such as ordered treemap, pivot
treemap, strip treemap, etc [1,6] on the metrics of the aspect ratio, order and the dynamics
stability etc.


  The treemap has a great application potential. According to the characteristics of practical
problems and user needs, different improvement on treemaps can be custom made to fit
real life problems. It is more useful to combine specific problems with the treemap.
Quantum treemap and Photomesa [1] are good examples of new directions of specialized
treemap algorithm and its practical application.




                                                                                           50
Appendix
Algorithms.
A. The original squarified algorithm (OSAS) in section 2.2
  The original squarified algorithm was given in [2]. We include it here for easier reference,
where the list notation (++ is concatenation of lists, [x] is the list containing element x, and
[] is the empty list), the functions worst(), layoutrow() and width() etc. are as described in
[2].


          procedure squarify(list of real children, list of real row, real w)
          begin
                real c = head(children);
                if worst(row, length) >= worst(row++[c], w) then
                        squarify(tail(children), row++[c], w);
                else
                        layoutrow(row);
                        squarify(children, [], width());
                fi
          end


  In order to adapt to the modification (MSAB) in appendix C, we have implemented an
equivalent representation to the original squarified algorithm by replacing the recurrence of
squarify() in the first if branch with a while loop,


          procedure squarify(list of real children, list of real row, real w)
          begin
                real c = head(children);
                while worst(row, length) >= worst(row++[c], w) do
                        children = tail(children);
                        row = row++[c];

                  layoutrow(row);
                  squarify(children, [], width());
          end




                                                                                             51
B. The modified algorithm (MSAL) in example 1 in section 3.
 This can be modified as described in example 1 in section 3. The function width() in the
original algorithm gives the length of the shorter side of the remaining subrectangle in
which the current row is placed. Here we replace it by another function length() that gives
the length of the longer side of the remaining subrectangle in which the current row is
placed. We also need to replace the function layoutrow() that adds a new row of children to
the rectangle along the shorter side by another function layoutrowL() that adds a new row
of children to the rectangle along the longer side.


          procedure squarifyL(list of real children, list of real row, real l )
          begin
                real c = head(children);
                if worst(row, l ) >= worst(row++[c], l ) then
                        squarifyL(tail(children), row++[c], l );
                else
                        layoutrowL(row);
                        squarifyL(children, [], length());
                fi
          end

 Similarly, to make it easier to accommodate the modification (MSAB) in example 2 in
section 3, we also show here an equivalent representation to this modified squarified
algorithm by replacing the recurrence of squarifyL() in the first if branch with a while loop,


          procedure squarifyL(list of real children, list of real row, real l )
          begin
                real c = head(children);
                while worst(row, l ) >= worst(row++[c], l ) do
                        children = tail(children);
                        row = row++[c];

                 layoutrowL(row);
                 squarifyL(children, [], length());
          end




                                                                                           52
C. The modified algorithm (MSAB) in example 2 in section 3.
 We give in details the modified algorithm described in example 2 in section 3. In every
subdivision [2] of the layout of the rectangles, we try two layout schemes, and adopt the
better layout between the two. The process of the algorithm is as follows:


        procedure squarifyBest(list of real children, real w, real l )
        begin
              list of real childrenL = clone(children);     //duplicate list of children
              list of real rowW = [], rowL = [];

               //try layout along shorter side in current subdivision
               real c = head(children);
               while worst(rowW, w) >= worst(rowW++[c], w) do
                        children = tail(children);
                        rowW = rowW++[c];
                        c = head(children);

               //try layout along longer side in current subdivision
               c = head(childrenL);
               while worst(rowL, l ) >= worst(rowL++[c], l ) do
                        childrenL = tail(childrenL);
                        rowL = rowL++[c];
                        c = head(childrenL);

               //select best side of layout to actually layout the row in current subdivison
               if worst(rowW, w) >= worst(rowL, l ) then
                        //actually layout along longer side in current subdivision
                        layoutrowL(rowL);
                        children = childrenL;
               else
                        //actually layout along shorter side in current subdivision
                        layoutrowW(rowW);
               fi

               //recurrence of next subdivision
               squarifyBest(children, width(), length());
        end

where the function width() and length() give the length of the shorter and longer side of the
remaining subrectangle in which the current row is placed. The function layoutrowW() and
layoutrowL() add a new row of children to the rectangle along the shorter and longer side.


                                                                                           53
Implementation details
 As illustrated in Figure 3.3, we apply the algorithm to a simple input of one level tree
containing 8 items. On the left of each row are the layout steps of keeping rectangles along
the shorter side of the current row (this 1st layout scheme is represented in the blue color).
On the right of the current row are the layout steps of keeping rectangles along the longer
side (this 2nd layout scheme is represented in a different color, red, to distinguish it from the
1st layout scheme).


  It must be emphasized that every row of the 2 layout schemes in Figure 3.3 starts from a
common initial state. Then, on every row the algorithm try both layout schemes step by
step before repeating the process of the next row. The previous row of the 2 layout schemes
produced respectively 2 results, the better one is selected as the common initial state on the
next row for both layout schemes. Therefore the 2 layout schemes in Figure 3.3 are not 2
completely independent layouts (such as 2 of the monolithic multi-row block of Figure 3.1)
put together simply side by side. Indeed, the 2 layout schemes in Figure 3.3 are
indispensable to each other down to every row.


 Explained below are a few more minor details on Figure 3.3, in the 1st row we only
needed to try 1 layout scheme (to reduce the unnecessary cost of computation time by the
algorithm) since the 2 sides of the initial rectangle (a square) are the same.


 Another small point to Figure 3.3, in the 3rd row, the 2nd layout scheme completes the
layout of all the remaining elements from the input. Moreover, the last rectangle added in
the 2nd layout scheme actually lowers the worst aspect ratio of all rectangles added on this
row. Indeed, the 2nd layout scheme after the layout of the last rectangle gets a lower worst
aspect ratio than the 1st layout scheme. Therefore the algorithm selects the layout result of
the 2nd layout scheme on the 3rd row and the process ends on the 3rd row.




                                                                                              54
D. The algorithm (OI+OSAS) in example 3 in section 3.
 This can be modified as described in example 3 in section 3 by sorting the list of data
elements (children) in increasing order before applying the same recurrence of the original
squarified algorithm squarify() as presented in appendix A. Here the function sort(list of
real children, real dataordering) sorts the list of children in the specified dataordering
(‘increasing’ here).


          procedure squarifyOI(list of real children)
          begin
                //ordering data elements in increasing order
                sort(children, increasing);

                 //apply recurrence of the original squarified algorithm
                 squarify(children, [], width());
          end


E. The algorithm (OR+OSAS) in example 4 in section 3.
 This can be modified as described in example 4 in section 3 by scrambling the ordering
of the list of data elements (children) if the data ordering is not already in random order (or
do nothing if the initial ordering is already random) before applying the same recurrence of
the original squarified algorithm squarify() as presented in appendix A.


          procedure squarifyOR(list of real children)
          begin
                //ordering data elements in random order
                scramble(children) // if the data ordering is not random;
                // or do nothing if the initial ordering is already random.

                 // apply recurrence of the original squarified algorithm
                 squarify(children, [], width());
          end




                                                                                            55
F. The modified algorithm (OB+OSAS) in example 5 in section 3.
 We give in details the modified algorithm described in example 5 in section 3. In every
subdivision [2] of processing the data elements, we try two ordering schemes, and adopt
the better of decreasing or increasing ordering. The process of the algorithm is as follows:


        procedure squarifyBestOrdering(list of real childrenD, real w)
        begin
              list of real childrenI = cloneReverse(childrenD); //increasing order children
              list of real rowD = [], rowI = [];

               //try processing in decreasing order the items in current subdivision
               real c = head(childrenD);
               while worst(rowD, w) >= worst(rowD++[c], w) do
                        childrenD = tail(childrenD);
                        rowD = rowD++[c];
                        c = head(childrenD);

               //try processing in increasing order the items in current subdivision
               c = head(childrenI);
               while worst(rowI, w) >= worst(rowI++[c], w) do
                        childrenI = tail(childrenI);
                        rowI = rowI++[c];
                        c = head(childrenI);

               //select best ordering of data to actually process in current subdivison
               if worst(rowD, w) >= worst(rowI, w) then
                        //actually process in increasing order in current subdivision
                        layoutrow(rowI);
                        childrenD = cloneReverse(childrenI);
               else
                        //actually process in decreasing order in current subdivision
                        layoutrow(rowD);
               fi

               //recurrence of next subdivision
               squarifyBestOrdering(childrenD, width());
        end


where the function cloneReverse() duplicate the list of children in reverse order, i.e. from
decreasing to increasing order or vice versa. The functions width() and layoutrow() are as
described in [2].


                                                                                           56
G. The modified squarified algorithm (MSABT) in section 4.1.1.
 The modified squarified algorithm by layout along best of longer and shorter sides and
with thresholds described in section 4.1.1 is as follows: in which if the aspect ratio of the
work region is greater than the threshold AR or if the number of element left is less than
the threshold N, it skip trying layout along best of the longer or shorter side and simply
layout along the shorter side.

       procedure squarifyBest(list of real children, real w, real l )
       begin
             list of real childrenL = clone(children);     //duplicate list of children
             list of real rowW = [], rowL = [];

               //try layout along shorter side in current subdivision
               real c = head(children);
               while worst(rowW, w) >= worst(rowW++[c], w) do
                        children = tail(children);
                        rowW = rowW++[c];
                        c = head(children);

               //judge the threshold condition whether to skip trying layout along longer side
               if aspectRatio() < AR && number(childrenL) > N then
                       //try layout along longer side in current subdivision
                       c = head(childrenL);
                       while worst(rowL, l ) >= worst(rowL++[c], l ) do
                                childrenL = tail(childrenL);
                                rowL = rowL++[c];
                                c = head(childrenL);

                       //select best side of layout to actually layout the row in current subdivison
                       if worst(rowW, w) >= worst(rowL, l ) then
                                //actually layout along longer side in current subdivision
                                layoutrowL(rowL);
                                children = childrenL;
                       else
                                //actually layout along shorter side in current subdivision
                                layoutrowW(rowW);
                       fi
               else
                       //actually layout along shorter side in current subdivision
                       layoutrowW(rowW);
               fi

               //recurrence of next subdivision
               squarifyBest(children, width(), length());
       end
                                                                                             57
H. The first 3 groups of algorithms (OD+*, OI+*, OR+*) in section 4.2.
 The first 3 groups of variants of squarified algorithm by ordering data in decreasing,
increasing and random order and layout along various sides as described in section 4.2 can
be presented in the following unified form: in which the constants of ‘short’, ‘long’, and
‘best’ (defined for the variable ‘layoutside’) and the constants of ‘decreasing’, ‘increasing’,
and ‘random’ (defined for the variable ‘dataordering’) represent laying out the
subrectangles along the specified side and ordering the data elements in the specified order.
The functions squarify(), squarifyL() and squarifyBest() etc. are as already presented in
appendix A, B, and C. The functions sort() and scramble() are as already presented in
appendix D, E.


         procedure squarifyVariants(list of real children, real layoutside, real dataordering)
         begin
               if dataordering == decreasing then
                        sort(children, decreasing);
               else if dataordering == increasing then
                        sort(children, increasing);
               else // dataordering == random
                        // do nothing if the initial ordering is already random or
                        scramble(children) // if the data ordering is not random;
               fi

                 if layoutside == short then
                          squarify(children, [], width());
                 else if layoutside == long then
                          squarifyL(children, [], length());
                 else if layoutside == best then
                          squarifyBest(children, width(), length());
                 fi
         end




                                                                                            58
I. The modified algorithm (OB+MSAL) in section 4.2.
 We give in details the modified algorithm (OB+MSAL) described in section 4.2. In every
subdivision [2] of processing the data elements, we try two ordering schemes, and adopt
the better of decreasing or increasing ordering. The difference from OB+OSAS in example
5 in section 3 and appendix F is the layout of the rectangles is along the longer side. The
process of the algorithm is as follows:


       procedure squarifyLBestOrdering(list of real childrenD, real l )
       begin
             list of real childrenI = cloneReverse(childrenD); //increasing order children
             list of real rowD = [], rowI = [];

               //try processing the items in decreasing order in current subdivision
               real c = head(childrenD);
               while worst(rowD, l ) >= worst(rowD++[c], l ) do
                        childrenD = tail(childrenD);
                        rowD = rowD++[c];
                        c = head(childrenD);

               //try processing the items in increasing order in current subdivision
               c = head(childrenI);
               while worst(rowI, l ) >= worst(rowI++[c], l ) do
                        childrenI = tail(childrenI);
                        rowI = rowI++[c];
                        c = head(childrenI);

               //select best ordering of data to actually process in current subdivison
               if worst(rowD, l ) >= worst(rowI, l ) then
                        //actually process in increasing order in current subdivision
                        layoutrowL(rowI);
                        childrenD = cloneReverse(childrenI);
               else
                        //actually process in decreasing order in current subdivision
                        layoutrowL(rowD);
               fi

               //recurrence of next subdivision
               squarifyLBestOrdering(childrenD, length());
       end




                                                                                          59
where the function cloneReverse() duplicate the list of children in reverse order, i.e. from
decreasing to increasing order or vice versa. The functions length() gives the length of the
longer side of the remaining subrectangle in which the current row is placed, and
layoutrowL() adds a new row of children to the subdivision along the longer side, as
described in appendix B.


J. The modified algorithm (OB+MSABT) in section 4.2.
 We give in details the modified algorithm (OB+MSABT) described in section 4.2. In
every subdivision [2] of processing the data elements, we try four combinations of the two
ordering schemes (order data in decreasing and increasing order) and two layout side
schemes (layout the rectangle along the shorter and longer side), and adopt the best of
combination of ordering and layout side schemes. The process of the algorithm is as
follows. The function cloneReverse() duplicate the list of children in reverse order, i.e.
from decreasing to increasing order or vice versa. The functions width() and layoutrow()
are as described in [2].




                                                                                         60
procedure squarifyBestSideBestOrdering(list of real childrenWD, real w, real l )
begin
      list of real childrenWI = cloneReverse(childrenWD); //increasing order children
      list of real childrenLD = clone(childrenWD);          //decreasing order children
      list of real childrenLI = cloneReverse(childrenWD); //increasing order children
      list of real rowWD = [], rowWI = [], rowLD = [], rowLI = [];

       //try layout along shorter side and in decreasing order in current subdivision
       real c = head(childrenWD);
       while worst(rowWD, w) >= worst(rowWD++[c], w) do
                childrenWD = tail(childrenWD);
                rowWD = rowWD++[c];
                c = head(childrenWD);

       //try layout along shorter side and in increasing order in current subdivision
       c = head(childrenWI);
       while worst(rowWI, w) >= worst(rowWI++[c], w) do
                childrenWI = tail(childrenWI);
                rowWI = rowWI++[c];
                c = head(childrenWI);

       //judge the threshold condition whether to skip trying layout along longer side
       if aspectRatio() < AR && number(childrenLD) > N then
               //try layout along longer side and in decreasing order in current subdivision
               c = head(childrenLD);
               while worst(rowLD, l ) >= worst(rowLD++[c], l ) do
                        childrenLD = tail(childrenLD);
                        rowLD = rowLD++[c];
                        c = head(childrenLD);

              //try layout along longer side and in increasing order in current subdivision
              c = head(childrenLI);
              while worst(rowLI, l ) >= worst(rowLI++[c], l ) do
                       childrenLI = tail(childrenLI);
                       rowLI = rowLI++[c];
                       c = head(childrenLI);




                                                                                        61
             //select best side of layout and best ordering of data to actually
             //layout the row in current subdivison
             if worst(rowWD, w) <= worst(rowWI, w) &&
                worst(rowWD, w) <= worst(rowLD, l ) &&
                worst(rowWD, w) <= worst(rowLI, l ) then
                      //actually layout along shorter side and in decreasing order
                      layoutrowW(rowWD);
             elseif
                worst(rowWI, w) <= worst(rowWD, w) &&
                worst(rowWI, w) <= worst(rowLD, l ) &&
                worst(rowWI, w) <= worst(rowLI, l ) then
                      //actually layout along shorter side and in increasing order
                      layoutrowW(rowWI);
                      childrenWD = cloneReverse(childrenWI);
             elseif
                worst(rowLD, l ) <= worst(rowWD, w) &&
                worst(rowLD, l ) <= worst(rowWI, w) &&
                worst(rowLD, l ) <= worst(rowLI, l ) then
                      //actually layout along longer side and in decreasing order
                      layoutrowL(rowLD);
                      childrenWD = clone(childrenLD);
             elseif
                worst(rowLI, l ) <= worst(rowWD, w) &&
                worst(rowLI, l ) <= worst(rowWI, w) &&
                worst(rowLI, l ) <= worst(rowLD, l ) then
                      //actually layout along longer side and in increasing order
                      layoutrowL(rowLI);
                      childrenWD = cloneReverse(childrenLI);
             fi
      else
             //select shorter side of layout and best ordering of data to actually
             //layout the row in current subdivison
             if worst(rowWD, w) <= worst(rowWI, w) then
                      //actually layout along shorter side and in decreasing order
                      layoutrowW(rowWD);
             elseif
                worst(rowWI, w) <= worst(rowWD, w)
                      //actually layout along shorter side and in decreasing order
                      layoutrowW(rowWI);
                      childrenWD = cloneReverse(childrenWI);
      fi

      //recurrence of next subdivision
      squarifyBestSideBestOrdering(childrenWD, width(), length());
end


                                                                                     62
K. The algorithm (ExOB+OSAS) in section 4.2.1.
 The algorithm using the best of ordering of data elements in decreasing and increasing
order exclusively described in section 4.2.1 is as follows, where the function
squarifyVariants() is as already presented in appendix H. The global variable
‘finalaverageaspectratio’ stores the final average aspect ratio of the treemap produced by a
specified variant of squarified algorithm.


         procedure squarifyExBestOrdering(list of real children, real layoutside)
         begin
               // run the algorithm variant by ordering data elements in decreasing order
               squarifyVariants(children, layoutside, decreasing);
               // record the final average aspect ratio of the produce treemap
               finalaverageaspectratioOD = finalaverageaspectratio;

                 //run the algorithm variant by ordering data elements in increasing order
                 squarifyVariants(children, layoutside, increasing);
                 //record the final average aspect ratio of the produce treemap
                 finalaverageaspectratioOI = finalaverageaspectratio;

                 // Compare the two aspect ratios and select the smaller aspect ratio as result
                 if finalaverageaspectratioOD < finalaverageaspectratioOI then
                          finalaverageaspectratio = finalaverageaspectratioOD;
                 else
                          finalaverageaspectratio = finalaverageaspectratioOI;
                 fi
         end




                                                                                             63
7. Reference
[1] Bederson, B. B. and Shneiderman, B. Ordered and quantum treemaps: making
   effective use of 2d space to display hierarchies. In ACM Transaction on Graphics, 2002,
   21(4):833-554
[2] Bruls, M., Huizing, K. and Van Wijk, J. J. Squarified treemaps. In: Euro graphics /
   IEEE TVCG 2000 Symposium, 2000, 112-116
[3] Knuth, D. E. The Art of Computer Programming: Volume 1 / Fundamental Algorithms.
   Addision-Wesley Publishing Co., Reading, MA, 1968.
[4] Sheldon, R. A. 1997. A First Course in Probability. Englewood Cliffs, NJ: Prentice
   Hall.
[5] Shneiderman, B. Tree visualization with tree-maps: 2-d space filling approach. ACM
   Transactions on Graphics, 1992, 11(l):92-99
[6] Shneiderman, B. and Wattenberg, M. Ordered treemap layouts. In IEEE Symposium on
   Information Visualization, 2001, 73-78
[7] Van Wijk, J. J. and Van de Wetering, H. Cushion treemaps: Visualization of
   hierarchical information. In IEEE Symposium on Information Visualization, San
   Francisco, 1999, 78-83




                                                                                          64

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:10/18/2012
language:English
pages:64