VIEWS: 5 PAGES: 64 POSTED ON: 10/18/2012
Algorithm for Improving the Aspect Ratio of Treemaps Xiaochuan Lin (MACS Project) 1. Introduction Treemap is a new way to represent a tree-structured or a large-scale hierarchical data set [1,2,5-7]. Knuth [3] was the first to suggest a space-filling approach to represent trees. Shneiderman [5] developed the first treemap, and following this many improvements were added to his method [1,2,3,6]. His treemap is essentially a 2-dimensional rectangular area divided into smaller rectangles corresponding to the hierarchical data sets, that is, the areas of these rectangles are proportional to the values of the data in the set. Although treemaps have space-saving advantages over node-links representation, they suffer two drawbacks. First, they lack intuitive hierarchical structure [7]. Second they might produce rectangles with potentially high aspect ratio [1,2,6]. High aspect ratio of thin rectangles makes it difficult to see and compare the size of two rectangular areas; and also makes it harder to use location-based input devices (e.g. mouse) to select a rectangle, etc [1,2,6]. Therefore, finding an approach that produces smaller aspect ratio is a problem of practical significance. Figure 1.1 shows a simple example with three solutions given below in Figures 1.2, 1.3, and 1.4; where each has a different aspect ratio. We explain these solutions briefly below, and present the details of the algorithms underlining them later (in Section 2). The data set T = (B6, C6, D4, E3, F2, G2, H1) in this example corresponds to the 7 leaf nodes of a1- level tree. The map for this tree is displayed on a rectangular region of width 6 and of height 4, i.e., has an area equal to 24, which is the size of all 7 leafs added together. In tree given in Figure 1.1, every node has a name and a size, where the size of an internal node is defined as the sum of its children. Thus, the sizes of leaves are 6, 6, 4, 3, 2, 2, and 1. 1 w=A/h 6 6 4 3 2 = , , , , , 4 4 4 4 4 A24 h=4 B6 C6 D4 E3 F2 G2 H1 6 6 4 3 2 22 1 Fig.1.2. First solution by slice-and-dice Fig.1.1. A one level tree. algorithm with horizontal slicing w=6 w=A/h=6/2=3 =6/6=1 h=A/w h=4/2=2 6/6 6/6 4/6 Fig.1.3. Second solution by slice-and- Fig.1.4. Third solution by squarified dice algorithm with vertical slicing. algorithm. The first layout shows a solution based on the slice-and-dice algorithm [5] which is obtained by cutting the map vertically into subareas equal to the values of the leafs. This solution generates 7 long rectangular strips, leading to an average aspect ratio = 6.67. Notice that for a rectangle of size “w by h”, its aspect ratio = max{h/w, w/h}. Thus, the thinnest rectangle with the area equal to 1 has the worst (i.e., largest) aspect ratio equal to 16. The second layout shows another solution using slice-and-dice algorithm where map is divided horizontally (instead of vertically). As the shows, the rectangle with the area equal to 1 has worst aspect ratio of 36, and the average aspect ratio for the whole map is 15. The third layout shows the solution of using the squarified algorithm [2]. The slicing of the map in this case is done partially horizontally and partially vertically (in a manner to be 2 explained in details later). This leads to an average aspect ratio equal to 1.68 and the worst aspect ratio of any rectangle is 2.78 (corresponding to the rectangle of area 1). By comparing the aspect ratios produced by the above solutions, one concludes that the third solution leads to better results. Notice that, by the definition given before, the aspect ratio of any rectangle (of dimension w x h) is ≥ 1, and moreover, it is 1 only w = h, i.e., when the rectangle is a square. We define the average aspect ratio of the whole map as the average of all aspect ratios of the rectangles corresponding to the leaf-nodes [1,6]. Thus, our problem can now be defined as follows: given a hierarchical data set corresponding to a tree and an initial rectangle with an area equal to the sum of the sizes of all leaf nodes in the tree, in which the size of each node is also equal to the final display area of each sub-rectangle corresponding to this leaf node, we need to effectively fill each node of the tree into the initial rectangle in order to achieve the smallest overall aspect ratio for the whole treemap. Figure 1.5 shows another example of a treemap layout generated by slice-and-dice algorithm resulting from converting a much larger tree whose nodes represent a hierarchical file system with thousands of files. In this case, treemap effortlessly shows the whole hierarchical file system. Also users can directly see the largest document from the whole treemap, i.e. which part consumes the most disk space, etc. Node-link diagram in representing the hierarchical structure of the data set, however, wastes a lot of display space and is unsuitable for large scale data. It would be difficult to use node-link diagram to completely display such a large hierarchical file system. When size is the most important feature to display, treemaps become a very effective tool to use. Figure 1.6 shows an example treemap layout of the result of applying squarified algorithm to the same large tree-structure data set of Figure 1.5. This example treemap layout generated by squarified algorithm and that example of treemap layout generated by slice- and-dice algorithm for the same large tree-structure data set shows that squarified algorithm is better than slice-and-dice algorithm in that treemap layout generated by 3 squarified algorithm has a better aspect ratio than treemap layout generated by slice-and- dice treemap. The comparison of these two treemap layout examples in Fig.1.6 and in Fig.1.5 shows clearly that the squarified algorithm is better than slice-and-dice algorithm and is successful in the sense that the rectangles in the treemap layout are far less elongated and that the black areas with cluttered rectangles have disappeared. Fig.1.5 A treemap layout generated by Fig.1.6. A treemap layout generated by slice-and-dice algorithm for a large squarified algorithm for a large tree- tree-structure data set corresponding to structure data set corresponding to a a hierarchical file system. hierarchical file system in Fig.1.5. Making the rectangles of a treemap to be more square-like has many advantages such as the following [2]. 1. It provides more efficient usage of display space. The circumference of a square is minimal compared to that of a rectangle with same area, which leads to minimal number of pixels to display the border. 2. It is easier to detect, locate and point at, square-like rectangles. (Thick rectangles clutter and aliasing errors caused by thin rectangles are minimized). 3. It is easier to compare the size of rectangles with similar aspect ratios. 4. It improves the accuracy of the presentation. 2. Related work Since the invention of treemap idea and the slice-and-dice algorithm [5], research has been focused in improving the two deficiencies of the treemap mentioned before. Over the 4 last decade, many treemap generation algorithms were proposed, e.g., the cushion method [7], the squarified algorithm [2], the pivot algorithm [6], and the strip method [1]. 2.1 Slice-and-dice treemap algorithm As a simple example for a data set of hierarchical structure, Figure 2.1 and 2.2 illustrates the representations of a hierarchical data set by the traditional node-link diagram and also as a treemap. In this example, the tree has four levels, starting from the root, where each leaf node in the tree has a fixed size, and each internal node x has a size defined recursively as the sum of all sizes of the nodes in the subtree rooted at x. The process of treemap formation is as follows: the entire display space, i.e. the initial rectangle, is used to represent the entire data set of the tree rooted under A16. Then, according to the size of the data elements B3, C3 and D10, the initial rectangle will be cut into the three rectangles corresponding to B3, C3 and Dl0; respectively, where the proportions of areas of these three rectangles will be equal to the proportions of the sizes of B3, C3 and D10. The process will continue recursively, where the slicing of the rectangle area will alternate from horizontal to vertical, then back to horizontal, and so forth, as illustrated in the Figure 2.2 below. This is the most simple slice-and-dice method. Figure 2.1 Node and link diagram Figure 2.2. Treemap layout generated by slice-and-dice algorithm Although the treemap generated by the slice-and-dice method have space-saving advantages node-link representation, the aspect ratio of the treemap produced can potentially be very large (see Figure 1.5). Another important method for generating maps is the squarified method explained below, for the example of 1-level tree with the leaf nodes (B6, C6, D4, E3, F2, G2, H1) rooted at root node A24. 5 2.2. Squarified algorithm We illustrate this method using an example from [2], which shows how to construct a treemap for the tree given before in Figure 1.1. The steps of this construction are given below in Figure 2.3. The algorithm first splits the original rectangle. It does subdivision horizontally; because the initial rectangle is longer in width than height. Then the left half is filled by adding the first rectangle (figure 2.3), the aspect ratio of which is 8/3. In step 2 a second rectangle is added on top of the first rectangle, and the aspect ratios improve to 3/2. In step 3, the third rectangle (with area 4) is added on top of the first and second rectangles, however, the aspect ratio worsen to 4/1. This means an optimum for the left half has been reached in step 2, thus the algorithm starts processing the right half of the initial rectangle. Figure 2.3 Constructing a treemap for the tree of Figure 1.1 using the squarified method. 6 Continuing from the result of Step 2, it first does the subdivision vertically, because the initial rectangle now is longer in height than width. Step 4 adds the rectangle with area 4, the aspect ratio of which is 9/4. Following this, step 5 adds the rectangle with area 3, and the aspect ratios improve to 49/27. In step 6, the next rectangle with area 2 is added. However, this does not improve the aspect ratio, which is now 9/2. Therefore the result of step 5 is accepted, and the algorithm starts to fill the right top partition. The algorithm repeats these steps until it completes the processing of all rectangles. 3. Proposed Algorithms None of the methods proposed before for treemap generations is optimal in all parameters. The solutions produced by the squarified algorithm have better aspect ratio, whereas the slice-and-dice algorithm produces solutions with better order and dynamic stability. Similarly, the pivot and strip algorithms attempt to achieve a balance between the aspect ratio on one hand, and the order and dynamic stability on the other. In short, there are two directions to further the research in treemap generation; either to develop a new algorithm that performs better under certain criteria, or that achieves a reasonable balance. Finding a better balance here means achieving a broader adaptability. This project is intended to develop a new algorithm that produces solutions with better aspect ratios than those obtained by the squarified algorithm [2] The better aspect ratio achieved by the squarified algorithm compared to the slice-and- dice algorithm (see the solutions in Figures 1.2, 1.3 and 1.4), is a result of the following two ideas: the self-adaptation of the layout based on each rectangle’s aspect ratio and the ordering of the data elements by size. In addition, the squarified algorithm makes the following two assumptions. First, when filling in each rectangle, the squarified algorithm assumes that it will be better to layout the rectangle along the shorter side of the initial rectangle. This way, the initial rectangle of the next level would be more like a square, thus more likely leading to good aspect ratio. However, there might be some cases where filling in the rectangle along the longer side may result in a smaller overall aspect ratio. Second, before filling in data elements to the 7 initial rectangle, it assumes that it will be better to order them in decreasing order and then fill in the largest data element in the data set first. On the surface, this appears a natural choice. The closer to the end of the algorithm, the smaller the initial rectangle remained to be filled in. But this choice may not always produce good results. Therefore, in this project I propose to develop some modified versions of the squarified algorithm, and to carry out the necessary experiments to show that our algorithms may produce treemaps with better aspect ratio. Our experiments comparing the original squarified algorithm with the modified algorithms incorporate the following 2 groups of new ideas. a. Comparing different layouts When we place the rectangles in the treemap, there are three options that we can be use. We can layout all rectangles along the shorter side (as in the squarified algorithm), or layout all rectangles along the longer side, or for each rectangle to be placed, we try both of these layouts (along the shorter and the longer sides) and select the one that leads to a smaller aspect ratio. b. Comparing different orderings I would propose to test the same idea of using recurrence as in the squarified algorithm except that we order the data elements in increasing order i.e. first fill the smaller element and later fill larger elements, and compare this with the original squarified algorithm. In addition, we also compare these two approaches with that of randomly ordering the elements. Lastly, we also compare them with that of ordering the data in the best of decreasing and increasing order. As pointed out in one of the reference [2], finding a solution that is optimal in aspect ratio is NP-Hard. Thus rather than attempting to find the most optimal solution, our goal is to further improve the good solution achieved by squarified algorithm. We present several examples in more details so that it becomes easier for the reader to follow the rest of our simulation experiments. 8 Example 1: Compare the layout along the longest-side and along the shortest-side. The following example demonstrates that modifying the original squarified algorithm by laying out the items along the longer (instead of the shorter) side can achieve a better average aspect ratio. The example consists of an 8-item data set whose values are generated randomly (based on a log-normal distribution). This set will be placed in an initial square whose area is equal to the sum of all 8 elements. The elements of the data set are as follows. Data set = (0.6702, 0.5025, 0.8937, 0.4331, 0.8275, 4.3615, 0.8979, 0.3188) We still use the same idea of recurrence [2], and as in the original algorithm, we order the data elements in decreasing order. However, unlike the original algorithm, when we fill in a new element, we place it along the longer side. The final result is better as demonstrated in details in Figure 3.1 and 3.2. The steps of processing all the rectangles are demonstrated in these figures. The modified algorithm achieves an average aspect ratio equal to 1.3259, whereas the original algorithm produces a treemap with an average aspect ratio equal to 1.5991. In fact, with the use of layouts along the shorter side and along the longer side, each rectangle obtained has the same area, only with slightly different location and correspondingly different aspect ratio. It is worth point out that this layout is similar to strip algorithm [1], which layout the rectangles always in horizontal (or always in vertical) strips of varying thicknesses. In our modified algorithm, the layout is along the longer side. When the initial rectangle is square (i.e. 2 sides have equal length), the length of the initial chosen layout side (either horizontal or vertical) would not be reduced by layout of each row while the length of the other side is reduced and becomes shorter and shorter after layout of each row. Thus the length of the layout side would not change and the layout would also be always in horizontal (or always in vertical) rows like the strip algorithm. 9 Optimum Worsen Add L(1) add L(2) = 4.3615, = 0.8979; Optimum Worsen Add L(2) add L(3) add L(4) add L(5) = 0.8979, = 0.8937; = 0.8275, = 0.6702, The final average aspect ratio = 1.3259. Add L(5) add L(6) add L(7) add L(8) = 0.6702, = 0.5025; = 0.4331, = 0.3188. Figure 3.1. Layout along the longer side. 10 Optimum Worsen Add L(1) add L(2) = 4.3615, = 0.8979; Optimum Worsen Add L(2) add L(3) add L(4) = 0.8979, = 0.8937; = 0.8275; Optimum Worsen Add L(4) add L(5) add L(6) = 0.8275, = 0.6702; = 0.5025; Optimum Worsen Add L(6) add L(7) = 0.5025, = 0.4331; Optimum Worsen Add L(7) add L(8) = 0.4331, = 0.3188; The final average aspect ratio = 1.5991. Add L(8) = 0.3188; Figure 3.2. Original squarified algorithm. 11 Example 2: Compare the layout of the rectangles along the best of shorter or longer sides and the layout of the rectangles along the shorter side. The following example demonstrates that modifying the original squarified algorithm by laying out the items along the best of shorter or longer sides can achieve a better average aspect ratio. The example consists of an 8-item data set whose values are generated randomly (using a log-normal distribution). Data set = (0.9374, 3.7711, 0.8815, 2.1927, 0.3901, 0.8526, 0.8410, 1.2385) The algorithm first splits the initial rectangle along the shorter side, and then keeps adding the next rectangles to fill the subdivision until the aspect ratio of rectangles added into this subdivision becomes worse, at which time the result of one step back would be considered as the result of layout along the shorter side of this subdivision. Next the algorithm goes back to the initial state to split the initial rectangle along the longer side, and then keeps adding the next rectangles to fill the subdivision until the aspect ratio of rectangles added into this subdivision becomes worse, at which time the result of one step back would be considered as the result of layout along the longer side of this subdivision. Then the algorithm compares the 2 results of layouts along the shorter side and along the longer side, accepts the better one as the result of this subdivision. Continuing from this result, the algorithm starts processing the next subdivision of the next initial rectangle in the same manner of choosing the better of the shorter and longer sides of partition. The algorithm repeats these subdivisions steps until it completes the processing of all rectangles. The steps of processing all the rectangles are demonstrated in Figure 3.3. During the layout of each row, the algorithm tries the laying out along the shorter side and along the longer side; and then, it selects the one with a smaller aspect ratio. (The details of the modified algorithm are given in Appendix). For the given data set, the modified algorithm generates a treemap with an average aspect ratio of 1.4505, which is better than the value (2.0012) obtained by the original squarified algorithm as shown in Figure 3.4. 12 Optimum Worsen Add L(1) add L(2) add L(3) = 3.7711, = 2.1927, = 1.2385; Better Optimum Worsen Optimum Worsen Add L(3) add L(4) Add L(3) add L(4) add L(5) add L(6) = 1.2385, = 0.9374; = 1.2385, = 0.9374, = 0.8815, = 0.8526 Optimum Worsen The final average aspect ratio = 1.4505. Add L(6) add L(7) Add L(6) add L(7) add L(8) = 0.8526, = 0.8410; = 0.8526, = 0.8410, = 0.3901. Figure 3.3 Layouts along the best of shorter or longer side. 13 Optimum Worsen Add L(1) add L(2) add L(3) = 3.7711, = 2.1927, = 1.2385; Optimum Worsen Add L(3) add L(4) = 1.2385, = 0.9374; Optimum Worsen Add L(4) add L(5) add L(6) = 0.9374, = 0.8815, = 0.8526; Optimum Worsen Add L(6) add L(7) add L(8) = 0.8526, = 0.8410, = 0.3901; The final average aspect ratio = 2.0012. Add L(8) = 0.3901; Figure 3.4 Original squarified algorithm. 14 Example 3: Compare ordering data elements in increasing and decreasing order The following example demonstrates that modifying the original squarified algorithm by changing the ordering of the data elements to increasing (instead of decreasing) order can also achieve a better average aspect ratio. The example consists of an 8-item data set whose values are generated randomly (using a log-normal distribution). This set will be placed in an initial square whose area is equal to the sum of all 8 elements. The elements of this data set are as follows. Data set = (5.9175, 3.1784, 1.2964, 0.7290, 0.4724, 4.1725, 0.8795, 3.0248) We still use the same idea of a recurrence as in the original algorithm [2]; however, elements will be ordered and filled in increasing order to obtain a treemap with an average aspect ratio of 1.4637 as shown below in Figure 3.5. This is an improvement over the original algorithm which orders and fills elements in decreasing order as this leads to a treemap with an average aspect ratio equal to 1.8368 as shown in Figure 3.6. 15 Optimum Worsen Add L(1) add L(2) add L(3) add L(4) add L(5) = 0.4724, = 0.7290, = 0.8795, = 1.2964, = 3.0248; Optimum Worsen Add L(5) add L(6) add L(7) = 3.0248, = 3.1784, = 4.1725; Optimum Worsen Add L(7) add L(8) = 4.1725, = 5.9175; The final average aspect ratio = 1.4637. Add L(8) = 5.9175; Figure 3.5 Modified squarified algorithm by ordering elements in increasing order. 16 Optimum Worsen Add L(1) add L(2) add L(3) = 5.9175, = 4.1725, = 3.1784; Optimum Worsen Add L(3) add L(4) = 3.1784, = 3.0248; Optimum Worsen Add L(4) add L(5) = 3.0248, = 1.2964; Optimum Worsen Add L(5) add L(6) = 1.2964, = 0.8795; Optimum Worsen Add L(6) add L(7) add L(8) = 0.8795, = 0.7290, = 0.4724; The final average aspect ratio = 1.8368. Add L(8) = 0.4724; Figure 3.6 Original squarified algorithm. . 17 Example 4: Compare ordering data elements in random and decreasing order The following example demonstrates that a better average aspect ratio can also be achieved by the modified squarified algorithm using random ordering of the data elements instead of decreasing order. The example consists of an 8-item data set whose values are generated randomly (using a log-normal distribution). This set will be placed in an initial square whose area is equal to the sum of all 8 elements. The elements of this data set are as follows. Data set = (0.2748, 0.4382, 0.3095, 0.4082, 1.6329, 0.3436, 0.4356, 1.4120) The same idea of a recurrence as in the original algorithm [2] is still used. However, random ordering of data elements is used when filling in subrectangles to obtain a treemap with an average aspect ratio of 1.2547 as shown below in Figure 3.7. This is smaller than that obtained by the original algorithm which orders and fills elements in decreasing order as this leads to a treemap with an average aspect ratio equal to 1.4843 as shown in Figure 3.8. 18 Optimum Worsen Add L(1) add L(2) add L(3) add L(4) add L(5) = 0.2748, = 0.4382, = 0.3095, = 0.4082, = 1.6329; Optimum Worsen Add L(5) add L(6) = 1.6329, = 0.3436; Optimum Worsen Add L(6) add L(7) add L(8) = 0.3436, = 0.4356, = 1.4120; The final average aspect ratio = 1.2547. Add L(8) = 1.4120. Figure 3.7 Modified squarified algorithm by using random ordering of data elements. . 19 Optimum Worsen Add L(1) add L(2) add L(3) = 1.6329, = 1.4120, = 0.4382; Optimum Worsen Add L(3) add L(4) add L(5) = 0.4382, = 0.4356, = 0.4082; Optimum Worsen Add L(5) add L(6) add L(7) = 0.4082, = 0.3436, = 0.3095; Optimum Worsen Add L(7) add L(8) = 0.3095, = 0.2748; The final average aspect ratio = 1.4843. Add L(8) = 0.2748. Figure 3.8 Original squarified algorithm. 20 Example 5: Compare ordering data elements in the best of decreasing or increasing order and in decreasing order. The following example demonstrates that modifying the original squarified algorithm by ordering the data elements in the best of increasing or decreasing order achieves a better average aspect ratio. The example consists of an 8-item data set whose values are generated randomly (using a log-normal distribution). Data set = (1.7476, 2.9099, 1.1319, 0.8325, 2.0535, 0.5614, 0.7500, 1.0990) The algorithm first orders the data elements in decreasing order, and then keeps filling the next item into the subdivision until the aspect ratio of the corresponding constituent rectangles filled into this subdivision becomes worse, at which time the result of one step back would be considered as the result of ordering in decreasing order the data elements filled in this subdivision. Next the algorithm goes back to the initial state to order the data elements in increasing order, and then keeps filling the next item into the subdivision until the aspect ratio of corresponding constituent rectangles filled into this subdivision becomes worse, at which time the result of one step back would be considered as the result of ordering in increasing order the data elements filled in this subdivision. Then the algorithm compares the 2 results of orderings in decreasing order and in increasing order, accepts the better one as the result of this subdivision. Continuing from this result, the algorithm starts processing the next subdivision filling in the remaining data elements in the same manner of choosing the better of the increasing or decreasing order of data. The algorithm repeats these subdivisions steps until it completes the processing of all data elements. The steps of processing all the data elements are demonstrated in Figure 3.9. During the processing of each row, the algorithm tries both ordering of data elements in decreasing order and in increasing order; and then, it selects the ordering that leads to a smaller aspect ratio. (The details of the modified algorithm are given in Appendix). For the given data set, the modified algorithm generates a treemap with an average aspect ratio of 1.3586, which is better than the value (1.5674) obtained by the original squarified algorithm as shown in Figure 3.10. 21 Optimum Worsen Optimum Worsen Add L(1) add L(2) add L(3) Add L(8) add L(7) add L(6) add L(5) add L(4) = 2.9099, = 2.0535, = 1.7476; = 0.5614, = 0.7500, = 0.8325, = 1.0990, = 1.1319; Optimum Worsen Optimum Worsen Add L(3) add L(4) Add L(8) add L(7) add L(6) = 1.7476, = 1.1319; = 0.5614, = 0.7500, = 0.8325; Optimum Worsen Optimum Worsen Add L(3) add L(4) Add L(6) add L(5) add L(4) = 1.7476, = 1.1319; = 0.8325, = 1.0990, = 1.1319; Optimum Worsen Optimum Worsen Add L(3) add L(4) Add L(4) add L(3) = 1.7476, = 1.1319; = 1.1319, = 1.7476; The final average aspect ratio = 1.3586. Add L(4) Add L(4) = 1.1319; = 1.1319. 22 Figure 3.9 Modified squarified algorithm by ordering in the best of decreasing or increasing order. Optimum Worsen Add L(1) add L(2) add L(3) = 2.9099, = 2.0535, = 1.7476; Optimum Worsen Add L(3) add L(4) = 1.7476, = 1.1319; Optimum Worsen Add L(4) add L(5) add L(6) = 1.1319, = 1.0990, = 0.8325; Optimum Worsen Add L(6) add L(7) = 0.8325, = 0.7500; Optimum Worsen Add L(7) add L(8) = 0.7500, = 0.5614; The final average aspect ratio = 1.5674. Add L(8) = 0.5614. Figure 3.10 Original squarified algorithm. 23 4. Statistics experiments The experiments in the last section show that the modified algorithms can achieve a better aspect ratio than the original algorithm. We can also cite many other examples; however, on average how much improvement of the aspect ratio can be achieved on a sufficiently large sample space? What percentages of cases have an improved aspect ratio? Only when there is a relatively large improvement of aspect ratio on average, or a relatively large percentage of improved aspect ratio in statistics, these ideas and modifications would be of value. After all, the original squarified algorithm has already achieved a reasonably good solution. If there is no significant improvement to the aspect ratio on average, but there is a considerable percentage that the modified squarified algorithms have improved the aspect ratio, as compared to the original squarified algorithm, it is possible to complement and use the original squarified algorithm in conjunction with the modified squarified algorithms to improve the aspect ratio. However this will inevitably cost greater time, just like the idea of layout along the best of shorter and longer side and the idea of ordering in the best of decreasing and increasing order, which would incur twice the running time. In this section we will give further analysis and experiment on the idea of comparing layout along shorter, longer, and the best of shorter and longer side, and the idea of comparing ordering the data elements in decreasing, increasing, random and the best of decreasing and increasing order. 4.1. Comparing the layout along the shorter, longer and best of shorter and longer side There are circumstances that lead to better results for each of these cases. In this section we perform the statistics experiment to see which of these layout leads to improved aspect ratio statistically. The original squarified algorithm (OSAS) layouts the rectangle always along the shorter side on each row. We have shown an example that the result of layout the rectangle always along the longer side (MSAL) is better, although it makes the initial rectangle in the next 24 row less square-like. This is because it is not always true the more square-like is the initial rectangle, the better the result. Even if the initial rectangle is long, filling it up with the necessary number of elements may still produce many square-like constituent rectangles. We are going to conduct statistics experiment on many data to compare it with the original algorithm. The 2nd modified algorithm is laying out along the best of the shorter and the longer sides (MSAB). The similarity is that we do 2 layouts along shorter side and longer side at the same time on each row. The difference is that we then compare the worst aspect ratio of rectangles added in the current row of layout along shorter side and along longer side. Then, we layout the current row along the side that produces a smaller worst aspect ratio of rectangles in the current row. We can also think of the layout along the best of shorter and longer side to find the lower aspect ratio as a method based on heuristics information computed from the specific data. In order to improve the final average aspect ratio of laying out the rectangles, we adopt a local evaluation of the 2 layouts along longer and shorter sides in each row, to decide which side to layout each row. The algorithm utilizes this specific heuristic knowledge of aspect ratio to control the layout of the rectangles in current row so as to improve the aspect ratio. The difference from the original squarified algorithm that always layout along the shorter side (OSAS) and the modified squarified algorithm which always lay out along the longer side (MSAL), is that MSAB can dynamically select which side to layout the current row based on knowledge of which side leads to a smaller aspect ratio. Statistics trials We use the five scales of data set: 8×3, 10×2, 20×1, 50×l, and 100×1 as the simulation input, in which the number of data items per level are 8, 10, 20, 50 and 100. To the five scales of data set, we conduct 100 tests, each consists of 100 steps. At the beginning of each test, we generate random data that obey log-normal distribution (exponentiation of a normal distribution with mean value 0 and variance 1). The log-normal distribution is commonly used to represent naturally occurring positive-valued data [1, 4, 6] (Other 25 experiments have used similar data distribution [1, 6]). At the beginning of each step, the elements of the data set are multiplied by a random variable ex (x is from a normal distribution with mean value 0 and variance 0.05), to simulate the noise [1, 6]. The final result takes the average value of 100 times of tests of 100 steps each. In all the experiments followed, the simulation data are also generated in this way. We compare layout along the longer side, and along the best of the shorter and longer side, with layout along the shorter side as in the original squarified algorithm. The experiment results are as shown in Figure 4.1. Figure 4.1. Results of 1st group of squarified algorithms. The ordering of data is in decreasing order (OD). The layouts of rectangles are along the shorter (OSAS), longer (MSAL) and best of shorter and longer side (MSAB). From the figure we can see that to all five scales of datasets with various number of data items per level, the final average aspect ratio generated by the modified algorithm that layout along the longer side (MSAL) are a lot higher (3.5 ~ 9), and therefore the aspect ratio improvement is negative (-700% ~ -4000%). The percentage of achieving better aspect ratio represents the proportion of treemap with improved final average aspect ratio generated by corresponding algorithm on datasets with various numbers of items per level. As we can see, the proportion of treemap with improved aspect ratio produced by layout along longer side is very small (around 4 per cent on the datasets with 20 items per level and less than 0.3% on the datasets with 8, 10, 50 and 100 items per level). Although we 26 found an example in which layout along the longer side is better than the original algorithm, but from the experiment results, the chance it produces smaller final average aspect ratio is not too good. In experiment conducted, we have also discovered the similarity in layout results between this modified algorithm (MSAL) and the strip algorithm, although this similarity was not expected at the beginning. As is also confirmed now by the statistical experiment, this modified algorithm produces the final average aspect ratio similar to that produced by strip algorithm [1]. Strip algorithm was not designed to achieve the smallest average aspect ratio, but to produce a treemap with smoother change in layout as the data changes. It produces layouts with aspect ratio that falls somewhere in the middle between the slice-and-dice method and squarified treemap. Although the final average aspect ratio of treemaps produced by the idea of layout always along the longer side is not improved in most cases, the hybrid alternative method of trying layout on the best of shorter and longer side have achieved some promising results. As can be seen from figure 4.1, although the average aspect ratio of the 2nd modified algorithm (MSAB) is still not improved (-5% ~ -30%) as compared to the original algorithm, there is significant percentage that it generates a better aspect ratio than the original algorithm (nearly 50% on the datasets with 20, 50, and 100 items per level). This is very significant provided that the average aspect ratio of the original squarified algorithm is already good and has been very close to 1. However the percentage of better aspect ratio on the datasets with 8 and 10 items per level is still relatively low (4% and 27%). In order to further improve the 2nd modified algorithm (MSAB), we need further analysis of this strategy and improvement. Specifically, we introduce the threshold of aspect ratio and the threshold of look ahead to further improve MSAB. 27 4.1.1. Introducing the thresholds of aspect ratio and look ahead to further improve the layout along best of longer and shorter side By comparing both layouts along the shorter side and the longer side, MSAB tries to lower the aspect ratio of the current row. However layout along the longer side results in the initial rectangle of the next row to be less square-like, which is likely to increase the aspect ratio of the next row and eventually increase the final aspect ratio. In order to minimize this negative impact, we introduce a threshold of aspect ratio into MSAB so that it doesn’t layout along the longer side if the aspect ratio of the initial rectangle of the current row is already very high. We also found that at the beginning stage of the algorithm, the layout space is relatively large, even if at this stage, layout the rectangles along the longer side, the remaining space is still relatively spacious and will not have a negative impact on subsequent layout. However, layout the last few rectangles along the longer side is likely to increase the aspect ratio and we should not layout the last few rectangles on the longer side. Thus, we add a threshold of look ahead into MSAB so that it doesn’t layout along the shorter side if the number of elements left is less than threshold of look ahead. The optimal threshold of aspect ratio and threshold of look ahead We conduct experiments to verify our hypothesis about the threshold of the aspect ratio and the threshold of look ahead and measure the improvement to treemap’s final average aspect ratio. We would find out whether any specific range of the two thresholds maybe capable of improving the final average aspect ratio of treemap on various scales of datasets. We search the 2 dimensional spaces of the threshold of the aspect ratio and the threshold of look ahead to observe the improvement of the final average aspect ratio of the treemap produced on different scale of data sets with varying threshold of aspect ratio and threshold of look ahead. We find aspect ratio threshold of 1.6 and the look ahead threshold 2 is an all-around good choice that consistently achieves the optimal positive improvement of aspect ratio for all datasets of various number of data elements and layers except very small fluctuation caused by randomness in data. 28 Figure 4.2 and 4.3 shows the landscape of the aspect ratio improvement on an 8x3 and a 50x1 dataset with varying values of threshold of aspect ratio and threshold of look ahead. The aspect ratio improvement is drawn as a mesh defined over a regularly spaced grid of the 2-d search space of the two thresholds. A contour plot based on the minimum and maximum values of the aspect ratio improvement is also drawn beneath the mesh. Figure 4.2. The landscape of the Figure 4.3. The landscape of the improvement of aspect ratio on 8X3 improvement of aspect ratio on 50X1 dataset with varying thresholds. dataset with varying thresholds. In the figure, the threshold of aspect ratio varies from 1.0 to 2.0, while the threshold of look ahead varies from 0 to 8 (since the number of elements per level in the dataset of 8x3 is only 8). In the figure, we can see the optimal region of the thresholds that achieves the positive improvement of aspect ratio is contained within the search space centering around the optimal threshold (1.6, 2). Also when the threshold of aspect ratio is set to 1.0 or the threshold of look ahead is set to sufficiently large (e.g. 8, the number of elements per level in the dataset of 8x3), it would be the same as the original squarified algorithm that always layout along the shorter side and thus the aspect ratio improvement is 0. When the threshold of look ahead is set to 0 and the threshold of aspect ratio is set to sufficiently large (e.g. larger than 1.8 on the dataset of 8x3), it would be as if the thresholds were not introduced yet into the modified algorithm that layout along the best of longer and shorter side and thus the aspect ratio improvement would still be negative. In addition, when we 29 use both thresholds and set to optimal values, the improvement is larger than using only one of the thresholds (either set the threshold of look ahead to 0 or set the threshold of aspect ratio to infinity). An example illustrating the idea of thresholds The process of the modified squarified algorithm by layout along the best of shorter and longer side with thresholds (MSABT) is similar to the process of MSAB previously described. The difference is just that at the beginning of processing each row it decides whether to try both layout along the shorter side and layout along the longer side by judging the relationship between the aspect ratio of the initial rectangle of current row and the threshold of aspect ratio and the relationship between the number of elements left and the threshold of look ahead. When the aspect ratio of the layout region is less than the threshold of aspect ratio and the number of elements left is larger than the threshold of look ahead, we try both layout along the shorter side and layout along the longer side, and adopt the layout along the side with better result (The details of the algorithm is presented in appendix). To help more intuitive understanding of the role of the optimal thresholds, we also present here an example in which MSABT achieves an improved average aspect ratio as compared to the original algorithm. The example consists of an 8-item data set whose values are generated randomly (using a log-normal distribution). Data set = (1.3954, 0.3736, 1.6940, 3.6211, 0.3106, 1.6804, 0.1658, 0.4058) 30 Optimum Worsen Add L(1) add L(2) add L(3) = 3.6211, = 1.6940, = 1.6804; Optimum Worsen Add L(3) add L(4) = 1.6804, = 1.3954; Optimum Worsen Optimum Worsen Add L(4) add L(5) Add L(4) add L(5) add L(6) = 1.3954, = 0.4058; = 1.3954, = 0.4058, = 0.3736; Optimum Worsen Optimum Worsen Add L(5) add L(6) Add L(5) add L(6) add L(7) = 0.4058, = 0.3736; = 0.4058, = 0.3736, = 0.3106; Optimum Worsen Add L(7) add L(8) = 0.3106, = 0.1658; The final average aspect ratio = 1.5119. Add L(8) = 0.1658. Figure 4.4. Layouts along the best of shorter and longer side with thresholds (MSABT). 31 Optimum Worsen Add L(1) add L(2) add L(3) = 3.6211, = 1.6940, = 1.6804; Optimum Worsen Add L(3) add L(4) = 1.6804, = 1.3954; Optimum Worsen Add L(4) add L(5) = 1.3954, = 0.4058; Optimum Worsen Add L(5) add L(6) = 0.4058, = 0.3736; Optimum Worsen Add L(6) add L(7) add L(8) = 0.3736, = 0.3106, = 0.1658; The final average aspect ratio = 1.9762. Add L(8) = 0.1658. Figure 4.5. Original squarified algorithm. 32 As shown in figure 4.4, the steps of processing all the rectangles are similar to that in figure 3.3. However, during the layout of each row, the algorithm tries to layout along the best of shorter and longer side only when the thresholds conditions are satisfied; if not, it simply layout along the shorter side. As shown, it skips layout along the best of shorter and longer side on the 2nd row where the aspect ratio threshold condition is not met and on the last two rows where the look ahead threshold condition is not met. For the given data set, the modified algorithm generates a treemap with an average aspect ratio of 1.5119, which is better than the value (1.9762) obtained by the original squarified algorithm as shown in Figure 4.5. Statistics trials We conducted statistical experiments to compare MSABT with the original squarified algorithm. The results of the average aspect ratio improvement and the percentage of better aspect ratio of the algorithms are as shown in Figure 4.6 (we also included the results of MSAL and MSAB to make it easy to compare them together). Figure 4.6. Results of 1st group of squarified algorithms. The thresholds of aspect ratio and look ahead are introduced into the variant of algorithm that layout along the best of shorter and longer side (MSABT). 33 In the experiments, we have applied the optimal thresholds as introduced and discussed to further improve the aspect ratio. From Figure 4.6., it is not difficult to see, after the introduction of the threshold of aspect ratio and the threshold of look ahead, the improvement of the final average aspect ratio over the original algorithm are greater than 10% for all data sets. The percentages of better aspect ratio are greater than 70% on all datasets. The data set of 8×3 (with 8 items per level) is now the data set with the biggest percentage of better aspect ratio, reaching 99 percent. 4.2. Compare ordering data elements in decreasing, increasing, random and the best of decreasing or increasing order The original squarified algorithm orders the data elements in decreasing order. In example 3 in section 3, we showed that ordering the data elements in increasing order results in a treemap with improved aspect ratio, as compared to the original squarified algorithm. The initial rectangle is large at the beginning of the algorithm, filling it up with many small elements may produce many square-like subrectangles at the beginning. We can change the ordering of data elements in all three variants of squarified algorithms that order data in decreasing order (OD) and layout along various sides (OSAS, MSAL, MSABT) as discussed in section 4.1. Thus, we would have a new group of algorithms (OI+*) in parallel to the 1st group of squarified algorithms that order data elements in decreasing order (OD+*) as already discussed. The input would be processed in a different order, instead of sorting and adding the list of data items in decreasing order, we would sort and add them in increasing order. However the recurrences would be the same as those of the corresponding squarified algorithms. Similarly, we would also have another group of algorithms in parallel by changing the ordering of data elements to random order (OR+*). Lastly, we would also have a group of algorithms by ordering data in the best of decreasing or increasing order (OB+*). The classifications of all twelve variants are shown in Table 4.1. 34 Table 4.1. 2 ways of classification of variants of squarified algorithms by layout side and by data ordering, and their combinations. The side along which layout of constituent subrectangles is performed Along shorter side Along longer side Along best of longer and shorter side and with thresholds Ordering in Ordering in dereasing order Ordering in decreasing Decreasing order decreasing order + + The modified squarified order + The modified Original squarified algorithm by layout along squarified algorithm algorithm by layout best of longer and shorter by layout along longer along shorter side side and with thresholds side (OD+MSAL) (OD+OSAS) (OD+MSABT) The ordering in which data elements are filled Ordering in increasing order Increasing order Ordering in Ordering in increasing + layout along best of increasing order + order + layout along longer and shorter side and layout along shorter shorter side with thresholds side (OI+OSAS) (OI+MSAL) (OI+MSABT) Ordering in random Ordering in random Ordering in random order + Random order order + layout along order + layout along layout along best of longer shorter side shorter side and shorter side and with (OR+OSAS) (OR+MSAL) thresholds (OR+MSABT) Ordering in best of Best of decreasing or Ordering in best of Ordering in best of increasing order decreasing and increasing decreasing and decreasing and order + layout along best of increasing order + increasing order + longer and shorter side and layout along shorter layout along shorter with thresholds side (OB+OSAS) side (OB+MSAL) (OB+MSABT) In this section we compare ordering data elements in decreasing, increasing, random and best of decreasing or increasing orders, to find the improvement to the average aspect ratio 35 and the percentage where each of these outperforms the original algorithm. We will conduct statistics experiment to verify which of these ideas works best on large data space. Statistics trials Similarly using the experiment process and the experiment method described before, we compare the original squarified algorithm with the three new groups of modified squarified algorithms by ordering data elements in increasing order (OI+*), random order (OR+*) and the best of decreasing and increasing order (OB+*). In the tests of three of the variants of modified algorithms that layout along the best of shorter and longer side with thresholds (OI+MSABT, OR+MSABT, OB+MSABT), we have also applied the optimal thresholds to control and improve the aspect ratio as in the experiments in section 4.1.1. Figure 4.7 shows the experiment results. As can be seen from figure 4.7, the 2nd group of three modified algorithms by ordering data elements in increasing order (OI+*) does not improve the aspect ratio, as compared to the original algorithm on the datasets with various numbers of items per level. The aspect ratio improvement is negative (-20% ~ -100%). The percentage that it produces a better aspect ratio than the original algorithm is also relatively small (less than 25% on the datasets with 10, 20, and 50 items per level, and less than 3% on the datasets with 8 and 100 items per level). Although in some cases ordering data elements in increasing order is better than in decreasing order, in majority of cases, its aspect ratio is worse than the original squarified algorithm. From the figure 4.7, the 3rd group of three modified algorithms that order data elements in random order (OR+*) produces much worse aspect ratio, on all five scales of datasets with various numbers of data items per level. The aspect ratio improvement is negative and very poor (-150% ~ -1000% or worse). As we can see, the percentage of cases with better aspect ratio compared to the original algorithm are very little (less than 0.04%) on all datasets with various number of items per level. 36 Figure 4.7. Results of various squarified algorithms combining different orderings of data and layouts of rectangles. The orderings of data are in decreasing (OD), increasing (OI), random (OR) and best of decreasing and increasing order (OB). The layouts of rectangles are along the shorter (OSAS), longer (MSAL) and best of shorter and longer side (MSABT). Although the idea of changing the ordering of data elements to increasing or random order does not improve the aspect ratio on average, our experiment results show that the 4th group of three algorithms of ordering data in the best of the decreasing or increasing order (OB+*) further improves the aspect ratio, as compared to the 1st group of algorithms that order data in decreasing order (OD+*). Also, two of the new modified algorithms by ordering data elements in best of decreasing and increasing order (OB+OSAS, 37 OB+MSABT) both improve the aspect ratio as compared to the original algorithm on these datasets with various numbers of items per level. We can see that ordering data elements in the best of decreasing and increasing order and layout along the shorter side (OB+OSAS) produces significant improvement on the aspect ratio, as compared to the original algorithm (over 3% on the datasets with 100 items per level, more than 5% and 7% on the datasets with 50 and 20 items per level, and around 10% on the datasets with 8 and 10 items per level). The percentage it generates a better aspect ratio than the original algorithm is also relatively large (around 97% on the datasets with 8 and 10 items per level, and 88% on the datasets with 20 and 50 items per level, and over 50% on the datasets with 100 items per level). In the case of OB+MSABT, the aspect ratio improvement (16% ~ 18%) is the largest ever among all variants of algorithm we experimented. The percentage it generates a better aspect ratio than the original algorithm is also very large (around 80% ~ 100%) on the datasets with various items per level. Compare to ordering in decreasing order and layout along the best side (OD+MSABT), the improvement by ordering data elements in the best of decreasing and increasing order (OB+MSABT) is also significantly larger, approximately equal to the sum of improvement by ordering data elements in the best of decreasing and increasing order (OB+OSAS) and by layout along the best side (OD+MSABT). From the figure 4.7, the new modified algorithm that orders data elements in the best of decreasing and increasing order and layout along the longer side (OB+MSAL) still produces very poor aspect ratio, on all five scales of datasets with various numbers of data items per level. The aspect ratio improvement is still negative and very large (ranging from -600% ~ -4000%), though slightly better than that of OD+MSAL. As we can see, the percentage of cases with better aspect ratio compared to the original algorithm is still very little (1% ~ 4% or less) on all datasets with various number of items per level. 38 4.2.1. Using the best of ordering data elements in increasing and decreasing order exclusively (ExOB) Ordering data elements in increasing order does not improve the average aspect ratio of the treemap produced in a large sample space. However, there are still some circumstances that ordering data elements in increasing order lead to better results, though the percentage is relatively small. In order to further compare different ways to utilize the idea of ordering data elements in increasing order, we have also tried using the best of ordering data elements in decreasing order and in increasing order exclusively (ExOB). We calculate the final average aspect ratio of the treemap produced by exclusively ordering data elements in increasing order and by exclusively ordering data elements in decreasing order, and then take the final result with a smaller final average aspect ratio as the produced treemap. Such a simple combination of two orderings would not be worse than the original algorithm in any situations. The similarity to ordering data in the best of decreasing and increasing order nonexclusively (OB) as described in example 5 in section 3 is they both do approximately twice the work on one dataset bringing twice the run-time cost of the original algorithm. We may as well begin with ExOB+OSAS, which uses the best of the original squarified algorithm (OD+OSAS) and ordering data elements in increasing order (OI+OSAS), to find the improvement to the average aspect ratio. We conduct experiment to compare aspect ratio of ExOB+OSAS with the original squarified algorithm. The result is shown in figure 4.8. 39 OD+OSAS OI+OSAS ExOB+OSAS (a) (b) Figure 4.8. Comparison of the aspect ratio produced by the original squarified algorithm (OD+OSAS) with that of ExOB+OSAS, which uses the best of the original squarified algorithm (OD+OSAS) and ordering data elements in increasing order (OI+OSAS). We can see that the improvement on the average aspect ratio of the treemap produced by ExOB+OSAS is quite small (less than 3.3% on the datasets with 20 items per level, less than 1.5% on the datasets with 10 and 50 items per level, and less than 0.2% on the datasets with 8 and 100 items per level). The improvement is much less and very inefficient, as compared to that of ordering in the best of decreasing and increasing order (OB+OSAS) shown in figure 4.7. Thus, we did not follow up on this idea (such as using the best of ordering of OD+MSABT and OI+MSABT exclusively). We certainly could supplement the original squarified algorithm with all the variants of modified algorithms if a low running time is not required while a particularly low aspect ratio is required. Although using the best of ordering data elements in increasing and decreasing order exclusively (ExOB) would not be worse than ordering data in decreasing order (OD), the improvement on aspect ratio from the experiment results (using the best of ordering of OD+OSAS and OI+OSAS exclusively, i.e. ExOB+OSAS) is not so efficient as that of ordering data in the best of decreasing or increasing order non-exclusively (OB+OSAS), given that they both spend approximately twice the expense of running time of the original algorithm (OD+OSAS). Thus, it is better to use ordering of data in the best of decreasing or 40 increasing order non-exclusively (OB+*) as presented in example 5 in section 3 and figure 4.7. 4.3. Metrics for treemap layout: aspect ratio distribution To have a more comprehensive understanding of the performance, we also compared the general distribution of the aspect ratio of treemaps produced by the twelve algorithms. Figure 4.9 shows the distribution of the aspect ratios attained in each meaningful category of Low-Medium-High aspect ratios under various numbers of elements per level for twelve algorithms. The height of white, gray and black portion in each bar indicate the percentage of low (white), medium (gray), and high (black) aspect ratios respectively. Here we distinguish between the low aspect ratios and the medium aspect ratios according to the representative aspect ratio (4/3 or approximately 1.333) of standard computer display (e.g. found in resolution of 1024x768). And the high aspect ratios and the medium aspect ratios are distinguished by the representative aspect ratio (16/9 or approximately 1.778) of the widescreen computer display (e.g., found in resolutions of 1280x720 and 1920x1080). In our definition, the low aspect ratio represents the rectangle whose visual impression is very close to the square and is good. In the range of low aspect ratio, the users are operating on the approximate square. The medium aspect ratio is within the range of the aspect ratio of the standard computer display and the widescreen computer display. In the range of medium aspect ratio, the rectangle is neither square like nor too slender, thus is still acceptable to the users. However, the high aspect ratio is out of the range of the aspect ratio of the widescreen computer display. In the range of high aspect ratio, the rectangle is obviously not square like and hard for the users to operate on and compare the area of two rectangles. Therefore we should, if at all possible, minimize the percentage of high aspect ratio result that has a bad visual effect. In essence, an algorithm that improves the aspect ratio is needed, to best meet the requirement on aspect ratio, we hope to attain more distribution of the low aspect ratio, and minimize the distribution of the high aspect ratio. 41 Figure 4.9. Distribution of the aspect ratios of various squarified algorithms that combine different ordering of data and layout of rectangles. The orderings of data are in decreasing (OD), increasing (OI), random (OR) and best of decreasing and increasing order (OB). The layouts of rectangles are along the shorter (OSAS), longer (MSAL), and best of shorter and longer side (MSABT). In general, the aspect ratio distribution of treemaps produced by squarified algorithms is better on datasets with more elements per level (except the 3rd group of modified algorithms of ordering data in random order, i.e. OR+*). As can be seen in figure 4.9, for the 3 groups of algorithms (OD+*, OI+*, OB+*), the distribution of the low (white) aspect ratio increases steadily from within 50% to over 80% and begins to saturate as the number 42 of data items per level increases from 8 and reaches 100. The reason of this is that more items per level allow more flexibility computing rectangles’ layout by the algorithm. For example, when the number of elements per level is 20, 50, and 100 (corresponding to the data set of 20×l, 50×1 and 100×1), the majority (or over 50%) of the aspect ratio distribution is within the low (white) aspect ratio category for all three algorithms while the distribution on the high (black) aspect ratio is very small (20% or less). Smaller number of items per level offer less flexibility resulting in more distribution in high aspect ratio. For example, when the number of elements per level is smaller, i.e. 8 and 10 (corresponding to 8×3, 10×2 datasets), the distribution on the high (black) aspect ratio is nearly 20% or more while the aspect ratio distribution within the low (white) aspect ratio category is less than 55% for these three groups of algorithms (OD+*, OI+*, OB+*). As figure 4.9 shows, the modified squarified algorithms that layout along longer side (*+MSAL) has more distribution within the high (black) aspect ratio category and less distribution within the low (white) aspect ratio category, as compared to layout along longer side (*+OSAS) on all datasets. In comparison, the overall distribution of the aspect ratio of the modified squarified algorithms by layout along best of longer and shorter side with threshold (*+MSABT) is less concentrated in the high (black) aspect ratio with rather poor display effect. While in the category of the low (white) aspect ratio with the best visual effect, it accounts for bigger proportion, as compared to the algorithms that layout along the shorter side (*+OSAS) on all datasets with varying number of items per level. This is also consistent with our observation that MSABT produces relatively low (white) aspect ratio result even though sometimes OSAS produces the treemap with high (black) aspect ratio and poor visibility. MSABT not only significantly improves the average aspect ratio, the overall aspect ratio distribution is also improved, as compared to OSAS. With the introduction of the thresholds and carefully choosing the optimal threshold values, it effectively controls the majority of aspect ratio distribution within the category of the low (white) aspect ratio. In the category of low (white) aspect ratio with good visual effect which is more acceptable to the user, it has larger proportion than OSAS. 43 Comparing the 2nd group of modified algorithms of ordering data elements in increasing order (OI+*) with the 1st group of corresponding variants of ordering in decreasing order (OD+*), the 2nd group of modified algorithms have more distribution in the category of the high (black) aspect ratio with rather poor display effect. While in the category of the low (white) aspect ratio with the best visual effect, the distribution is similar or less on all datasets with varying number of items per level. Moreover, comparing to the original squarified algorithm (OD+OSAS), the 2nd group of modified algorithms (OI+*) all account for bigger proportion in the category of the high (black) aspect ratio and less in the category of the low (white) aspect ratio. Thus, it not only does not improve the aspect ratio, the overall aspect ratio distribution is also worse than the original squarified algorithm. As shown, the aspect ratio distribution of the 3rd group of modified algorithms of ordering data elements in random order (OR+*) are much more concentrated in the high (black) aspect ratio category and much less distributed within the low (white) aspect ratio category on all datasets than that of either ordering data in decreasing order or in increasing order. Therefore the results of ordering data elements in random order is much worse than either ordering data elements in increasing or decreasing order in terms of both achieving smaller average aspect ratio and the overall aspect ratio distribution. Comparing the last group of modified algorithms of ordering data elements in the best of decreasing or increasing order (OB+*) with the corresponding variants of ordering in decreasing order (OD+*), the 4th group of modified algorithms have slightly less distribution in the category of the high (black) aspect ratio with rather poor display effect. While in the category of the low (white) aspect ratio with the best visual effect, the distribution is slightly more on all datasets with varying number of items per level. Moreover, two variants of the 4th group (OB+OSAS, OB+MSABT) both account for less proportion in the category of the high (black) aspect ratio with poor display effect, as compared to the original squarified algorithm (OD+OSAS). While in the category of the low (white) aspect ratio with the best visual effect, the distribution is slightly more on all datasets with varying number of items per level. Thus, these two new variants of modified 44 algorithms not only improve the aspect ratio, the overall aspect ratio distribution is also better than the original squarified algorithm. We can conclude that the results of ordering data elements in the best of decreasing and increasing order (OB) is better than ordering data elements in decreasing, increasing and random order (OD, OI, OR) in terms of both achieving smaller average aspect ratio and the overall aspect ratio distribution. 4.4. Performance time comparison We have implemented the algorithms described above to conduct experiment to measure the average run-time performances of the algorithms compared. As before, we used 100 tests (each test consists of 100 steps) to measure the average time on various scales of dataset. Here we consider only the time difference between the different variations of squarified algorithms without including the time for rendering the 2d graphics and other common overhead such as generating random data and counting the statistics of percentage of better aspect ratio and distribution etc that used to take a lot more time. We run tests of the algorithms implemented using Java 1.4.2 on a 1.80 GHz AMD Sempron computer running Windows XP. We used four scales of dataset: 20×1, 50×1, 10×2 and 8×3. The total numbers of data elements in each dataset are 20, 50, 100, and 512 respectively. We plotted both the total number of data items and the running time in log scale. The results are as shown in Figure 4.10. We have included only four out of the twelve algorithms (OD+OSAS, OD+MSABT, OB+OSAS, OB+MSABT), since the other variants did not improve the aspect ratio. (Also the run-time of 2nd and 3rd groups of modified algorithms, i.e. OI+*, OR+*, are similar to the 1st group of modified algorithms, i.e. OD+*. The run- time of modified algorithms that layout along the longer side, i.e. *+MSAL, are similar to that of layout along the shorter side, i.e. *+OSAS.) It can be seen from Figure 4.10, the running time increases approximately linearly according to the total number of data items in a data set. The running times of the original algorithm (OD+OSAS) on each dataset are around 40, 100, 200, and 1000 microseconds respectively. 45 Figure 4.10. The comparison of performance time of four of the algorithms (OD+OSAS, OD+MSABT, OB+OSAS, OB+MSABT). The running time of OD+MSABT is just a little more (30% ~ 40%) than the original squarified algorithm. This is as expected. MSABT tries layout along longer side in addition to layout along shorter side on every row, and thus cost more time. By limiting the condition under which MSABT tries the layout along longer side, the threshold of aspect ratio and the threshold of look ahead improve the aspect ratio, and by the way reduce the algorithm computation time since it eliminates some unnecessary layout on certain rows. If neither threshold condition is met on every row, MSABT doesn’t try the layout along the longer side at all, and the running time would be same as the original algorithm. If both thresholds condition are met on every row, the computation time would be approximately twice that of the original algorithm. Thus the running time of MSABT is greater than that of OSAS, but less than 2 times that of the original algorithm. We can see that the running time of OB+OSAS is approximately twice that of the original squarified algorithm. As it always tries both ordering of data in decreasing and increasing order on every row without any thresholds and then selects the better ordering on the current row, it is equivalent to doing twice the work of the original algorithm. However, ordering of the data elements in the best of decreasing and increasing order also 46 does not change the time complexity of the algorithm, it only increases the runtime by approximately a factor of 2. As can be seen, the running time of OB+MSABT is around 2~3 times that of the original squarified algorithm. Alternatively we can think of it as a little more (30% ~ 40%) than that of OB+OSAS, in the same way the run-time of OD+MSABT is a little more (30% ~ 40%) than that of OD+OSAS. We could also think of it as approximately twice that of OD+MSABT since it tries both orderings and does equivalent of twice the work of MSABT. Considering that it achieves the largest improvement of aspect ratio (almost 50% more improvement than OD+MSABT), twice the run-time of OD+MSABT is still efficient and worth the user’s time. Besides, even for the largest dataset with 512 elements (8×3), the difference is just within a few milliseconds, which is still only a small fraction of time for graphics rendering and other common overhead like data generation and data accessing etc. 5. Conclusion Treemap is a 2d-space filling approach to represent hierarchical structure, the first treemap method (slice-and-dice) has a problem of producing rectangles of high aspect ratio. Squarified algorithm is an extension to treemap method focusing on lowering the aspect ratio. Pivot treemap and strip treemap tries to find the optimal balance of various performance of the aspect ratio, order and the dynamics stability. This project is committed to obtain a smaller aspect ratio, and under this objective proposes several modified algorithms based on squarified algorithms. The problem of subdividing a rectangle so as to achieve the minimum overall aspect ratio of the treemap is an NP problem [2], i.e. we are not able to find the optimal solution in polynomial time. Therefore, it is not practical to search the problem space exhaustively to find the optimal solution. What we are looking for is a heuristic method that could achieve near optimal solution in a reasonable time. In this project, the two proposals about laying out the given subrectangles and the ordering of the data elements were analyzed. A dozen of modified squarified algorithms were proposed to improve the aspect ratio. 47 The modified algorithm for laying out along the longer side (MSAL) is found to be similar to strip algorithm, that is, it is not ideal in improving the aspect ratio over the original squarified algorithm. The modified algorithm of laying out along the best of shorter and longer side (MSAB) produces significant percentage of treemaps with improved aspect ratio over the original squarified algorithm (OSAS) although on average the aspect ratio is still not improved. In order to further improve this idea, we further analyze the properties of this modified algorithm and introduce the threshold of aspect ratio and the threshold of lookahead to remedy the shortcomings of this modified algorithm. The further modified squarified algorithm (MSABT) uses the strategy of layout along best of longer and shorter side and with thresholds, which selects the better optimum from the two layouts when the aspect ratio of the initial rectangle for processing each row and the number of elements left meets the thresholds condition. Tests show that the modified squarified algorithm with thresholds, as compared to the original squarified algorithm, can significantly improve the average aspect ratio (by more than 10% on all datasets) while spending a little more time. Also our results show that using both thresholds achieves larger improvement than using only either one of the threshold. The modified squarified algorithms of ordering the data elements in increasing order (OI+*) does not improve the aspect ratio of treemaps produced by ordering data in decreasing order. And the modified squarified algorithms of using random ordering (OR+*) of the data elements is much worse compared to that by ordering data in either decreasing or increasing order. However, the modified squarified algorithms of ordering data in the best of the decreasing and increasing order (OB+*) further improve the aspect ratio compared to the 1st group of algorithms that order data in decreasing order (OD+*). Specifically, two new modified algorithms (OB+OSAS, OB+MSABT) both improve the aspect ratio compared to the original algorithm. The improvement by OB+OSAS is quite significant (3% ~ 10%), almost half of that by OD+MSABT. The percentage of improved aspect ratio is also large (50% ~ 97%). The improvement by OB+MSABT is the largest (16% ~ 18%) ever among all variants of algorithm we experimented, significantly larger than the improvement (around 10%) by OD+MSABT. The percentage of improved aspect 48 ratio is also the largest (80% ~ 100%). We can think of the improvement by OB+MSABT as superposition of improvement by OB+OSAS and by OD+MSABT. The idea of using the best of ordering of the data elements in increasing and decreasing order exclusively (ExOB+*) is dropped since the improvement of aspect ratio is much smaller and inefficient, as compared to that of ordering of the data elements in increasing and decreasing order nonexclusively (OB+*). Our contribution is proposing various ideas to modify the original squarified algorithm to improve the aspect ratio and thorough comparing variants of the modified algorithms with the original algorithm through analysis and statistics experiments. We further improved the modified squareified algorithm of layout along the best of the shorter and longer side by introducing two thresholds of aspect ratio and look ahead. It results in a relatively large improvement of the aspect ratio at the cost of relatively small amount of time. We also achieved relatively significant improvement of aspect ratio by ordering of data elements in the best of decreasing and increasing order. Finally, we achieved the best modified squarified algorithm (OB+MSABT) by combining the improvement idea of ordering data in the best of decreasing and increasing order and layout along the best of shorter and longer side with thresholds. This is a result of the combination of two good improvement methods of the aspect ratio. Both methods (best ordering, best side) can be regarded as heuristic methods that give direction on which ordering of data elements and which layout of rectangles to proceed. To facilitate our comparison and experiments, we also devised a classification of all twelve variants of modified squarified algorithm by layout side and data ordering. Further improvement on the basis of ordering of data elements in the best of decreasing and increasing order and the layout along the best of shorter and longer side with thresholds (OB+MSABT) would not be time-efficient. Searching the best random ordering of data elements may improve the aspect ratio. However the average aspect ratio of OB+MSABT is already very close to 1, and the strategy of ordering data elements in the best of decreasing and increasing order to find the lowest aspect ratio is the most efficient to find a good ordering in a reasonably short running time. Thus the feasibility of further adjusting 49 the ordering of the data elements is not too good. It also would not have much effect to use OB+MSABT in conjunction with the original squarified algorithm (OD+OSAS), since the percentage of situations where OB+MSABT achieves the improved aspect ratio is very near 100%. Future work may include the implementation of treemap application software integrating practical features such as representing multiple levels directories and files to improve the utilization and management of hard drive and building a friendlier user interface for experiments to make studying the treemap algorithm more convenient. Future work may include also comparison with other treemap algorithms, such as ordered treemap, pivot treemap, strip treemap, etc [1,6] on the metrics of the aspect ratio, order and the dynamics stability etc. The treemap has a great application potential. According to the characteristics of practical problems and user needs, different improvement on treemaps can be custom made to fit real life problems. It is more useful to combine specific problems with the treemap. Quantum treemap and Photomesa [1] are good examples of new directions of specialized treemap algorithm and its practical application. 50 Appendix Algorithms. A. The original squarified algorithm (OSAS) in section 2.2 The original squarified algorithm was given in [2]. We include it here for easier reference, where the list notation (++ is concatenation of lists, [x] is the list containing element x, and [] is the empty list), the functions worst(), layoutrow() and width() etc. are as described in [2]. procedure squarify(list of real children, list of real row, real w) begin real c = head(children); if worst(row, length) >= worst(row++[c], w) then squarify(tail(children), row++[c], w); else layoutrow(row); squarify(children, [], width()); fi end In order to adapt to the modification (MSAB) in appendix C, we have implemented an equivalent representation to the original squarified algorithm by replacing the recurrence of squarify() in the first if branch with a while loop, procedure squarify(list of real children, list of real row, real w) begin real c = head(children); while worst(row, length) >= worst(row++[c], w) do children = tail(children); row = row++[c]; layoutrow(row); squarify(children, [], width()); end 51 B. The modified algorithm (MSAL) in example 1 in section 3. This can be modified as described in example 1 in section 3. The function width() in the original algorithm gives the length of the shorter side of the remaining subrectangle in which the current row is placed. Here we replace it by another function length() that gives the length of the longer side of the remaining subrectangle in which the current row is placed. We also need to replace the function layoutrow() that adds a new row of children to the rectangle along the shorter side by another function layoutrowL() that adds a new row of children to the rectangle along the longer side. procedure squarifyL(list of real children, list of real row, real l ) begin real c = head(children); if worst(row, l ) >= worst(row++[c], l ) then squarifyL(tail(children), row++[c], l ); else layoutrowL(row); squarifyL(children, [], length()); fi end Similarly, to make it easier to accommodate the modification (MSAB) in example 2 in section 3, we also show here an equivalent representation to this modified squarified algorithm by replacing the recurrence of squarifyL() in the first if branch with a while loop, procedure squarifyL(list of real children, list of real row, real l ) begin real c = head(children); while worst(row, l ) >= worst(row++[c], l ) do children = tail(children); row = row++[c]; layoutrowL(row); squarifyL(children, [], length()); end 52 C. The modified algorithm (MSAB) in example 2 in section 3. We give in details the modified algorithm described in example 2 in section 3. In every subdivision [2] of the layout of the rectangles, we try two layout schemes, and adopt the better layout between the two. The process of the algorithm is as follows: procedure squarifyBest(list of real children, real w, real l ) begin list of real childrenL = clone(children); //duplicate list of children list of real rowW = [], rowL = []; //try layout along shorter side in current subdivision real c = head(children); while worst(rowW, w) >= worst(rowW++[c], w) do children = tail(children); rowW = rowW++[c]; c = head(children); //try layout along longer side in current subdivision c = head(childrenL); while worst(rowL, l ) >= worst(rowL++[c], l ) do childrenL = tail(childrenL); rowL = rowL++[c]; c = head(childrenL); //select best side of layout to actually layout the row in current subdivison if worst(rowW, w) >= worst(rowL, l ) then //actually layout along longer side in current subdivision layoutrowL(rowL); children = childrenL; else //actually layout along shorter side in current subdivision layoutrowW(rowW); fi //recurrence of next subdivision squarifyBest(children, width(), length()); end where the function width() and length() give the length of the shorter and longer side of the remaining subrectangle in which the current row is placed. The function layoutrowW() and layoutrowL() add a new row of children to the rectangle along the shorter and longer side. 53 Implementation details As illustrated in Figure 3.3, we apply the algorithm to a simple input of one level tree containing 8 items. On the left of each row are the layout steps of keeping rectangles along the shorter side of the current row (this 1st layout scheme is represented in the blue color). On the right of the current row are the layout steps of keeping rectangles along the longer side (this 2nd layout scheme is represented in a different color, red, to distinguish it from the 1st layout scheme). It must be emphasized that every row of the 2 layout schemes in Figure 3.3 starts from a common initial state. Then, on every row the algorithm try both layout schemes step by step before repeating the process of the next row. The previous row of the 2 layout schemes produced respectively 2 results, the better one is selected as the common initial state on the next row for both layout schemes. Therefore the 2 layout schemes in Figure 3.3 are not 2 completely independent layouts (such as 2 of the monolithic multi-row block of Figure 3.1) put together simply side by side. Indeed, the 2 layout schemes in Figure 3.3 are indispensable to each other down to every row. Explained below are a few more minor details on Figure 3.3, in the 1st row we only needed to try 1 layout scheme (to reduce the unnecessary cost of computation time by the algorithm) since the 2 sides of the initial rectangle (a square) are the same. Another small point to Figure 3.3, in the 3rd row, the 2nd layout scheme completes the layout of all the remaining elements from the input. Moreover, the last rectangle added in the 2nd layout scheme actually lowers the worst aspect ratio of all rectangles added on this row. Indeed, the 2nd layout scheme after the layout of the last rectangle gets a lower worst aspect ratio than the 1st layout scheme. Therefore the algorithm selects the layout result of the 2nd layout scheme on the 3rd row and the process ends on the 3rd row. 54 D. The algorithm (OI+OSAS) in example 3 in section 3. This can be modified as described in example 3 in section 3 by sorting the list of data elements (children) in increasing order before applying the same recurrence of the original squarified algorithm squarify() as presented in appendix A. Here the function sort(list of real children, real dataordering) sorts the list of children in the specified dataordering (‘increasing’ here). procedure squarifyOI(list of real children) begin //ordering data elements in increasing order sort(children, increasing); //apply recurrence of the original squarified algorithm squarify(children, [], width()); end E. The algorithm (OR+OSAS) in example 4 in section 3. This can be modified as described in example 4 in section 3 by scrambling the ordering of the list of data elements (children) if the data ordering is not already in random order (or do nothing if the initial ordering is already random) before applying the same recurrence of the original squarified algorithm squarify() as presented in appendix A. procedure squarifyOR(list of real children) begin //ordering data elements in random order scramble(children) // if the data ordering is not random; // or do nothing if the initial ordering is already random. // apply recurrence of the original squarified algorithm squarify(children, [], width()); end 55 F. The modified algorithm (OB+OSAS) in example 5 in section 3. We give in details the modified algorithm described in example 5 in section 3. In every subdivision [2] of processing the data elements, we try two ordering schemes, and adopt the better of decreasing or increasing ordering. The process of the algorithm is as follows: procedure squarifyBestOrdering(list of real childrenD, real w) begin list of real childrenI = cloneReverse(childrenD); //increasing order children list of real rowD = [], rowI = []; //try processing in decreasing order the items in current subdivision real c = head(childrenD); while worst(rowD, w) >= worst(rowD++[c], w) do childrenD = tail(childrenD); rowD = rowD++[c]; c = head(childrenD); //try processing in increasing order the items in current subdivision c = head(childrenI); while worst(rowI, w) >= worst(rowI++[c], w) do childrenI = tail(childrenI); rowI = rowI++[c]; c = head(childrenI); //select best ordering of data to actually process in current subdivison if worst(rowD, w) >= worst(rowI, w) then //actually process in increasing order in current subdivision layoutrow(rowI); childrenD = cloneReverse(childrenI); else //actually process in decreasing order in current subdivision layoutrow(rowD); fi //recurrence of next subdivision squarifyBestOrdering(childrenD, width()); end where the function cloneReverse() duplicate the list of children in reverse order, i.e. from decreasing to increasing order or vice versa. The functions width() and layoutrow() are as described in [2]. 56 G. The modified squarified algorithm (MSABT) in section 4.1.1. The modified squarified algorithm by layout along best of longer and shorter sides and with thresholds described in section 4.1.1 is as follows: in which if the aspect ratio of the work region is greater than the threshold AR or if the number of element left is less than the threshold N, it skip trying layout along best of the longer or shorter side and simply layout along the shorter side. procedure squarifyBest(list of real children, real w, real l ) begin list of real childrenL = clone(children); //duplicate list of children list of real rowW = [], rowL = []; //try layout along shorter side in current subdivision real c = head(children); while worst(rowW, w) >= worst(rowW++[c], w) do children = tail(children); rowW = rowW++[c]; c = head(children); //judge the threshold condition whether to skip trying layout along longer side if aspectRatio() < AR && number(childrenL) > N then //try layout along longer side in current subdivision c = head(childrenL); while worst(rowL, l ) >= worst(rowL++[c], l ) do childrenL = tail(childrenL); rowL = rowL++[c]; c = head(childrenL); //select best side of layout to actually layout the row in current subdivison if worst(rowW, w) >= worst(rowL, l ) then //actually layout along longer side in current subdivision layoutrowL(rowL); children = childrenL; else //actually layout along shorter side in current subdivision layoutrowW(rowW); fi else //actually layout along shorter side in current subdivision layoutrowW(rowW); fi //recurrence of next subdivision squarifyBest(children, width(), length()); end 57 H. The first 3 groups of algorithms (OD+*, OI+*, OR+*) in section 4.2. The first 3 groups of variants of squarified algorithm by ordering data in decreasing, increasing and random order and layout along various sides as described in section 4.2 can be presented in the following unified form: in which the constants of ‘short’, ‘long’, and ‘best’ (defined for the variable ‘layoutside’) and the constants of ‘decreasing’, ‘increasing’, and ‘random’ (defined for the variable ‘dataordering’) represent laying out the subrectangles along the specified side and ordering the data elements in the specified order. The functions squarify(), squarifyL() and squarifyBest() etc. are as already presented in appendix A, B, and C. The functions sort() and scramble() are as already presented in appendix D, E. procedure squarifyVariants(list of real children, real layoutside, real dataordering) begin if dataordering == decreasing then sort(children, decreasing); else if dataordering == increasing then sort(children, increasing); else // dataordering == random // do nothing if the initial ordering is already random or scramble(children) // if the data ordering is not random; fi if layoutside == short then squarify(children, [], width()); else if layoutside == long then squarifyL(children, [], length()); else if layoutside == best then squarifyBest(children, width(), length()); fi end 58 I. The modified algorithm (OB+MSAL) in section 4.2. We give in details the modified algorithm (OB+MSAL) described in section 4.2. In every subdivision [2] of processing the data elements, we try two ordering schemes, and adopt the better of decreasing or increasing ordering. The difference from OB+OSAS in example 5 in section 3 and appendix F is the layout of the rectangles is along the longer side. The process of the algorithm is as follows: procedure squarifyLBestOrdering(list of real childrenD, real l ) begin list of real childrenI = cloneReverse(childrenD); //increasing order children list of real rowD = [], rowI = []; //try processing the items in decreasing order in current subdivision real c = head(childrenD); while worst(rowD, l ) >= worst(rowD++[c], l ) do childrenD = tail(childrenD); rowD = rowD++[c]; c = head(childrenD); //try processing the items in increasing order in current subdivision c = head(childrenI); while worst(rowI, l ) >= worst(rowI++[c], l ) do childrenI = tail(childrenI); rowI = rowI++[c]; c = head(childrenI); //select best ordering of data to actually process in current subdivison if worst(rowD, l ) >= worst(rowI, l ) then //actually process in increasing order in current subdivision layoutrowL(rowI); childrenD = cloneReverse(childrenI); else //actually process in decreasing order in current subdivision layoutrowL(rowD); fi //recurrence of next subdivision squarifyLBestOrdering(childrenD, length()); end 59 where the function cloneReverse() duplicate the list of children in reverse order, i.e. from decreasing to increasing order or vice versa. The functions length() gives the length of the longer side of the remaining subrectangle in which the current row is placed, and layoutrowL() adds a new row of children to the subdivision along the longer side, as described in appendix B. J. The modified algorithm (OB+MSABT) in section 4.2. We give in details the modified algorithm (OB+MSABT) described in section 4.2. In every subdivision [2] of processing the data elements, we try four combinations of the two ordering schemes (order data in decreasing and increasing order) and two layout side schemes (layout the rectangle along the shorter and longer side), and adopt the best of combination of ordering and layout side schemes. The process of the algorithm is as follows. The function cloneReverse() duplicate the list of children in reverse order, i.e. from decreasing to increasing order or vice versa. The functions width() and layoutrow() are as described in [2]. 60 procedure squarifyBestSideBestOrdering(list of real childrenWD, real w, real l ) begin list of real childrenWI = cloneReverse(childrenWD); //increasing order children list of real childrenLD = clone(childrenWD); //decreasing order children list of real childrenLI = cloneReverse(childrenWD); //increasing order children list of real rowWD = [], rowWI = [], rowLD = [], rowLI = []; //try layout along shorter side and in decreasing order in current subdivision real c = head(childrenWD); while worst(rowWD, w) >= worst(rowWD++[c], w) do childrenWD = tail(childrenWD); rowWD = rowWD++[c]; c = head(childrenWD); //try layout along shorter side and in increasing order in current subdivision c = head(childrenWI); while worst(rowWI, w) >= worst(rowWI++[c], w) do childrenWI = tail(childrenWI); rowWI = rowWI++[c]; c = head(childrenWI); //judge the threshold condition whether to skip trying layout along longer side if aspectRatio() < AR && number(childrenLD) > N then //try layout along longer side and in decreasing order in current subdivision c = head(childrenLD); while worst(rowLD, l ) >= worst(rowLD++[c], l ) do childrenLD = tail(childrenLD); rowLD = rowLD++[c]; c = head(childrenLD); //try layout along longer side and in increasing order in current subdivision c = head(childrenLI); while worst(rowLI, l ) >= worst(rowLI++[c], l ) do childrenLI = tail(childrenLI); rowLI = rowLI++[c]; c = head(childrenLI); 61 //select best side of layout and best ordering of data to actually //layout the row in current subdivison if worst(rowWD, w) <= worst(rowWI, w) && worst(rowWD, w) <= worst(rowLD, l ) && worst(rowWD, w) <= worst(rowLI, l ) then //actually layout along shorter side and in decreasing order layoutrowW(rowWD); elseif worst(rowWI, w) <= worst(rowWD, w) && worst(rowWI, w) <= worst(rowLD, l ) && worst(rowWI, w) <= worst(rowLI, l ) then //actually layout along shorter side and in increasing order layoutrowW(rowWI); childrenWD = cloneReverse(childrenWI); elseif worst(rowLD, l ) <= worst(rowWD, w) && worst(rowLD, l ) <= worst(rowWI, w) && worst(rowLD, l ) <= worst(rowLI, l ) then //actually layout along longer side and in decreasing order layoutrowL(rowLD); childrenWD = clone(childrenLD); elseif worst(rowLI, l ) <= worst(rowWD, w) && worst(rowLI, l ) <= worst(rowWI, w) && worst(rowLI, l ) <= worst(rowLD, l ) then //actually layout along longer side and in increasing order layoutrowL(rowLI); childrenWD = cloneReverse(childrenLI); fi else //select shorter side of layout and best ordering of data to actually //layout the row in current subdivison if worst(rowWD, w) <= worst(rowWI, w) then //actually layout along shorter side and in decreasing order layoutrowW(rowWD); elseif worst(rowWI, w) <= worst(rowWD, w) //actually layout along shorter side and in decreasing order layoutrowW(rowWI); childrenWD = cloneReverse(childrenWI); fi //recurrence of next subdivision squarifyBestSideBestOrdering(childrenWD, width(), length()); end 62 K. The algorithm (ExOB+OSAS) in section 4.2.1. The algorithm using the best of ordering of data elements in decreasing and increasing order exclusively described in section 4.2.1 is as follows, where the function squarifyVariants() is as already presented in appendix H. The global variable ‘finalaverageaspectratio’ stores the final average aspect ratio of the treemap produced by a specified variant of squarified algorithm. procedure squarifyExBestOrdering(list of real children, real layoutside) begin // run the algorithm variant by ordering data elements in decreasing order squarifyVariants(children, layoutside, decreasing); // record the final average aspect ratio of the produce treemap finalaverageaspectratioOD = finalaverageaspectratio; //run the algorithm variant by ordering data elements in increasing order squarifyVariants(children, layoutside, increasing); //record the final average aspect ratio of the produce treemap finalaverageaspectratioOI = finalaverageaspectratio; // Compare the two aspect ratios and select the smaller aspect ratio as result if finalaverageaspectratioOD < finalaverageaspectratioOI then finalaverageaspectratio = finalaverageaspectratioOD; else finalaverageaspectratio = finalaverageaspectratioOI; fi end 63 7. Reference [1] Bederson, B. B. and Shneiderman, B. Ordered and quantum treemaps: making effective use of 2d space to display hierarchies. In ACM Transaction on Graphics, 2002, 21(4):833-554 [2] Bruls, M., Huizing, K. and Van Wijk, J. J. Squarified treemaps. In: Euro graphics / IEEE TVCG 2000 Symposium, 2000, 112-116 [3] Knuth, D. E. The Art of Computer Programming: Volume 1 / Fundamental Algorithms. Addision-Wesley Publishing Co., Reading, MA, 1968. [4] Sheldon, R. A. 1997. A First Course in Probability. Englewood Cliffs, NJ: Prentice Hall. [5] Shneiderman, B. Tree visualization with tree-maps: 2-d space filling approach. ACM Transactions on Graphics, 1992, 11(l):92-99 [6] Shneiderman, B. and Wattenberg, M. Ordered treemap layouts. In IEEE Symposium on Information Visualization, 2001, 73-78 [7] Van Wijk, J. J. and Van de Wetering, H. Cushion treemaps: Visualization of hierarchical information. In IEEE Symposium on Information Visualization, San Francisco, 1999, 78-83 64