Introduction

səhifə	5/13
tarix	24.06.2016
ölçüsü	3.14 Mb.

1 2 3 4 5 6 7 8 9 ... 13

Summary

Data in real world graph drawing applications often change frequently but incrementally. Any drastic change in the graph layout could disrupt a user's “mental map.” Furthermore, real world applications like enterprise process or e-commerce graphing, where data increases rapidly, demand a good response in term of rendering the graph layout in a multi-user environment and in real time mode. Most standard static graph drawing algorithms apply global changes and redraw the entire graph layout whenever the data changes. The new layout may be very different from the previous layout and the time taken to redraw the entire graph degrades quickly as the amount of graph data grows. Dynamic behavior and the quantity of data of real world applications pose challenges for existing graph drawing algorithms in terms of incremental stability and scalability.

Although dynamic graph drawing algorithms have been proposed to accommodate the dynamic behaviors of real world graph drawing applications, dynamic graph drawing algorithms also define several dynamic aesthetic criteria on graph layouts. When followed, these criteria improve the incremental stability of the graph layout but also complicate the reduction of crossing numbers by imposing additional constraints on the layout. There has been little research on the constrained crossing reduction problem. Thus, the problem of minimizing the crossing numbers while adhering to a set of dynamic aesthetic criteria for dynamic graph layout algorithms still needs further research.

The research of this dissertation will develop a heuristic for solving the constrained one-sided crossing reduction problem based on the work by Forster (2004). The goal of the heuristic is to find a balance between the aesthetic criteria and minimizing the edge crossings. A modified version of the on-line dynamic graph drawing framework proposed by North and Woodhull (2001) will also be developed to support the experiment.

The remainder of this thesis is organized as follows. Chapter 2 reviews literature in the graph drawing area that has direct or indirect influence on this research. Chapter 3 describes the methodology of the proposed constrained graph drawing and visualization framework that is based on the work of North (1995) and the modified algorithm for the one-sided two-layered crossing reduction problem based on the work of Forster (2004). Chapter 4 presents the research results, and chapter 5 presents the conclusion of the research and potential future direction if applicable.

Chapter 2

Review of Literature

Introduction

Since the work of this research is influenced by two areas of graph drawing frameworks, namely (1) general algorithms for drawing hierarchical graph layouts and (2) dynamic graph drawing frameworks. The literature review will be divided into two sections; the first section reviews the Sugiyama et al. heuristic and the second section reviews the incremental graph drawing frameworks. The Sugiyama heuristic has four steps, each with its own domain of research. Hence within the first section the review of literature is divided into four sub-sections. Each sub-section reviews the key literature of each step in the Sugiyama heuristic. At the end of each sub-section and section there is a table that summarizes the characteristics of the different algorithms.

The Sugiyama Algorithm

A well-known heuristic for drawing standard hierarchical graph layouts is proposed by Sugiyama, Tagawa, and Toda (1981). This algorithm has four steps as follows:

cycle removal
layer assignment
crossing reduction
coordinate assignment

Cycle Removal

This step is applicable when the input graph has cycles, and ensures that a directed graph is acyclic, which is required in the layer assignment step. To make a cyclic directed graph acyclic, a set of edges is reserved temporarily so that all the edges flow in the same direction. The main problem is to choose the smallest set of edges possible. Figure 12 shows two possible sets of edges that can be reversed to make the given directed graph acyclic. Set A= {(9, 4), (11, 5)} and set B contains all other edges. The optimal solution for the graph in Figure 12 is to reverse set A, which has fewer edges. Figure 13 shows that the directed graph is acyclic after reversing the directions of edges (9, 4) and (11, 5).

Figure 12. A directed graph with cycles

Figure 13. An acyclic directed graph after reversing the set of edges {(9, 4), (11, 5)}

A set of reversed edges in a directed graph is called a feedback set. This problem relates to the well-known problem called the feedback arc set, which is defined as a set of edges whose removal makes the directed graph acyclic (Battista et al., 1999). Although the feedback set simply reverses a set of edges and the feedback arc set removes a set of edges, they have the common goal of finding the minimum set of edges. Hence, algorithms and heuristic solving for a feedback arc set can be applied for a feedback set problem. Unfortunately finding a minimum set of edges is NP-complete (Garey & Johnson, 1979) and the common technique for solving this type of problem is using approximation algorithms. There are three well-known algorithmic approaches in finding approximation solutions: local search, greedy algorithms, and random cuts.

A simple heuristic is to choose an arbitrary ordering, and then reverse the edges that create cycles using either breadth-first search (BFS) or depth-first search (DSF). This heuristic is simple to implement but does not guarantee performance and may yield poor results (Stedile, 2001).

A well-known heuristic for solving the feedback set problem is called Greedy-Cycle-Removal, introduced by Eades, Lin, and Smith (1993). Unlike previous research on the feedback arc set problem, which could provide an optimal solution but with a runtime of O (|V| * |E|), the Greedy-Cycle-Removal (GR) simply finds a “good” vertex sequence which has small R (G) by going through the vertices and eliminating any vertices that have the maximum sum of in and out degrees. The GR runs in linear time and space complexity. Formally, the run time for the Greedy-Cycle-Removal algorithm is O (|V| + |E|) where V is a set of vertices and E is a set of edges.

Demetrescu and Finocchi (2003) present an approximation algorithm based on the Local-Ratio technique, which provides an approximation algorithm for the covering problem. The approximation algorithm consists of two phases. The first phase searches for simple cycles C in the directed graph. If such a cycle exists, the algorithm will identify edges in C whose weight, denoted as, is minimal. Then the weight of all the edges in C is reduced by and the edges with a weight of zero are removed. If no more cycles are found the first phase terminates. The second phase is to add some deleted edges back to the graph without re-creating cycles. The approximation ratio of the algorithm is bounded by the length of the longest simple cycle of the directed graph. However the proposed algorithm worst-case run time is O (|V| * |E|) on a directed graph that has |V| vertices and |E| edges.

One of the aesthetic criteria in a layered graph is consistency. This property ensures that the edges in the layered graph always point from layer L_i to L_i+1. Utilizing this knowledge with the information captured in the proposed incremental graph drawing system, the set of reverse edges affected by an update operation will be determined automatically once the initial graph is stored in the proposed graph data structure. For initial input graph data, the Greedy-Cycle-Removal algorithm will be employed for reversing the cycles temporarily. A summary of the aforementioned algorithms is shown in Table 1.

Table 1. Summary of algorithms for solving the cycle removal step in the Sugiyama heuristic

Name	Approach	Performance	Note
BFS/DFS	Random	No guarantee	Result may be poor
Greedy-Cycle-Removal	Greedy	O (\|V\| + \|E\|)	Result is good and the runtime is linear
Penalty-graph	Local search	O (\|V\| * \|E\|)	Approximation ratio ~ the longest simple cycle. The runtime is not linear.

Layer Assignment

The layer assignment step makes a given graph structure into an acyclic directed layered graph layout by assigning vertices to layers. Due to the limitations of computer screen real estate, the goal of this step is to not only assign vertices to layers but also ensure that the final layout has as little width and height as possible. In other words, the layer assignment problem is a two-objective function that has two optimizing variables. Unfortunately, minimizing both the width and the height of the graph layout is NP-complete (Battista et al., 1999). Thus, the heuristics for the layer assignment problem involves either reducing width or reducing height.

A simple algorithm called Longest Path Layering runs linearly and produces a layout with minimum height. The algorithm of the Longest Path Layering heuristic comprises two steps: (1) Place all the vertices that do not have incident edges in layer L₁, and (2) Place each remaining vertex v in layer L_p+1_,where p is the longest path from vertex v to those vertices on layer L₁. The drawback of this algorithm is that the layout could be very wide.

Assigning vertices to layers with minimum width also relates to the problem of multi-processor scheduling. The Coffman-Graham layering algorithm (Coffman & Graham, 1972) for solving multi-process scheduling was also applied to this problem. This algorithm provides an upper bound for the width of the graph layout by accepting an input W as an upper bound value. The Coffman-Graham layering algorithm assigns vertices to layers by performing two steps: (1) Order vertices based on their lexicographic value, and (2) Assign vertices to layers such that no layer has a width greater than the input W (Battista et al., 1999). Though the Coffman-Graham layering algorithm may produce layouts of a greater height than those of the Longest-Path-Layering algorithm, Lam and Sethi (1979) showed that in the worst case scenario, the height of the graph layout produced by the Coffman-Graham layering algorithm is twice the optimal height when w  as indicated in the following equation: h (2 – 2 / w) * h_min, where w is the width of the layout and h_min is the minimum height.

Another aspect of the layering problem is minimizing the number of dummy vertices. The number of dummy vertices created to make a directed graph proper impacts on the width of the layout, but most of the algorithms for layer assignment, for instance the Coffman-Graham algorithm (Coffman & Graham, 1972), fail to take into account the dummy vertices while computing the width of the layout. The actual width of the layout may be larger than expected due to the number of dummy vertices. However, combining the goals of minimizing the height of the drawing and minimizing the number of dummy vertices is NP-complete (Lin, 1992).

To deal with dummy vertices, Gansner et al. (1993) propose a heuristic for solving the layer assignment problem. The proposed algorithm minimizes the number of dummy vertices by using network simplex programming to translate the layer assignment problem into an integer linear problem. The problem then is solved using a network simplex algorithm. Gansner et al. (1993) mention that although the time complexity of the simplex network algorithm has not proven polynomial, it can run very quickly with few iterations in practice.

Battista et al. (1999) note that in real world graph drawings, vertices are not simple points, but are rectangles or other geometric shapes. Thus, the spacing between the vertices horizontally is often larger than the spacing vertically. In other words, minimizing the width is more important than minimizing the height and the Coffman-Graham algorithm is effective for drawings that are drawn from top to bottom. On the other hand, the Longest Path algorithm is more effective for drawings that are drawn from left to right.

Hierarchical graph layouts tend to be drawn from top to bottom. The incremental graph drawing algorithm proposed in this thesis will employ the Coffman-Graham layering algorithm, assigning vertices into layers. Table 2 shows the summary characteristic of each algorithm in the layer assignment step.

Table 2. Summary of algorithms for solving the layer assignment step in the Sugiyama heuristic

Name

Approach

Performance

Note

Longest Path Layering

Produces layout with minimum height

Linear

Good for drawings that are drawn left to right. Does not take into account dummy vertices.

Coffman-Graham

Provides upper bound for layout width

Linear

Good for drawings that are drawn top-to-bottom. Does not take into account dummy vertices.

Gansner et al. (1993)

Produces minimum dummy vertices

Not linear

Though its runtime is not linear, can find a solution with few iterations in real world applications.

Crossing Reduction

This step reduces the number of edge crossings on a proper k-layered hierarchical graph layout and improves the readability aesthetic criterion. As mentioned in chapter 1, a proper k-layered hierarchical graph layout is a special k-partite graph where the vertices are assigned into horizontal layers, edges are straight and pointing in the same direction, and no edges span more than one layer.

A well-known heuristic for solving the crossing reduction problem for proper layered hierarchical graph layouts is the layer-by-layer sweep algorithm proposed by Sugiyama et al. (1981). This algorithm can be described as follows:

Let G (V, E) be a proper k-layered hierarchical graph with edges pointing downward. The layer-by-layer sweep algorithm starts from the top layer and sweeps downward through each layer. At each layer, the ordering of one layer is held fixed and the one-sided crossing reduction algorithm is performed to find the minimum number of crossings in each layer. Once the algorithm reaches the bottom layer it sweeps upward layer by layer until it reaches the top layer. The algorithm continues to sweep downward then upward until the number of crossings stops decreasing.

The proper k-layered hierarchical graph layout can be reduced to a series of two-layered hierarchical graph layouts. It is observed that the number of crossings of a proper k-layered hierarchical graph layout is the sum of the number of crossings of all the two-layered layouts. Hence the crossing reduction problem of a proper k-layered graph can be reduced to a crossing reduction problem for a two-layered graph.

There are two possible approaches to finding the minimum number of crossings for each layer in the layer-by-layer sweep algorithm. One approach, called the two-sided crossing reduction problem, allows an algorithm to permute vertices on both layers to find the minimum number of crossings. The other approach, called the one-sided crossing reduction problem, holds one layer fixed and permutes the vertices on the other layer to find the minimum number of crossings. Though in theory the first approach may produce a better result, it is best for instances of graphs that have few vertices (Junger & Mutzel, 1997). In practice, the one-sided crossing reduction problem is employed in the layer-by-layer sweep algorithm. As mentioned in chapter 1, the crossing reduction problem for one-sided two-layered graphs can be defined as follows: Given G (L₁, L_2,E) where an ordering pos₂ of layer L₂ is fixed, find the ordering pos₁ of layer L₁ such that the crossing number is minimum.

A brute force computation for the one-sided crossing reduction problem is to permute all vertices on layer L₁ and compute the number of edge crossings while the ordering of vertices on L₂ is held fixed. Once the permutation is complete, the ordering of vertices on layer L₁ that has the fewest crossings is the optimal solution. Figure 14 shows a two-layered hierarchical graph layout before the one-sided crossing reduction algorithm is performed. There are 6 edge crossings, represented by the gray dots. Figure 15 shows the same two-layered hierarchical graph layout after the one-sided crossing reduction algorithm is performed and the crossing number is reduced to 1.

Figure 14. A two-layered hierarchical graph layout

Figure 15. A two-layered hierarchical graph layout after crossing reduction is performed

Unfortunately the one-sided crossing reduction problem for two-layered hierarchical graphs is NP-complete (Garey & Johnson, 1983). The brute force approach works only with small two-layered hierarchical graphs with few vertices. Finding an optimal solution for larger hierarchical graphs requires heuristics.

The barycenter heuristic (Sugiyama et al., 1981) is well known for its simplicity and effectiveness. The algorithm reduces the number of crossings by performing two basic steps: (1) Compute a barycenter value for each vertex on the layer L_i, and (2) Sort vertices according to their barycenter values. The result of sorting yields the fewest edge crossings possible. Although in theory the efficient ratio bound of the barycenter heuristic is) (Li & Stallman, 2002), this heuristic produces a very good layout and outperforms most algorithms in practice (Junger & Mutzel, 1997).

Eades and Kelly (1986) propose a split algorithm. The algorithm chooses a vertex as a pivot and then places other vertices to the left or right of the pivot vertex depending on which way would produce fewer crossings. The step is applied recursively for all the vertices on the same layer. In practice, the split algorithm is implemented in two steps: (1) Create a crossing matrix, and (2) Perform the crossing reduction. The asymptotic performance of this algorithm is O (|V| *|E| + |V| log|V|).

Eades and Kelly (1986) also propose another heuristic, called greedy-switch. The algorithm scans all consecutive pairs of vertices and switches their positions if it reduces the number of crossings. The scan continues until no switching can produce fewer crossings. The asymptotic performance of this algorithm is O (|V| log|V|²). Junger and Mutzel's (1997) experimental result shows that these recursive heuristics are outperformed by the barycenter and median heuristics.

Eades and Kelly (1986) propose a heuristic similar to that of the barycenter heuristic, called a median heuristic. Both barycenter and median algorithms sort vertices based on their average values, but the barycenter sorts vertices on the same layer according to the barycenter values, while the median heuristic sorts vertices on the same layer according to the median values. In theory the median has a better ratio bound than the barycenter heuristic because median heuristic efficiency has a constant ratio bound of 3 or 3 times the optimal solution. On the other hand, barycenter heuristic efficiency has a ratio bound of ), where |V| is the number of vertices. In practice the barycenter heuristic outperforms the median heuristic (Marti & Laguna, 2003; Junger & Mutzel, 1997).

Catarci (1988) proposes the assignment heuristic. The assignment problem is designed to find a best task for workers using an adjacent matrix. The author reduces the one-sided crossing reduction to an assignment problem by converting the bipartite graph data into a four-dimensional matrix. The algorithm performs well for graphs with a density greater than 30%. The runtime of the assignment heuristics is defined as a ratio of the crossing number and the lower bound: , where CN denotes as the crossing number and LB is trivial lower bound = . An experiment performed by Junger and Mutzel (1997) indicates that the barycenter heuristic outperforms the assignment heuristic in many instances of graphs with different densities, but it consistently produces an attractive graph layout.

Junger and Mutzel (1997) present an algorithm called branch and cut. The authors define minimizing the number of crossings as an objective function with respect to a set of constraints. The one-sided two-layered crossing reduction problem can be expressed as linear programming (LP). The authors determine that the branch and cut heuristic can find a true optimal solution for a small graph with fewer than 60 vertices and layers with fewer than 15 vertices. For larger graphs the authors suggest using the barycenter heuristic.

Matuszewski, Schönfeld, and Molitor (1999) present a heuristic for solving the one-sided two-layered crossing reduction problem based on a technique called sifting (Rudell, 1993), which reduces the number of vertices in a reduced order binary decision diagram (ROBDD). The sifting algorithm moves vertices to the left, and then to the right. Finally, the vertices are moved to a position that produces the minimum number of crossings. The authors show that the sifting heuristic runtime performance is slightly better than that of the barycenter heuristic for small, spare, two-layer graphs. However, the barycenter heuristic outperforms the sifting heuristic because the sifting heuristic run-time is O (|V|²), where |V| is the number of vertices.

Based on local search, a common approach for improving solutions to optimization problems, Stallman et al. (2001) propose a heuristic called adaptive insertion, which is a generalization of the local search approach. The basic operation of the adaptive insertion heuristic is to swap vertices around its neighbors in the same layer. This operation is performed iteratively until no better result is found or few crossings are found. The overall asymptotic performance for each iteration is O(|V| * |E| * log(|E| / |V|). The experimental result indicates that the adaptive insertion heuristic is not scalable for large graphs.

Demetrescu and Finocchi (2003) address a strong relationship between the crossing reduction problem and the problem of finding minimum feedback arc sets in directed graphs. The authors show that the number of crossings in a two-layered graph can be represented as a graph called a penalty graph. The authors also prove that the crossing reduction problem is equivalent to the feedback arc set problem. In fact the final penalty graph, after cycle removal, represents the ordering of vertices on the layer such that the number of crossings is minimal. The authors perform several experiments with different data sets. The experimental result shows that the proposed algorithm produces fewer crossings than does the barycenter method. The drawback of this approach is that the algorithm has a time complexity of O (|V|⁴ + |E|²). This approach is not scalable for large graphs.

Marti and Laguna (2003) perform extensive experiments comparing 12 well-known heuristics and two meta-heuristics. The authors conclude that for dense graphs Tabu search is an appropriate choice for solving the crossing reduction problem, and for sparse graphs the GRASP meta-heuristic produces better results than other heuristics. However, the authors also suggest that if the performance is critical the hybrid barycenter or splitting heuristic is a good candidate.

Most of the research cited so far focuses on the crossing reduction problem without constraints. Finocchi (2002) formalizes the notion of a constrained crossing reduction problem for constrained hierarchical graph layouts, classifying this as hierarchical realizability. According to Finocchi (2002), there are two variants of the constrained crossing reduction problem. The first problem is proving whether any solution at all exists that satisfies given non-crossing constraints. The problem can be stated as follows:

Given a proper layered graph G, a set of edges C(u,v), and an integer k > 0, does G exist such that at least k pairs of edges do not cross (Finocchi, 2002)?

Finocchi proves that this problem is NP-complete. The other problem is to find the minimum numbers of crossings with respect to a set of crossing constraints. In other words, the optimal solution to this problem is the minimum number of crossings that equal the number of given crossing constraints. Finocchi (2002) proposes a heuristic for solving this problem using a constrained penalty graph. The penalty graph approach produces good results but its performance is not as good as that of the barycenter heuristic (Forster, 2004).

Forster (2004) presents a heuristic for the constrained crossing reduction problem. The proposed heuristic is a variant of the barycenter heuristic. This algorithm extends the barycenter heuristic by including constraints as a factor in the computation of barycenter values. The author shows that the proposed algorithm produces a good quality graph layout and is as fast as the standard barycenter algorithm. Table 3 summarizes the characteristics of each crossing reduction algorithm.

Table 3. Summary of algorithms for solving the crossing reduction step in the Sugiyama heuristic

Name	Approach	Performance	Note
Barycenter	Sorting vertices	Near linear	Outperforms most of algorithms in real world applications
Median (Eades & Kelly, 1986)	Sorting vertices	Near linear	Outperforms barycenter in theory but is outperformed by barycenter in real world experiments
Split (Eades & Kelly, 1986)	Reorder vertices through a pivot point	O(\|V\|log\|V\|)	Good performance comparing to barycenter and median
Greedy-Switch (Eades & Kelly, 1986)	Scan vertices and compare the crossing numbers	O(\|V\|log\|V\|²)	Runs effectively in real world applications
Assignment (Catarci, 1988)	Assignment		Efficient for layouts whose edge density is larger than 30%. However, it is not as efficient as barycenter in real world applications.
Branch and Cut (Junger & Mutzel, 1997)	Linear programming	Not linear	Find true optimal solution for a graph with less than 60 vertices.
Based on sifting algorithm (Matuszewski, Schönfeld, & Molitor, 1999)	Reduced order binary decision diagram	O (\|V\|²)	Outperformed by barycenter heuristic
Adaptive Insertion (Stallman et al., 2001)	Local search	O(\|V\| \|E\| * log(\|E\| / \|V\|)*	Not scalable for large graphs
penalty graph (Demetrescu & Finocchi, 2003)	Induce as a feedback arc set problem	O (\|V\|⁴ + \|E\|²)	Provides better drawing with fewer crossings but is not scalable for large graphs
Modified Barycenter (Forster, 2004)	Sorting vertices	O(\|V\| log \|V\| + \|E\| + \|C\|²)	Provides layout as good as those of other complicated algorithms but with a better run-time

Coordinate Assignment

According to aesthetic criteria, graph edges should be short and straight (Gansner et al., 1993). A common approach for solving this problem is the Quadratic Programming Layout Method proposed by Sugiyama et al. (1981). The problem is defined as an objective function with respect to a set of constraints. Unfortunately, solving this problem using linear programming is computationally expensive due to the size of the matrix. Gansner et al. (1993) present a heuristic for solving this problem. The heuristic performance is good but it is hard to program and the layout sometimes is not pleasing (Gansner et al., 1993).

1 2 3 4 5 6 7 8 9 ... 13