CS503: Sixteenth Lecture, Fall 2008 Graph Algorithms

Document Sample
CS503: Sixteenth Lecture, Fall 2008 Graph Algorithms Powered By Docstoc
					CS503: Sixteenth Lecture, Fall 2008
        Graph Algorithms
         Michael Barnathan
       Here’s what we’ll be learning:
• Data Structures:
   – Graphs (Adjacency Matrix Representation).
• Theory:
   –   Dijkstra’s algorithm.
   –   Floyd’s algorithm.
   –   Traveling salesman problem (TSP).
   –   Dynamic programming.

• And then we’re done.
   – With all of the basic topics covered in Algorithms I and II at
   – With every algorithm that is discussed in the book (except
     red-black trees). You should recognize all of them by now.
   – Of course, we still have a month left in the course.
   – We’ll probably use it for programming.
Traditional Graph Representation
      G=               1               3

               2               5



      V=   1       2   3   4       5

              Adjacency Matrices
• A graph G can also be represented as an adjacency
• Let v be the number of vertices in G.
• The adjacency matrix A is then a v x v binary matrix
  indicating the presence of edges between nodes:
   – Ai,j = 1 if an edge exists between vertices i and j.
   – Ai,j = 0 otherwise.
• If the graph is undirected, A will be symmetric.
   – That is, an edge between vertices i and j means an edge
     also exists between vertices j and i.
• This is not necessarily true if the graph is directed.
Adjacency Matrix Example
  G=          1        3

       2           5


       0   1 0 0 1
                  
       1   0 0 1 1
  A=   0   0 0 0 1
                  
       0   1 0 0 1
       1   1 1 1 0
                  
      Adjacency Matrix Tradeoffs
• O(n2) space required to store the matrix, but each
  entry can be represented with one bit.
   – More efficient representation in very dense graphs, but
     fails overall.
• More natural for use in certain algorithms.
• O(n) time required to retrieve all edges of a vertex,
  even if the vertex has only one edge.
   – Because it also stores the 0s, scanning a vertex’s edges
     requires scanning a whole row.
   – The traditional (adjacency list) representation takes O(e)
     time, where e is the number of edges adjacent to the
     vertex you are examining. e is much smaller than n.
                  Adjacency Lists
• The traditional method of representing edges is
  called an adjacency list.
• Simple: Every vertex contains a linked list of edges it
  is adjacent to.
• The edge typically stores the vertices on both ends as
  well, to allow the traversal across that edge to be
  constant time.
   – If edges did not keep track of their vertices, traversing one
     would require linearly scanning all vertices for adjacency.
Vertex[] mat2list(boolean[][] adjmatrix) {
    Vertex[] ret = new Vertex[adjmatrix.length];
    for (int vidx = 0; vidx < adjmatrix.length; vidx++)
             ret[vidx] = new Vertex();               //Initialize vertices.

     for (int vidx = 0; vidx < adjmatrix.length; vidx++)
              for (int eidx = 0; eidx < adjmatrix[vidx].length; eidx++)
                             if (adjmatrix[vidx][eidx]) //If the matrix has a 1, add an edge to the list.
                                           ret[vidx].edges().add(new Edge(ret[vidx], ret[eidx]));

     return ret;

boolean[][] list2mat(Vertex[] adjlist) {
    boolean[][] ret = new boolean[adjlist.length][adjlist.length];                //Default value is false.
    for (int vidx = 0; vidx < adjlist.length; vidx++) {
              for (Edge eadj : adjlist.edges())
                            ret[vidx][eadj.getOtherVertexIndex()] = true;         //Edges in the list are true.

     return ret;
   Weighted Adjacency Matrices
• Weights can be added to adjacency matrices
  as well.
• Rather than using 1s to represent edges, use
  the edge weights.
• Missing edges are represented by infinity.
  – Infinity in Java is Double.POSITIVE_INFINITY.
        Shortest Path Problem
• You are working for a new software company
  called 10100, Inc.
• They just received access to a road database
  for their new product, 10100 Maps.
• You are asked to develop the algorithm that
  computes the fastest route from point A to
  point B.
• For example, from Monmouth University to
  Carnegie Hall.
                Example Graph
                              Carnegie Hall

          27 min.                                  24 min.

                         12 min.
                                              Jersey City     10 years.
                                                              (“Practice, practice,
                          43 min.                   44 min.   practice”)
54 min.                                          Old
                          20 min.
                                      34 min.
                    27 min.

                     Monmouth University

      What is the quickest way to Carnegie Hall?
                Dijkstra’s Algorithm
• Named after Edsger Dijkstra, who discovered it in 1959.
• Also called the shortest-path algorithm, which should tell you
  what it does.
• Of course, it finds the shortest path from one node to another
  (or to all others) in a graph.
• Key insight: if you have found the shortest path from Old
  Bridge to Carnegie and Freehold to Carnegie, you will not
  need to calculate the path from Freehold to Old Bridge. Going
  directly to Old Bridge is faster.
   – Caveat: this is not true if negative weights exist in the graph. In this
     case, maybe going from Freehold to Old Bridge saves you time and the
     link must still be checked!
   – Dijkstra’s algorithm only works when all weights are non-negative.
      Dijkstra’s Algorithm Overview:
1.    Declare an array, dist, of shortest path lengths to each vertex. Initialize
      the distance of the start vertex to 0 and every other vertex to infinity.
2.    Create a priority queue and fill it with all nodes in the graph.
3.    While the queue is not empty:
     1.   Remove the vertex u with the smallest distance from the start.
     2.   Compute the minimum distance md between u and each neighboring
          vertex v (scan u’s edges and choose the one with the smallest weight).
     3.   For each neighbor v, if dist[u] + md < dist[v],
          1.   Set dist[v] = dist[u] + md
          2.   Set v’s “predecessor vertex” to u. This is used to retrace the path.
          3.   (We have found a shorter path to v than our current best).
4.    Trace the path back from the target to the source by traversing the
      predecessor node of each vertex from the target. Reverse it and you
      have the shortest path from source to target.
Dijkstra’s Algorithm – Starting State
                                Carnegie Hall

            27 min.                                  24 min.
    ∞                      12 min.
                                                Jersey City
                                                                10 years.
                            43 min.                   44 min.
  54 min.                                          Old
            ∞               20 min.
                                          34 min.
                      27 min.
                       Monmouth University
Dijkstra’s Algorithm – First Iteration
                                 10 years
                                Carnegie Hall

            27 min.                                  24 min.
    ∞                      12 min.
                                                Jersey City
                                                               10 years.
                            43 min.                  44 min.
  54 min.                                         Old          So far practicing is
                                                 Bridge        winning. Maybe your
            27              20 min.                            piano teacher was
             Freehold                                          right…
                                        34 min.
                      27 min.
                       Monmouth University
Dijkstra’s Algorithm – Second Iteration
                                  10 years
                                 Carnegie Hall

             27 min.                                  24 min.
     77                     12 min.
                                                 Jersey City
                                                                10 years.
                             43 min.                  44 min.
   54 min.                                         Old          34 < 47, so Old
                                                  Bridge        Bridge keeps its
             27              20 min.
                                                                current predecessor.
                                         34 min.                77 < 81, so
                       27 min.                                  Newark’s
                                                                predecessor is Old
                        Monmouth University                     Bridge.
Dijkstra’s Algorithm – Third Iteration
                                Carnegie Hall

            27 min.                                   24 min.
    77                     12 min.
                                                Jersey City
                                                                10 years.
                            43 min.                   44 min.
  54 min.                                          Old          102 < 104 and 102
                                                  Bridge        < 10 years, so
            27              20 min.
                                                                Carnegie Hall
             Freehold                                           changes its
                                            34 min.             predecessor to
                      27 min.                                   Jersey City.
                       Monmouth University
 Dijkstra’s Algorithm – Results:
                                     The shortest time to Carnegie Hall is
     Carnegie Hall                   thus 102 minutes.

                          24 min.    Starting at Carnegie Hall and
                                     traversing its predecessor list, we see
                     78              that we passed through Jersey City
                     Jersey City     and Old Bridge.

                           44 min.
                     34              Therefore, the optimal route is
                       Old           Monmouth -> Old Bridge -> Jersey
                      Bridge         City -> Carnegie Hall.

                           34 min.
Monmouth University
    Dijkstra’s Algorithm – Pseudocode:
Vertex[] Dijkstra(Vertex[] graph, int sourceidx) {
     double[] dist = new double[graph.length];
     Vertex[] predecessor = new Vertex[graph.length];

     PriorityQueue<Vertex> vertq = new PriorityQueue<Vertex>();
     for (int vidx = 0; vidx < graph.length; vidx++) {
                dist[vidx] = Double.POSITIVE_INFINITY;

     dist[sourceidx] = 0;

     while (!vertq.empty()) {
               Vertex cur = vertq.pop();
               for (Edge adjedge : cur.edges()) {
                              Vertex other = adjedge.getOtherVertex();
                              if (dist[cur] + adjedge.getWeight() < dist[other]) {
                                               dist[other] = dist[cur] + adjedge.getWeight();
                                               predecessor[other] = cur;

     return predecessor;      //All shortest paths from the source node are contained here.
   Dijkstra’s Algorithm – Analysis:
• What is the time complexity of this algorithm?
  – Assuming a linear search is performed on the
    priority queue when removing the element?
  – Assuming the traditional heap implementation of
    a priority queue (which is tricky in this case
    because the distance changes throughout the
  – Hint: it will depend on both V and E.
• How much space is being used?
 Dijkstra’s Algorithm – Discussion:
• This algorithm always chooses the path of
  shortest distance to record at each step.
• What did we call those algorithms again?
• When it finishes, the recorded path will be the
  absolute shortest from the source.

• Dijkstra’s algorithm will fail if given a graph
  with negative weights. Use the Bellman-Ford
  algorithm (which we won’t discuss) for this.
                 Floyd’s Algorithm
• Also called the Floyd-Warshall algorithm.
• This algorithm reports shortest paths between ALL pairs of
  nodes in the graph.
• This algorithm also does not work when negative weights
  exist in the graph.
• You could also run Dijkstra’s algorithm for each vertex in the
  graph, but you would be repeating work, and it would cost
  you: O(v3 * e), to be precise.
• In the worst case, e = v2, so this algorithm could cost O(v5).
• Floyd’s algorithm improves this to O(v3).
• It uses a technique called dynamic programming to do this.
                  Floyd’s Algorithm
double[][] floyd(double[][] graph) {
  //Weighted Adjacency matrix representation.
  double[][] pathlen = (double[][]) graph.clone();

    for (int start1 = 0; start1 < graph.length; start1++)
          for (int start2 = 0; start2 < graph.length; start2++)
                   for (int end = 0; end < graph.length; end++)
                            pathlen[start2][end] =
    Math.min(pathlen[start2][end], pathlen[start2][start1] +

    return pathlen;
     Recall: Divide and Conquer
• Divide and Conquer is an algorithm design
  paradigm that splits large problems up into
  smaller instances of the same problem, solves
  the smaller problems, then merges them to
  get a solution to the original problem.
• When a table of solutions to subproblems is
  kept to avoid redoing work, this is called
              Dynamic Programming
• Floyd’s algorithm is a dynamic programming algorithm.
• The idea behind dynamic programming is similar to memoization.
• Whereas memoization begins with large problems and breaks them down,
  dynamic programming builds large solutions from smaller problems.
• Dynamic programming is used in problems with overlapping substructure:
  when a problem can be split “horizontally” into overlapping subproblems,
  which can be merged back later:
    – For example, computing path[1][5] would involve computing path[1][2] +
      path[2][5], path[1][3] + path[3][5], and path[1][4] + path[4][5].
    – The problem space can be partitioned into subsets of itself and those subsets
      can be merged together to solve the full problem.
• A table is still required to store the solutions to the subproblems.
• While memoization has a naturally recursive structure, dynamic
  programming algorithms often involve computations within a loop.
  The Traveling Salesman Problem
• Let’s say you’re in charge of planning FedEx’s
  delivery route.
• You have packages to deliver in New York,
  Denver, Chicago, and Boston.
• Gas is expensive for the company, so you’d like
  to find the route with the shortest distance
  required to deliver all of the packages.
            Graph Representation

                                           982 mi.      Boston
              1001 mi.                    791 mi.           215 mi.
                          1777 mi.               New York

Starting from New York, which route minimizes the total distance?
            The Naïve Algorithm
• Compute all permutations of edges and sum the path
  lengths. Select the smallest.
• This is equivalent to “topological sorting” the graph,
  and takes O(v!) time.
• Dynamic programming can get this down to O(v22v),
  but it’s still exponential.
• The million dollar question: Is there any way to solve
  this problem in less than exponential time?
• Literally. Find one or prove one can’t exist and you’ll
  win $1 million.
               NP Completeness
• TSP is an example of an NP Complete problem.
• These are problems whose solutions can be verified
  in polynomial time, but (probably) can’t be computed
  in polynomial time.
• All NP complete problems can be reduced to each
  other; they form a complexity class.
• The open (million dollar) question is whether the
  complexity classes P and NP are equal.
   – Finding a polynomial-time algorithm for even one of
     these problems, or proving that no such algorithm exists,
     is sufficient to prove P = NP or P != NP.
• So is UPS out of luck?
• Not entirely… it turns out that there are many
  approximation algorithms or heuristics for NP-
  complete problems that will run in polynomial time.
• Some of these give very good estimates. Certainly
  good enough when the question is one of driving
• Continuing this discussion is likely outside of this
  course’s scope.
            Other Graph Problems
• There are many open problems
  in graph theory.
   –   Vertex covers.
   –   Cliques.
   –   Flow.
   –   Graph coloring.
   –   Knight’s tours.
• With the rise of social networks,   --xkcd
  this is becoming a more and
  more relevant field.
       Not the Shortest Lecture
• We rounded out the topics usually taught in
  an algorithms course with Dijkstra’s and
  Floyd’s shortest-path algorithms and briefly
  discussed the notion of NP completeness in
  the Traveling Salesman Problem.
• The lesson:
  – Slight variations on problems may not seem to
    make them harder, but may in fact make them
    intractable. It isn’t always apparent.

Shared By: