2015-10-16

Graph - Minimum Spanning Tree

Spanning Tree

    a spanning tree of an undirected graph is a connected subgraph with no cycles that includes all the vertices

Minimum Spanning Tree

    for undirected edge-weighted graph, MST is a spanning tree with minimum weight.

Two Strategies For Finding MST

  • grow the MST from a single vertex, then find a new edge to attach to a single growing tree (Prim's Algorithm)
  • maintain a minimum spanning forest, then merge the forest (Kruskal's Algorithm)
  • both are greedy algorithm. (pickup a min-cost edge from a single vertex or from the whole graph)

Prim's Algorithm (O(elge) = O(elgn))

  • pickup an arbitrary vertex s
  • make s as visited
  • for all edges (s, u), put (s, u) into min-heap H based on edge cost
  • while H is not empty
    • e(u, v) = extractMin(H)        // do it e times with O(lge) each time
    • mark v as visited                   // u already is marked at this time
    • for all edges (v, w)                // do it e times totally
    •        if w is unvisited, put (v, w) into H

Kruskal's Algorithm (O(elgn))

  • sort all the edges based on their weight. (O(elge) = O(elgn))
  • make n different sets. (make_set() * n = O(1 * n))
  • for each edges(u, v) in the sorted order
    • if (find(u) != find(v))                         // without anything clever, it will take O(n)
      • union(find(u), find(v))             // edge(u, v) in the spanning tree. O(1) if found

Union-Find

  1. without optimization, running time of Kruskal is: O(elgn) + O(n) + e * O(n)
  2. we need to take down the running time of union-find.
  3. as a rule of thumb, we should reduce the height of the tree when doing union. 
  4. so if we can track the height of a subset, it will be helpful since we can connect the smaller tree to the higher one. it is called "weighted union".
  5. with this trick in step 4, find operation could be done in O(lgn). Proof could be done by induction.
  6. the cool stuff but hard to proof: if we change a node's parent direct to the root of the set when we do union-find operation each time, the amortized running time could be reduced significantly, but not in constant time. this is called "path compression".
  7. path compression does not cost anything, but will give you more benefit in the long run.
  8. with the trick like "weighted union" and "path compression", the dominant running time of Kruskal's algorithm is to sort edges, which is O(elgn).
  9. Union-Find is easy to implement with data structures like array and others.

Reference

  1. https://www.youtube.com/watch?v=ufj5_bppBsA

No comments:

Post a Comment