2015-10-31

Randomized Algorithms


  • Make random choices during execution
  • Traditional algorithms (deterministic algorithms) analyze the behavior of an algorithm on an "average" input, rather than a worst-case input.
  • Randomized algorithms do not require new assumptions about the nature of the input. It's purely internal to the algorithm. So it makes the underlying model more powerful.
  • 2 broad categories of randomized algorithms
    • Las Vegas randomized algorithms
      • use randomness to achieve good expected running times, e.g. randomized-quicksort
      • guarantee the output is always correct
    • Monte Carlo randomized algorithms
      • have a deterministic running time
      • the output is correct with a certain probability.

Randomized Divide-and-Conquer

the "divide" step is performed using randomization and use expectations of random variables to analyze the time.

Randomized-Selection

  • A more general description of finding-the-median problem is selection. Sometimes called Quick-Selection.
  • Given a set of n numbers S and a number k between 1 and n, Select(s, k) returns the kth largest element in S.
  • Basic Ideas of Select(s, k)
    • choose a splitter from s
    • partition S into S1 and S2 where {S1} < k and {S2} > k
    • if |S1| == k - 1 then splitter is the answer
    • if |S1| >= k then answer lies in S1, recursively call Select(s1, k)
    • if |S1| < k then answer lies in S2, recursively call Select(s2, k - |S1| - 1)
  • Good splitter is very important to make this algorithm running in O(n) time. And should produce each subset into approximately equal in size.
  • The deterministic algorithm
    • partition the number into n/5 group, so each group has 5 numbers
    • find the median of each group, then we have n/5 medians
    • recursively find the median from n/5 medians
    • use the median as the pivot to partition S into S1 and S2
    • recursively search either in S1 or S2
    • T(n) = n + T(n/5) + T{max(|S1|, |S2|)} = n + T(n/5) + T(7/10n) => O(n)
  • Random Splitters
    • choose a "well-centered" splitter = from S uniformly at random
    • running time is O(n), the analysis is a little complicated...
    • randomized selection (aka quick-selection) is Las Vegas randomized algorithm
  • There is a randomized find-approximate-middle algorithm
    • pick an element from A uniformly at random (what a simple algorithm!)
    • this algorithm returns a correct answer with probability 8/10. It is Monte Carlo randomized algorithm.

Randomized-Quicksort

  • improve the Quicksort performance against the worst case instance
  • the only difference between randomized-quicksort and deterministic quicksort is in picking on pivot randomness or not to partition the input sequence.
  • with randomized picking on pivot, it is expected to have better performance in worst case.
  • For instance, with n = 10,000, the probability of X < nln(n) or X > 3nln(n) is less than 0.5%. (data from my instructor's lecture notes.)

Min-cut problem

  • Problem description: Given an undirected graph G=(V, E), we define a cut of G to be a partition of V into 2 non-empty sets A and B.
  • Deterministic Algorithm for this problem:
    • use min-cost s-t cut solution for every pair of vertices
    • running time: O(V^5) = (Ford-Fulkerson Algorithm for min-cost s-t cut needs O(V.E) = O(V^3), there are possible V^2 pairs)
    • so we can see the power of randomized algorithm.
  • Introduce a concept: edge contraction
    • pickup any edge (u, v), combine u and v into u'
    • remove all edges between u and v
    • connect all remaining edges incident to u and v
  • Randomized Algorithm of Min-cut (Karger's Algorithm)
    • while the graph has more than 2 edges
      • randomly pick an edge to contract
    • return the remaining edges as min-cut
    • running time O(|V|)
  • We can test that the algorithm above is not always correct. It is a Monte Carlo randomized algorithm
  • Let's look at the probability of correct answer. 
    • Say the min-cut edges number is k, the question is how many odd the k edges will survive at last? 
    • first, we know the probability of min-cut survive at the first contraction is Pr(min-cut survive) = 1 - k/e   (e is the total number of edges of G)
    • observation-1: k <= degrees of every node
    • observation-2: (nk/2) <= e    ===> 1 - k/e >= 1 - 2/n = (n-2)/n
    • the odds that min-cut survive all n-2 contraction is: <= (n-2)/2 * ((n-1) - 2)/(n-1) * ... * 1/3 = 2/(n(n-1)) = 1/C(n, 2)
    • if we execute C(n, 2) runs, the probability of incorrect answer would be: (1 - C(n, 2))^(C(n, 2)) = 1/e = 1/2.7
    • as we can see, C(n, 2) runs still seem to result pretty high probability of incorrect answer.
    • if we execute ln(n).C(n, 2) runs, the probability of resulting incorrect answer will be 1/n.

Reference


No comments:

Post a Comment