Randomized Algorithms
- Make random choices during execution
- Traditional algorithms (deterministic algorithms) analyze the behavior of an algorithm on an "average" input, rather than a worst-case input.
- Randomized algorithms do not require new assumptions about the nature of the input. It's purely internal to the algorithm. So it makes the underlying model more powerful.
- 2 broad categories of randomized algorithms
- Las Vegas randomized algorithms
- use randomness to achieve good expected running times, e.g. randomized-quicksort
- guarantee the output is always correct
- Monte Carlo randomized algorithms
- have a deterministic running time
- the output is correct with a certain probability.
Randomized Divide-and-Conquer
the "divide" step is performed using randomization and use expectations of random variables to analyze the time.
Randomized-Selection
- A more general description of finding-the-median problem is selection. Sometimes called Quick-Selection.
- Given a set of n numbers S and a number k between 1 and n, Select(s, k) returns the kth largest element in S.
- Basic Ideas of Select(s, k)
- choose a splitter from s
- partition S into S1 and S2 where {S1} < k and {S2} > k
- if |S1| == k - 1 then splitter is the answer
- if |S1| >= k then answer lies in S1, recursively call Select(s1, k)
- if |S1| < k then answer lies in S2, recursively call Select(s2, k - |S1| - 1)
- Good splitter is very important to make this algorithm running in O(n) time. And should produce each subset into approximately equal in size.
- The deterministic algorithm
- partition the number into n/5 group, so each group has 5 numbers
- find the median of each group, then we have n/5 medians
- recursively find the median from n/5 medians
- use the median as the pivot to partition S into S1 and S2
- recursively search either in S1 or S2
- T(n) = n + T(n/5) + T{max(|S1|, |S2|)} = n + T(n/5) + T(7/10n) => O(n)
- Random Splitters
- choose a "well-centered" splitter = from S uniformly at random
- running time is O(n), the analysis is a little complicated...
- randomized selection (aka quick-selection) is Las Vegas randomized algorithm
- There is a randomized find-approximate-middle algorithm
- pick an element from A uniformly at random (what a simple algorithm!)
- this algorithm returns a correct answer with probability 8/10. It is Monte Carlo randomized algorithm.
Randomized-Quicksort
- improve the Quicksort performance against the worst case instance
- the only difference between randomized-quicksort and deterministic quicksort is in picking on pivot randomness or not to partition the input sequence.
- with randomized picking on pivot, it is expected to have better performance in worst case.
- For instance, with n = 10,000, the probability of X < nln(n) or X > 3nln(n) is less than 0.5%. (data from my instructor's lecture notes.)
Min-cut problem
- Problem description: Given an undirected graph G=(V, E), we define a cut of G to be a partition of V into 2 non-empty sets A and B.
- Deterministic Algorithm for this problem:
- use min-cost s-t cut solution for every pair of vertices
- running time: O(V^5) = (Ford-Fulkerson Algorithm for min-cost s-t cut needs O(V.E) = O(V^3), there are possible V^2 pairs)
- so we can see the power of randomized algorithm.
- Introduce a concept: edge contraction
- pickup any edge (u, v), combine u and v into u'
- remove all edges between u and v
- connect all remaining edges incident to u and v
- Randomized Algorithm of Min-cut (Karger's Algorithm)
- while the graph has more than 2 edges
- randomly pick an edge to contract
- return the remaining edges as min-cut
- running time O(|V|)
- We can test that the algorithm above is not always correct. It is a Monte Carlo randomized algorithm
- Let's look at the probability of correct answer.
- Say the min-cut edges number is k, the question is how many odd the k edges will survive at last?
- first, we know the probability of min-cut survive at the first contraction is Pr(min-cut survive) = 1 - k/e (e is the total number of edges of G)
- observation-1: k <= degrees of every node
- observation-2: (nk/2) <= e ===> 1 - k/e >= 1 - 2/n = (n-2)/n
- the odds that min-cut survive all n-2 contraction is: <= (n-2)/2 * ((n-1) - 2)/(n-1) * ... * 1/3 = 2/(n(n-1)) = 1/C(n, 2)
- if we execute C(n, 2) runs, the probability of incorrect answer would be: (1 - C(n, 2))^(C(n, 2)) = 1/e = 1/2.7
- as we can see, C(n, 2) runs still seem to result pretty high probability of incorrect answer.
- if we execute ln(n).C(n, 2) runs, the probability of resulting incorrect answer will be 1/n.
Reference
No comments:
Post a Comment