2015-11-30

Some Probability Distribution

Expectation and Variance

  • expectation is the mean value of a sequence of values.      
    • E(x) = sum(1...n) / n
  • variance is the average of squared difference from mean. 
    • Var(x) = E(x - E(x))^2
  • standard deviation is the square root of variance.              
    • Sigma = sqrt(Var(x))
  • using standard deviation, we know what is normal(standard), what is extra large or small.

Uniform Distribution



  •  a and b are solvable given E(x) and V(x)

Normal (Gaussian) Distribution



 Binomial Distribution (Bernoulli Trials)

  • a sequence of independent yes/no experiment, called Bernoulli experiment
  • when n = 1, binomial distribution is Bernoulli distribution


  • e.g. 
    • toss a coin 100 times, what is the prob. the head occurs 30 times? (n = 100, k = 30, p = 1/2)


Poisson Distribution



Geometric Distribution



  • modelling the trials up to and including the first success (k = 1, 2, 3, ...)
  • e.g.
    • toss a fair coin until the first heads, E(x) = 1/p = 2, which means on average tossing 2 times we will get heads.


  • modelling the number of failures until the first success (k = 0, 1, 2, 3, ...)
  • e.g.
    • assume the success prob. of something is p, the mean number of success is: (1-p)/p.

Reference




2015-11-15

String Matching

Naive Algorithm


  • shifting pattern 1 element each mismatch
  • running time O(n.m)
  • n - the length of string
  • m - the length of pattern
  • the goal of following 3 string-matching algorithm is to achieve O(n + m) running time. 
  • Obviously, if m is small, it doesn't make much difference among those algorithms. (It's kinda of like the question about when dynamic programming is less efficient than brute-force one for knapsack problem: O(nk) >= O(2^k))

KMP Algorithm


  • idea: have to scan text, which takes O(n), but can we try to shift more than 1 position each mismatch?
  • if s[i] != p[j], that means s[i-j ... i-1] == p[0 ... j-1], if there is common substring in p[0 ... j-1], try to use it. so we do a little pre-process of the pattern to calculate the longest common substring for each substring.

Boyer-Moore-Horspool Algorithm


  • compare letter from rightmost of pattern to left (instead of comparing from left to right like KMP algorithm does)
  • what is the difference of comparing from RIGHT to LEFT? 
    • let's assume s[i] != p[j] (j = 0 ... m-1) and s[i] even does not exist in the pattern, then we can shift (j + 1) position directly. 
    • if j = m - 1, we start comparing s[i + m] with p[m - 1] without missing any match.
    • the  above case is the best one, which enable us to shift m position, the max possible value, but how about s[i] do present in the pattern? how many shift can we perform?
    • let's assume s[i] exists in pattern, and s[i] = p[k] ( k < j, since k must on the left side of j), so we can shift (j - k) position, which is going to align s[i] with p[k].
    • from the above analysis, we are considering if s[i] in pattern or not, then decide how many shift we can perform. KMP has a different way, it concerns the longest common substring of p[0...j-1], try to align the common substring with s[... i-1], and continue comparing s[i] with p[length-of-longest-substring(p[0...j-1])]
  • like KMP algorithm, we do a little pre-process about the pattern, called "bad match table", the value is how many shift we can perform from s[i], which is the start comparison point not necessarily the mismatch point.
    • to calculate the "bad match table"
      • possible shifts = length - 1 - index
      • we can see the possible shifts are getting smaller if a character is duplicate. 
      • last letter of pattern is length if not already defined

Robin-Karp Algorithm


  • this algorithm compares the hash value between the pattern and the substring of text, instead of comparing the substring itself with the pattern.
  • if we compute the hash value for each substring, then it doesn't make much difference with naive algorithm even worse, since the hash function takes O(m) running time.
  • the key idea is to achieve O(1) running time when calculating hash value for each substring from text, that is, rolling hash, it updates the new hash value instead of recomputing it.
  • then how does it work?
    • let x = (b^m-1)s[i] + (b^m-2)s[i+1] + ... + (b^0)s[i+m-1], 
    • let y = (b^m-1)s[i+1] + ... + (b^0)s[i+m]
    • then, y = (x - s[i](b^m-1)b + s[j+m], b = |alphabets|
    • h(y) = (h(x).b%q - (s[i]%q).(b^m%q) + s[j+m] %q) %q
    • b%q and (b^m)%q are constant which is pre-compute-able, that means h(y) could be updated with O(1) time.
  • single pattern searching vs. multiple pattern searching
    • single pattern searching: KMP, BMH
    • multiple pattern searching: rolling hash is a choice (e.g. Leetcode - repeated DNA sequence)

Reference




2015-11-01

Graph - Network Flow Problems

Network Flow 

  • in a finite directed graph (also called a network), each edge has a capacity and each edge receives a flow. Flow satisfies 2 constraints:
    • the amount of flow cannot exceed the capacity of the edge.
    • the amount of flow into a node equals the amount of flow out of it except source, which has only outgoing flow, or sink which has only incoming flow.
  • flow network can be used to model
    • traffic in a road system
    • circulation with demands
    • fluid in pipes
    • currents in an electrical circuit
    • etc
  • Max-Flow Problem: following network (from CLRS) has a maximum possible flow from source-0 to sink-5 is 23. how to find the max possible flow in a network?
  • Bipartite Matching Problem
    • Bipartite graph: each node has no edge for the same layer
  • Min-cost Cut Problem
    • Remove some edges from graph G such that there is no path from s to t after removing the edges (image destroy bridges with minimum cost)

Finding Max-Flow (Ford-Fulkerson Algorithm)

Ford-Fulkerson algorithm:

Start with 0 flow.
While there exists an augmenting path:
    - find an augmenting path
    - compute bottleneck capacity
    - increase flow on that path by bottleneck capacity
* this FF algorithm description is very much simplified. To implement the algorithm, one has to understand concepts like: residual graph, forward edge, backward edge.

Questions:

  • how to find an augmenting path?
  • if FF terminates, does it always compute a max-flow?
  • Does FF always terminate? If so, after how many augmentations?

Min-cost Cut

This is a variant of max-flow problem. Basically, the problem says how to remove some edges with min-cost such that no path from s to t. For the network from CLRS book, the min-cost cut as follows:

  • solution
    • max-flow min-cut theorem says: (max flow) = (min-cut). 
      • from above example, we know the max flow is 23, the min-cut is 12 + 7 + 4 = 23
      • CLRS book has proof about this theorem
      • sometimes min-cost cut also calls minimum s-t cut (s-t: source-terminal/sink)
      • similar to: (min vertex cover) = (max independent set), for a bipartite graph (Konig Theorem)
    • one can use FF algorithm to find max-flow, but how to find min-cost cut?
    • one answer is to use residual graph
      • run FF algorithm and consider the final residual graph
      • collect all reachable vertices from source (obviously, this set does not include t)
      • min-cost cut edges are those edges  from reachable vertices to unreachable vertices
  • min-cost cut problem is different with min-cut problem mentioned in Randomized Algorithm
    • randomized min-cut problem is about undirected graph, this is about directed graph
    • randomized min-cut problem is cutting minimum number of edges of G to disconnect the graph. min-cost cut is about disconnecting s-t, and is about min-cost of edges not about edge numbers.
  • min-cut problem is much harder than min-cost cut problem discussed here!

[TODO] Min-cost Max-flow 

finding a cheapest possible way of sending a mount of flow through a network.

Bipartite Matching


Reference