Let $A_P = (Q,\Sigma,\delta,0,\{m\})$ the string matching automaton for pattern $P \in \Sigma^m$, that is
- $Q = \{0,1,\dots,m\}$
- $\delta(q,a) = \sigma_P(P_{0,q}\cdot a)$ for all $q\in Q$ and $a\in \Sigma$
with $\sigma_P(w)$ the length of the longest prefix of $P$ that is a Suffix of $w$, that is
$\qquad \displaystyle \sigma_P(w) = \max \left\{k \in \mathbb{N}_0 \mid P_{0,k} \sqsupset w \right\}$.
Now, let $\pi$ the prefix function from the Knuth-Morris-Pratt algorithm, that is
$\qquad \displaystyle \pi_P(q)= \max \{k \mid k < q \wedge P_{0,k} \sqsupset P_{0,q}\}$.
As it turns out, one can use $\pi_P$ to compute $\delta$ quickly; the central observation is:
Assume above notions and $a \in \Sigma$. For $q \in \{0,\dots,m\}$ with $q = m$ or $P_{q+1} \neq a$, it holds that
$\qquad \displaystyle \delta(q,a) = \delta(\pi_P(q),a)$
But how can I prove this?
For reference, this is how you compute $\pi_P$:
m ← length[P ] π[0] ← 0 k ← 0 for q ← 1 to m − 1 do while k > 0 and P [k + 1] =6 P [q] do k ← π[k] if P [k + 1] = P [q] then k ← k + 1 end if π[q] ← k end while end for return π
Asked By : Bob
Answered By : Raphael
First of all, note that by definition
- $\delta(q,a) = \sigma_P(P_{0,q}\cdot a) =: s_1$ and
- $\delta(\pi_P(q),a) = \sigma_P(P_{0,\pi_P(q)}\cdot a) =: s_2$.
Let us investigate $s_1$ and $s_2$ in a sketch:
[source]
Now assume $s_2 > s_1$; this contradicts the maximal choice of $s_1$ directly. If we assume $s_1 > s_2$ we contradict the fact that both $s_2$ and $\pi_P(q)$ are chosen maximally, in particular because $\pi_P(q) \geq s_1 - 1$. As both cases cases lead to contradictions $s_1=s_2$ holds, q.e.d.
As requested, a more elaborate version of the proof:
Now we have to show $s_1=s_2$; we do this by showing that the opposite leads to contradictions.
- Assume $s_2 > s_1$. Note that $P_{0,s_2} \sqsupset P_{0,q}\cdot a$ because $P_{0,s_2} \sqsupset P_{0,\pi_P(q)}\cdot a$ and $P_{0,\pi_P(q)} \sqsupset P_{0,q}$ by definition of $s_2$. Therefore, $P_{0,s_2}$ -- a prefix of $P$ and a suffix of $P_{0,q}\cdot a$ -- is longer than $P_{0,s_1}$, which is by definition of $s_1$ the longest prefix of $P$ that is a suffix of $P_{0,q}\cdot a$. This is a contradiction.
Before we continue with the other case, let us see that $\pi_P(q) \geq s_1 - 1$. Observe that because $P_{0,s_1} \sqsupset P_{0,q}\cdot a$, we have $P_{0,s-1} \sqsupset P_{0,q}$. Assuming that $\pi_P(q) < s_1 - 1$ immediately contradicts the maximal choice of $\pi_P(q)$ ($s_1 - 1$ is in the set $\pi_P(q)$ is chosen from).
- Assume $s_1 > s_2$. We have just shown $|P_{0,\pi_P(q)}\cdot a| \geq s_1$, and remember that $P_{0,\pi_P(q)}\cdot a \sqsupset P_{0,q} \cdot a$. Therefore, $s_1 > s_2$ contradicts the maximal choice of $s_2$ ($s_1$ is in the set $s_2$ is chosen from).
As neither $s_1 > s_2$ nor $s_2 > s_1$ can hold, we have proven that $s_1 = s_2$, q.e.d.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/1669
0 comments:
Post a Comment
Let us know your responses and feedback