The string matching problem has two inputs, as follows:
- An array T[1, 2, ...n] of length n
- An array P[1, 2, ...m] of length m (<= n)
The elements of T and P are characters from the same finite alphabet (usually called ∑).
For instance, we may be searching in binary strings, in which case our alphabet is {0, 1}, or we may be searching in strings of lowercase letters, in which case our alphabet is {a, b… z}.
The following diagram represents this terminology:
Figure 5.1: Representation of text array T, pattern array P, and finite alphabet ∑
The character arrays P and T are usually called "strings of characters". We're interested in finding occurrences of pattern P in text T.
We say that pattern P occurs in text T if we can align the pattern P with text T so that all characters in P match the ones in T. When aligning...