A theorem in formal language theory that concerns the nature of context-free languages when order of letters is disregarded.
Let the alphabet Σ be the set {a1,…,an}. The letter distribution, ϕ(w), of a Σ-word w is the n-tuple
with
Ni the number of occurrences of
ai in
w. The
Parikh image, ϕ(
L), of a Σ-language
L is
i.e. the set of all letter-distributions of words in
L.
L1 and
L2 are
letter-equivalent if
Letter distributions may be added component-wise as vectors. This leads to the following: a set
S of letter distributions is
linear if, for some distributions
d and
d1,…,
dk, S is the set of all sums formed from
d and multiples of
di.
S is
semilinear if it is a finite union of linear sets.
Parikh’s theorem now states that if L is context-free ϕ(L) is semilinear. It can also be shown that ϕ(L) is semilinear if and only if L is letter-equivalent to a regular language. Hence any context-free language is letter-equivalent to a regular language—although not all such languages are context-free.