Is the complement of { ww | … } context-free?

Define the language $L$ as $L = \{a, b\}^* – \{ww\mid w \in \{a, b\}^*\}$. In other words, $L$ contains the words that cannot be expressed as some word repeated twice. Is $L$ context-free or not?

I’ve tried to intersect $L$ with $a^*b^*a^*b^*$, but I still can’t prove anything. I also looked at Parikh’s theorem, but it doesn’t help.

Answer

It’s context-free. Here’s the grammar:

$S \to A | B|AB|BA$

$A \to a|aAa|aAb|bAb|bAa$

$B \to b|aBa|aBb|bBb|bBa$

$A$ generates words of odd length with $a$ in the center. Same for $B$ and $b$.

I’ll present a proof that this grammar is correct. Let $L = \{a,b\}^* \setminus \{ww \mid w \in \{a,b\}^*\}$ (the language in the question).

Theorem. $L = L(S)$. In other words, this grammar generates the language in the question.

Proof. This certainly holds for all odd-length words, since this grammar generates all odd-lengths words, as does $L$. So let’s focus on even-length words.

Suppose $x \in L$ has even length. I’ll show that $x \in L(G)$. In particular, I claim that $x$ can be written in the form $x=uv$, where both $u$ and $v$ have odd length and have different central letters. Thus $x$ can be derived from either $AB$ or $BA$ (according to whether $u$’s central letter is $a$ or $b$). Justification of claim: Let the $i$th letter of $x$ be denoted $x_i$, so that $x = x_1 x_2 \cdots x_n$. Then since $x$ is not in $\{ww \mid w \in \{a,b\}^{n/2}\}$, there must exist some index $i$ such that $x_i \ne x_{i+n/2}$. Consequently we can take $u = x_1 \cdots x_{2i-1}$ and $v = x_{2i} \cdots x_n$; the central letter of $u$ will be $x_i$, and the central letter of $v$ will be $x_{i+n/2}$, so by construction $u,v$ have different central letters.

Next suppose $x \in L(G)$ has even length. I’ll show that we must have $x \in L$. If $x$ has even length, it must be derivable from either $AB$ or $BA$; without loss of generality, suppose it is derivable from $AB$, and $x=uv$ where $u$ is derivable from $A$ and $v$ is derivable from $B$. If $u,v$ have the same lengths, then we must have $u\ne v$ (since they have different central letters), so $x \notin \{ww \mid w \in \{a,b\}^*\}$. So suppose $u,v$ have different lengths, say length $\ell$ and $n-\ell$ respectively. Then their central letters are $u_{(\ell+1)/2}$ and $v_{(n-\ell+1)/2}$. The fact that $u,v$ have different central letters means that $u_{(\ell+1)/2} \ne v_{(n-\ell+1)/2}$. Since $x=uv$, this means that $x_{(\ell+1)/2} \ne x_{(n+\ell+1)/2}$. If we attempt to decompose $x$ as $x=ww’$ where $w,w’$ have the same length, then we’ll discover that $w_{(\ell+1)/2} = x_{(\ell+1)/2} \ne x_{(n+\ell+1)/2} = w’_{(\ell+1)/2}$, i.e., $w\ne w’$, so $x \notin \{ww \mid w \in \{a,b\}^*\}$. In particular, it follows that $x \in L$.

Attribution
Source : Link , Question Author : Evgeny Eltishev , Answer Author : D.W.

Leave a Comment