# Why are NP-complete problems so different in terms of their approximation?

I’d like to begin the question by saying I’m a programmer, and I don’t have a lot of background in complexity theory.

One thing that I’ve noticed is that while many problems are NP-complete, when extended to optimization problems, some are far more difficult to approximate than others.

A good example is TSP. Although all kinds of TSP are NP-complete, the corresponding optimization problems get easier and easier to approximate with successive simplifications. The general case is NPO-complete, the metric case is APX-complete, and the Euclidean case actually has a PTAS.

This seems counter-intuitive to me, and I’m wondering whether there is a reason for this.

One reason that we see different approximation complexities for NP-complete problems is that the necessary conditions for NP-complete constitute a very coarse grained measure of a problem’s complexity. You may be familiar with the basic definition of a problem $\Pi$ being NP-complete:

1. $\Pi$ is in NP, and
2. For every other problem $\Xi$ in NP, we can turn an instance $x$ of $\Xi$ into an instance $y$ of $\Pi$ in polynomial time such that $y$ is a yes-instance of $\Pi$ if and only if $x$ is a yes-instance of $\Xi$.

Consider condition 2: all that is required is that we can take $x$ and turn it into some $y$ that preserves the “single-bit” yes/no answer. There’s no conditions about, for example, the relative size of the witnesses to the yes or no (that is, the size of the solution in the optimization context). So the only measure that’s used is the total size of the input which only gives a very weak condition on the size of the solution. So it’s pretty “easy” to turn a $\Xi$ into a $\Pi$.

We can see the difference in various NP-complete problems by looking at the complexity of the some simple algorithms. $k$-Coloring has a brute force $O(k^{n})$ (where $n$ is the input size). For $k$-Dominating Set, a brute force approach take $O(n^{k})$. These are, in essence the best exact algorithms we have. $k$-Vertex Cover however has a very simple $O(2^{k}n^{c})$ algorithm (pick an edge, branch on which endpoint to include, mark all covered, keep going until you have no edges unmarked or you hit your budget of $k$ and bactrack). Under polynomial-time many-one reductions (Karp reductions, i.e. what we’re doing in condition 2 above), these problems are equivalent.

When we start to approach complexity with even slightly more delicate tools (approximation complexity, parameterized complexity, any others I can’t think of), the reductions we use become more strict, or rather, more sensitive to the structure of the solution, and the differences start to appear; $k$-Vertex Cover (as Yuval mentioned) has a simple 2-approximation (but doesn’t have an FPTAS unless some complexity classes collapse), $k$-Dominating Set has a $(1+\log n)$-approximation algorithm (but no $(c\log n)$-approximation for some $c>0$), and approximation doesn’t really make sense at all for the straight forward version of $k$-Coloring.