# Measuring the difficulty of SAT instances

Given an instance of SAT, I would like to be able to estimate how difficult it will be to solve the instance.

One way is to run existing solvers, but that kind of defeats the purpose of estimating difficulty. A second way might be looking a the ratio of clauses to variables, as is done for phase transitions in random-SAT, but I am sure better methods exist.

Given an instance of SAT, are there some fast heuristics to measure the difficulty? The only condition is that these heuristics be faster than actually running existing SAT solvers on the instance.

### Related question

Which SAT problems are easy? on cstheory.SE. This questions asks about tractable sets of instances. This is a similar question, but not exactly the same. I am really interested in a heuristic that given a single instance, makes some sort of semi-intelligent guess of if the instance will be a hard one to solve.

In general, this is a very relevant and interesting research question. “One way is to run existing solvers…” and what would this even tell us exactly? We could see empirically that an instance seems hard for a specific solver or a specific algorithm/heuristic, but what does it really tell about the hardness of the instance?

One way that has been pursued is the identification of various structural properties of instances that lead to efficient algorithms. These properties are indeed prefered to be “easily” identifiable. An example is the topology of the underlying constraint graph, measured using various graph width parameters. For example it is known that an instance is solvable in polynomial time if the treewidth of the underlying constraint graph is bounded by a constant.

Another approach has focused on the role of hidden structure of instances. One example is the backdoor set, meaning the set of variables such that when they are instantiated, the remaining problem simplifies to a tractable class. For example, Williams et al., 2003 [1] show that even when taking into account the cost of searching for backdoor variables, one can still obtain an overall computational advantage by focusing on a backdoor set, provided the set is sufficiently small. Furthermore, Dilkina et al., 2007 [2] note that a solver called Satz-Rand is remarkably good at finding small strong backdoors on a range of experimental domains.

More recently, Ansotegui et al., 2008 [3] propose the use of the tree-like space complexity as a measure for DPLL-based solvers. They prove that also constant-bounded space implies the existence of a polynomial time decision algorithm with space being the degree of the polynomial (Theorem 6 in the paper). Moreover, they show the space is smaller than the size of cycle-cutsets. In fact, under certain assumptions, the space is also smaller than the size of backdoors.

They also formalize what I think you are after, that is:

Find a measure $\psi$, and an algorithm that given a formula $\Gamma$ decides satisfiability in time $O(n^{\psi ( \Gamma )})$. The smaller the measure is, the better it characterizes the hardness of a formula.

[1] Williams, Ryan, Carla P. Gomes, and Bart Selman. “Backdoors to typical case complexity.” International Joint Conference on Artificial Intelligence. Vol. 18, 2003.

[2] Dilkina, Bistra, Carla Gomes, and Ashish Sabharwal. “Tradeoffs in the Complexity of Backdoor Detection.” Principles and Practice of Constraint Programming (CP 2007), pp. 256-270, 2007.

[3] Ansótegui, Carlos, Maria Luisa Bonet, Jordi Levy, and Felip Manya. “Measuring the Hardness of SAT Instances.” In Proceedings of the 23rd National Conference on Artificial Intelligence (AAAI’08), pp. 222-228, 2008.