Jan Elffers, Jesús Giráldez-Cru, Jakob Nordström, Marc Vinyals

Using Combinatorial Benchmarks to Probe the Reasoning Power of Pseudo-Boolean Solvers

SAT 2018

We study cdcl-cuttingplanes, Open-WBO, and Sat4j, three successful solvers from the Pseudo-Boolean Competition 2016, and evaluate them by performing experiments on crafted benchmarks designed to be trivial for the cutting planes (CP) proof system underlying pseudo-Boolean (PB) proof search, but yet potentially tricky for PB solvers.

Our results demonstrate severe shortcomings in state-of-the-art PB solving techniques. Despite the fact that our benchmarks have linear-size tree-like CP proofs, the solvers often perform quite badly even for very small instances. Our analysis is that this shows that solvers need to explore stronger methods of pseudo-Boolean reasoning within cutting planes.

We make an empirical observation from the competition data that many of the easy crafted instances are also infeasible over the rational numbers, or have small strong backdoors to PB instances without rational solutions. This raises the intriguing question whether the existence of such backdoors can be correlated with easiness/hardness. However, for some of our constructed benchmark families even rationally infeasible instances are completely beyond reach. This indicates that PB solvers need to get better not only at Boolean reasoning but even at linear programming.

Finally, we compare CP-based solvers with CDCL and MIP solvers. For those of our benchmarks where the natural CNF encodings admit efficient resolution proofs, we see that the CDCL-based solver Open-WBO is orders of magnitude faster than the CP-based solvers cdcl-cuttingplanes and Sat4j (though it seems very sensitive to the ordering of the input). And the MIP solver Gurobi beats all of these solvers across the board.

These experimental results point to several crucial challenges in the quest for more efficient pseudo-Boolean solvers, and we also believe that a further study of our benchmarks could shed important light on the potential and limitations of current state-of-the-art PB solving.