As it is the main topic of my research I write quite a bit about the Benders Decomposition Method (Benders, 1962). In this post I will try to present a short, yet complete, introduction to this decomposition method.

## How to Benders

Consider the following general LP, in standard form, where $x$ is a real valued variable and $y$ is a variable whose domain is defined by polyhedron $\mathbb{Y}$. We call $y$ a complicating variable because if we were to fix it the resulting LP would be significantly easier. \begin{align} \min && c^T x + f^T y \\ s.t. && Ax + By & \geq b & ~\\ && x \geq 0, y \in \mathbb{Y} \end{align}

The Benders decomposition method partitions the problem in two: a master problem containing the $y$ variables and a sub-problem containing the $x$ variables. With $q(y)$ as the incumbent value for the $x$ part, in other words the problem obtained by projecting out the $x$ variables, we can define a LP using only variable $y$. \begin{align} \min && f^T y + q(y) & ~ & ~ \\ && y \in \mathbb{Y} \end{align}

We then have the sub-problem in terms of $x$. Do note that if the sub-problem is unbounded, then the original problem is unbounded as well. Assuming it is bounded, we can calculate the value of $q(y)$ by solving the following LP. \begin{align} \min && c^T x \\ s.t. && Ax & \geq b - By \\ && x \geq 0 \end{align}

If we consider $\alpha$ the dual variable associated with $(7)$, we can define the dual of the problem above. An important observation is that the solution space of the dual does not depend on $y$. Therefore, if the dual feasible region is empty, the dual problem is infeasible and the sub-problem is unbounded. \begin{align} \max && \alpha^T (b - By) \\ s.t. && A^T \alpha \leq c & ~\\ && \alpha \geq 0 \end{align}

Assuming the solution space is not empty we can enumerate all extreme rays $\rho_i, i \in [1, I]$ and extreme points $\pi_j, j \in [1, J]$ of the feasible region. Given a solution vector $\bar{y}$, we can solve the dual problem by checking if we can find:

1. $\rho_i$ such that: $\rho_i^T (b - B\bar{y}) > 0$, in which case the dual is unbounded;
2. $\pi_i$ maximising: $\pi_j^T (b - B\bar{y})$, in which case both the primal and dual have finite solutions.

Following this idea, we can rewrite the sub-problem in terms of a single variable $q$. This new problem tries to minimise $q$ while respecting these two constraints.

Once we have this definition of the sub-problem, we can rewrite the master problem in terms of $y$ and $q$ only. In Benders, the constraints and are called cuts. \begin{align} \min && f^T y + q \\ s.t. && \rho_i^T (b - By) & \leq 0 & \forall i \in [1, I] \\ && \pi_j^T (b - By) & \leq q & \forall j \in [1, J] \\ && q \in {\mathbb{R}}, y \in {\mathbb{Y}} \end{align}

Hence, since there is an exponential number of extreme points and extreme rays, and because their enumeration is $\mathcal{NP}$-hard, Benders starts without any cut and solves a relaxed master problem which gives a candidate solution $(y^\ast, q^\ast)$. Using this solution, it solves the sub-problem to obtain a value $q(y^\ast)$. If $q^\ast = q(y^\ast)$ then the candidate solution is optimal for the original problem. Otherwise we have two cases:

1. the dual is unbounded, then we select an extreme ray to generate a constraint of type $(13)$, which is called a feasibility cut because it removes infeasible solutions;
2. we have $q(y^\ast) > q^\ast$, then we select and extreme point to add a constraint of type $(14)$, which is called an optimality cut as it gives a lower bound for the incumbent.

The algorithm then repeats the previous steps: it solves the master problem, now with an additional constraint, and checks whether the solution is optimal or if a new cut is needed.

Because there is a finite number of extreme points and extreme rays the algorithm finishes in a finite number of steps.

### Bonus: How to detect faults

Benders is well known to be hard to implement. Although the method is systematic and its application seems straightforward, seldom is its implementation trivial. Detecting issues in the algorithm, especially during long runs with a high number of iterations, can be tricky, making debugging a chore.

However, through my short career I had to debug enough implementations as to find it useful to list a number of properties that should hold during the execution, making at least the debugging process faster.

Do note that even using these methods will not prevent some errors from creeping in, especially if you are using modified versions of Benders.

#### Checking the value of the sub-problem

At each iteration, the objective value of the master problem provides a lower bound on the global solution while the value of the sub-problem gives an upper bound. You have to keep track of the best upper bound as its value can oscillate.

If the value of the sub-problem is lower than the value of the incumbent in the master problem, it means that there is an error in the generated cuts.

#### Keeping track of the last solution

The algorithm is an ever improving process. Because at each iteration we add a cut which removes at least the current solution, it should not be possible to obtain the same solution twice unless we found the optimum.

By saving the last solution, or its value, we can detect a convergence issue early on if we obtain the same solution twice in a row while the termination criterion is not met.

#### Verifying the cuts generated

Most issues in Benders come from generating invalid cuts. One way to verify that the coefficients extracted from the dual information of the sub-problem are correct is to compare them with what their actual value.

The left-hand side of a cut is actually the objective function of the dual in which we reintroduce the master variables. It is therefore simple to verify that the coefficients we use in the cut actually sum to be the objective value of the dual given the current solution.

1. Benders, J. F. (1962). Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4(1), 238–252.