AMPL Development Tutorial 5/6 – Parallelizing Subproblem Solves in Benders Decomposition#

Description: In the fifth installment of our six-part series, we delve deeper by showing how to evolve our Benders decomposition Python script from a serial execution to one that solves subproblems in parallel. This transition has the potential to significantly reduce solution times, especially when dealing with complex subproblems or a large number of scenarios that generate many subproblems. In our concluding notebook, we will introduce the AMPLS library, a robust resource designed to further enhance Benders decomposition’s efficiency. It achieves this by enabling the export of AMPL model instances as persistent solver representations, thus streamlining the solution process.

Tags: amplpy, ampl, mip, stochastic, facility location, benders, decomposition, parallel solves

Notebook author: Gyorgy Matyasfalvi <gyorgy@ampl.com>

References:

AMPL a Modeling Language for Mathematical Programming – Robert Fourer et al.
Intro to Linear Optimization – Dimitris Bertsimas et al.
SCIP Optimization Suite example projects – Stephen J. Maher (ZIB)

# Install dependencies
%pip install -q amplpy pandas

# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["open"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

# Import all necessary libraries
import os
import warnings
import concurrent.futures

# import queue
import pandas as pd
from typing import Tuple, Dict

Introduction#

To ensure a cohesive understanding and for ease of reference, we will revisit the foundational concepts of stochastic programming and the theory behind Benders decomposition in this notebook. This repetition will help reinforce these critical concepts and provide a seamless transition into the PYTHON script implementing Benders decomposition of our stochastic facility location problem.

Two-stage stochastic programming problem formulation#

Consider a decision-making process occurring over two sequential stages. Initially, a decision vector \(x\) is selected. Following this, new information becomes available, prompting the selection of a second-stage decision vector \(y\). We introduce \(S\) possible scenarios for the newly acquired information, with the actual scenario being revealed only after the selection of \(x\). Each scenario is indexed by \(s\), and its occurrence probability is denoted by \(\alpha^s\), which we assume to be positive. Given the sequential nature of decision-making, the choice of \(y\) can be scenario-dependent, denoted as \(y^s\) to reflect this.

The decision vectors \(x\) and \(y^s\) incur costs represented by vectors \(f\) and \(q\), respectively. The first-stage decisions are constrained by:

\[\begin{split} \begin{array}{crlllllll} & A x & = & {b} \\ & x & \geq & 0. \end{array} \end{split}\]

Furthermore, the combination of first and second-stage decisions must satisfy the scenario-specific constraints:

\[\begin{split} \begin{array}{crcllllll} & {B}^s x + {D}^s y^s & = & {d}^s \\ & y^s & \geq & 0, \end{array} \end{split}\]

for every scenario \(s\). The uncertainty in the model is encapsulated by \(\xi := (q, B, D, d)\), where a constraint is deemed part of the second stage if it involves any nonzero coefficients of variables \({y}\). The matrices \({B}^s\) are referred to as technology matrices, which determine the influence of the decision at the first stage on the second stage. The objective is to determine the choices of \(x\) and \(y^s\) for all \(s=1, \ldots, S\) that minimize the expected total cost:

\[ \begin{array}{cllllllll} & f^T x & + & \alpha_1 {(q^1)^T} y^1 & + & \alpha^2 {(q^2)^T} y^2 & \ldots & + & \alpha^S {(q^S)^T} y^S \end{array} \]

This formulation leads to the following two-stage stochastic optimization problem:

\[\begin{split} \begin{array}{crcrcrccrcl} \min & f^T x & + & \alpha^1 {(q^1)^T} y_1 & + & \alpha^2 {(q^1)^T} y^2 & \ldots & + & \alpha^s {(q^s)^T} y^S & & \\ \textrm{s.t.} & A x & & & & & & & & &= {b} \\ & {B^1} x & + & {D^1} y^1 & & & & & & &= {d^1} \\ & {B^2} x & + & & & {D^2} y^2 & & & & &= {d^2} \\ & \vdots & & & & & \ddots & & & &\vdots \\ & {B^S} x & + & & & & & & {D^S} y^S & &= {d^S} \\ & x \geq 0, & & y^1 \geq 0, & \ldots & & & & y^S \geq 0 & & \end{array} \end{split}\]

we will refer to the above problem as the master problem. Notice that even if the number of possible scenarios \(S\) is moderate, this formulation can be a large linear programming problem. However, a decomposition method can help.

Benders decomposition for two-stage problems#

Given a vector \(\hat{x}\), using the dual simplex method solve the subproblem (for all \(s \in S\)):

\[\begin{split} \begin{array}{crl} \min & (q^s)^T y^s & \\ \textrm{s.t.} & D^s y^s & = d^s - B^s \hat{x} \\ & y^s & \geq 0 \end{array} \end{split}\]
The dual of (5) takes the following form:

\[\begin{split} \begin{array}{crl} \max & (p^s)^T (d^s - B^s \hat{x}) & \\ \textrm{s.t.} & (p^s)^T D^s & <= (q^s)^T \end{array} \end{split}\]

Let

\[\begin{split} \begin{array}{crrl} & P \;\; = & \{ p & | \;\; p^T D \leq q^T \} & \\ \end{array} \end{split}\]

We assume that \(P\) is nonempty and has at least one extreme point (so we can apply duality theory). Let \(p_i\), \(i = 1, \ldots, I\), be the extreme points, and let \(w^j\), \(j = 1, \ldots, J\), be a complete set of extreme rays of \(P\).
Define \(z^s(\hat{x})^*\) as the optimal cost obtained from problem (5) described above. Returning to the optimization of \(x\), we encounter the following problem:

\[\begin{split} \begin{array}{crcrcrccrcl} \min & f^T x & + & \sum_{s=1}^S \alpha^s z^s(x) & \\ \textrm{s.t.} & A x & & & = {b} \\ & x & \geq 0 & & \end{array} \end{split}\]

Under our assumption that the set \(P\) is nonempty, either the dual problem (6) has an optimal solution and \(z^s(\hat{x})\) is finite, or the optimal dual cost is infinite (indicating that the primal (5) is infeasible).
In particular, \(z^s(x) \leq \infty \leftrightarrow (w^j)^T (d^s - B^s x) \leq 0, \; \forall j\) and whenever \(z^s(x)\) is finite the optimum of (6) must be attained at an extreme point of the set \(P\) meaning \(z^s(x)\) is the smallest number \(z^s\) such that \((p^i)^T (d^s - B^s x) \leq z^s, \; \forall i\), leading us to the following master-problem formulation:

\[\begin{split} \begin{array}{crcrcrccrcl} \min & f^T x & + & \sum_{s=1}^S \alpha^s z^s & & \\ \textrm{s.t.} & A x & & & = b & \\ & (p^i)^T (d^s - B^s x) & & & \leq z^s & \forall \; i, s \\ & (w^j)^T (d^s - B^s x) & & & \leq 0 & \forall \; j, s \\ & x & \geq 0 & & & \end{array} \end{split}\]
In our algorithm we will only generate constraints that we find to be violated by the current solution.

The algorithm#

Initialization step: initialize \((\hat{x}, \hat{z}^1, \ldots, \hat{z}^S)\) to zero.
Sub-problem step: using the dual-simplex method solve the sub-problems individually for every \(s \in S\):

\[\begin{split} \begin{array}{crl} \min & (q^s)^T y^s & \\ \textrm{s.t.} & D^s y^s & = d^s - B^s \hat{x} \\ & y^s & \geq 0 \end{array} \end{split}\]

Check optimality of master-problem: If for every \(s\), the corresponding sub-problem is feasible and \((q^s)^T \hat{y}^s \leq \hat{z}^s\) then all constraints are satisfied and we have an optimal solution to the master problem.
Add optimality cut: If the sub-problem corresponding to some \(s\) has an optimal solution such that \((q^s)^T \hat{y}^s > \hat{z}^s\) the following cut is added to the master problem:

\[ (p^s)^T (d^s - B^s x) \leq z^s, \]

where \(p^s\) is an optimal basic feasible solution to the dual of the sub-problem.

Add feasibility cut: If the sub-problem corresponding to some \(s\) is infeasible, its dual has infinite cost and the following cut is added to the master problem:

\[ (w^s)^T (d^s - B^s x) \leq 0, \]

where \(w^s\) is a positive cost extreme ray.

Stochastic facility location a concrete example#

Given:

A set of facilities: \(I\).
A set of customers: \(J\).
Set of scenarios: \(S\) (representing different customer demands).

Task:

Find the minimum cost facilities to open such that the customer demand can be satisfied in all scenarios.

Variables#

\(x_i \in \{0, 1\} \quad \forall i \in I\)
- \(x_i = 1\) if facility \(i\) is opened.
\(y_{ij}^s \geq 0 \quad \forall i \in I, \forall j \in J, \forall s \in S\)
- \(y_{ij}^s\) is the level of demand for customer \(j\) satisfied by facility \(i\) in scenario \(s\).

Parameters:#

\(\alpha^s\): the probability of scenario \(s\).
\(f_i\): the fixed cost for opening facility \(i\),
\(q_{ij}\): the cost of servicing customer \(j\) from facility \(i\),
\(\lambda_j^s\): the demand of customer \(j\) in scenario \(s\),
\(k_i:\) the capacity of facility \(i\).

The extensive form#

The extensive form of our stochastic program can be formulated as follows:

\[\begin{split} \begin{array}{rrlrrll} \min \quad & \sum_{i \in I} f_i x_i & + & \sum_{s \in S} \sum_{i \in I} \sum_{j \in J} \alpha^s q_{ij} y_{ij}^s & & & \\ & & & & & & \\ \textrm{s.t.} \quad & & & \sum_{i \in I} y_{ij}^s & \geq & \lambda_j^s & \forall j \in J, \forall s \in S \\ & & & \sum_{j \in J} y_{ij}^s & \leq & k_i x_i & \forall i \in I, \forall s \in S \\ & \sum_{i \in I} k_i x_i & & & \geq & \max_{s \in S} \sum_{j \in J} \lambda_j^s & \\ & & & & & & \\ & x_i & \in \{0, 1\} & & & & \forall i \in I \\ & y_{ij}^s & \geq 0 & & & & \forall i \in I, \forall j \in J, \forall s \in S \end{array} \end{split}\]

The sub-problem#

Given the above formulation for the extensive form (12) we can express our scenario specific sub-problems as follows

\[\begin{split} \begin{array}{lcrllll} z_s(\hat{x}) & = & \min \quad & \sum_{i \in I} \sum_{j \in J} q_{ij} y_{ij}^s & & & \\ & & & & & & \\ & & \textrm{s.t.} \quad & \sum_{i \in I} y_{ij}^s & \geq & \lambda_j^s & \forall j \in J \\ & & & \sum_{j \in J} y_{ij}^s & \leq & k_i \hat{x_i} & \forall i \in I \\ & & & & & & \\ & & & y_{ij}^s \geq 0 & & & \forall \; i \in I, \forall j \in J \end{array} \end{split}\]

The master-problem#

The master-problem takes the following form

\[\begin{split} \begin{array}{rllll} \min \quad & \sum_{i \in I} f_i x_i + \sum_{s \in S} \alpha^s z^s & & & \\ & & & & \\ \textrm{s.t.} \quad & \sum_{i \in I} k_i x_i & \geq & \max_{s \in S} \sum_{j \in J} \lambda_j^s & \\ & \sum_{j \in J} p_j^s \lambda_j^s + \sum_{i \in I} p_i^s k_i x_i & \leq & z^s & \forall s \in S, \forall p \in P^s \\ & \sum_{j \in J} w_j^s \lambda_j^s + \sum_{i \in I} w_i^s k_i x_i & \leq & 0 & \forall s \in S, \forall w \in P^s \\ & & & & \\ & x_i \in \{0, 1\} & & & \forall i \in I \\ \end{array} \end{split}\]

AMPL implementation#

Converting the mathematical formulation of our decomposition scheme into an AMPL model is straightforward. Similar to the extensive form, AMPL enables the modeling of problems in a way that closely aligns with their conceptual structure. In AMPL, sets and parameters are globally accessible, allowing for seamless integration across different problem components.

AMPL facilitates the assignment of variables, objective functions, and constraints to specific named problems using the syntax problem <problemName>: <list of model entities>. In the model file presented below, we’ve organized the content to clearly indicate which elements pertain to the master problem and which to the subproblems. This organization is primarily for clarity and ease of understanding; the AMPL interpreter treats the entire file as a single problem until it encounters explicit problem statements.

Beyond the entities familiar from the extensive form, we introduce additional parameters like sub_facility_open, which convey the master problem’s solutions to the subproblems. We also incorporate parameters for the dual values from the subproblems, namely customer_price and facility_price. New constraints, often referred to as cuts, are added to the master problem based on the subproblems’ feasibility and optimality relative to the master problem’s objectives.

Furthermore, a new set of variables, sub_variable_cost, is added to the master problem. These variables account for the variable costs identified in each scenario’s subproblem, contributing to the overall total cost calculation.

%%writefile floc_bend.mod

# Global sets
set FACILITIES; # set of facilities
set CUSTOMERS;  # set of customers
set SCENARIOS;  # set of scenarios

# Global parameters
param customer_demand{CUSTOMERS, SCENARIOS} >= 0;   # customer_demand of customer j in scenario s
param facility_capacity{FACILITIES} >= 0;           # facility_capacity of facility_open i


### SUBPROBLEM ###

# Subproblem parameters
param variable_cost{FACILITIES, CUSTOMERS} >= 0;    # variable cost of satisfying customer_demand of customer j from facility_open i

# Bender's parameters
param sub_facility_open{FACILITIES} binary, default 1; # 1 if facility i is open, 0 otherwise
param sub_scenario symbolic in SCENARIOS;

# Subproblem variables
var production{FACILITIES, CUSTOMERS, SCENARIOS} >= 0;  # production from facility i to satisfy customer demand j in scenario s

# Subproblem objective
minimize operating_cost: 
    sum{i in FACILITIES, j in CUSTOMERS} variable_cost[i,j]*production[i,j,sub_scenario]; 

# Subproblem constraints
s.t. satisfying_customer_demand{j in CUSTOMERS}:
    sum{i in FACILITIES} production[i,j,sub_scenario] >= customer_demand[j,sub_scenario];

s.t. facility_capacity_limits{i in FACILITIES}:
    sum{j in CUSTOMERS} production[i,j,sub_scenario] <= facility_capacity[i] * sub_facility_open[i];



### MASTER PROBLEM ### 

# Master parameters
param fixed_cost{FACILITIES} >= 0;               # fixed cost of opening facility_open i
param prob{SCENARIOS} default 1/card(SCENARIOS); # probability of scenario s

# Bender's parameters
param nITER >= 0 integer;
param cut_type {1..nITER, SCENARIOS} symbolic within {"opt", "feas", "none"};
param customer_price{CUSTOMERS, SCENARIOS, 1..nITER} >= 0;  # dual variable for the demand constraint
param facility_price{FACILITIES, SCENARIOS, 1..nITER} <= 0; # dual variable for the capacity constraint

# Master variables
var sub_variable_cost{SCENARIOS} >= 0; # subproblem objective bound
var facility_open{FACILITIES} binary;  # 1 if facility i is open, 0 otherwise

# Master objective
minimize total_cost: 
    sum{i in FACILITIES} fixed_cost[i]*facility_open[i] + sum{s in SCENARIOS} prob[s]*sub_variable_cost[s];

# Master constraints
s.t. sufficient_production_capacity:
    sum{i in FACILITIES} facility_capacity[i]*facility_open[i] >= max{s in SCENARIOS} sum{j in CUSTOMERS} customer_demand[j,s];

s.t. optimality_cut {k in 1..nITER, s in SCENARIOS: cut_type[k,s] == "opt"}:
        sum{j in CUSTOMERS} customer_price[j,s,k]*customer_demand[j,s] + 
        sum{i in FACILITIES} facility_price[i,s,k]*facility_capacity[i]*facility_open[i] <= sub_variable_cost[s]; 

s.t. feasibility_cut {k in 1..nITER, s in SCENARIOS: cut_type[k,s] == "feas"}: 
        sum{j in CUSTOMERS} customer_price[j,s,k]*customer_demand[j,s] + 
        sum{i in FACILITIES} facility_price[i,s,k]*facility_capacity[i]*facility_open[i] <= 0; 

Overwriting floc_bend.mod

Data as Python objects#

In this notebook, we continue to work with the same dataset as in the previous installments. However, the format in which we store our data differs; we now utilize Python lists and Pandas data frames. Initially, we create these Python objects and subsequently employ amplpy methods to seamlessly transfer the data into AMPL.

# Define the data

# Define sets
FACILITIES = ["Baytown_TX", "Beaumont_TX", "Baton_Rouge_LA"]
CUSTOMERS = ["San_Antonio_TX", "Dallas_TX", "Jackson_MS", "Birmingham_AL"]
SCENARIOS = ["Low", "Medium", "High"]

# Define parameters indexed over FACILITIES
fixed_cost_df = pd.DataFrame(
    [400000, 200000, 600000], index=FACILITIES, columns=["fixed_cost"]
)

facility_capacity_df = pd.DataFrame(
    [1550, 650, 1750], index=FACILITIES, columns=["facility_capacity"]
)

# Define parameters indexed over SCENARIOS
prob_df = pd.DataFrame([0.25, 0.5, 0.25], index=SCENARIOS, columns=["prob"])

# Define parameters indexed over CUSTOMERS and SCENARIOS
customer_demand_df = pd.DataFrame(
    [450, 650, 887, 910, 1134, 1456, 379, 416, 673, 91, 113, 207],
    index=pd.MultiIndex.from_product(
        [CUSTOMERS, SCENARIOS], names=["CUSTOMERS", "SCENARIOS"]
    ),
    columns=["customer_demand"],
)

# Define parameters indexed over FACILITIES and CUSTOMERS
variable_cost_df = pd.DataFrame(
    [
        5739.725,
        6539.725,
        8650.40,
        22372.1125,
        6055.05,
        6739.055,
        8050.40,
        21014.225,
        8650.40,
        7539.055,
        4539.72,
        15024.325,
    ],
    index=pd.MultiIndex.from_product(
        [FACILITIES, CUSTOMERS], names=["FACILITIES", "CUSTOMERS"]
    ),
    columns=["variable_cost"],
)

Solving subproblems in parallel#

Leveraging multiple processor cores enables us to solve subproblems in parallel, enhancing efficiency by allowing each subproblem to be addressed independently. To facilitate this, we’ll develop a dedicated worker function specifically tailored for solving these subproblems. Each subproblem will be processed in its own AMPL process, with the solutions integrated back into the master problem to update its parameters.

Noteworthy Changes#

In contrast to our approach in previous notebooks, where a single AMPL process managed both the master problem and all subproblems, our new setup diverges significantly. We will initiate separate AMPL processes for each subproblem, in addition to one for the master problem, resulting in a total of “number of scenarios plus one” AMPL processes. This change precludes the possibility of direct data transfers between subproblems and the master problem using AMPL’s let commands, such as:

let {j in CUSTOMERS} customer_price[j, sub_scenario, nITER] := satisfying_customer_demand[j].dual;
let {i in FACILITIES} facility_price[i, sub_scenario, nITER] := facility_capacity_limits[i].dual;

Instead, data transfer between the subproblems and the master problem will be managed through Python. This requires the master problem’s Python process to extract data from the subproblems and then load it into the master problem’s AMPL process, and vice versa.

To manage these data transformations efficiently, additional Python code is necessary. A practical solution is to develop a Python module comprised of utility and worker functions tailored for these tasks. Given our Jupyter Notebook environment, we will define these functions in a separate code cell for ease of access and organization.

Utility and worker Functions#

To streamline our approach before tackling the core of our task, we will establish a series of utility functions designed to handle data transformations and to reduce redundancy in our code. Alongside these, we’ll introduce a pivotal worker function, solve_sub(), designated to be run by individual threads in our parallel setup.

Worker functions typically have access to global symbols and data. However, in the realm of parallel computing, it’s advisable to directly supply data to the worker function. This practice promotes clarity and ensures that each operation is distinct and self-contained. Adhering to this principle, the solve_sub() function will use a specific AMPL process for each scenario, handling the necessary computations for each subproblem independently.

# Define data loading functions
def load_set_data(list_dict: dict, ampl: AMPL) -> None:
    for key, value in list_dict.items():
        ampl.set[key] = value


def load_param_data(df_dict: dict, ampl: AMPL) -> None:
    # Since we have named the columns of our data frame with the same names as the AMPL parameters, we don't need the keys
    for _, value in df_dict.items():
        ampl.set_data(value)


# Define functions for setting options
def set_benders_options(ampl: AMPL) -> None:
    # Set options
    ampl.option["solver"] = "highs"  # Select the solver
    # Ensure that the solver uses the dual simplex method and returns dual rays that's why presolve is turned off:
    ampl.option["highs_options"] = "pre:solve off alg:simplex 1"
    # Also turn off presolve in AMPL so we get info from the solver:
    ampl.option["presolve"] = 0
    ampl.option["display_eps"] = 0.000001  # Set the display precision
    ampl.option["omit_zero_rows"] = 1  # Omit zero rows in the output


# Define the subproblem initialization function
def init_sub(
    model_file_path: str, list_dict: dict, df_dict: dict, s: str
) -> Tuple[str, AMPL]:
    ampl = AMPL()  # Instantiate an AMPL object
    ampl.read(model_file_path)  # Read the model from file
    # Load sets
    load_set_data(list_dict, ampl)
    # Load parameters
    load_param_data(df_dict, ampl)
    # Set options
    set_benders_options(ampl)
    ampl.param["sub_scenario"] = s
    # Define subproblem entities
    ampl.eval(
        r"""
        problem Sub: production, operating_cost, satisfying_customer_demand, facility_capacity_limits;
        """
    )
    return s, ampl


# Define the subproblem worker function
# def solve_sub(s: str, ampl: AMPL) -> None:
def solve_sub(s: str, ampl: AMPL) -> None:
    # Solve the subproblem
    ampl.param["sub_scenario"] = s
    ampl.solve(problem="Sub")
    # After solving put the scenario and the AMPL object into the queue and make it available to the master problem
    # results_queue.put((s, ampl))


# Define the master initialization function
def init_master(model_file_path: str, list_dict: dict, df_dict: dict) -> AMPL:
    ampl = AMPL()  # Instantiate an AMPL object
    ampl.read(model_file_path)  # Read the model from file
    # Define suffix for storing dual rays
    ampl.eval(r"suffix dunbdd;")
    # Load sets
    load_set_data(list_dict, ampl)
    # Load parameters
    load_param_data(df_dict, ampl)
    # Set options
    set_benders_options(ampl)
    # Define master problem entities
    ampl.eval(
        r"""
        problem Master: facility_open, sub_variable_cost, total_cost, optimality_cut, feasibility_cut, sufficient_production_capacity; 
        problem Sub: production, operating_cost, satisfying_customer_demand, facility_capacity_limits;
        """
    )
    return ampl


# Define add feasibility cut function for the master problem
def add_feas_cut(master: AMPL, sub: AMPL, n_iter: int, s: str) -> None:
    # Extract dual rays for constraints related to customers
    # from subproblem and store them in such a way that we can readily load them into master
    c_rays_df = (
        sub.con["satisfying_customer_demand"]
        .get_values(suffixes=["dunbdd"])
        .to_pandas()
    )
    c_rays_df.rename(
        columns={"satisfying_customer_demand.dunbdd": "customer_price"}, inplace=True
    )
    c_rays_df.index = pd.MultiIndex.from_tuples(
        [(i, s, n_iter) for i in c_rays_df.index]
    )
    # Extract dual rays for constraints related to facilities
    f_rays_df = (
        sub.con["facility_capacity_limits"].get_values(suffixes=["dunbdd"]).to_pandas()
    )
    f_rays_df.rename(
        columns={"facility_capacity_limits.dunbdd": "facility_price"}, inplace=True
    )
    f_rays_df.index = pd.MultiIndex.from_tuples(
        [(i, s, n_iter) for i in f_rays_df.index]
    )
    # Update cut_type, customer_price, and facility_price inside the master
    master.param["cut_type"][(n_iter, s)] = "feas"
    master.set_data(c_rays_df)
    master.set_data(f_rays_df)


# Define add optimality cut function for the master problem
def add_opt_cut(master: AMPL, sub: AMPL, n_iter: int, s: str) -> None:
    # Extract dual rays for constraints related to customers
    # from subproblem and store them in such a way that we can readily load them into master
    c_duals_df = (
        sub.con["satisfying_customer_demand"].get_values(suffixes=["dual"]).to_pandas()
    )
    c_duals_df.rename(
        columns={"satisfying_customer_demand.dual": "customer_price"}, inplace=True
    )
    c_duals_df.index = pd.MultiIndex.from_tuples(
        [(i, s, n_iter) for i in c_duals_df.index]
    )
    # Extract dual rays for constraints related to facilities
    f_duals_df = (
        sub.con["facility_capacity_limits"].get_values(suffixes=["dual"]).to_pandas()
    )
    f_duals_df.rename(
        columns={"facility_capacity_limits.dual": "facility_price"}, inplace=True
    )
    f_duals_df.index = pd.MultiIndex.from_tuples(
        [(i, s, n_iter) for i in f_duals_df.index]
    )
    # Update cut_type, customer_price, and facility_price inside the master
    master.param["cut_type"][(n_iter, s)] = "opt"
    master.set_data(c_duals_df)
    master.set_data(f_duals_df)


# Define update subproblem function
def update_sub(master: AMPL, sub_dict: Dict[str, AMPL]) -> None:
    # Pass master solution to subproblem as parameters
    sub_facility_open_df = master.var["facility_open"].get_values().to_pandas()
    sub_facility_open_df.rename(
        columns={"facility_open.val": "sub_facility_open"}, inplace=True
    )
    for _, ampl in sub_dict.items():
        ampl.set_data(sub_facility_open_df)


# Close subproblem AMPL objects
def close_sub(sub_dict: Dict[str, AMPL]) -> None:
    for _, ampl in sub_dict.items():
        ampl.close()

Parallel Benders#

Key elements of our parallel Benders implementation include:

Data Management: We employ dictionaries for each data type (Python lists, Pandas data frames) to streamline and clarify data loading operations, reducing code redundancy.
Core Availability Check: We assess the number of available processor cores. If the number of scenarios exceeds the available cores, we issue a warning to the user.
AMPL Object Initialization: For each scenario, we initialize an ampl_dict object, each containing an AMPL object configured to run as an independent process.
Subproblem Executor: We establish sub_executor, a dedicated object for managing the execution of subproblems.
While Loop Execution: The iterative process begins similarly to previous implementations, but with the solve operations are conducted outside the for loop in parallel.
Violation Checks and Cuts: Within the for loop, we simply identify violations and append the necessary cuts to the master problem.
Cleanup Operations: Upon exiting the while loop, we undertake cleanup operations to prevent the persistence of unused processes, ensuring system resources are efficiently managed.

This structured approach allows us to leverage parallel computing capabilities, potentially enhancing the efficiency and speed of the Benders decomposition process.

# Define data dictionaries for easier loading
list_dict = {"FACILITIES": FACILITIES, "CUSTOMERS": CUSTOMERS, "SCENARIOS": SCENARIOS}
df_dict = {
    "fixed_cost": fixed_cost_df,
    "variable_cost": variable_cost_df,
    "customer_demand": customer_demand_df,
    "facility_capacity": facility_capacity_df,
    "prob": prob_df,
}

# Check available threads if there aren't enough then warn the user
num_cores = os.cpu_count()
num_cores_needed = len(SCENARIOS)
if num_cores < num_cores_needed:
    warnings.warn(
        f"Number of threads ({num_cores}) is less than the number of scenarios ({num_cores_needed})."
    )

# Instantiate an AMPL object for the master problem, load the model, and set options
file_path = "floc_bend.mod"  # Define the file path
master = init_master(file_path, list_dict, df_dict)
master.eval("option solver;")

# Set up AMPL and Python for Benders decomposition
# Loop logic variables
epsilon = 0.00001
n_iter = 0
master.param["nITER"] = n_iter
violation_ctr = 0

# Create a queue to hold the results of the subproblem
# results_queue = queue.Queue()
# Create AMPL objects for each scenario
sub_ampl_dict = dict(init_sub(file_path, list_dict, df_dict, s) for s in SCENARIOS)
# Create thread handlers for each scenario
sub_executor = concurrent.futures.ThreadPoolExecutor(max_workers=len(SCENARIOS))

while True:
    violation_ctr = 0
    n_iter += 1
    master.param["nITER"] = n_iter
    print(f"\n\nITERATION {n_iter}\n")

    # Solve each subproblem in parallel
    ### BEGIN PARALLEL COMPUTATION ###
    # futures = {sub_executor.submit(solve_sub, s, sub_ampl_dict[s], results_queue): s for s in SCENARIOS}
    futures = {
        sub_executor.submit(solve_sub, s, sub_ampl_dict[s]): s for s in SCENARIOS
    }
    concurrent.futures.wait(futures)
    ### END PARALLEL COMPUTATION ###

    # Check results of each sub problem
    for s, sub in sub_ampl_dict.items():
        # Check for feasibility
        if sub.solve_result == "infeasible":
            print(f"{n_iter}: Feasibility cut added for scenario {s}")
            add_feas_cut(master, sub, n_iter, s)
        # Check for optimality
        elif (
            sub.obj["operating_cost"].value()
            > master.var["sub_variable_cost"][s].value() + epsilon
        ):
            print(f"{n_iter}: Optimality cut added for scenario {s}")
            add_opt_cut(master, sub, n_iter, s)
        else:
            print(f"{n_iter}: No cut needed for scenario {s}")
            violation_ctr += 1
            master.param["cut_type"][(n_iter, s)] = "none"

    # Exit loop if no cuts are needed indicating optimality
    if violation_ctr == len(SCENARIOS):
        break

    # Solve the master problem
    print(f"\nSOLVING THE MASTER PROBLEM\n")
    master.solve(problem="Master")
    update_sub(master, sub_ampl_dict)

# Clean up threads and AMPL objects
sub_executor.shutdown()
close_sub(sub_ampl_dict)

print(
    f"\n\nOPTIMAL SOLUTION FOUND\nExpected cost = {master.obj['total_cost'].value()}\n"
)
master.eval("display facility_open;")
assert master.solve_result == "solved", master.solve_result

option solver highs;


ITERATION 1

HiGHS 1.6.0:   pre:solve = off
  alg:simplex = 1
HiGHS 1.6.0: HiGHS 1.6.0:  pre:solve = off
  pre:solve = off
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 11621793.46
4 simplex iterations
0 barrier iterations
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 21050711.2
6 simplex iterations
0 barrier iterations
HiGHS 1.6.0: optimal solution; objective 14779784.86
5 simplex iterations
0 barrier iterations
1: Optimality cut added for scenario Low
1: Optimality cut added for scenario Medium
1: Optimality cut added for scenario High

SOLVING THE MASTER PROBLEM

HiGHS 1.6.0:   pre:solve = off
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 16688018.6
0 simplex iterations
1 branching nodes


ITERATION 2

HiGHS 1.6.0: HiGHS 1.6.0: HiGHS 1.6.0:   pre:solve = off
  alg:simplex = 1
  pre:solve = off
  pre:solve = off
  alg:simplex = 1
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 11621793.46
0 simplex iterations
0 barrier iterations
HiGHS 1.6.0: optimal solution; objective 14966984.87
1 simplex iterations
0 barrier iterations
HiGHS 1.6.0: optimal solution; objective 21570711.2
0 simplex iterations
0 barrier iterations
2: No cut needed for scenario Low
2: Optimality cut added for scenario Medium
2: No cut needed for scenario High

SOLVING THE MASTER PROBLEM

HiGHS 1.6.0:   pre:solve = off
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 16758018.6
1 simplex iterations
1 branching nodes


ITERATION 3

HiGHS 1.6.0:   pre:solve = off
HiGHS 1.6.0:   alg:simplex = 1
HiGHS 1.6.0:  pre:solve = off
  pre:solve = off
  alg:simplex = 1
  alg:simplex = 1
HiGHS 1.6.0: optimal solution; objective 11621793.46
0 simplex iterations
0 barrier iterations
HiGHS 1.6.0: optimal solution; objective 14779784.87
1 simplex iterations
0 barrier iterations
HiGHS 1.6.0: optimal solution; objective 21050711.2
0 simplex iterations
0 barrier iterations
3: No cut needed for scenario Low
3: No cut needed for scenario Medium
3: No cut needed for scenario High


OPTIMAL SOLUTION FOUND
Expected cost = 16758018.596250001

facility_open [*] :=
Baton_Rouge_LA  1
    Baytown_TX  1
   Beaumont_TX  1
;

Conclusion#

In this section, we’ve illustrated the process of integrating parallel computations into the Benders decomposition algorithm, with only minimal modifications required from our original serial Python implementation. As the complexity of our code increases, we’ve adopted certain practices to enhance the modularity and robustness of our application.

Typically, this stage would involve the development of Python modules and the implementation of comprehensive exception handling to prepare the application for deployment. However, a detailed exploration of these advanced topics falls beyond the scope of this series.

While our current implementation efficiently addresses subproblems, each solver instance still begins its computation from scratch. In the concluding notebook of this series, we will explore the use of the AMPLS library to maintain persistent solver instances, thereby further improving the efficiency of our solution process.