AMPL Development Tutorial 1/6 – Capacitated Facility Location Problem#

1_floc.ipynb Open In Colab Kaggle Gradient Open In SageMaker Studio Lab Hits

Description: This notebook marks the beginning of a six-part series. We start with a straightforward facility location problem, demonstrating how to utilize AMPL and the amplpy module within a Jupyter Notebook to find a solution. As the series progresses, the problem complexity will increase, evolving into a two-stage stochastic programming problem. From the third notebook onwards, our focus will pivot from modeling to algorithm development in AMPL. We’ll explore four distinct approaches to implementing the Benders decomposition: initially through AMPL scripting, then by shifting control flow to the amplpy module. This transition allows us to illustrate how subproblems can be solved in parallel. Lastly, we’ll introduce our AMPLS library, which facilitates exporting AMPL model instances to persistent solver representations. This approach enables an exceptionally efficient Benders decomposition implementation.

Tags: amplpy, ampl, mip, facility location

Notebook author: Gyorgy Matyasfalvi <gyorgy@ampl.com>

References:

  • AMPL a Modeling Language for Mathematical Programming – Robert Fourer et al.

  • SCIP Optimization Suite example projects – Stephen J. Maher (ZIB)

# Install dependencies
%pip install -q amplpy
# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["open"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

Problem description#

Facility location decisions are crucial and often involve significant investment for both public and private sector entities, bearing profound social, economic, and environmental implications. The strategic positioning of facilities, such as warehouses, factories, and service centers, can determine an organization’s operational efficiency, market reach, and overall sustainability.

Given the high stakes of these decisions, engineers and analysts have developed sophisticated models to aid organizations in identifying optimal locations. These models take into account a variety of factors, including but not limited to, transportation costs, proximity to customers and suppliers, labor availability, customer demand, and environmental regulations.

The challenge is compounded when considering the uncertainty inherent in future conditions. Factors such as fluctuating market demands, changes in infrastructure, and unpredictable socio-economic developments require a robust approach to facility location. Therefore, as we will explore in the second notebook of this series, engineers frequently utilize stochastic models and robust optimization techniques. These approaches are designed to accommodate uncertainties, ensuring that the selected locations remain effective across various potential future scenarios.

Mixed integer program#

Below you can find a simple capacitated facility location problem as an explicit mixed integer program.

Given:

  • A set of facilities: \(I\).

  • A set of customers: \(J\).

Task:

  • Find the minimum cost facilities to open such that the customer demand can be satisfied.

Variables#

  • \(x_i \in \{0, 1\} \quad \forall i \in I\)

    • \(x_i = 1\) if facility \(i\) is opened.

  • \(y_{ij} \geq 0 \quad \forall i \in I, \forall j \in J\)

    • \(y_{ij}\) is the level of demand for customer \(j\) satisfied by facility \(i\).

Parameters:#

  • \(f_i\): the fixed cost for opening facility \(i\),

  • \(q_{ij}\): the cost of servicing customer \(j\) from facility \(i\),

  • \(\lambda_j\): the demand of customer \(j\),

  • \(k_i:\) the capacity of facility \(i\).

The explicit form#

The explicit mixed integer program can be formulated as follows:

\[\begin{split} \begin{array}{rll} \min \quad & \sum_{i \in I} f_i x_i + \sum_{i \in I} \sum_{j \in J} q_{ij} y_{ij} & \\ & & \\ \textrm{subject to} \quad & \sum_{i \in I} y_{ij} \geq \lambda_j & \forall j \in J \\ & \sum_{j \in J} y_{ij} \leq k_i x_i & \forall i \in I \\ & \sum_{i \in I} k_i x_i \geq \sum_{j \in J} \lambda_j & \\ & & \\ & x_i \in \{0, 1\} & \forall i \in I \\ & y_{ij} \geq 0 & \forall i \in I, \forall j \in J \end{array} \end{split}\]

AMPL implementation#

Translating the mathematical formulation of our optimization problem into an AMPL model is a direct process. The AMPL code closely mirrors each inequality in the system (1), preserving the structure of the mathematical model.

AMPL’s expressive syntax allows for meaningful names for entities such as variables, parameters, and constraints, enhancing the model’s clarity. For instance, we’ll represent the set \(I\) as FACILITIES, and \(J\) as CUSTOMERS. Variables previously denoted as \(x\) and \(y\) will be named facility_open and production, respectively.

Similarly, we will rename our parameters for greater clarity: \(f_i\) becomes fixed_cost, \(q_ij\) is now variable_cost, \(\lambda_j\) is referred to as customer_demand, and \(k_i\) is labeled as facility_capacity.

Using descriptive names not only enhances the readability of the model but also its maintainability and the ease with which it can be shared and understood by others.

Creating the model file#

Creating an AMPL model file is straightforward: simply open a text editor, type your AMPL model, and then submit it to the ampl interpreter. You can do this either by passing it as a command-line argument or by using it in interactive mode. For more details on these processes, refer to the AMPL book.

In the context of a Jupyter Notebook, like the one we are using, you can create an AMPL model file directly within a code cell. To do this, we’ll employ the %%writefile magic command, which acts as our text editor for generating the model file. The syntax %%writefile <filename> will create a file named <filename> in the notebook’s current working directory. As far as ampl is concerened, the file extension is not crucial, but it’s customary to use the .mod suffix for model files.

As we progress and begin to utilize AMPL’s Python API, we’ll demonstrate how to load these model files into AMPL for further use.

%%writefile floc_det.mod
# Sets
set FACILITIES; # set of facilities
set CUSTOMERS;  # set of customers

# Variables
var facility_open{FACILITIES} binary;        # 1 if facility i is open, 0 otherwise
var production{FACILITIES, CUSTOMERS} >= 0;  # production from facility i to satisfy customer demand

# Parameters
param fixed_cost{FACILITIES} >= 0;                # fixed cost of opening facility_open i
param variable_cost{FACILITIES, CUSTOMERS} >= 0;  # variable cost of satisfying customer_demand of customer j from facility_open i
param customer_demand{CUSTOMERS} >= 0;            # customer_demand of customer j
param facility_capacity{FACILITIES} >= 0;         # facility_capacity of facility_open i

# Objective
minimize TotalCost: 
    sum{i in FACILITIES} fixed_cost[i] * facility_open[i] +                     # Fixed cost of opening facility i 
    sum{i in FACILITIES, j in CUSTOMERS} variable_cost[i,j] * production[i,j];  # Variable cost of satisfying customer demand j from facility i

# Constraints
s.t. satisfying_customer_demand{j in CUSTOMERS}:
    sum{i in FACILITIES} production[i,j] >= customer_demand[j]; # Satisfy customer demand j

s.t. facility_capacity_limits{i in FACILITIES}:
    sum{j in CUSTOMERS} production[i,j] <= facility_capacity[i] * facility_open[i]; # Respect facility capacity limits

s.t. sufficient_production_capacity:
    sum{i in FACILITIES} facility_capacity[i]*facility_open[i] >= sum{j in CUSTOMERS} customer_demand[j]; # Ensure there is enough production capacity to satisfy all customer demand
Writing floc_det.mod

Specifying Data#

Once you have formulated your optimization problem mathematically, as discussed previously, writing the AMPL model becomes relatively straightforward. For detailed guidance on AMPL syntax, the AMPL book is an excellent resource. However, a model without data will result in an error message, not a solution.

AMPL supports a variety of data loading methods. In production environments, data is typically sourced from databases, CSV or Excel files, or existing data frames, such as those in Pandas. AMPL is equipped with features and tools that allow for seamless data importation from any of these sources.

It’s important to note that AMPL also supports loading data from a dedicated data file, usually with a .dat suffix. However, if your data is already in an Excel file, a database, a data frame, or another storage format, there is NO NEED to convert it into a .dat file for AMPL. Using .dat files can be convenient if you don’t have access to real data yet and wish to test your model.

Below, we will demonstrate how to provide data using .dat files. In subsequent notebooks, as we increasingly utilize AMPL’s Python API, we will transition to using Pandas data frames for data input. For a comprehensive overview of loading data from databases, Excel, or CSV files, please refer to our data connectors page.

Various demands#

For our specific problem we will specify 3 data files. Each data file will represent a customer demand scenario: low, medium, or high. Similar as above we will use the jupyter notebook magic %%writefile <filename> to write our data files in the current working directory. Outside of a Jupyter notebook you can use any text editor to do this.

%%writefile floc_low.dat

set FACILITIES  := Baytown_TX Beaumont_TX Baton_Rouge_LA;
set CUSTOMERS   := San_Antonio_TX Dallas_TX Jackson_MS Birmingham_AL;

param fixed_cost := Baytown_TX 400000 Beaumont_TX 200000 Baton_Rouge_LA 600000;

param facility_capacity := Baytown_TX 1550 Beaumont_TX 650 Baton_Rouge_LA 1750;

param variable_cost:        San_Antonio_TX  Dallas_TX    Jackson_MS   Birmingham_AL :=
             Baytown_TX     5739.725        6539.725     8650.40      22372.1125
             Beaumont_TX    6055.05         6739.055     8050.40      21014.225 
             Baton_Rouge_LA 8650.40         7539.055     4539.72      15024.325;

param customer_demand :=
               San_Antonio_TX  650      
               Dallas_TX       1134            
               Jackson_MS      416      
               Birmingham_AL   113;
Writing floc_low.dat
%%writefile floc_med.dat

set FACILITIES  := Baytown_TX Beaumont_TX Baton_Rouge_LA;
set CUSTOMERS   := San_Antonio_TX Dallas_TX Jackson_MS Birmingham_AL;

param fixed_cost := Baytown_TX 400000 Beaumont_TX 200000 Baton_Rouge_LA 600000;

param facility_capacity := Baytown_TX 1550 Beaumont_TX 650 Baton_Rouge_LA 1750;

param variable_cost:        San_Antonio_TX  Dallas_TX    Jackson_MS   Birmingham_AL :=
             Baytown_TX     5739.725        6539.725     8650.40      22372.1125
             Beaumont_TX    6055.05         6739.055     8050.40      21014.225 
             Baton_Rouge_LA 8650.40         7539.055     4539.72      15024.325;

param customer_demand :=
               San_Antonio_TX  650      
               Dallas_TX       1134            
               Jackson_MS      416      
               Birmingham_AL   113;
Writing floc_med.dat
%%writefile floc_high.dat

set FACILITIES  := Baytown_TX Beaumont_TX Baton_Rouge_LA;
set CUSTOMERS   := San_Antonio_TX Dallas_TX Jackson_MS Birmingham_AL;

param fixed_cost := Baytown_TX 400000 Beaumont_TX 200000 Baton_Rouge_LA 600000;

param facility_capacity := Baytown_TX 1550 Beaumont_TX 650 Baton_Rouge_LA 1750;

param variable_cost:        San_Antonio_TX  Dallas_TX    Jackson_MS   Birmingham_AL :=
             Baytown_TX     5739.725        6539.725     8650.40      22372.1125
             Beaumont_TX    6055.05         6739.055     8050.40      21014.225 
             Baton_Rouge_LA 8650.40         7539.055     4539.72      15024.325;

param customer_demand :=
               San_Antonio_TX        887 
               Dallas_TX             1456       
               Jackson_MS            673
               Birmingham_AL         207;
Writing floc_high.dat

Solve for each scenario and check for optimality#

Outside of a Jupyter notebook, there are several approaches to solve our model for each scenario. You might, for example, use a bash script to iterate through all files, invoking ampl on each one. Alternatively, an AMPL script named det.run could be written as follows:

option solver highs;
option omit_zero_rows 1, omit_zero_cols 1, display_precision 0;
model floc_det.mod;

data floc_low.dat;
solve;
print "LOW DEMAND COST:";
display TotalCost;
print "LOW DEMAND SOLUTION:";
display facility_open;

reset data;
data floc_med.dat;
solve;
print "MEDIUM DEMAND COST:";
display TotalCost;
print "MEDIUM DEMAND SOLUTION:";
display facility_open;

reset data;
data floc_high.dat;
solve;
print "HIGH DEMAND COST:";
display TotalCost;
print "HIGH DEMAND SOLUTION:";
display facility_open;

This script can then be executed from the terminal with the command ampl det.run.

Within a Jupyter Notebook, we’ll leverage amplpy for a similar purpose as outlined in the AMPL script above. The Python API will allow us to instantiate an AMPL object and utilize its methods to read model and data files, configure options, issue solve commands, and refresh our data as needed.

It’s important to note that customer_demand is the only parameter that varies across the different scenarios in our data files. This observation suggests that it might not be necessary to create three distinct data files and reset all our data after each solve, as the AMPL script above does. Instead, we can begin with a single data file for the low demand scenario, floc_low.dat. For the medium and high scenarios, we can simply use AMPL’s update data command to modify the customer_demand parameter accordingly.

Solve for low demand#

Relying on AMPL’s Python API.

# This section relies on AMPL's Python API to solve the problem

ampl = AMPL()  # Instantiate an AMPL object
ampl.read("floc_det.mod")  # Read the model from file
ampl.read_data("floc_low.dat")  # Read the data from file
ampl.eval(
    r"""
    print "Number of variables: ",  _nvars;
    print "Number of constraints: ", _ncons;
    option show_stats 1;
    """
)  # Display total number of variables and constraints

ampl.option["solver"] = "highs"  # Select the solver
ampl.option["display_precision"] = 0  # Turn off display rounding
ampl.option["show_stats"] = 1  # Show problem statistics
ampl.solve()  # Attempt to solve

# Check if the problem was solved if not print warning
srn = ampl.get_value("solve_result_num")
if srn != 0:
    print(f"Warning:\tProblem not solved to optimality!\n\t\tsolve_result_num = {srn}")
else:
    print("Success:\tProblem solved to optimality!")
    # Print the solution
    ampl.eval(
        r"""
        option omit_zero_rows 1, omit_zero_cols 1; # Set more display options.
        print;
        display TotalCost;
        display facility_open;
        """
    )
Number of variables:  15
Number of constraints:  8

Presolve eliminates 0 constraints and 1 variable.
Adjusted problem:
14 variables:
	2 binary variables
	12 linear variables
8 constraints, all linear; 28 nonzeros
	8 inequality constraints
1 linear objective; 14 nonzeros.

HiGHS 1.6.0:HiGHS 1.6.0: optimal solution; objective 15966984.87
8 simplex iterations
1 branching nodes
Success:	Problem solved to optimality!

TotalCost = 15966984.865

facility_open [*] :=
Baton_Rouge_LA  1
    Baytown_TX  1
;

Solve for medium demand#

ampl.eval(
    r"""
    update data customer_demand;
    data;
    param customer_demand :=
        San_Antonio_TX  650      
        Dallas_TX       1134            
        Jackson_MS      416      
        Birmingham_AL   113;
    """
)
ampl.solve()  # Attempt to solve
# Check if the problem was solved if not print warning
srn = ampl.get_value("solve_result_num")
if srn != 0:
    print(f"Warning:\tProblem not solved to optimality!\n\t\tsolve_result_num = {srn}")
else:
    print("Success:\tProblem solved to optimality!")
    # Print the solution
    ampl.eval(
        r"""
        print;
        display TotalCost;
        display facility_open;
        """
    )
HiGHS 1.6.0:HiGHS 1.6.0: optimal solution; objective 15966984.87
8 simplex iterations
1 branching nodes
Success:	Problem solved to optimality!

TotalCost = 15966984.865

facility_open [*] :=
Baton_Rouge_LA  1
    Baytown_TX  1
   Beaumont_TX  0
;

Solve for high demand#

ampl.eval(
    """
          update data customer_demand;
          data;
          param customer_demand :=
               San_Antonio_TX        887 
               Dallas_TX             1456       
               Jackson_MS            673
               Birmingham_AL         207; 
          """
)
ampl.solve()  # Attempt to solve
# Check if the problem was solved if not print warning
srn = ampl.get_value("solve_result_num")
if srn != 0:
    print(f"Warning:\tProblem not solved to optimality!\n\t\tsolve_result_num = {srn}")
else:
    print("Success:\tProblem solved to optimality!")
    # Print the solution
    ampl.eval(
        """
              print;
              display TotalCost;
              display facility_open;
              """
    )
Presolve eliminates 1 constraint and 2 variables.
Adjusted problem:
13 variables:
	1 binary variable
	12 linear variables
7 constraints, all linear; 25 nonzeros
	7 inequality constraints
1 linear objective; 13 nonzeros.

HiGHS 1.6.0:HiGHS 1.6.0: optimal solution; objective 22250711.2
6 simplex iterations
1 branching nodes
Success:	Problem solved to optimality!

TotalCost = 22250711.200000003

facility_open [*] :=
Baton_Rouge_LA  1
    Baytown_TX  1
   Beaumont_TX  1
;

Conclusion#

In this section, we’ve covered foundational aspects of modeling in AMPL, including how to import data and utilize AMPL’s Python API. Our exploration revealed that for low to medium demand scenarios, opening just two facilities suffices to meet customer needs. However, the surge in demand under the high-demand scenario necessitates the operation of all three facilities.

Moving forward, we’ll demonstrate a more efficient approach that avoids solving three distinct problems. Instead, we’ll consolidate them into a single problem. This will be accomplished by formulating the extensive form of the stochastic capacitated facility location problem, streamlining the entire process.