Diet and Other Input Models: Minimizing Costs#

diet_case_study.ipynb Open In Colab Kaggle Gradient Open In SageMaker Studio Lab Hits

Description: Diet case study, Chapter 2 from the AMPL book adapted to Python

Tags: amplpy, ampl-lecture

Notebook author: Marcos Dominguez Velad <marcos@ampl.com>

Model author: N/A

# Install dependencies
%pip install -q amplpy numpy pandas
# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["highs"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

This notebook provides the implementation of the production problem described in the book AMPL: A Modeling Language for Mathematical Programming by Robert Fourer, David M. Gay, and Brian W. Kernighan. This implementation is Python-based using amplpy rather than pure Ampl style.

Diet problem#

As an intuitive example of a cost-minimizing model, we use the well-known “diet problem”, which finds a mix of foods that satisfies requirements on the amounts of various vitamins. We will construct a small, explicit linear program, and then show how a general model can be formulated for all linear programs of that kind.

After formulating the diet model, we will discuss a few changes that might make it more realistic. The full power of this model, however, derives from its applicability to many situations that have nothing to do with diets. A general model derived from the diet formulation can be also applied to blending, economics, and scheduling.

Solving a diet problem instance#

Consider the problem of choosing prepared foods to meet certain nutritional requirements. Suppose that precooked dinners of the following kinds are available for the following prices per package:

Name

Price $

BEEF

beef

3.19

CHK

chicken

2.59

FISH

fish

2.29

HAM

ham

2.89

MCH

macaroni & cheese

1.89

MTL

meat loaf

1.99

SPG

spaghetti

1.99

TUR

turkey

2.49

These dinners provide the following percentages, per package, of the minimum daily requirements for vitamins A, C, B1 and B2:

A

C

B1

B2

BEEF

60%

20%

10%

15%

CHK

8

0

20

20

FISH

8

10

15

10

HAM

40

40

35

10

MCH

15

35

15

15

MTL

70

30

15

15

SPG

25

50

25

15

TUR

60

20

15

10

The problem is to find the cheapest combination of packages that will meet a week’s requirements - that is, at least 700% of the daily requirement for each nutrient.

Let us write \(X_{beef}\) for the number of packages of beef dinner to be purchased, \(X_{chk}\) for the number of packages of chicken dinner, and so forth. Then the total cost of the diet will be:

total cost =
      3.19 Xbeef + 2.59 Xchk + 2.29 Xfish + 2.89 Xham +
      1.89 Xmch + 1.99 Xmtl + 1.99 Xspg + 2.49 Xtur

The total percentage of the vitamin A requirement is given by a similar formula, except that X BEEF , X CHK , and so forth are multiplied by the percentage per package instead of the cost per package:

total percentage of vitamin A daily requirement met =
      60 Xbeef + 8 Xchk + 8 Xfish + 40 Xham +
      15 Xmch + 70 Xmtl + 25 Xspg + 60 Xtur

This amount needs to be greater than or equal to 700 percent. There is a similar formula for each of the other vitamins, and each of these also needs to be \(\geq\) 700.

Putting these all together, we have the following linear program:

Minimize
    3.19 Xbeef + 2.59 Xchk + 2.29 Xfish + 2.89 Xham +
    1.89 Xmch + 1.99 Xmtl + 1.99 Xspg + 2.49 Xtur

Subject to
     60 Xbeef + 8 Xchk + 8 Xfish + 40 Xham +
    15 Xmch + 70 Xmtl + 25 Xspg + 60 Xtur >= 700

      20 Xbeef + 0 Xchk + 10 Xfish + 40 Xham +
      35 Xmch + 30 Xmtl + 50 Xspg + 20 Xtur >= 700

      10 Xbeef + 20 Xchk + 15 Xfish + 35 Xham +
      15 Xmch + 15 Xmtl + 25 Xspg + 15 Xtur >= 700

      15 Xbeef + 20 Xchk + 10 Xfish + 10 Xham +
      15 Xmch + 15 Xmtl + 15 Xspg + 10 Xtur >= 700

      Xbeef >= 0, Xchk >= 0, Xfish >= 0, Xham >= 0,
      Xmch >= 0, Xmtl >= 0, Xspg >= 0, Xtur >= 0

At the end we have added the common-sense requirement that no fewer than zero packages of a food can be purchased. And now, we can transcribe the model to an AMPL statement of the explicit diet LP:

%%writefile diet0.mod
# Variables
var Xbeef >= 0; var Xchk >= 0; var Xfish >= 0;
var Xham >= 0; var Xmch >= 0; var Xmtl >= 0;
var Xspg >= 0; var Xtur >= 0;

# Objective function (minimizing)
minimize cost:
	3.19*Xbeef + 2.59*Xchk + 2.29*Xfish + 2.89*Xham +
	1.89*Xmch + 1.99*Xmtl + 1.99*Xspg + 2.49*Xtur;

# Constraints
subject to A:
	60*Xbeef + 8*Xchk + 8*Xfish + 40*Xham +
	15*Xmch + 70*Xmtl + 25*Xspg + 60*Xtur >= 700;
subject to C:
	20*Xbeef + 0*Xchk + 10*Xfish + 40*Xham +
	35*Xmch + 30*Xmtl + 50*Xspg + 20*Xtur >= 700;
subject to B1:
	10*Xbeef + 20*Xchk + 15*Xfish + 35*Xham +
	15*Xmch + 15*Xmtl + 25*Xspg + 15*Xtur >= 700;
subject to B2:
	15*Xbeef + 20*Xchk + 10*Xfish + 10*Xham +
	15*Xmch + 15*Xmtl + 15*Xspg + 10*Xtur >= 700;
Writing diet0.mod

A few AMPL commands then suffice to read the file, send the LP to a solver, and retrieve the results (using HiGHS solver).

ampl.read('diet0.mod')
ampl.solve(solver='highs')
ampl.display('Xbeef,Xchk,Xfish,Xham,Xmch,Xmtl,Xspg,Xtur')
HiGHS 1.10.0: HiGHS 1.10.0: optimal solution; objective 88.2
2 simplex iterations
0 barrier iterations
Xbeef = 0
Xchk = 0
Xfish = 0
Xham = 0
Xmch = 46.6667
Xmtl = 0
Xspg = 0
Xtur = 0
assert ampl.solve_result == "solved", ampl.solve_result

The optimal solution is found quickly, but it is hardly what we might have hoped for. The cost is minimized by a monotonous diet of 46 and 2/3 packages of macaroni and cheese! You can check that this neatly provides \(15\% \times 46\frac{2}{3} = 700\%\) of the requirement for vitamins A, B1 and B2, and a lot more vitamin C than necessary; the cost is only \(\$1.89 × 46\frac{2}{3} = \$88.20.\)

You might guess that a better solution would be generated by requiring the amount of each vitamin to equal 700% exactly. Such a requirement can easily be imposed by changing each >= to = in the AMPL constraints. If you go ahead and solve the changed LP, you will find that the diet does indeed become more varied: approximately 19.5 packages of chicken, 16.3 of macaroni and cheese, and 4.3 of meat loaf. But since equalities are more restrictive than inequalities, the cost goes up to $89.99.

An AMPL model for the diet problem#

Clearly we will have to consider more extensive modifications to our linear program in order to produce a diet that is even remotely acceptable. We will probably want to change the sets of food and nutrients, as well as the nature of the constraints and bounds. As in the production example of the previous chapter, this will be much easier to do if we rely on a general model that can be coupled with a variety of specific data files.

This model deals with two things: nutrients and foods. Thus we begin an AMPL model by declaring sets of each:

set NUTR;
set FOOD;

Next we need to specify the numbers required by the model. Certainly a positive cost should be given for each food:

param cost {FOOD} > 0;

We also specify that for each food there are lower and upper limits on the number of packages in the diet:

param f_min {FOOD} >= 0;
param f_max {j in FOOD} >= f_min[j];

Notice that we need a dummy index j to run over FOOD in the declaration of f_max, in order to say that the maximum for each food must be greater than or equal to the corresponding minimum.

To make this model somewhat more general than our examples so far, we also specify similar lower and upper limits on the amount of each nutrient in the diet:

param n_min {NUTR} >= 0;
param n_max {i in NUTR} >= n_min[i];

Finally, for each combination of a nutrient and a food, we need a number that represents the amount of the nutrient in one package of the food. Such a “product” of two sets is written by listing them both:

param amt {NUTR,FOOD} >= 0;

References to this parameter require two indices. For example, amt[i,j] is the amount of nutrient i in a package of food j.

The decision variables for this model are the numbers of packages to buy of the different foods:

var Buy {j in FOOD} >= f_min[j], <= f_max[j];

The number of packages of some food j to be bought will be called Buy[j]; in any acceptable solution it will have to lie between f_min[j] and f_max[j].

The total cost of buying a food j is the cost per package, cost[j], times the number of packages, Buy[j]. The objective to be minimized is the sum of this product over all foods j:

minimize Total_Cost: sum {j in FOOD} cost[j] * Buy[j];

Similarly, the amount of a nutrient i supplied by a food j is the nutrient per package, amt[i,j], times the number of packages Buy[j]. The total amount of nutrient i supplied is the sum of this product over all foods j:

sum {j in FOOD} amt[i,j] * Buy[j];

To complete the model, we need only specify that each such sum must lie between the appropriate bounds. Our constraint declaration begins

subject to Diet {i in NUTR}:

to say that a constraint named Diet[i] must be imposed for each member i of NUTR. The rest of the declaration gives the algebraic statement of the constraint for nutrient i: the variables must satisfy

n_min[i] <= sum {j in FOOD} amt[i,j] * Buy[j] <= n_max[i]

A “double inequality” like this is interpreted in the obvious way: the value of the sum in the middle must lie between n_min[i] and n_max[i]. We can write all together into a model file diet.mod:

%%writefile diet.mod
set NUTR;
set FOOD;

param cost {FOOD} > 0;
param f_min {FOOD} >= 0;
param f_max {j in FOOD} >= f_min[j];

param n_min {NUTR} >= 0;
param n_max {i in NUTR} >= n_min[i];

param amt {NUTR,FOOD} >= 0;

var Buy {j in FOOD} >= f_min[j], <= f_max[j];

minimize Total_Cost:  sum {j in FOOD} cost[j] * Buy[j];

subject to Diet {i in NUTR}:
   n_min[i] <= sum {j in FOOD} amt[i,j] * Buy[j] <= n_max[i];
Writing diet.mod

We can use the read() function from Amplpy to read a model file by specifying the name (or the path) of the model. This must happen before reading the concrete data for the problem.

# Reset Ampl to delete the previous model. Another Ampl object could be created as well
ampl.reset()
ampl.read('diet.mod')

Data#

By specifying appropriate data, we can solve any of the linear programs that correspond to the above model. In this example we are assuming you are familiar to basic data structures in Python like lists, dictionaries, Pandas DataFrames, and Numpy Matrices.

Let’s begin by using the data from the previous example.

The sets of FOOD and NUTR are defined by Python lists (foods and nutrs). The parameters related to foods are cost, f_min and f_max, which are in fact the columns of the dataframe food_df. There is another column in this dataframe to specify the food, which be the index for each row in the dataframe. Something equivalent happens with the parameters related to the nutrients, all the data is gathered in nutr_df. Notice that the values of f_min and n_min are as given originally, while f_max and n_max are set, for the time being, to large values that won’t affect the optimal solution. The data for amt is given by another dataframe, generated through a numpy matrix to show a more compact representation.

This is not the only way to load data into the Ampl model, and these will be covered in other resources.

# Defining the data in Python data structures
import pandas as pd
import numpy as np

# Indexing sets (as Python lists)
foods = ['BEEF', 'CHK', 'FISH', 'HAM', 'MCH', 'MTL', 'SPG', 'TUR']
nutrs = ['A', 'B1', 'B2', 'C']

# Pandas Dataframe with 4 columns: FOOD (indexing set in the model), cost, f_min, f_max
# Notice that since FOOD is the indexing set in the model, we set it as the index in the dataframe
food_df = pd.DataFrame(
    [
        ("BEEF", 3.19, 0, 100),
        ("CHK", 2.59, 0, 100),
        ("FISH", 2.29, 0, 100),
        ("HAM", 2.89, 0, 100),
        ("MCH", 1.89, 0, 100),
        ("MTL", 1.99, 0, 100),
        ("SPG", 1.99, 0, 100),
        ("TUR", 2.49, 0, 100),
    ],
    columns=["FOOD", "cost", "f_min", "f_max"],
).set_index("FOOD")

# Pandas Dataframe with 3 columns: NUTR (indexing set in the model), n_min, n_max
nutr_df = pd.DataFrame(
    [
        ("A", 700, 10000),
        ("C", 700, 10000),
        ("B1", 700, 10000),
        ("B2", 700, 10000),
    ],
    columns=["NUTR", "n_min", "n_max"],
).set_index("NUTR")

# amt_df is a Pandas Dataframe generated from a numpy matrix, each row corresponds
# to a food, and each column to a nutrient.
amt_df = pd.DataFrame(
    np.array(
        [
            [60, 8, 8, 40, 15, 70, 25, 60],
            [20, 0, 10, 40, 35, 30, 50, 20],
            [10, 20, 15, 35, 15, 15, 25, 15],
            [15, 20, 10, 10, 15, 15, 15, 10],
        ]
    ),
    columns=food_df.index.to_list(),
    index=nutr_df.index.to_list(),
)

# Load the data into Ampl:

ampl.set["FOOD"] = foods
ampl.set["NUTR"] = nutrs

# Load data related to foods:
ampl.set_data(food_df, "FOOD")
# Or alternatively:
#ampl.param['cost'] = df_food['cost']
#ampl.param['f_min'] = df_food['f_min']
#ampl.param['f_max'] = df_food['f_max']

# Load data related to nutrients:
ampl.set_data(nutr_df, "NUTR")
# Or alternatively:
#ampl.param['n_min'] = df_nutr['n_min']
#ampl.param['n_max'] = df_nutr['n_max']

ampl.param["amt"] = amt_df

Once we have read the model and loaded the data in AMPL, it is used to solve the resulting linear program and show the result for the Buy variables.

ampl.solve(solver='highs')
buy = ampl.var['Buy'].to_dict()
print(buy)
HiGHS 1.10.0: HiGHS 1.10.0: optimal solution; objective 88.2
2 simplex iterations
0 barrier iterations
{'BEEF': 0, 'CHK': 0, 'FISH': 0, 'HAM': 0, 'MCH': 46.66666666666666, 'MTL': 0, 'SPG': 0, 'TUR': 0}

Naturally, the result is the same as before. This time we have saved the Buy data into a Python dictionary later displayed.

Now suppose that we want to make the following enhancements. To promote variety, the weekly diet must contain between 2 and 10 packages of each food. The amount of sodium and calories in each package is also given; total sodium must not exceed 40,000 mg, and total calories must be between 16,000 and 24,000. All of these changes can be made through a few modifications to the data. Putting this new data in the Python data structures and run Ampl again:

# New data (diet2)
foods = ['BEEF', 'CHK', 'FISH', 'HAM', 'MCH', 'MTL', 'SPG', 'TUR']
nutrs = ['A', 'B1', 'B2', 'C', 'NA', 'CAL']

food_df = pd.DataFrame(
    [
        ("BEEF", 3.19, 2, 10),
        ("CHK", 2.59, 2, 10),
        ("FISH", 2.29, 2, 10),
        ("HAM", 2.89, 2, 10),
        ("MCH", 1.89, 2, 10),
        ("MTL", 1.99, 2, 10),
        ("SPG", 1.99, 2, 10),
        ("TUR", 2.49, 2, 10),
    ],
    columns=["FOOD", "cost", "f_min", "f_max"],
).set_index("FOOD")

# Create a pandas.DataFrame with data for n_min, n_max
nutr_df = pd.DataFrame(
    [
        ("A", 700, 20000),
        ("C", 700, 20000),
        ("B1", 700, 20000),
        ("B2", 700, 20000),
        ("NA", 0, 40000),
        ("CAL", 16000, 24000),
    ],
    columns=["NUTR", "n_min", "n_max"],
).set_index("NUTR")

amt_df = pd.DataFrame(
    np.array(
        [
            [60, 8, 8, 40, 15, 70, 25, 60],
            [20, 0, 10, 40, 35, 30, 50, 20],
            [10, 20, 15, 35, 15, 15, 25, 15],
            [15, 20, 10, 10, 15, 15, 15, 10],
            [928, 2180, 945, 278, 1182, 896, 1329, 1397],
            [295, 770, 440, 430, 315, 400, 370, 450],
        ]
    ),
    columns=food_df.index.to_list(),
    index=nutr_df.index.to_list(),
)

# Load data into Ampl
ampl.set["FOOD"] = foods
ampl.set["NUTR"] = nutrs
ampl.set_data(food_df, "FOOD")
ampl.set_data(nutr_df, "NUTR")
ampl.param["amt"] = amt_df

# Run Ampl
ampl.solve(solver="highs")
print("Solve status: " + ampl.solve_result)
HiGHS 1.10.0: HiGHS 1.10.0: infeasible problem
4 simplex iterations
0 barrier iterations

suffix dunbdd OUT;
Solve status: infeasible

The message infeasible problem tells us that we have constrained the diet too tightly; there is no way that all of the restrictions can be satisfied. This can be checked by examining the attribute ampl.solve_result.

AMPL lets us examine a variety of values produced by a solver as it attempts to find a solution. We can use marginal (or dual) values to investigate the sensitivity of an optimum solution to changes in the constraints. Here there is no optimum, but the solver does return the last solution that it found while attempting to satisfy the constraints. We can look for the source of the infeasibility by displaying some values associated with this solution:

ampl.display("Diet.lb, Diet.body, Diet.ub")
:   Diet.lb  Diet.body Diet.ub    :=
A       700     700      20000
B1      700     700      20000
B2      700     700      20000
C       700    1633.33   20000
CAL   16000   14700      24000
NA        0   55160      40000
;

For each nutrient, Diet.body is the sum of the terms amt[i,j] * Buy[j] in the constraint Diet[i]. The Diet.lb and Diet.ub values are the “lower bounds” and “upper bounds” on the sum in Diet[i] - in this case, just the values n_min[i] and n_max[i]. We can see that the diet returned by the solver supply the minimum vitamin B2, while the amount of sodium (NA) has overpassed its upper bound.

At this point, there are two obvious choices: we could require less B2 or we could allow more sodium. If we try the latter, and relax the sodium limit to 50,000 mg, a feasible solution becomes possible:

ampl.param["n_max"]["NA"] = 50000;
ampl.solve(solver="highs")
ampl.display("Buy")
print("Solve status: " + ampl.solve_result)
HiGHS 1.10.0: HiGHS 1.10.0: optimal solution; objective 117.8989859
1 simplex iterations
0 barrier iterations
Buy [*] :=
BEEF   5.22693
 CHK   2
FISH   2
 HAM  10
 MCH  10
 MTL  10
 SPG   9.43973
 TUR   2
;

Solve status: solved

This is at least a start toward a palatable diet, although we have to spend $117.89, compared to $88.20 for the original, less restricted case. Clearly it would be easy, now that the model is set up, to try many other possibilities. Section 11.3 of the AMPL book describes ways to quickly change the data and re-solve.

One still disappointing aspect of the solution is the need to buy 5.22693 packages of beef, and 9.43973 of spaghetti. How can we find the best possible solution in terms of whole packages? You might think that we could simply round the optimal values to whole numbers - or integers, as they’re often called in the context of optimization - but it is not so easy to do so in a feasible way. Using AMPL to modify the reported solution , we can observe that rounding up to 6 packages of beef and 10 of spaghetti, for example, will violate the sodium limit:

ampl.var["Buy"]["BEEF"] = 6
ampl.var["Buy"]["SPG"] = 10

ampl.display("Diet.lb, Diet.body, Diet.ub")
:   Diet.lb Diet.body Diet.ub    :=
A       700     2012    20000
B1      700     1060    20000
B2      700      720    20000
C       700     1730    20000
CAL   16000    20240    24000
NA        0    51462    50000
;

You can similarly check that rounding the solution down to 5 of beef and 9 of spaghetti will provide insufficient vitamin B2. Rounding one up and the other down doesn’t work either. With enough experimenting you can find a nearby all-integer solution that does satisfy the constraints, but still you will have no guarantee that it is the least-cost allinteger solution.

AMPL does provide for putting the integrality restriction directly into the declaration of the variables:

var Buy {j in FOOD} integer >= f_min[j], <= f_max[j];

This will only help, however, if you use a solver that can deal with problems whose variables must be integers. The HiGHS solver can handle these so called integer programs. If we add integer to the declaration of variable Buy as above, save the resulting model in the file dieti.mod, and add the higher sodium limit to the data (diet2a), then we can re-solve as follows:

%%writefile dieti.mod
set NUTR;
set FOOD;

param cost {FOOD} > 0;
param f_min {FOOD} >= 0;
param f_max {j in FOOD} >= f_min[j];

param n_min {NUTR} >= 0;
param n_max {i in NUTR} >= n_min[i];

param amt {NUTR,FOOD} >= 0;

var Buy {j in FOOD} integer >= f_min[j], <= f_max[j];

minimize Total_Cost:  sum {j in FOOD} cost[j] * Buy[j];

subject to Diet {i in NUTR}:
   n_min[i] <= sum {j in FOOD} amt[i,j] * Buy[j] <= n_max[i];
Writing dieti.mod
ampl.reset()
ampl.read("dieti.mod")

# New data (diet2a)
foods = ['BEEF', 'CHK', 'FISH', 'HAM', 'MCH', 'MTL', 'SPG', 'TUR']
nutrs = ['A', 'B1', 'B2', 'C', 'NA', 'CAL']

food_df = pd.DataFrame(
    [
        ("BEEF", 3.19, 2, 10),
        ("CHK", 2.59, 2, 10),
        ("FISH", 2.29, 2, 10),
        ("HAM", 2.89, 2, 10),
        ("MCH", 1.89, 2, 10),
        ("MTL", 1.99, 2, 10),
        ("SPG", 1.99, 2, 10),
        ("TUR", 2.49, 2, 10),
    ],
    columns=["FOOD", "cost", "f_min", "f_max"],
).set_index("FOOD")

# Create a pandas.DataFrame with data for n_min, n_max
nutr_df = pd.DataFrame(
    [
        ("A", 700, 20000),
        ("C", 700, 20000),
        ("B1", 700, 20000),
        ("B2", 700, 20000),
        ("NA", 0, 50000),
        ("CAL", 16000, 24000),
    ],
    columns=["NUTR", "n_min", "n_max"],
).set_index("NUTR")

amt_df = pd.DataFrame(
    np.array(
        [
            [60, 8, 8, 40, 15, 70, 25, 60],
            [20, 0, 10, 40, 35, 30, 50, 20],
            [10, 20, 15, 35, 15, 15, 25, 15],
            [15, 20, 10, 10, 15, 15, 15, 10],
            [928, 2180, 945, 278, 1182, 896, 1329, 1397],
            [295, 770, 440, 430, 315, 400, 370, 450],
        ]
    ),
    columns=food_df.index.to_list(),
    index=nutr_df.index.to_list(),
)

# Load data into Ampl
ampl.set["FOOD"] = foods
ampl.set["NUTR"] = nutrs
ampl.set_data(food_df, "FOOD")
ampl.set_data(nutr_df, "NUTR")
ampl.param["amt"] = amt_df

# Run Ampl
ampl.solve(solver="highs")
ampl.display("Buy")
HiGHS 1.10.0: HiGHS 1.10.0: optimal solution; objective 119.3
9 simplex iterations
1 branching nodes
Buy [*] :=
BEEF   9
 CHK   2
FISH   2
 HAM   8
 MCH  10
 MTL  10
 SPG   7
 TUR   2
;

Since integrality is an added constraint, it is no surprise that the best integer solution costs about $1.40 more than the best “continuous” one. But the difference between the diets is unexpected; the amounts of 3 foods change, each by two or more packages. In general, integrality and other “discrete” restrictions make solutions for a model much harder to find.

Generalizations to blending, economics and scheduling#

Your personal experience probably suggests that diet models are not widely used by people to choose their dinners. These models would be much better suited to situations in which packaging and personal preferences don’t play such a prominent role — for example , the blending of animal feed or perhaps food for college dining halls.

The diet model is a convenient, intuitive example of a linear programming formulation that appears in many contexts. Suppose that we rewrite the model in a more general way:

set INPUT;            # inputs
set OUTPUT;           # outputs
param cost {INPUT} > 0;
param in_min {INPUT} >= 0;
param in_max {j in INPUT} >= in_min[j];
param out_min {OUTPUT} >= 0;
param out_max {i in OUTPUT} >= out_min[i];
param io {OUTPUT,INPUT} >= 0;
var X {j in INPUT} >= in_min[j], <= in_max[j];
minimize Total_Cost: sum {j in INPUT} cost[j] * X[j];
subject to Outputs {i in OUTPUT}:
	out_min[i] <= sum {j in INPUT} io[i,j] * X[j] <= out_max[i];

The objects that were called foods and nutrients in the diet model are now referred to more generically as inputs and outputs. For each input j, we must decide to use a quantity X[j] that lies between in_min[j] and in_max[j]; as a result we incur a cost equal to cost[j]·X[j], and we create io[i,j]·X[j] units of each output i. Our goal is to find the least-cost combination of inputs that yields, for each output i, an amount between out_min[i] and out_max[i].

In one common class of applications for this model, the inputs are raw materials to be mixed together. The outputs are qualities of the resulting blend. The raw materials could be the components of an animal feed, but they could equally well be the crude oil derivatives that are blended to make gasoline, or the different kinds of coal that are mixed as input to a coke oven. The qualities can be amounts of something (sodium or calories for animal feed), or more complex measures (vapor pressure or octane rating for gasoline), or even physical properties such as weight and volume.

In another well-known application, the inputs are production activities of some sector of an economy, and the outputs are various products. The in_min and in_max parameters are limits on the levels of the activities, while out_min and out_max are regulated by demands. Thus the goal is to find levels of the activities that meet demand at the lowest cost. This interpretation is related to the concept of an economic equilibrium.

In still another, quite different application, the inputs are work schedules, and the outputs correspond to hours worked on certain days of a month. For a particular work schedule j, io[i,j] is the number of hours that a person following schedule j will work on day i (zero if none), cost[j] is the monthly salary for a person following schedule j, and X[j] is the number of workers assigned that schedule. Under this interpretation, the objective becomes the total cost of the monthly payroll, while the constraints say that for each day i, the total number of workers assigned to work that day must lie between the limits out_min[i] and out_max[i]. The same approach can be used in a variety of other scheduling contexts, where the hours, days or months are replaced by other periods of time.

Although linear programming can be very useful in applications like these, we need to keep in mind the assumptions that underlie the LP model. We have already mentioned the “continuity” assumption whereby X[j] is allowed to take on any value between in_min[j] and in_max[j]. This may be a lot more reasonable for blending than for scheduling.

As another example, in writing the objective as

sum {j in INPUT} cost[j] * X[j]

we are assuming “linearity of costs”, that is, that the cost of an input is proportional to the amount of the input used, and that the total cost is the sum of the inputs’ individual costs.

In writing the constraints as

out_min[i] <= sum {j in INPUT} io[i,j] * X[j] <= out_max[i]

we are also assuming that the yield of an output i from a particular input is proportional to the amount of the input used, and that the total yield of an output i is the sum of the yields from the individual inputs. This “linearity of yield” assumption poses no problem when the inputs are schedules, and the outputs are hours worked. But in the blending example, linearity is a physical assumption about the nature of the raw materials and the qualities, which may or may not hold. In early applications to refineries, for example, it was recognized that the addition of lead as an input had a nonlinear effect on the quality known as octane rating in the resulting blend.

AMPL makes it easy to express discrete or nonlinear models, but any departure from continuity or linearity is likely to make an optimal solution much harder to obtain. At the least, it takes a more powerful solver to optimize the resulting mathematical programs.

Bibliography#

  • George B. Dantzig, “The Diet Problem.” Interfaces 20, 4 (1990) pp. 43-47. An entertaining account of the origins of the diet problem.

  • Robert Fourer, David M. Gay, and Brian W. Kernighan, “AMPL: A Modeling Language for Mathematical Programming (2nd edition).” Cengage Learning (2002).

  • Susan Garner Garille and Saul I. Gass, “Stigler’s Diet Problem Revisited.” Operations Research 49, 1 (2001) pp. 1-13. A review of the diet problem’s origins and its influence over the years on linear programming and on nutritionists.

  • Said S. Hilal and Warren Erikson, “Matching Supplies to Save Lives: Linear Programming the Production of Heart Valves.” Interfaces 11, 6 (1981) pp. 48-56. A less appetizing equivalent of the diet problem, involving the choice of pig heart suppliers.