This writeup first considers general-purpose string manipulation features, including the new string concatenation operator and a variety of new string functions. Next we describe some changes that make the display of strings more natural and flexible. The final two sections describe additional new functions and conventions for converting numbers to strings and strings to numbers.
The following table summarizes the concatenation operator and all of the functions for string handling, in the order in which they are introduced below. Arguments beginning with s refer to strings, with i refer to integers, and with e refer to any numeric or symbolic expressions.
Syntax Return
TypeReturn
Values1 & s2 string concatenation of s1 and s2 length(s1) number number of characters in s1 match(s1,s2) number first position where s2 matches a substring in s1 (or 0 if it never appears) substr(s1,i2) string substring of s1 beginning at the position given by i2 and extending to the end of s1 substr(s1,i2,i3) string substring of s1 beginning at the position given by i2 and having a length of i3 sub(s1,s2,s3) string result of substituting s3 for the first match of s2 in s1 gsub(s1,s2,s3) string result of substituting s3 for all matches of s2 in s1 sprintf(s1,e2,. . .) string formatted string, the same as would be written by printf s1,e2,. . . num(s1) number decimal number represented by s1 num0(s1) number decimal number represented by s1, ignoring extraneous characters char(i1) string string of one character whose unicode value is i1 ichar(s1) number unicode value of the first character in s1
ampl: model diet.mod; ampl: data diet2.dat; ampl: display NUTR, FOOD; set NUTR := A B1 B2 C NA CAL; set FOOD := BEEF CHK FISH HAM MCH MTL SPG TUR; ampl: set NUTR_FOOD := setof {i in NUTR, j in FOOD} i & "_" & j; ampl: display NUTR_FOOD; set NUTR_FOOD := A_BEEF B1_BEEF B2_BEEF C_BEEF NA_BEEF CAL_BEEF A_CHK B1_CHK B2_CHK C_CHK NA_CHK CAL_CHK A_FISH B1_FISH B2_FISH C_FISH NA_FISH CAL_FISH A_HAM B1_HAM B2_HAM C_HAM NA_HAM CAL_HAM A_MCH B1_MCH B2_MCH C_MCH NA_MCH CAL_MCH A_MTL B1_MTL B2_MTL C_MTL NA_MTL CAL_MTL A_SPG B1_SPG B2_SPG C_SPG NA_SPG CAL_SPG A_TUR B1_TUR B2_TUR C_TUR NA_TUR CAL_TUR;This is not a set that you would normally want to define, but it might be useful if you have to read data in which strings like "B2_BEEF" appear (see the example below).
The length function takes a string as argument and returns the number of characters in it. The match function takes two string arguments, and returns the first position where the second appears as a substring in the first -- or zero if the second never appears as a substring in the first. For example:
ampl: display {j in FOOD} (length(j), match(j,"T")); : length(j) match(j, 'T') := BEEF 4 0 CHK 3 0 FISH 4 0 HAM 3 0 MCH 3 0 MTL 3 2 SPG 3 0 TUR 3 1 ;The substr function takes a string and one or two integers as arguments. It returns a substring of the first argument that begins at the position given by the second argument; it has the length given by the third argument, or extends to the end of the string if no third argument is given. For instance:
ampl: display {j in FOOD} (substr(j,1,2), substr(j,3)); : substr(j, 1, 2) substr(j, 3) := BEEF BE EF CHK CH K FISH FI SH HAM HA M MCH MC H MTL MT L SPG SP G TUR TU R ;An empty string is returned if the second argument is greater than the length of the first argument, or if the third argument is less than 1.
As an example of the use of several of these functions, suppose that you want to use the model from diet.mod and to supply the nutrition amount data in a table like this:
param: NUTR_FOOD: amt_nutr := A_BEEF 60 B1_BEEF 10 CAL_BEEF 295 CAL_CHK 770 ...Then in addition to the declarations for the parameter amt used in the model,
set NUTR; set FOOD; param amt {NUTR,FOOD} >= 0;you would declare a set and a parameter to hold the data from the "nonstandard" table:
set NUTR_FOOD; param amt_nutr {NUTR_FOOD} >= 0;To use the model, you need to write an assignment of some kind to get the data from set NUTR_FOOD and parameter amt_nutr into sets NUTR and FOOD and parameter amt. One solution is to extract the sets first, and then convert the parameters
set NUTR := setof {ij in NUTR_FOOD} substr(ij,1,match(ij,"_")-1); set FOOD := setof {ij in NUTR_FOOD} substr(ij,match(ij,"_")+1); param amt {i in NUTR, j in FOOD} := amt_nutr[i & "_" & j];As an alternative, you can extract the sets and parameters together, by use of an AMPL script such as the following:
param iNUTR symbolic; param jFOOD symbolic; param upos > 0; let NUTR := {}; let FOOD := {}; for {ij in NUTR_FOOD} { let upos := match(ij,"_"); let iNUTR := substr(ij,1,upos-1); let jFOOD := substr(ij,upos+1); let NUTR := NUTR union {iNUTR}; let FOOD := FOOD union {jFOOD}; let amt[iNUTR,jFOOD] := amt_nutr[ij]; }Under either alternative, errors such as a missing "_" in a member of NUTR_FOOD are eventually signaled by error messages.
For completeness, AMPL also provides two functions to make substitutions in a string. Both sub and gsub take three strings as arguments. Both return the string that results when the third argument is substituted for the second in the first; sub substitutes only for the first occurrence of the second argument, while gsub substitutes for all occurrences. For example, to replace each underscore in the membership of set NUTR_FOOD above with two hyphens,
let NUTR_FOOD := setof {ij in NUTR_FOOD} sub(ij, '_', '--'); ampl: display NUTR_FOOD; set NUTR_FOOD := A--BEEF B1--BEEF B2--BEEF C--BEEF NA--BEEF CAL--BEEF A--CHK B1--CHK B2--CHK C--CHK NA--CHK CAL--CHK A--FISH B1--FISH B2--FISH C--FISH NA--FISH CAL--FISH A--HAM B1--HAM B2--HAM C--HAM NA--HAM CAL--HAM A--MCH B1--MCH B2--MCH C--MCH NA--MCH CAL--MCH A--MTL B1--MTL B2--MTL C--MTL NA--MTL CAL--MTL A--SPG B1--SPG B2--SPG C--SPG NA--SPG CAL--SPG A--TUR B1--TUR B2--TUR C--TUR NA--TUR CAL--TUR;If the second argument has no occurrences in the first, then sub or gsub returns the first argument unchanged.
In match, sub and gsub, the second argument is actually taken to represent a "regular expression"; if it contains certain special characters, it is interpreted as a pattern that may match many sub-strings. The pattern "^B[0-9]+_", for example, matches any sub-string consisting of a B followed by one or more digits and then an underscore, and occurring at the beginning of a string. To use this feature, see the separate description of regular expression rules.
Here are two examples. The following script solves diet.mod with a series of different data files dietA.dat, dietB.dat, dietC.dat, . . . and saves the solution to files dietA.out, dietB.out, dietC.out, . . . :
- filenames that are part of commands, including model, data, and commands
- filenames following < or > or >> to specify redirection of input or output
- values assigned to AMPL options by an option command
model diet.mod; set CASES := {"A","B","C"}; for {j in CASES} { reset data; data ("diet" & j & ".dat"); solve; display Buy >("diet" & j & ".out"); }The following script solves the same problem four times, each using a different pairing of the directives primal and dual with the directives primalopt and dualopt:
model sched.mod; data sched.dat; option solver cplex; set DIR1 := {"primal","dual"}; set DIR2 := {"primalopt","dualopt"}; for {i in DIR1, j in DIR2} { option cplex_options (i & " " & j); solve; }See the next section for examples that generate consecutive numbers as parts of filenames and directives.
Options ending in _precision and _round enable you to control number-to-string conversion in specific contexts. In general terms, their values are interpreted as follows:
The different possibilities for context are:
- context_precision n
- n > 0: round to n digits of precision.
n = 0: show in full precision: the shortest decimal string representation that, when correctly rounded back to the computer's internal representation, yields the original numerical value
- context_round n
- n > 0: round to n digits after the decimal point
n = 0: round to integer
n < 0: round to -n digits before the decimal point
The _round variant (if any) takes precedence when it has a numerical value; otherwise the precision variant is used.
Options Context affected csvdisplay_precision
csvdisplay_roundnumbers from the obscure _display and csvdisplay commands display_precision
display_roundnumbers from the display command expand_precision
expand_roundnumbers from the new expand command MD_precision numbers written in debug output produced by the obscure -M and -D command-line switches objective_precision optimal objective from the solve command output_precision numbers written by the solve command to the file that will be read by a solver print_precision
print_roundnumbers from the print command solution_precision
solution_roundvalues of variables and dual variables returned by the solve or solution command
The printf command provides more precise control over number-to-string conversions. Its syntax is
printf indexingopt format-string, expression-listopt redirectionopt ;Members of the expression-list that evaluate to numbers are converted to strings according to instructions encoded into the format-string. The final result of this formatting is a character string that is sent to a file specified by the optional redirection or else by default to standard output. A guide to format strings is provided on the separate printf rules page (and on pages 328-329 of the AMPL book). Format strings in AMPL have mostly the same interpretation as in the C programming language. The most notable exceptions are AMPL's interpretation of %.g or %.0g to specify full precision (as explained earlier in this section), and the introduction of %q and %Q to produce quoted strings (as described under "Display formats for strings" below).
The new sprintf function also specifies a format-string and an expression-list:
sprintf ( format-string, expression-listopt )This function performs conversions in the same way as printf, except that the resulting formatted string is not sent to an output device, but instead becomes the function's return value.
Numbers are also converted automatically to strings when they appear as arguments to the string concatenation operator &. For example, for a multi-week model you can create a set of generically-named periods WEEK1, WEEK2, and so forth, by declaring:
param T integer > 1; set WEEKS ordered := setof {t in 1..T} "WEEK" & t;Many useful applications of this feature occur in scripts that (as in our previous examples) process a series of files or set an option in several ways. For example, the following script solves diet.mod with a series of different data files diet1.dat, diet2.dat, diet3.dat, . . . and saves the solution to files diet1.out, diet2.out, diet3.out, . . . :
model diet.mod; param Nfiles integer := 3; for {j in 1..Nfiles} { reset data; data ("diet" & j & ".dat"); solve; display Buy >("diet" & j & ".out"); }The following script solves the same problem three times, each using a different value for the solver's branch directive:
model multmip3.mod; data multmip3.dat; option solver cplex; for {i in -1 .. 1} { option cplex_options ("branch " & i); solve; }Numeric operands to & are always converted to full precision (or equivalently, to %.0g format) as previously defined. The conversion thus produces the expected results for concatention of numerical constants and of indices that run over sets of integers or constants, as in our examples. Full precision conversion of computed fractional values may surprise you, however. The following variation on the preceding example would seem to solve the problem for values 0.1, 0.2, 0.3, and 0.4 of directive linesearch_tolerance, saving the listings from solve in files nltrans0.1 through nltrans0.4:
model nltransd.mod; data nltrans.dat; for {i in 0.1 .. 0.4 by 0.1} { option reset_initial_guesses 1; option minos_options ("linesearch_tolerance=" & i); solve > ("nltrans" & i); }The actual files created are
nltrans0.1 nltrans0.2 nltrans0.30000000000000004 nltrans0.4Because 0.1 cannot be stored exactly in the computer's internal base-2 representation, the third member of the set 0.1 .. 0.4 by 0.1 in the for loop is slightly different from 0.3 in "full" precision. There is no easy way to predict this behavior, but you can prevent it by specifying the conversion explicitly. In our example, you could replace "nltrans" & i with sprintf("nltrans%3.1f",i).
param cc {j in FOOD} symbolic;are "cost codes" that contain the cost per unit in the leading 4 characters: "3.19BE", "2.59CH", and so forth. The expression substr(costcode[j],1,4) * Buy[j] is rejected. because a string appears where a number is expected, as the left operand of *; but
num(substr(costcode[j],1,4)) * Buy[j]is correct.
The num function accepts an argument that evaluates to a valid AMPL numerical constant written as a string, such as "-12.34" or " 1.2e+6". Leading and trailing white space -- any combination of spaces, tabs and newlines -- is ignored. The use of num with any other argument is flagged as an error.
The num0 function is more forgiving. It extracts the longest leading substring of its argument that would be acceptable to num, returns the corresponding numerical value, and disregards the rest of the argument. If no leading substring can be converted, a value of zero is returned. Thus num0 accepts " 12.34kg" or "1.2e+4?", for example, returning the numbers 12.34 and 12000. Our sample expression above, with costcode[j] having the form "3.19BE", can be written more concisely as
num0(costcode[j]) * Buy[j]If the cost codes are instead of the form "BE3.19", however, then num0 returns 0 for all of them.
The char function takes a nonnegative integer as an argument, and returns a string of length 1 comprising the corresponding unicode character. (There is no distinct character data type in AMPL.) Unicode is a superset of ASCII, so char(65) returns "A", and display char(7) sends a "control-G" to the AMPL output stream (where it may provoke a "beep" or other alert sound).
The ichar function is the inverse of char. It takes a string as argument, and returns the corresponding integer unicode for the string's first character. Thus ichar("A") and ichar("AT&T") both return 65.
Strings still do need to be delimited by quotes (' or ") when you write them as input to AMPL, so that they can be parsed correctly. The only exception is in data statements (Chapter 9 of the AMPL book) where for convenience the quotes may be omitted from strings that contain only letters, digits, and underscores and that do not represent numbers.
For the command printf or the function sprintf, the conversions specification %s in the format string requests the unquoted contents of the string. Two new specifiers have been added to produce quoted strings: %q shows quotes if and only if they would be required in a data statement, while %Q shows quotes around all strings.
Return to the AMPL update page.