Optimized Portfolio Optimization with AMPL and EIA Data#

Notebook_1_Portfolio_Optimization_Commodities.ipynb Open In Colab Kaggle Gradient Open In SageMaker Studio Lab Hits

Description: Portfolio Optimization across Crude Oil, Gold, Natural Gas, Silver, and the S&P 500.

Tags: finance, portfolio-optimization, mean-variance

Notebook author: Mukeshwaran Baskaran <mukesh96official@gmail.com>

What the Code Does:#

This code is a portfolio optimization tool that helps you decide how to invest in different assets (like Crude Oil, Gold, Natural Gas, Silver, and the S&P 500) to achieve specific financial goals. It uses historical data and advanced math to find the best way to distribute your money across these assets.


Steps in the Code:#

  1. Install Required Tools:

    • The code installs necessary Python libraries like amplpy (for optimization) and yfinance (to get financial data).

  2. Import Libraries:

    • It imports tools for data analysis (pandas, numpy), visualization (matplotlib, seaborn), and fetching data (yfinance, requests).

  3. Enter API Key:

  4. Define Assets and Time Period:

    • The code focuses on 5 assets: Crude Oil, Gold, Natural Gas, Silver, and the S&P 500 (SPY).

    • It sets a time period (from January 1, 2020, to January 1, 2025) to analyze historical data.

  5. Fetch Data:

    • Crude Oil Inventories: It fetches weekly crude oil inventory data from the EIA API.

    • Commodity Prices: It downloads historical price data for the 5 assets using yfinance.

  6. Calculate Statistics:

    • It calculates:

      • Daily Returns: How much the price of each asset changes daily.

      • Covariance Matrix: How the assets move in relation to each other.

      • Mean Returns: The average daily return for each asset.

  7. Initialize AMPL:

    • AMPL is a tool for solving optimization problems. The code sets it up with specific modules and a license.

  8. Define Optimization Functions:

    • The code defines three types of portfolio optimization:

      • Mean-Variance Optimization: Finds the portfolio with the lowest risk for a given level of return.

      • Risk Parity Optimization: Balances risk equally across all assets.

      • Maximum Sharpe Ratio Optimization: Finds the portfolio with the best risk-adjusted return.

  9. Plot Results:

    • It creates pie charts to visualize how much money should be invested in each asset based on the optimization results.

  10. Run Optimizations:

    • It runs each optimization (Mean-Variance, Risk Parity, and Maximum Sharpe Ratio) and displays the results as pie charts.


Outputs:#

  • Mean-Variance Portfolio: A pie chart showing how to distribute investments to minimize risk.

  • Risk Parity Portfolio: A pie chart showing how to balance risk equally across assets.

  • Maximum Sharpe Ratio Portfolio: A pie chart showing how to maximize returns relative to risk.

# Install dependencies
%pip install amplpy yfinance matplotlib seaborn pandas numpy requests -q
# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["gurobi"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics
import os
from getpass import getpass

if os.environ.get("EIA_API_KEY", "") == "":
    # Prompt the user to enter their API key securely
    os.environ["EIA_API_KEY"] = getpass("Enter your EIA API key: ")
    print("API Key stored successfully!")
else:
    print("API Key already stored!")
API Key already stored!

Utility functions#

Fetch data from EIA and yfinance#

# Import libraries
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import requests
# Define commodities and fetch data
COMMODITIES = [
    "CL=F",
    "GC=F",
    "NG=F",
    "SI=F",
    "SPY",
]  # Crude Oil, Gold, Natural Gas, Silver, SPY,
START_DATE = "2020-01-01"
END_DATE = "2025-01-01"


# Fetch crude oil inventories from EIA API
def fetch_crude_inventories(api_key):
    """Fetch crude oil inventories data from EIA API."""
    base_url = "https://api.eia.gov/v2/petroleum/stoc/wstk/data/"
    params = {
        "frequency": "weekly",
        "data[0]": "value",
        "facets[series][]": "WTTSTUS1",
        "sort[0][column]": "period",
        "sort[0][direction]": "desc",
        "offset": 0,
        "length": 5000,
        "api_key": api_key,
    }
    response = requests.get(base_url, params=params)
    if response.status_code == 200:
        data = response.json()
        df = pd.DataFrame(data["response"]["data"])
        df["period"] = pd.to_datetime(df["period"])
        df.set_index("period", inplace=True)
        return df[["value"]].rename(columns={"value": "Inventories"})
    else:
        raise Exception(f"Failed to fetch data: {response.status_code}")


# Download historical data for commodities
def fetch_commodities_data(tickers, start_date, end_date):
    """Fetch historical data for given tickers."""
    data = yf.download(tickers, start=start_date, end=end_date)["Close"]
    return data

Calculate statistics#

# Calculate returns and covariance matrix
def calculate_statistics(data):
    """Calculate daily returns and covariance matrix."""
    returns = data.pct_change().dropna()
    cov_matrix = returns.cov().values
    mean_returns = returns.mean().values
    return returns, cov_matrix, mean_returns

Plot charts#

import matplotlib.pyplot as plt
import seaborn as sns
def plot_pie_chart(weights, title):
    """Plot a pie chart of portfolio weights with improved visuals."""

    # Define a dictionary to map tickers to their respective names
    ticker_to_name = {
        "CL=F": "Crude Oil",
        "GC=F": "Gold",
        "NG=F": "Natural Gas",
        "SI=F": "Silver",
        "SPY": "S&P 500",
    }

    # Map the tickers in the portfolio to their actual names
    labels = [ticker_to_name[ticker] for ticker in weights.index]

    # Filter out the tickers with weight less than or equal to 10e-6
    filtered_weights = weights[weights > 10e-6]
    filtered_labels = [
        label
        for label, ticker in zip(labels, weights.index)
        if ticker in filtered_weights.index
    ]

    # Set color palette
    palette = sns.color_palette("Set3", len(filtered_weights))

    # Plot the pie chart
    plt.figure(figsize=(8, 8))
    wedges, texts, autotexts = plt.pie(
        filtered_weights,
        labels=filtered_labels,
        autopct="%1.1f%%",
        startangle=90,
        colors=palette,
        explode=[0.1]
        * len(filtered_weights),  # Slightly "explode" all slices for emphasis
        wedgeprops={"edgecolor": "black", "linewidth": 1, "linestyle": "solid"},
    )

    # Improve text labels
    for text in texts:
        text.set_fontsize(14)
        text.set_fontweight("bold")
    for autotext in autotexts:
        autotext.set_fontsize(12)
        autotext.set_fontweight("bold")

    # Add title
    plt.title(title, fontsize=18, fontweight="bold")

    # Equal aspect ratio ensures that pie is drawn as a circle.
    plt.axis("equal")

    # Show the plot
    plt.show()

Mean-Variance Optimization Implementation#

def mean_variance_optimization(tickers, cov_matrix):
    """Perform mean-variance optimization."""
    ampl = AMPL()
    # Define the AMPL model
    ampl.eval(
        r"""
        set A ordered;  # Set of assets
        param S{A, A};  # Covariance matrix
        var w{A} >= 0 <= 1;  # Portfolio weights

        minimize portfolio_variance:
            sum {i in A, j in A} w[i] * S[i, j] * w[j];  # Objective: minimize variance

        s.t. portfolio_weights:
            sum {i in A} w[i] = 1;  # Constraint: weights sum to 1
        """
    )

    # Set data in AMPL
    ampl.set["A"] = tickers
    ampl.param["S"] = pd.DataFrame(cov_matrix, index=tickers, columns=tickers)

    # Solve the optimization problem
    ampl.solve(solver="gurobi", gurobi_options="outlev=1")

    # Extract and return portfolio weights
    return ampl.var["w"].to_pandas()["w.val"]

Risk Parity Optimization Implementation#

def risk_parity_optimization(tickers, cov_matrix, total_volatility):
    """Perform risk parity optimization."""
    ampl = AMPL()
    # Define the AMPL model
    ampl.eval(
        r"""
        set A ordered;  # Set of assets
        param S{A, A};  # Covariance matrix
        param total_volatility;  # Total portfolio volatility
        var w{A} >= 0 <= 1;  # Portfolio weights

        minimize risk_parity:
            sum {i in A} (w[i] * sum {j in A} S[i, j] * w[j]) / total_volatility;  # Objective: risk parity

        s.t. portfolio_weights:
            sum {i in A} w[i] = 1;  # Constraint: weights sum to 1
        """
    )

    # Set data in AMPL
    ampl.set["A"] = tickers
    ampl.param["S"] = pd.DataFrame(cov_matrix, index=tickers, columns=tickers)
    ampl.param["total_volatility"] = total_volatility

    # Solve the optimization problem
    ampl.solve(solver="gurobi", gurobi_options="outlev=1")

    # Extract and return portfolio weights
    return ampl.var["w"].to_pandas()["w.val"]

Maximum Sharpe Ratio Optimization Implementation#

def max_sharpe_optimization(
    tickers, cov_matrix, mean_returns, risk_free_rate=0.02
):
    """Perform maximum Sharpe ratio optimization."""
    ampl = AMPL()
    # Define the AMPL model
    ampl.eval(
        r"""
        set A ordered;  # Set of assets
        param S{A, A};  # Covariance matrix
        param mu{A};  # Mean returns
        param risk_free_rate;  # Risk-free rate
        var w{A} >= 0 <= 1;  # Portfolio weights

        maximize sharpe_ratio:
            (sum {i in A} mu[i] * w[i] - risk_free_rate) /
            sqrt(sum {i in A, j in A} w[i] * S[i, j] * w[j]);  # Objective: maximize Sharpe ratio

        s.t. portfolio_weights:
            sum {i in A} w[i] = 1;  # Constraint: weights sum to 1
        """
    )

    # Set data in AMPL
    ampl.set["A"] = tickers
    ampl.param["S"] = pd.DataFrame(cov_matrix, index=tickers, columns=tickers)
    ampl.param["mu"] = pd.Series(mean_returns, index=tickers).to_dict()
    ampl.param["risk_free_rate"] = risk_free_rate

    # Solve the optimization problem
    ampl.solve(solver="gurobi", gurobi_options="outlev=1")

    # Extract and return portfolio weights
    return ampl.var["w"].to_pandas()["w.val"]

Mean-Variance Optimization#

crude_inventories = fetch_crude_inventories(os.environ["EIA_API_KEY"])
commodities_data = fetch_commodities_data(COMMODITIES, START_DATE, END_DATE)
returns, cov_matrix, mean_returns = calculate_statistics(commodities_data)

# Mean-Variance Optimization
mean_variance_weights = mean_variance_optimization(COMMODITIES, cov_matrix)
plot_pie_chart(mean_variance_weights, "Mean-Variance Optimization Portfolio")
[*********************100%***********************]  5 of 5 completed
<ipython-input-68-262237bf351b>:4: FutureWarning: The default fill_method='pad' in DataFrame.pct_change is deprecated and will be removed in a future version. Either fill in any non-leading NA values prior to calling pct_change or specify 'fill_method=None' to not fill NA values.
  returns = data.pct_change().dropna()
Gurobi 12.0.1: Set parameter LogToConsole to value 1
  tech:outlev = 1
Set parameter InfUnbdInfo to value 1
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Ubuntu 22.04.4 LTS")

CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 1 physical cores, 2 logical processors, using up to 2 threads

Non-default parameters:
InfUnbdInfo  1

Optimize a model with 1 rows, 5 columns and 5 nonzeros
Model fingerprint: 0xf5b53e7c
Model has 15 quadratic objective terms
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [0e+00, 0e+00]
  QObjective range [5e-05, 2e-02]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve time: 0.00s
Presolved: 1 rows, 5 columns, 5 nonzeros
Presolved model has 15 quadratic objective terms
Ordering time: 0.00s

Barrier statistics:
 Free vars  : 4
 AA' NZ     : 1.000e+01
 Factor NZ  : 1.500e+01
 Factor Ops : 5.500e+01 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0   2.63090112e+03 -2.63090112e+03  2.25e+03 2.70e-07  2.50e+05     0s
   1   5.00114160e+02 -5.01458979e+02  2.43e+02 2.92e-08  3.22e+04     0s
   2   3.94288533e+01 -4.16895661e+01  1.85e+01 2.22e-09  2.78e+03     0s
   3   1.56225933e-02 -2.91974974e+00  1.21e-01 1.44e-11  3.67e+01     0s
   4   4.36523957e-03 -7.22068602e-01  1.21e-07 5.55e-17  4.65e+00     0s
   5   4.26451246e-03 -4.85867704e-03  1.51e-09 5.55e-17  5.84e-02     0s
   6   5.70426476e-04 -2.43281468e-03  1.53e-15 1.39e-16  1.92e-02     0s
   7   2.05375197e-04 -5.59800439e-05  5.55e-17 1.98e-16  1.67e-03     0s
   8   9.12464623e-05  4.89884369e-05  2.78e-17 1.39e-17  2.70e-04     0s
   9   7.49653082e-05  7.01197812e-05  3.47e-18 3.47e-18  3.10e-05     0s
  10   7.30077162e-05  7.23982266e-05  6.52e-16 6.94e-18  3.90e-06     0s
  11   7.27266074e-05  7.26554420e-05  1.29e-15 6.94e-18  4.55e-07     0s
  12   7.26783967e-05  7.26715401e-05  8.60e-16 6.94e-18  4.39e-08     0s
  13   7.26721403e-05  7.26720451e-05  3.09e-15 6.94e-18  6.09e-10     0s

Barrier solved model in 13 iterations and 0.01 seconds (0.00 work units)
Optimal objective 7.26721403e-05

Gurobi 12.0.1: optimal solution; objective 7.26721403e-05
0 simplex iterations
13 barrier iterations
../_images/e9410c54d8154b9f451c605434277dda2704abe76e5304d8cb1ce84af0478987.png

Risk Parity Optimization#

# Risk Parity Optimization
total_volatility = np.sqrt(np.dot(mean_returns, np.dot(cov_matrix, mean_returns)))
risk_parity_weights = risk_parity_optimization(
    COMMODITIES, cov_matrix, total_volatility
)
plot_pie_chart(risk_parity_weights, "Risk Parity Optimization Portfolio")
Gurobi 12.0.1: Set parameter LogToConsole to value 1
  tech:outlev = 1
Set parameter InfUnbdInfo to value 1
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Ubuntu 22.04.4 LTS")

CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 1 physical cores, 2 logical processors, using up to 2 threads

Non-default parameters:
InfUnbdInfo  1

Optimize a model with 1 rows, 5 columns and 5 nonzeros
Model fingerprint: 0x1daa3581
Model has 15 quadratic objective terms
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [0e+00, 0e+00]
  QObjective range [2e-01, 8e+01]
  Bounds range     [1e+00, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve time: 0.00s
Presolved: 1 rows, 5 columns, 5 nonzeros
Presolved model has 15 quadratic objective terms
Ordering time: 0.00s

Barrier statistics:
 Free vars  : 4
 AA' NZ     : 1.000e+01
 Factor NZ  : 1.500e+01
 Factor Ops : 5.500e+01 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0   2.77624384e+06 -2.77624384e+06  2.54e+03 6.74e-07  2.69e+05     0s
   1   9.34658298e+04 -9.50588354e+04  2.02e+02 5.36e-08  2.37e+04     0s
   2   3.84183441e+03 -6.36299934e+03  7.80e+00 2.07e-09  1.03e+03     0s
   3   1.37974550e+01 -2.68192258e+03  7.80e-06 1.03e-13  3.37e+01     0s
   4   1.33961068e+01 -1.57253068e+01  5.18e-08 5.55e-17  3.64e-01     0s
   5   1.87611945e+00 -1.09723916e+01  5.18e-14 3.08e-15  1.61e-01     0s
   6   7.57110720e-01 -7.87494755e-02  2.78e-16 5.62e-16  1.04e-02     0s
   7   3.71954487e-01  2.48420962e-01  2.22e-16 1.11e-16  1.54e-03     0s
   8   3.14466939e-01  3.00358722e-01  2.78e-17 5.55e-17  1.76e-04     0s
   9   3.07824297e-01  3.05933954e-01  1.39e-16 1.39e-17  2.36e-05     0s
  10   3.06824712e-01  3.06575069e-01  8.88e-16 1.39e-17  3.12e-06     0s
  11   3.06650315e-01  3.06627918e-01  5.02e-15 2.78e-17  2.80e-07     0s
  12   3.06629561e-01  3.06629345e-01  8.83e-15 1.83e-17  2.70e-09     0s
  13   3.06629347e-01  3.06629347e-01  2.18e-14 3.47e-18  2.70e-12     0s

Barrier solved model in 13 iterations and 0.01 seconds (0.00 work units)
Optimal objective 3.06629347e-01

Gurobi 12.0.1: optimal solution; objective 0.3066293474
0 simplex iterations
13 barrier iterations
../_images/8fd10247ae17c2d9b46cde63e7a995c60d0a22fb831cec827d3f132163bf5c6d.png

Max Sharpe Optimization#

# Maximum Sharpe Ratio Optimization
max_sharpe_weights = max_sharpe_optimization(
    COMMODITIES, cov_matrix, mean_returns
)
plot_pie_chart(max_sharpe_weights, "Maximum Sharpe Ratio Portfolio")
Gurobi 12.0.1: Set parameter LogToConsole to value 1
  tech:outlev = 1
Set parameter InfUnbdInfo to value 1
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (linux64 - "Ubuntu 22.04.4 LTS")

CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 1 physical cores, 2 logical processors, using up to 2 threads

Non-default parameters:
InfUnbdInfo  1

Optimize a model with 1 rows, 9 columns and 5 nonzeros
Model fingerprint: 0x477461ec
Model has 1 general nonlinear constraint (27 nonlinear terms)
Variable types: 9 continuous, 0 integer (0 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [8e-05, 1e+00]
  RHS range        [1e+00, 1e+00]
Presolve model has 1 nlconstr
Added 33 variables to disaggregate expressions.
Presolve removed 0 rows and 3 columns
Presolve time: 0.00s
Presolved: 82 rows, 40 columns, 182 nonzeros
Presolved model has 16 bilinear constraint(s)
Presolved model has 1 nonlinear constraint(s)

Solving non-convex MINLP

Variable types: 40 continuous, 0 integer (0 binary)
Found heuristic solution: objective -0.2240361

Root relaxation: interrupted, 20 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0          -    0        -0.22404   -0.22402  0.01%     -    0s

Explored 1 nodes (20 simplex iterations) in 0.03 seconds (0.00 work units)
Thread count was 2 (of 2 available processors)

Solution count 1: -0.224036 

Optimal solution found (tolerance 1.00e-04)
Best objective -2.240361471278e-01, best bound -2.240208721585e-01, gap 0.0068%
Gurobi 12.0.1: optimal solution; objective -0.2240361471
20 simplex iterations
1 branching node
absmipgap=1.5275e-05, relmipgap=6.81808e-05
../_images/e1f67fb6830ff138dd5a4eaa3dfd2114585d841e7ea3c49d4c4f88b262b98d15.png