# LLM cost tracking (AI Foundry and in-house)
Below is a command-line tool to fetch the costs of LLM (Large Language Model) deployments in Azure AI Foundry for a set of subscriptions within your directory.
The tool generates reports that document the total costs, per subscription, and per resource (group).
Subscriptions can be specified in a configuration file or discovered automatically.
In addition, you can specify resources that you want to track as costs in addition to AI Foundry deployments, such as an AKS cluster, the container registry, or the API Management instances you are using are for running your own, private LLM deployments.
Finally, it can show plots of daily cost graphs using Pandas and matplotlib.
It simplifies cost tracking and provides immediate insights into the (amortized) expenses associated with any Azure-based LLM services you are using.
## Prerequisites
- Python (3.11 or higher); dependencies: `pip install requests pyyaml pandas matplotlib types-pyyaml types-requests pandas-stubs`
- Azure account with permissions to access subscription cost data
- Azure Cost Management API enabled for your subscriptions
- Azure CLI `az` installed (for token generation)
## Setup Instructions
### Optional: Set Azure Access Token
By default, the script will fetch the access token on its own using the `az` CLI tool.
To generate an Azure Access Token using the Azure CLI:
```bash
az account get-access-token --resource https://management.azure.com
```
Copy the `accessToken` value from the output.
Set the Access Token as an Environment Variable:
- On macOS/Linux:
```bash
export AZURE_ACCESS_TOKEN=<your-access-token>
```
- On Windows (Command Prompt):
```cmd
set AZURE_ACCESS_TOKEN=<your-access-token>
```
- On Windows (PowerShell):
```powershell
$env:AZURE_ACCESS_TOKEN="<your-access-token>"
```
To make this easy, create a script `set_azure_token.sh` with this content:
```shell
#!/bin/bash
# Get the access token using Azure CLI
TOKEN=$(az account get-access-token --resource https://management.azure.com --query accessToken -o tsv)
if [ -z "$TOKEN" ]; then
echo "Error: Failed to retrieve access token. Make sure you are logged in with 'az login'."
exit 1
fi
export AZURE_ACCESS_TOKEN="$TOKEN"
```
And then set the token using `source set_azure_token.sh`
### Optional: Configure Subscriptions
Update the `config.yaml` file with your Azure subscription details. The file includes a list of subscriptions to track:
```yaml
subscriptions:
- id: $UUID...
name: $NAME...
resource_groups:
- $NAME
- ...
- id: ...
```
Add or modify subscription entries as needed for your environment.
To use those subscriptions, use the command-line flag `--configure`
## Using the tool
```bash
python track_costs.py --help
```
To capture the full cost report in a markdown file:
```bash
python track_costs.py > report.md
```
# Code
```python
#!/usr/bin/env python3
import os
import sys
import logging
import requests
import argparse
import time
import subprocess
import re
from datetime import datetime, timedelta
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
logging.basicConfig(
level=logging.INFO, format="%(levelname)s: %(message)s", stream=sys.stderr
)
AZURE_MANAGEMENT_API = "https://management.azure.com"
AZURE_GRAPH_API = "https://graph.microsoft.com"
API_VERSION_SUBSCRIPTIONS = "2022-12-01"
API_VERSION_COST_MANAGEMENT = "2023-03-01"
API_VERSION_COGNITIVE_SERVICES = "2023-05-01"
RATE_LIMIT_DELAY_SECONDS = 2.0
SUBSCRIPTION_DELAY_SECONDS = 3.0
VALID_PRESET_TIMEFRAMES = ["this-week", "7days", "month-to-date", "last-month", "30days"]
_principal_name_cache: dict[str, str | None] = {}
def create_session_with_retries():
"""Create a requests session with retry logic for rate limiting (429) and server errors."""
session = requests.Session()
retry_strategy = Retry(
total=5,
backoff_factor=2,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["GET", "POST"],
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)
return session
def get_azure_token(resource=f"{AZURE_MANAGEMENT_API}/"):
"""Get Azure access token from environment variable or Azure CLI.
Args:
resource: The Azure resource URL to get a token for.
Defaults to Azure Management API.
Returns:
Access token string, or None if unable to retrieve.
"""
env_var = (
"AZURE_ACCESS_TOKEN"
if resource == f"{AZURE_MANAGEMENT_API}/"
else "AZURE_GRAPH_TOKEN"
)
token = os.getenv(env_var)
if token:
logging.debug(f"Using token from environment variable {env_var}")
return token
logging.debug(f"az account get-access-token --resource {resource} --query accessToken -o tsv")
try:
result = subprocess.run(
[
"az",
"account",
"get-access-token",
"--resource",
resource,
"--query",
"accessToken",
"-o",
"tsv",
],
capture_output=True,
text=True,
check=True,
)
return result.stdout.strip()
except subprocess.CalledProcessError as e:
logging.debug(f"Azure CLI error: {e.stderr if hasattr(e, 'stderr') else str(e)}")
return None
except FileNotFoundError:
logging.error("Azure CLI not found")
return None
def load_subscriptions(config_file):
"""Load subscriptions from the configuration file."""
try:
import yaml
with open(config_file, "r") as file:
return yaml.safe_load(file).get("subscriptions", [])
except ImportError:
logging.error("Error: 'pyyaml' module is not installed.")
logging.info("Install pyyaml using: pip install pyyaml")
sys.exit(1)
def generate_timeline_plots(daily_data):
"""Generate timeline plots from daily cost data."""
if not daily_data:
logging.info("No daily data to plot")
return
df = pd.DataFrame(daily_data)
df["date"] = pd.to_datetime(df["date"].astype(str), format="%Y%m%d")
pivot_df = df.pivot_table(
index="date",
columns="resource_name",
values="cost",
aggfunc="sum",
fill_value=0,
)
if pivot_df.empty:
logging.info("No data available for plotting")
return
plt.figure(figsize=(12, 6))
for resource in pivot_df.columns:
plt.plot(pivot_df.index, pivot_df[resource], marker="o", label=resource)
plt.xlabel("Date")
plt.ylabel("Cost ($)")
plt.title("Daily Costs by Resource")
plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
plt.grid(True, alpha=0.3)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d"))
plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.savefig("resource_group_costs.png", dpi=300, bbox_inches="tight")
plt.close()
logging.info("Saved resource_group_costs.png")
cumulative_df = pivot_df.cumsum()
plt.figure(figsize=(12, 6))
plt.stackplot(
cumulative_df.index,
*[cumulative_df[col].values for col in cumulative_df.columns],
labels=list(cumulative_df.columns),
alpha=0.8,
)
plt.xlabel("Date")
plt.ylabel("Cumulative Cost ($)")
plt.title("Cumulative Total Costs (Stacked)")
plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left")
plt.grid(True, alpha=0.3)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m-%d"))
plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.savefig("total_costs.png", dpi=300, bbox_inches="tight")
plt.close()
logging.info("Saved total_costs.png")
def discover_subscriptions(access_token):
"""Discover all Azure subscriptions the current user has access to.
Args:
access_token: Azure Management API access token.
Returns:
List of dicts with 'id' and 'name' keys for each subscription.
"""
url = f"{AZURE_MANAGEMENT_API}/subscriptions?api-version={API_VERSION_SUBSCRIPTIONS}"
headers = {
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
}
try:
session = create_session_with_retries()
response = session.get(url, headers=headers)
response.raise_for_status()
data = response.json()
logging.debug(f"Found {len(data.get('value', []))} subscriptions")
subscriptions = []
for sub in data.get("value", []):
subscriptions.append(
{"id": sub.get("subscriptionId"), "name": sub.get("displayName")}
)
return subscriptions
except requests.exceptions.RequestException as e:
logging.error(f"Error discovering subscriptions: {e}")
if hasattr(e, 'response') and e.response is not None:
logging.debug(f"Response text: {e.response.text[:500]}")
return []
def get_date_range(timeframe):
"""Calculate start and end dates for a given timeframe.
Supports formats:
- 'this-week': Current week
- '7days': Last 7 days
- 'month-to-date': Current month to today
- 'last-month': Previous complete month
- '30days': Last 30 days
- 'mon-2025', 'january-2026': Specific month (abbreviated or full month name)
"""
today = datetime.now()
# Try to parse month-year format (e.g., "nov-2025", "november-2025")
month_year_pattern = r'^([a-z]+)-([0-9]{4})