Step-by-Step OSRM Docker Setup on AWS EC2

Q: The /table endpoint returns 'Max locations exceeded' — how do I fix it?

The --max-table-size flag controls the matrix dimension limit. Restart the container with a higher value (e.g., --max-table-size 5000), but monitor RAM: a 5000×5000 table allocates ~200 MB per request. For very large matrices, batch the requests in Python using itertools.product over chunked origin/destination lists.

This guide covers the specific infrastructure and configuration decisions required when running Open Source Routing Machine (OSRM) on AWS EC2 via Docker — from selecting the right instance family through the Multi-Level Dijkstra (MLD) preprocessing pipeline to serving a production-grade routing API. It is a concrete operational variant of the broader OSRM Docker deployment workflow, sitting within the Python Routing Engines & Isochrone Mapping engineering stack. The decisions here — instance type, EBS IOPS, Docker memory flags, and MLD vs. CH algorithm choice — determine whether the server handles production traffic reliably or collapses under peak load.

When to use this approach

Running OSRM on EC2 is the right infrastructure choice when:

Latency requirements are strict. Self-hosted OSRM on a co-located EC2 instance eliminates the round-trip overhead and rate limits of third-party routing APIs — important for logistics dashboards that query thousands of routes per minute.
Network data is proprietary or regulated. If your OSM extract is augmented with confidential speed surveys, freight restriction overlays, or hazmat routing rules, those weights cannot be shared with external API providers.
Matrix query volume is high. The OSRM /table endpoint handles origin-destination matrices in a single HTTP call, but only for datasets loaded into RAM. A dedicated EC2 instance sized to the graph keeps response times under 100 ms for matrices up to 500×500.
Traffic update frequency is moderate. MLD allows you to run osrm-customize with a new speed CSV every few minutes without re-extracting the full graph. If you need sub-minute traffic freshness, consider a managed service; for hourly or daily updates, this pipeline is cost-effective.

This approach suits city- to country-scale extracts. Continental datasets (Europe, North America full) require ≥128 GB RAM instances and should use a purpose-built cluster or a managed OSRM service.

Implementation

The five-stage deployment below is self-contained. Steps 1–2 are infrastructure setup; steps 3–4 are preprocessing; step 5 launches the server.

Step 1: Provision EC2 and mount high-IOPS EBS storage

OSRM maps the entire routing graph into RAM using memory-mapped files, so instance selection is primarily a memory sizing exercise. Use the r6i or r7i families — their high memory bandwidth suits mmap I/O patterns. Avoid burstable t3/t4g instances: the CPU-intensive extraction stage will exhaust burst credits and stall.

RAM sizing by extract scope:

Extract scope	Typical `.osrm` size	Recommended RAM
Single city / metro	1–3 GB	8 GB (`r6i.large`)
US state (e.g. California)	4–10 GB	16 GB (`r6i.xlarge`)
Country (e.g. Germany)	10–25 GB	32 GB (`r6i.2xlarge`)
Multi-country / region	25–80 GB	64–128 GB (`r6i.4xlarge`+)

Attach a secondary gp3 EBS volume (500 GB minimum for state-level extracts) with ≥3000 IOPS and ≥125 MB/s throughput — preprocessing is heavily sequential-read/write, so throughput matters more than peak IOPS.

# Identify the new block device (typically /dev/xvdf or /dev/nvme1n1)
lsblk

# Format ext4, create mount point, mount
sudo mkfs.ext4 /dev/xvdf
sudo mkdir -p /mnt/osrm-data
sudo mount /dev/xvdf /mnt/osrm-data

# Persist mount across reboots (nofail prevents boot hang if device is missing)
echo '/dev/xvdf /mnt/osrm-data ext4 defaults,nofail 0 2' | sudo tee -a /etc/fstab

Configure the EC2 security group to allow TCP 22 (SSH, from your IP only) and TCP 5000 (OSRM API, from your VPC CIDR or load balancer security group). Never expose port 5000 directly to 0.0.0.0/0; use an Nginx reverse proxy or AWS Application Load Balancer for public traffic.

Step 2: Install Docker Engine from the official repository

Use the official Docker apt repository rather than Ubuntu’s docker.io package, which lags upstream releases and may lack current security patches or the overlay2 storage driver.

# Prerequisites
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings

# Add Docker's GPG key and apt source
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
  | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" \
  | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker and add the current user to the docker group
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
  docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER && newgrp docker

# Sanity check
docker run --rm hello-world

Step 3: Download the OSM PBF extract and run MLD preprocessing

MLD preprocessing has three sequential stages: osrm-extract parses the PBF into a routing graph; osrm-partition divides the graph into hierarchical cells for parallel query planning; osrm-customize applies edge weights (speed profiles or live traffic CSVs). You can re-run osrm-customize alone when speed data changes, without repeating the expensive extract and partition stages.

# Download a Geofabrik extract to the EBS volume
cd /mnt/osrm-data
curl -LO https://download.geofabrik.de/north-america/us/california-latest.osm.pbf

# Stage 1: Extract — parse OSM into OSRM graph format using the car profile
docker run -t \
  -v /mnt/osrm-data:/data \
  ghcr.io/project-osrm/osrm-backend \
  osrm-extract -p /opt/car.lua /data/california-latest.osm.pbf

# Stage 2: Partition — build MLD cell decomposition (CPU-bound, 5–30 min for state scale)
docker run -t \
  -v /mnt/osrm-data:/data \
  ghcr.io/project-osrm/osrm-backend \
  osrm-partition /data/california-latest.osrm

# Stage 3: Customize — write edge weights into MLD structures
docker run -t \
  -v /mnt/osrm-data:/data \
  ghcr.io/project-osrm/osrm-backend \
  osrm-customize /data/california-latest.osrm

Profile selection determines which OSM tags are read and how edge costs are computed. The built-in car.lua uses maxspeed=*, highway=* class weights, and oneway=yes directionality. For freight applications, review the profile’s turn penalty table and consider overriding vehicle_speed tables — see the OSRM profiles documentation before extraction, since the profile choice is baked into the .osrm files and cannot be swapped without re-extracting.

Step 4: Launch the routing daemon

Start osrm-routed as a detached container with a restart policy. The memory limit reserves headroom for the Docker daemon and host OS; swap is disabled because swapping OSRM’s mmap regions collapses query performance.

docker run -d \
  --name osrm-server \
  --restart unless-stopped \
  --memory="90%" \
  --memory-swap="-1" \
  -p 5000:5000 \
  -v /mnt/osrm-data:/data \
  ghcr.io/project-osrm/osrm-backend \
  osrm-routed \
    --algorithm mld \
    --max-table-size 1000 \
    /data/california-latest.osrm

Step 5: Validate the API and integrate with Python

# Test a driving route: LA to San Diego
curl "http://localhost:5000/route/v1/driving/-118.2437,34.0522;-117.1611,32.7157?overview=full&geometries=polyline6"

A successful response is a JSON object containing a routes array with geometry, duration (seconds), and distance (metres). Zero results or a NoRoute code indicates a graph connectivity issue in the extract.

For Python integration, query the endpoint with httpx (async) or requests (sync), decode the encoded polyline with the polyline library, and feed coordinates into geopandas for spatial joins:

# requires: httpx, polyline, geopandas, shapely
import httpx
import polyline
import geopandas as gpd
from shapely.geometry import LineString

OSRM_BASE = "http://localhost:5000"

def get_route(origin: tuple[float, float], dest: tuple[float, float]) -> dict:
    """Return OSRM route dict for (lon, lat) origin and destination."""
    coords = f"{origin[0]},{origin[1]};{dest[0]},{dest[1]}"
    url = f"{OSRM_BASE}/route/v1/driving/{coords}"
    params = {"overview": "full", "geometries": "polyline6", "steps": "false"}
    response = httpx.get(url, params=params, timeout=10.0)
    response.raise_for_status()
    return response.json()

def route_to_geodataframe(route_json: dict) -> gpd.GeoDataFrame:
    """Decode polyline6 geometry and return a GeoDataFrame (EPSG:4326)."""
    encoded = route_json["routes"][0]["geometry"]
    coords = [(lon, lat) for lat, lon in polyline.decode(encoded, precision=6)]
    line = LineString(coords)
    props = {
        "duration_s": route_json["routes"][0]["duration"],
        "distance_m": route_json["routes"][0]["distance"],
    }
    return gpd.GeoDataFrame([props], geometry=[line], crs="EPSG:4326")

# Example usage
route_data = get_route((-118.2437, 34.0522), (-117.1611, 32.7157))
gdf = route_to_geodataframe(route_data)
print(gdf[["duration_s", "distance_m"]])

Key parameters and tuning

Parameter	Flag / Config	Recommended range	Notes
Algorithm	`--algorithm mld`	MLD for production	CH is faster for single routes but cannot accept live traffic updates
Matrix limit	`--max-table-size`	500–2000	Higher values raise per-request RAM proportionally
Container memory cap	`--memory`	85–92% of instance RAM	Leaves room for OS kernel page cache and Docker daemon
Swap	`--memory-swap="-1"`	Always disable	Swapping mmap’d routing data causes multi-second query stalls
EBS throughput	`gp3` baseline	≥125 MB/s, ≥3000 IOPS	Preprocessing reads/writes sequentially; throughput matters more than IOPS burst
Restart policy	`--restart unless-stopped`	Required in production	Survives instance reboot and Docker daemon restart without manual intervention
HTTP threads	`--threads N`	CPU core count	Defaults to hardware concurrency; set explicitly for predictable behaviour

Integration points

The OSRM HTTP API output connects directly into several downstream workflows available on this site:

Distance matrix generation: The /table endpoint returns a full N×M duration or distance matrix in a single request. Feed the response matrix into a NumPy array and pass it to a vehicle routing solver (OR-Tools, python-mip) as the travel-time cost matrix.
Isochrone boundary computation: OSRM does not natively produce isochrone polygons, but you can approximate them by querying the /nearest endpoint for a grid of candidate points, filtering by /route duration from a centre, and computing the alpha-hull boundary with GeoPandas-based isochrone generation.
Traffic weight overlays: After the initial MLD preprocessing, speed updates can be applied by re-running osrm-customize with a new speed CSV (segment ID, speed km/h) without re-extracting the graph — this is the primary mechanism for integrating custom traffic weights into OSRM.
NetworkX comparison: For small subgraph analyses or what-if scenarios, you can extract a bounding-box subset from the same PBF and load it into NetworkX shortest-path algorithms for logistics alongside the OSRM server to cross-validate route costs.

Validation checklist

Run these checks after completing the deployment before routing live traffic through the server:

Route JSON structure: curl http://localhost:5000/route/v1/driving/-118.2437,34.0522;-117.1611,32.7157?overview=full returns "code": "Ok" and a routes array with non-zero duration and distance.
Snap accuracy: Query /nearest/v1/driving/<lon>,<lat> for several known road junctions and confirm the snapped node is within 20 m of the expected coordinate.
Table endpoint: curl "http://localhost:5000/table/v1/driving/-118.2437,34.0522;-117.1611,32.7157;-118.4912,34.0195" returns a 3×3 durations matrix with no null cells.
Memory headroom: Run docker stats osrm-server under a moderate query load (50 concurrent routes) and confirm RSS stays below the container memory limit.
Container restart persistence: docker restart osrm-server and re-query — the server should recover in under 10 seconds for city-scale graphs (RAM is re-mapped from EBS on startup).
One-way enforcement: Route from A→B and B→A on a known oneway=yes road segment. The reverse direction should return a detour rather than the direct path.

Why does osrm-extract run out of memory even on a 16 GB instance?

The extract stage builds an in-memory node/way/relation index before flushing to disk. For country-scale PBF files, peak RAM during extraction can reach 2–3× the final .osrm dataset size. Either upgrade to a 32 GB instance, or pre-filter the PBF with osmium extract --bbox to reduce the input area before running osrm-extract.

Should I use MLD or Contraction Hierarchies for fleet routing?

Use MLD for production deployments that require dynamic traffic weight updates or large distance-matrix queries. CH produces marginally faster individual route lookups but cannot accept live traffic overlays after preprocessing — any speed update requires a full rebuild. MLD’s partition/customize split lets you swap traffic data without re-extracting the graph.

The /table endpoint returns "Max locations exceeded" — how do I fix it?

The --max-table-size flag controls the matrix dimension limit. Restart the container with a higher value (e.g. --max-table-size 5000), but monitor RAM: a 5000×5000 table allocates approximately 200 MB per request. For very large matrices, batch requests in Python by splitting origin and destination lists with itertools.islice and issuing multiple smaller table calls.

Related

Deploying OSRM with Docker for Local Routing — the full guide covering Docker Compose patterns, Nginx reverse proxy configuration, and rolling PBF update strategies
Integrating custom traffic weights into OSRM — applying speed CSV overlays via osrm-customize for live traffic without full re-extraction
NetworkX shortest-path algorithms for logistics — in-process Python graph routing for smaller networks and algorithm comparison
Generating isochrones with PySAL and GeoPandas — building reachability polygons on top of OSRM duration output