This guide covers the specific infrastructure and configuration decisions required when running Open Source Routing Machine (OSRM) on AWS EC2 via Docker — from selecting the right instance family through the Multi-Level Dijkstra (MLD) preprocessing pipeline to serving a production-grade routing API. It is a concrete operational variant of the broader OSRM Docker deployment workflow, sitting within the Python Routing Engines & Isochrone Mapping engineering stack. The decisions here — instance type, EBS IOPS, Docker memory flags, and MLD vs. CH algorithm choice — determine whether the server handles production traffic reliably or collapses under peak load.
When to use this approach
Running OSRM on EC2 is the right infrastructure choice when:
- Latency requirements are strict. Self-hosted OSRM on a co-located EC2 instance eliminates the round-trip overhead and rate limits of third-party routing APIs — important for logistics dashboards that query thousands of routes per minute.
- Network data is proprietary or regulated. If your OSM extract is augmented with confidential speed surveys, freight restriction overlays, or hazmat routing rules, those weights cannot be shared with external API providers.
- Matrix query volume is high. The OSRM
/tableendpoint handles origin-destination matrices in a single HTTP call, but only for datasets loaded into RAM. A dedicated EC2 instance sized to the graph keeps response times under 100 ms for matrices up to 500×500. - Traffic update frequency is moderate. MLD allows you to run
osrm-customizewith a new speed CSV every few minutes without re-extracting the full graph. If you need sub-minute traffic freshness, consider a managed service; for hourly or daily updates, this pipeline is cost-effective.
This approach suits city- to country-scale extracts. Continental datasets (Europe, North America full) require ≥128 GB RAM instances and should use a purpose-built cluster or a managed OSRM service.
Implementation
The five-stage deployment below is self-contained. Steps 1–2 are infrastructure setup; steps 3–4 are preprocessing; step 5 launches the server.
Step 1: Provision EC2 and mount high-IOPS EBS storage
OSRM maps the entire routing graph into RAM using memory-mapped files, so instance selection is primarily a memory sizing exercise. Use the r6i or r7i families — their high memory bandwidth suits mmap I/O patterns. Avoid burstable t3/t4g instances: the CPU-intensive extraction stage will exhaust burst credits and stall.
RAM sizing by extract scope:
| Extract scope | Typical .osrm size |
Recommended RAM |
|---|---|---|
| Single city / metro | 1–3 GB | 8 GB (r6i.large) |
| US state (e.g. California) | 4–10 GB | 16 GB (r6i.xlarge) |
| Country (e.g. Germany) | 10–25 GB | 32 GB (r6i.2xlarge) |
| Multi-country / region | 25–80 GB | 64–128 GB (r6i.4xlarge+) |
Attach a secondary gp3 EBS volume (500 GB minimum for state-level extracts) with ≥3000 IOPS and ≥125 MB/s throughput — preprocessing is heavily sequential-read/write, so throughput matters more than peak IOPS.
# Identify the new block device (typically /dev/xvdf or /dev/nvme1n1)
lsblk
# Format ext4, create mount point, mount
sudo mkfs.ext4 /dev/xvdf
sudo mkdir -p /mnt/osrm-data
sudo mount /dev/xvdf /mnt/osrm-data
# Persist mount across reboots (nofail prevents boot hang if device is missing)
echo '/dev/xvdf /mnt/osrm-data ext4 defaults,nofail 0 2' | sudo tee -a /etc/fstab
Configure the EC2 security group to allow TCP 22 (SSH, from your IP only) and TCP 5000 (OSRM API, from your VPC CIDR or load balancer security group). Never expose port 5000 directly to 0.0.0.0/0; use an Nginx reverse proxy or AWS Application Load Balancer for public traffic.
Step 2: Install Docker Engine from the official repository
Use the official Docker apt repository rather than Ubuntu’s docker.io package, which lags upstream releases and may lack current security patches or the overlay2 storage driver.
# Prerequisites
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
# Add Docker's GPG key and apt source
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
| sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" \
| sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker and add the current user to the docker group
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER && newgrp docker
# Sanity check
docker run --rm hello-world
Step 3: Download the OSM PBF extract and run MLD preprocessing
MLD preprocessing has three sequential stages: osrm-extract parses the PBF into a routing graph; osrm-partition divides the graph into hierarchical cells for parallel query planning; osrm-customize applies edge weights (speed profiles or live traffic CSVs). You can re-run osrm-customize alone when speed data changes, without repeating the expensive extract and partition stages.
# Download a Geofabrik extract to the EBS volume
cd /mnt/osrm-data
curl -LO https://download.geofabrik.de/north-america/us/california-latest.osm.pbf
# Stage 1: Extract — parse OSM into OSRM graph format using the car profile
docker run -t \
-v /mnt/osrm-data:/data \
ghcr.io/project-osrm/osrm-backend \
osrm-extract -p /opt/car.lua /data/california-latest.osm.pbf
# Stage 2: Partition — build MLD cell decomposition (CPU-bound, 5–30 min for state scale)
docker run -t \
-v /mnt/osrm-data:/data \
ghcr.io/project-osrm/osrm-backend \
osrm-partition /data/california-latest.osrm
# Stage 3: Customize — write edge weights into MLD structures
docker run -t \
-v /mnt/osrm-data:/data \
ghcr.io/project-osrm/osrm-backend \
osrm-customize /data/california-latest.osrm
Profile selection determines which OSM tags are read and how edge costs are computed. The built-in car.lua uses maxspeed=*, highway=* class weights, and oneway=yes directionality. For freight applications, review the profile’s turn penalty table and consider overriding vehicle_speed tables — see the OSRM profiles documentation before extraction, since the profile choice is baked into the .osrm files and cannot be swapped without re-extracting.
Step 4: Launch the routing daemon
Start osrm-routed as a detached container with a restart policy. The memory limit reserves headroom for the Docker daemon and host OS; swap is disabled because swapping OSRM’s mmap regions collapses query performance.
docker run -d \
--name osrm-server \
--restart unless-stopped \
--memory="90%" \
--memory-swap="-1" \
-p 5000:5000 \
-v /mnt/osrm-data:/data \
ghcr.io/project-osrm/osrm-backend \
osrm-routed \
--algorithm mld \
--max-table-size 1000 \
/data/california-latest.osrm
Step 5: Validate the API and integrate with Python
# Test a driving route: LA to San Diego
curl "http://localhost:5000/route/v1/driving/-118.2437,34.0522;-117.1611,32.7157?overview=full&geometries=polyline6"
A successful response is a JSON object containing a routes array with geometry, duration (seconds), and distance (metres). Zero results or a NoRoute code indicates a graph connectivity issue in the extract.
For Python integration, query the endpoint with httpx (async) or requests (sync), decode the encoded polyline with the polyline library, and feed coordinates into geopandas for spatial joins:
# requires: httpx, polyline, geopandas, shapely
import httpx
import polyline
import geopandas as gpd
from shapely.geometry import LineString
OSRM_BASE = "http://localhost:5000"
def get_route(origin: tuple[float, float], dest: tuple[float, float]) -> dict:
"""Return OSRM route dict for (lon, lat) origin and destination."""
coords = f"{origin[0]},{origin[1]};{dest[0]},{dest[1]}"
url = f"{OSRM_BASE}/route/v1/driving/{coords}"
params = {"overview": "full", "geometries": "polyline6", "steps": "false"}
response = httpx.get(url, params=params, timeout=10.0)
response.raise_for_status()
return response.json()
def route_to_geodataframe(route_json: dict) -> gpd.GeoDataFrame:
"""Decode polyline6 geometry and return a GeoDataFrame (EPSG:4326)."""
encoded = route_json["routes"][0]["geometry"]
coords = [(lon, lat) for lat, lon in polyline.decode(encoded, precision=6)]
line = LineString(coords)
props = {
"duration_s": route_json["routes"][0]["duration"],
"distance_m": route_json["routes"][0]["distance"],
}
return gpd.GeoDataFrame([props], geometry=[line], crs="EPSG:4326")
# Example usage
route_data = get_route((-118.2437, 34.0522), (-117.1611, 32.7157))
gdf = route_to_geodataframe(route_data)
print(gdf[["duration_s", "distance_m"]])
Key parameters and tuning
| Parameter | Flag / Config | Recommended range | Notes |
|---|---|---|---|
| Algorithm | --algorithm mld |
MLD for production | CH is faster for single routes but cannot accept live traffic updates |
| Matrix limit | --max-table-size |
500–2000 | Higher values raise per-request RAM proportionally |
| Container memory cap | --memory |
85–92% of instance RAM | Leaves room for OS kernel page cache and Docker daemon |
| Swap | --memory-swap="-1" |
Always disable | Swapping mmap’d routing data causes multi-second query stalls |
| EBS throughput | gp3 baseline |
≥125 MB/s, ≥3000 IOPS | Preprocessing reads/writes sequentially; throughput matters more than IOPS burst |
| Restart policy | --restart unless-stopped |
Required in production | Survives instance reboot and Docker daemon restart without manual intervention |
| HTTP threads | --threads N |
CPU core count | Defaults to hardware concurrency; set explicitly for predictable behaviour |
Integration points
The OSRM HTTP API output connects directly into several downstream workflows available on this site:
- Distance matrix generation: The
/tableendpoint returns a full N×M duration or distance matrix in a single request. Feed the response matrix into a NumPy array and pass it to a vehicle routing solver (OR-Tools, python-mip) as the travel-time cost matrix. - Isochrone boundary computation: OSRM does not natively produce isochrone polygons, but you can approximate them by querying the
/nearestendpoint for a grid of candidate points, filtering by/routeduration from a centre, and computing the alpha-hull boundary with GeoPandas-based isochrone generation. - Traffic weight overlays: After the initial MLD preprocessing, speed updates can be applied by re-running
osrm-customizewith a new speed CSV (segment ID, speed km/h) without re-extracting the graph — this is the primary mechanism for integrating custom traffic weights into OSRM. - NetworkX comparison: For small subgraph analyses or what-if scenarios, you can extract a bounding-box subset from the same PBF and load it into NetworkX shortest-path algorithms for logistics alongside the OSRM server to cross-validate route costs.
Validation checklist
Run these checks after completing the deployment before routing live traffic through the server:
- Route JSON structure:
curl http://localhost:5000/route/v1/driving/-118.2437,34.0522;-117.1611,32.7157?overview=fullreturns"code": "Ok"and aroutesarray with non-zerodurationanddistance. - Snap accuracy: Query
/nearest/v1/driving/<lon>,<lat>for several known road junctions and confirm the snapped node is within 20 m of the expected coordinate. - Table endpoint:
curl "http://localhost:5000/table/v1/driving/-118.2437,34.0522;-117.1611,32.7157;-118.4912,34.0195"returns a 3×3durationsmatrix with nonullcells. - Memory headroom: Run
docker stats osrm-serverunder a moderate query load (50 concurrent routes) and confirm RSS stays below the container memory limit. - Container restart persistence:
docker restart osrm-serverand re-query — the server should recover in under 10 seconds for city-scale graphs (RAM is re-mapped from EBS on startup). - One-way enforcement: Route from A→B and B→A on a known
oneway=yesroad segment. The reverse direction should return a detour rather than the direct path.
Why does osrm-extract run out of memory even on a 16 GB instance?
The extract stage builds an in-memory node/way/relation index before flushing to disk. For country-scale PBF files, peak RAM during extraction can reach 2–3× the final .osrm dataset size. Either upgrade to a 32 GB instance, or pre-filter the PBF with osmium extract --bbox to reduce the input area before running osrm-extract.
Should I use MLD or Contraction Hierarchies for fleet routing?
Use MLD for production deployments that require dynamic traffic weight updates or large distance-matrix queries. CH produces marginally faster individual route lookups but cannot accept live traffic overlays after preprocessing — any speed update requires a full rebuild. MLD’s partition/customize split lets you swap traffic data without re-extracting the graph.
The /table endpoint returns "Max locations exceeded" — how do I fix it?
The --max-table-size flag controls the matrix dimension limit. Restart the container with a higher value (e.g. --max-table-size 5000), but monitor RAM: a 5000×5000 table allocates approximately 200 MB per request. For very large matrices, batch requests in Python by splitting origin and destination lists with itertools.islice and issuing multiple smaller table calls.
Related
- Deploying OSRM with Docker for Local Routing — the full guide covering Docker Compose patterns, Nginx reverse proxy configuration, and rolling PBF update strategies
- Integrating custom traffic weights into OSRM — applying speed CSV overlays via
osrm-customizefor live traffic without full re-extraction - NetworkX shortest-path algorithms for logistics — in-process Python graph routing for smaller networks and algorithm comparison
- Generating isochrones with PySAL and GeoPandas — building reachability polygons on top of OSRM duration output