Valhalla’s /sources_to_targets endpoint produces an N×M table of travel times, distances, and status codes between arbitrary origin and destination coordinate sets in a single HTTP call — eliminating the sequential route loops that make naive matrix construction prohibitively slow at city scale. This page covers the specific variant used in urban planning contexts: multi-origin, multi-destination accessibility matrices for facility siting, equity analysis, and commute shed modelling. It sits within the broader Valhalla configuration for multi-modal analysis workflow and feeds results into the spatial analysis pipelines described across Python Routing Engines & Isochrone Mapping.
When to Use This Approach
The matrix endpoint is the right tool when the study requires discrete origin-destination pairs rather than continuous coverage polygons. Specific conditions that point to /sources_to_targets over isochrone generation:
- Facility siting under equity constraints. Evaluating 200 candidate clinic locations against 5,000 census tract centroids requires 1,000,000 travel-time lookups. A single chunked matrix job completes this in minutes; sequential
/routecalls would take hours. - Gravity model inputs. Hansen-type accessibility indices and cumulative opportunity measures require time or distance between every origin and every destination in a study layer — exactly the matrix shape the endpoint returns.
- Transit corridor analysis. Time-dependent matrices with
costing: "transit"and GTFS feeds quantify headway-sensitive reachability at specific departure windows, enabling comparison between peak and off-peak commute sheds. - Fleet dispatch pre-computation. Caching an N×M time matrix for depot-to-customer pairs at graph-build time eliminates live routing overhead during vehicle dispatch decisions.
The approach is less appropriate when you need full turn-by-turn instructions, when origins and destinations overlap heavily (isochrones share computation more efficiently in that case), or when your coordinate sets exceed the practical per-request limits and network round-trip overhead dominates processing time.
Matrix vs. Isochrone Decision
The SVG below summarises the decision criteria between matrix requests and isochrone polygon generation for common urban planning tasks:
Implementation
The implementation below is self-contained and focuses on production concerns specific to the matrix variant: chunked batching, vectorised response parsing, and status-code filtering. It avoids repeating the Valhalla environment setup covered in the Valhalla configuration for multi-modal analysis guide.
# Requires: requests>=2.31, pandas>=2.0, numpy>=1.24
from __future__ import annotations
import itertools
from concurrent.futures import ThreadPoolExecutor, as_completed
import numpy as np
import pandas as pd
import requests
VALHALLA_URL = "http://localhost:8002/sources_to_targets"
# Maximum coordinate count per dimension — Valhalla default is 50×50.
CHUNK_SIZE = 50
def _build_payload(
sources: list[dict],
targets: list[dict],
costing: str = "auto",
costing_options: dict | None = None,
date_time: dict | None = None,
) -> dict:
"""Construct the sources_to_targets request body."""
payload: dict = {
"sources": sources,
"targets": targets,
"costing": costing,
}
if costing_options:
payload["costing_options"] = {costing: costing_options}
if date_time:
payload["date_time"] = date_time
return payload
def _request_chunk(
sources_chunk: list[dict],
targets_chunk: list[dict],
source_offset: int,
target_offset: int,
**kwargs,
) -> pd.DataFrame:
"""
POST one chunk to Valhalla and return a tidy DataFrame fragment.
Rows with status != 0 are retained but flagged for downstream filtering.
"""
payload = _build_payload(sources_chunk, targets_chunk, **kwargs)
response = requests.post(VALHALLA_URL, json=payload, timeout=60)
response.raise_for_status()
data = response.json()
# sources_to_targets[i] is the full list of target cells for source i.
# sources_to_targets[i][j] is the specific cell for source i → target j.
cells = data["sources_to_targets"]
# Vectorised extraction: build index arrays then assign column values at once.
n_src = len(sources_chunk)
n_tgt = len(targets_chunk)
src_idx = np.repeat(np.arange(n_src), n_tgt)
tgt_idx = np.tile(np.arange(n_tgt), n_src)
flat_cells = [cells[i][j] for i, j in zip(src_idx, tgt_idx)]
df = pd.DataFrame({
"source_idx": src_idx + source_offset,
"target_idx": tgt_idx + target_offset,
"time_sec": pd.array([c.get("time") for c in flat_cells], dtype="Float64"),
"distance_km": pd.array([c.get("distance") for c in flat_cells], dtype="Float64"),
"status": pd.array([c.get("status", -1) for c in flat_cells], dtype="Int8"),
})
return df
def compute_od_matrix(
sources: list[dict],
targets: list[dict],
costing: str = "auto",
costing_options: dict | None = None,
date_time: dict | None = None,
max_workers: int = 4,
) -> pd.DataFrame:
"""
Compute a full origin-destination travel-time matrix via Valhalla's
sources_to_targets endpoint, chunked and parallelised.
Parameters
----------
sources : list of {"lon": float, "lat": float} dicts (origins)
targets : list of {"lon": float, "lat": float} dicts (destinations)
costing : Valhalla costing model — "auto", "pedestrian", "bicycle",
"transit", "truck", etc.
costing_options : dict of model-specific penalty overrides (optional)
date_time : {"type": 1, "value": "2026-06-23T08:00"} for transit (optional)
max_workers : thread-pool size for parallel chunk requests
Returns
-------
DataFrame with columns [source_idx, target_idx, time_sec, distance_km, status,
time_min], filtered to status == 0.
"""
# Build (source_offset, target_offset) pairs for every chunk combination.
src_offsets = range(0, len(sources), CHUNK_SIZE)
tgt_offsets = range(0, len(targets), CHUNK_SIZE)
chunks = list(itertools.product(src_offsets, tgt_offsets))
futures = {}
results: list[pd.DataFrame] = []
with ThreadPoolExecutor(max_workers=max_workers) as pool:
for s_off, t_off in chunks:
src_chunk = sources[s_off: s_off + CHUNK_SIZE]
tgt_chunk = targets[t_off: t_off + CHUNK_SIZE]
future = pool.submit(
_request_chunk,
src_chunk, tgt_chunk, s_off, t_off,
costing=costing,
costing_options=costing_options,
date_time=date_time,
)
futures[future] = (s_off, t_off)
for future in as_completed(futures):
s_off, t_off = futures[future]
try:
results.append(future.result())
except requests.RequestException as exc:
raise RuntimeError(
f"Matrix chunk (src_offset={s_off}, tgt_offset={t_off}) failed: {exc}"
) from exc
matrix = pd.concat(results, ignore_index=True)
# Retain only routable pairs; non-zero status signals snapping failure,
# disconnected graph segment, or out-of-bounds coordinate.
valid = matrix[matrix["status"] == 0].copy()
valid["time_min"] = valid["time_sec"] / 60.0
return valid.sort_values(["source_idx", "target_idx"]).reset_index(drop=True)
Transit Time-Dependent Matrix
For public transit studies, pass costing="transit" and a date_time block. Valhalla aligns routes against the loaded GTFS schedule, producing headway-sensitive travel times rather than free-flow averages:
# Requires: requests>=2.31, pandas>=2.0, numpy>=1.24
# (uses compute_od_matrix from the snippet above)
transit_matrix = compute_od_matrix(
sources=neighbourhood_centroids, # list of {"lon": ..., "lat": ...}
targets=transit_hub_coords,
costing="transit",
date_time={"type": 1, "value": "2026-06-23T08:00"}, # depart-at
max_workers=2, # lower concurrency for GTFS-heavy queries
)
# Pivot to wide format: rows = origins, columns = transit hubs
pivot = transit_matrix.pivot(
index="source_idx", columns="target_idx", values="time_min"
)
Key Parameters and Tuning
| Parameter | Recommended Value | Sensitivity Notes |
|---|---|---|
costing |
"auto" / "pedestrian" / "transit" / "truck" |
Determines graph traversal rules, speed defaults, and access restrictions; changing mode invalidates cached matrices |
use_ferry |
0.0–1.0 (default 0.5) |
Values near 0 actively avoid ferry crossings; urban studies without waterways should set 0.0 to eliminate edge-case routing artefacts |
use_toll |
0.0 (equity studies) |
Set 0.0 for studies measuring equitable access — toll roads skew travel times for low-income origin zones |
maneuver_penalty (seconds) |
5–30 |
Higher values penalise complex intersections; calibrate against GPS traces to match observed urban intersection delay |
CHUNK_SIZE |
50 |
Hard cap from Valhalla defaults; reduce to 25 if requests time out on dense urban graphs with transit costing |
max_workers |
4 (auto/pedestrian), 2 (transit) |
Transit queries carry higher per-request compute cost due to GTFS schedule alignment; over-threading stalls the tile cache |
date_time.type |
1 (depart at) |
Type 2 (arrive by) computes backward searches; useful for catchment studies but doubles graph traversal cost |
| Tileset bounding box | Study area + 10 km buffer | Tight bounding boxes improve cold-start speed but produce status=2 errors for near-boundary coordinates |
Coordinate snapping failures (status=2) are the most common field problem in multi-city studies. They occur when coordinates fall outside the loaded tileset or on road classes excluded by the costing model — for example, access=private ways when using "auto" costing. Pre-validate all coordinates with the /locate endpoint and cache the snapped node IDs; this also cuts per-request overhead by 15–30% on repeated queries against the same POI dataset.
Integration Points
The output DataFrame from compute_od_matrix — with columns source_idx, target_idx, time_min, and distance_km — connects directly to several downstream spatial workflows:
Gravity-model accessibility indices. Join time_min against a destination-opportunity table (jobs, services, transit capacity) and apply a decay function (exp(-β × time_min)) to compute Hansen accessibility scores per origin zone. Merge with census boundary GeoDataFrames via geopandas spatial joins for choropleth mapping.
Isochrone overlays. The matrix identifies which destinations fall within a travel-time threshold. Feed the threshold-filtered target subset to Valhalla’s /isochrone endpoint to generate bounding polygons for the reachable set — combining the precision of discrete pair timing with the cartographic clarity of polygon coverage. See generating isochrones with PySAL and GeoPandas for the polygon construction step.
Custom cost function inputs. If the study uses non-standard impedance — combining travel time with fare cost, transfer count, or environmental exposure — the raw time_sec and distance_km columns feed into custom cost functions for routing solvers, where composite weights are assembled before pivoting to a matrix format.
NetworkX graph construction. Wide-format OD matrices can seed a directed NetworkX graph for network-theoretic analysis. Set time_min as the edge weight, compute betweenness centrality to identify critical transfer nodes, and overlay with land-use data to flag equity gaps in the transport network.
Fleet dispatch pre-computation. For same-day delivery operations, pre-compute the depot-to-customer matrix at graph-build time and store in a Redis sorted set keyed by (depot_id, customer_id). The live dispatch solver queries cached times without hitting the routing engine during order assignment.
Validation Checklist
Run these checks against every matrix output before using results in a planning deliverable:
-
Coverage ratio. Compute
valid_cells / total_cellswheretotal_cells = len(sources) × len(targets). A ratio below 0.90 signals systematic snapping failures or disconnected graph segments — investigatestatusdistributions and re-examine the loaded tileset extent. -
Time distribution sanity. Assert
time_min.between(0.5, 180).all()for typical urban studies. Values below 0.5 minutes indicate coordinate pairs that snapped to the same node (check for duplicate POIs). Values above 180 minutes suggest the tileset spans disconnected regions or a costing model mismatch (e.g."pedestrian"applied to motorway-only zones). -
Symmetry check (reversible modes). For
"auto"and"pedestrian"costing on undirected networks, verify that|time_min[A→B] - time_min[B→A]|is under 10% for a random 5% sample. Systematic asymmetry indicates one-way street data issues or incorrect graph orientation from the OSM extract. -
Transit headway sensitivity. For transit matrices, re-run with
date_time.valueshifted by 30 minutes. The mean travel time across all pairs should shift by no more than the dominant route’s headway. If times are identical regardless of departure, GTFS feeds are not being applied — check the Valhalla tile build log fortransit.enabledconfirmation. -
Chunk boundary artefacts. After concatenating chunked results, verify that pairs spanning chunk boundaries (source near offset boundary, target near offset boundary) have plausible times consistent with spatially adjacent pairs. Index arithmetic errors in the offset logic can silently mis-assign travel times.
-
Outlier route investigation. Flag origin rows where
mean(time_min)exceeds the study-area median by more than two standard deviations. These are often coordinates that snapped to low-speed residential ways or pedestrian-only paths; review the snapped node IDs with/locateand consider coordinate adjustments.
Related
- Valhalla configuration for multi-modal analysis — costing JSON schema, tile builds, and mode-switching penalties
- Generating isochrones with PySAL and GeoPandas — converting reachable-set coordinates into coverage polygons
- Custom cost functions for routing solvers — composite impedance design combining time, fare, and environmental cost
- Deploying OSRM with Docker for local routing — containerised routing engine setup for development and staging environments