What is the practical upper limit for a single Valhalla matrix request?

Valhalla's default configuration caps sources + targets at 50 each per request (2,500 pairs). Larger studies should chunk coordinate sets into 50×50 batches and parallelise with ThreadPoolExecutor. The /locate endpoint can pre-snap coordinates once per POI dataset to eliminate repeated snapping overhead on subsequent matrix calls.

How do I handle unreachable origins or destinations in the matrix?

Check the status field in each cell: 0 = routable, 1 = no route (check graph connectivity or loosen costing), 2 = coordinate out of bounds (expand the loaded tileset or fix the coordinate), 3 = internal routing error (check Valhalla logs). Filter to status == 0 before computing any coverage or gravity-model metric.

Can Valhalla produce time-dependent matrices for public transit?

Yes. Use costing: 'transit' and pass a date_time block with type: 1 (depart at) and a value in ISO 8601 format. Valhalla aligns route timing against loaded GTFS feeds, so the matrix reflects actual service frequency rather than a static average travel time.

Valhalla Cost Matrix Generation for Urban Planners

Valhalla’s /sources_to_targets endpoint produces an N×M table of travel times, distances, and status codes between arbitrary origin and destination coordinate sets in a single HTTP call — eliminating the sequential route loops that make naive matrix construction prohibitively slow at city scale. This page covers the specific variant used in urban planning contexts: multi-origin, multi-destination accessibility matrices for facility siting, equity analysis, and commute shed modelling. It sits within the broader Valhalla configuration for multi-modal analysis workflow and feeds results into the spatial analysis pipelines described across Python Routing Engines & Isochrone Mapping.

When to Use This Approach

The matrix endpoint is the right tool when the study requires discrete origin-destination pairs rather than continuous coverage polygons. Specific conditions that point to /sources_to_targets over isochrone generation:

Facility siting under equity constraints. Evaluating 200 candidate clinic locations against 5,000 census tract centroids requires 1,000,000 travel-time lookups. A single chunked matrix job completes this in minutes; sequential /route calls would take hours.
Gravity model inputs. Hansen-type accessibility indices and cumulative opportunity measures require time or distance between every origin and every destination in a study layer — exactly the matrix shape the endpoint returns.
Transit corridor analysis. Time-dependent matrices with costing: "transit" and GTFS feeds quantify headway-sensitive reachability at specific departure windows, enabling comparison between peak and off-peak commute sheds.
Fleet dispatch pre-computation. Caching an N×M time matrix for depot-to-customer pairs at graph-build time eliminates live routing overhead during vehicle dispatch decisions.

The approach is less appropriate when you need full turn-by-turn instructions, when origins and destinations overlap heavily (isochrones share computation more efficiently in that case), or when your coordinate sets exceed the practical per-request limits and network round-trip overhead dominates processing time.

Matrix vs. Isochrone Decision

The SVG below summarises the decision criteria between matrix requests and isochrone polygon generation for common urban planning tasks:

Implementation

The implementation below is self-contained and focuses on production concerns specific to the matrix variant: chunked batching, vectorised response parsing, and status-code filtering. It avoids repeating the Valhalla environment setup covered in the Valhalla configuration for multi-modal analysis guide.

# Requires: requests>=2.31, pandas>=2.0, numpy>=1.24
from __future__ import annotations

import itertools
from concurrent.futures import ThreadPoolExecutor, as_completed

import numpy as np
import pandas as pd
import requests

VALHALLA_URL = "http://localhost:8002/sources_to_targets"
# Maximum coordinate count per dimension — Valhalla default is 50×50.
CHUNK_SIZE = 50


def _build_payload(
    sources: list[dict],
    targets: list[dict],
    costing: str = "auto",
    costing_options: dict | None = None,
    date_time: dict | None = None,
) -> dict:
    """Construct the sources_to_targets request body."""
    payload: dict = {
        "sources": sources,
        "targets": targets,
        "costing": costing,
    }
    if costing_options:
        payload["costing_options"] = {costing: costing_options}
    if date_time:
        payload["date_time"] = date_time
    return payload


def _request_chunk(
    sources_chunk: list[dict],
    targets_chunk: list[dict],
    source_offset: int,
    target_offset: int,
    **kwargs,
) -> pd.DataFrame:
    """
    POST one chunk to Valhalla and return a tidy DataFrame fragment.
    Rows with status != 0 are retained but flagged for downstream filtering.
    """
    payload = _build_payload(sources_chunk, targets_chunk, **kwargs)
    response = requests.post(VALHALLA_URL, json=payload, timeout=60)
    response.raise_for_status()
    data = response.json()

    # sources_to_targets[i] is the full list of target cells for source i.
    # sources_to_targets[i][j] is the specific cell for source i → target j.
    cells = data["sources_to_targets"]

    # Vectorised extraction: build index arrays then assign column values at once.
    n_src = len(sources_chunk)
    n_tgt = len(targets_chunk)
    src_idx = np.repeat(np.arange(n_src), n_tgt)
    tgt_idx = np.tile(np.arange(n_tgt), n_src)

    flat_cells = [cells[i][j] for i, j in zip(src_idx, tgt_idx)]

    df = pd.DataFrame({
        "source_idx": src_idx + source_offset,
        "target_idx": tgt_idx + target_offset,
        "time_sec": pd.array([c.get("time") for c in flat_cells], dtype="Float64"),
        "distance_km": pd.array([c.get("distance") for c in flat_cells], dtype="Float64"),
        "status": pd.array([c.get("status", -1) for c in flat_cells], dtype="Int8"),
    })
    return df


def compute_od_matrix(
    sources: list[dict],
    targets: list[dict],
    costing: str = "auto",
    costing_options: dict | None = None,
    date_time: dict | None = None,
    max_workers: int = 4,
) -> pd.DataFrame:
    """
    Compute a full origin-destination travel-time matrix via Valhalla's
    sources_to_targets endpoint, chunked and parallelised.

    Parameters
    ----------
    sources         : list of {"lon": float, "lat": float} dicts (origins)
    targets         : list of {"lon": float, "lat": float} dicts (destinations)
    costing         : Valhalla costing model — "auto", "pedestrian", "bicycle",
                      "transit", "truck", etc.
    costing_options : dict of model-specific penalty overrides (optional)
    date_time       : {"type": 1, "value": "2026-06-23T08:00"} for transit (optional)
    max_workers     : thread-pool size for parallel chunk requests

    Returns
    -------
    DataFrame with columns [source_idx, target_idx, time_sec, distance_km, status,
                             time_min], filtered to status == 0.
    """
    # Build (source_offset, target_offset) pairs for every chunk combination.
    src_offsets = range(0, len(sources), CHUNK_SIZE)
    tgt_offsets = range(0, len(targets), CHUNK_SIZE)
    chunks = list(itertools.product(src_offsets, tgt_offsets))

    futures = {}
    results: list[pd.DataFrame] = []

    with ThreadPoolExecutor(max_workers=max_workers) as pool:
        for s_off, t_off in chunks:
            src_chunk = sources[s_off: s_off + CHUNK_SIZE]
            tgt_chunk = targets[t_off: t_off + CHUNK_SIZE]
            future = pool.submit(
                _request_chunk,
                src_chunk, tgt_chunk, s_off, t_off,
                costing=costing,
                costing_options=costing_options,
                date_time=date_time,
            )
            futures[future] = (s_off, t_off)

        for future in as_completed(futures):
            s_off, t_off = futures[future]
            try:
                results.append(future.result())
            except requests.RequestException as exc:
                raise RuntimeError(
                    f"Matrix chunk (src_offset={s_off}, tgt_offset={t_off}) failed: {exc}"
                ) from exc

    matrix = pd.concat(results, ignore_index=True)

    # Retain only routable pairs; non-zero status signals snapping failure,
    # disconnected graph segment, or out-of-bounds coordinate.
    valid = matrix[matrix["status"] == 0].copy()
    valid["time_min"] = valid["time_sec"] / 60.0
    return valid.sort_values(["source_idx", "target_idx"]).reset_index(drop=True)

Transit Time-Dependent Matrix

For public transit studies, pass costing="transit" and a date_time block. Valhalla aligns routes against the loaded GTFS schedule, producing headway-sensitive travel times rather than free-flow averages:

# Requires: requests>=2.31, pandas>=2.0, numpy>=1.24
# (uses compute_od_matrix from the snippet above)

transit_matrix = compute_od_matrix(
    sources=neighbourhood_centroids,   # list of {"lon": ..., "lat": ...}
    targets=transit_hub_coords,
    costing="transit",
    date_time={"type": 1, "value": "2026-06-23T08:00"},  # depart-at
    max_workers=2,  # lower concurrency for GTFS-heavy queries
)

# Pivot to wide format: rows = origins, columns = transit hubs
pivot = transit_matrix.pivot(
    index="source_idx", columns="target_idx", values="time_min"
)

Key Parameters and Tuning

Parameter	Recommended Value	Sensitivity Notes
`costing`	`"auto"` / `"pedestrian"` / `"transit"` / `"truck"`	Determines graph traversal rules, speed defaults, and access restrictions; changing mode invalidates cached matrices
`use_ferry`	`0.0`–`1.0` (default `0.5`)	Values near `0` actively avoid ferry crossings; urban studies without waterways should set `0.0` to eliminate edge-case routing artefacts
`use_toll`	`0.0` (equity studies)	Set `0.0` for studies measuring equitable access — toll roads skew travel times for low-income origin zones
`maneuver_penalty` (seconds)	`5`–`30`	Higher values penalise complex intersections; calibrate against GPS traces to match observed urban intersection delay
`CHUNK_SIZE`	`50`	Hard cap from Valhalla defaults; reduce to `25` if requests time out on dense urban graphs with transit costing
`max_workers`	`4` (auto/pedestrian), `2` (transit)	Transit queries carry higher per-request compute cost due to GTFS schedule alignment; over-threading stalls the tile cache
`date_time.type`	`1` (depart at)	Type `2` (arrive by) computes backward searches; useful for catchment studies but doubles graph traversal cost
Tileset bounding box	Study area + 10 km buffer	Tight bounding boxes improve cold-start speed but produce `status=2` errors for near-boundary coordinates

Coordinate snapping failures (status=2) are the most common field problem in multi-city studies. They occur when coordinates fall outside the loaded tileset or on road classes excluded by the costing model — for example, access=private ways when using "auto" costing. Pre-validate all coordinates with the /locate endpoint and cache the snapped node IDs; this also cuts per-request overhead by 15–30% on repeated queries against the same POI dataset.

Integration Points

The output DataFrame from compute_od_matrix — with columns source_idx, target_idx, time_min, and distance_km — connects directly to several downstream spatial workflows:

Gravity-model accessibility indices. Join time_min against a destination-opportunity table (jobs, services, transit capacity) and apply a decay function (exp(-β × time_min)) to compute Hansen accessibility scores per origin zone. Merge with census boundary GeoDataFrames via geopandas spatial joins for choropleth mapping.

Isochrone overlays. The matrix identifies which destinations fall within a travel-time threshold. Feed the threshold-filtered target subset to Valhalla’s /isochrone endpoint to generate bounding polygons for the reachable set — combining the precision of discrete pair timing with the cartographic clarity of polygon coverage. See generating isochrones with PySAL and GeoPandas for the polygon construction step.

Custom cost function inputs. If the study uses non-standard impedance — combining travel time with fare cost, transfer count, or environmental exposure — the raw time_sec and distance_km columns feed into custom cost functions for routing solvers, where composite weights are assembled before pivoting to a matrix format.

NetworkX graph construction. Wide-format OD matrices can seed a directed NetworkX graph for network-theoretic analysis. Set time_min as the edge weight, compute betweenness centrality to identify critical transfer nodes, and overlay with land-use data to flag equity gaps in the transport network.

Fleet dispatch pre-computation. For same-day delivery operations, pre-compute the depot-to-customer matrix at graph-build time and store in a Redis sorted set keyed by (depot_id, customer_id). The live dispatch solver queries cached times without hitting the routing engine during order assignment.

Validation Checklist

Run these checks against every matrix output before using results in a planning deliverable:

Coverage ratio. Compute valid_cells / total_cells where total_cells = len(sources) × len(targets). A ratio below 0.90 signals systematic snapping failures or disconnected graph segments — investigate status distributions and re-examine the loaded tileset extent.
Time distribution sanity. Assert time_min.between(0.5, 180).all() for typical urban studies. Values below 0.5 minutes indicate coordinate pairs that snapped to the same node (check for duplicate POIs). Values above 180 minutes suggest the tileset spans disconnected regions or a costing model mismatch (e.g. "pedestrian" applied to motorway-only zones).
Symmetry check (reversible modes). For "auto" and "pedestrian" costing on undirected networks, verify that |time_min[A→B] - time_min[B→A]| is under 10% for a random 5% sample. Systematic asymmetry indicates one-way street data issues or incorrect graph orientation from the OSM extract.
Transit headway sensitivity. For transit matrices, re-run with date_time.value shifted by 30 minutes. The mean travel time across all pairs should shift by no more than the dominant route’s headway. If times are identical regardless of departure, GTFS feeds are not being applied — check the Valhalla tile build log for transit.enabled confirmation.
Chunk boundary artefacts. After concatenating chunked results, verify that pairs spanning chunk boundaries (source near offset boundary, target near offset boundary) have plausible times consistent with spatially adjacent pairs. Index arithmetic errors in the offset logic can silently mis-assign travel times.
Outlier route investigation. Flag origin rows where mean(time_min) exceeds the study-area median by more than two standard deviations. These are often coordinates that snapped to low-speed residential ways or pedestrian-only paths; review the snapped node IDs with /locate and consider coordinate adjustments.