Optimizing node attributes for last-mile routing transforms raw spatial coordinates into weighted, constraint-aware graph vertices that reflect real-world delivery friction. This technique applies on top of the broader work of mapping node attributes for urban delivery zones — where attributes are extracted and spatially joined — and sits within the wider context of OSM graph architecture and network modeling, where raw OpenStreetMap primitives become directed multigraphs. The focus here is the optimization pipeline itself: taking those joined attributes, normalizing them into solver-compatible cost scalars, and serializing the result for a constraint routing engine.
When to Use This Approach
Node attribute optimization is the right technique under specific conditions that distinguish it from simpler edge-weight adjustments.
Vehicle and operational triggers:
- Urban vans and cargo bikes where legal stopping location matters more than road speed. A 30-second loading-zone window at a node may save a 10-minute detour that edge weights cannot capture.
- Fleets operating under time-window SLAs where delivery at 10:05 inside a
time_window=(08:00, 10:00)node must trigger a prohibitive penalty, not a gentle discouragement. - Vehicles subject to Low Emission Zone access rules —
access=noconditions that change by vehicle class and hour require per-node flags rather than static edge filters.
Network scale triggers:
- Dense urban cores where node density exceeds 500 nodes/km², making per-node penalty computation materially cheaper than per-edge recalculation.
- Fleets with heterogeneous vehicle classes sharing the same underlying graph. A single graph with vehicle-class-keyed node attributes is far more maintainable than separate per-class graphs.
Data condition triggers:
- Municipal curb regulation datasets are available as point geometries that snap cleanly to OSM node IDs. If curb data arrives only as polygon zones, configuring edge weights for freight logistics is often better suited.
- Historical dwell times are available per OSM node or POI ID from telematics. When that granularity exists, node-level weighting outperforms edge-averaged speed profiles.
Implementation
The following code is self-contained for the optimization and graph-building phase. It assumes attributes have already been spatially joined to node IDs (covered in mapping node attributes for urban delivery zones). Prefer vectorized pandas operations over row-level loops — the normalization step processes tens of thousands of nodes and loop overhead is measurable.
# Requires: pandas>=1.5, networkx>=3.0, numpy>=1.24
# python -m pip install pandas networkx numpy
import pandas as pd
import numpy as np
import networkx as nx
from datetime import datetime, time as dt_time
HARD_PENALTY = 1_000_000 # seconds — large finite value; avoid math.inf (NaN propagation risk)
CURB_PENALTY_NO_STOP = 300 # 5-minute equivalent for non-loading-zone nodes
DECAY_RECENT = 0.7 # weight for real-time congestion observations
DECAY_HISTORIC = 0.3 # weight for historical baseline
def normalize_node_attributes(
nodes_df: pd.DataFrame,
current_hour: int,
current_minute: int
) -> pd.DataFrame:
"""
Vectorized normalization of node attributes into a single 'node_cost' column (seconds).
nodes_df must contain columns:
- node_id (str/int index)
- traversal_time_sec (float)
- curb_type (str): 'loading_zone' | 'no_stopping' | 'unrestricted' | ...
- tw_start_hour, tw_start_min, tw_end_hour, tw_end_min (int, NaN if no window)
- lez_violation (bool)
- congestion_recent (float, 0–1 index)
- congestion_historic (float, 0–1 index)
"""
df = nodes_df.copy()
now_minutes = current_hour * 60 + current_minute
# --- Curb penalty (vectorized) ---
curb_map = {"loading_zone": 0, "unrestricted": 0}
df["curb_penalty"] = (
df["curb_type"]
.map(curb_map)
.fillna(CURB_PENALTY_NO_STOP) # any other curb type gets the no-stop penalty
)
# --- Time window penalty (vectorized) ---
has_window = df["tw_start_hour"].notna()
tw_start_min = (df["tw_start_hour"] * 60 + df["tw_start_min"]).where(has_window)
tw_end_min = (df["tw_end_hour"] * 60 + df["tw_end_min"]).where(has_window)
# within-window: penalty = 0; outside: HARD_PENALTY
in_window = (tw_start_min <= now_minutes) & (now_minutes <= tw_end_min)
df["tw_penalty"] = np.where(
~has_window,
0,
np.where(in_window, 0, HARD_PENALTY)
)
# --- LEZ penalty ---
df["lez_penalty"] = np.where(df["lez_violation"], HARD_PENALTY, 0)
# --- Congestion decay (weighted blend of recent + historic) ---
df["congestion_penalty"] = (
DECAY_RECENT * df["congestion_recent"] +
DECAY_HISTORIC * df["congestion_historic"]
) * 600 # scale: max congestion = 10 minutes extra traversal time
# --- Total node cost ---
df["node_cost"] = (
df["traversal_time_sec"].fillna(0)
+ df["curb_penalty"]
+ df["tw_penalty"]
+ df["lez_penalty"]
+ df["congestion_penalty"]
)
return df[["node_id", "node_cost", "curb_type", "lez_violation"]]
def build_weighted_graph(normalized_df: pd.DataFrame) -> nx.DiGraph:
"""
Build a NetworkX DiGraph from the normalized node attribute table.
Edges must be added separately from the OSM topology source.
"""
G = nx.DiGraph()
records = normalized_df.to_dict(orient="records")
G.add_nodes_from(
(r["node_id"], {
"weight": r["node_cost"],
"curb": r["curb_type"],
"lez": r["lez_violation"]
})
for r in records
)
return G
Key design decisions embedded in this code:
HARD_PENALTY = 1_000_000rather thanmath.inf. Several constraint-propagation libraries convert node weights to NumPy float64 arrays internally;math.infcan produceNaNwhen added to another infinity, silently invalidating the path cost.- The congestion blend (
0.7 * recent + 0.3 * historic) is applied only at normalization time, not at graph query time. This means static graphs are rebuilt with fresh telemetry on a scheduled cadence rather than recomputed per request. fillna(CURB_PENALTY_NO_STOP)on the curb map means any new or unmappedcurb_typevalue is treated conservatively as a no-stop node. This prevents a data-ingestion error from silently granting free curb access.
Key Parameters and Tuning
| Parameter | Default | Recommended range | Sensitivity notes |
|---|---|---|---|
HARD_PENALTY |
1_000_000 s |
500_000–10_000_000 s |
Must exceed the longest plausible valid path cost in your network. For city-scale graphs, 1 M seconds (~11 days) is safely prohibitive. |
CURB_PENALTY_NO_STOP |
300 s |
120–600 s |
Calibrate against observed re-circulation time in your city. 300 s (5 min) reflects typical urban block-circling time; dense grids may use 120 s. |
DECAY_RECENT |
0.7 |
0.5–0.9 |
Higher values make the solver more reactive to real-time congestion; lower values prioritize historical patterns. Tune against dwell-time overshoot logs. |
DECAY_HISTORIC |
0.3 |
0.1–0.5 |
Complement of DECAY_RECENT. Ensure DECAY_RECENT + DECAY_HISTORIC = 1.0. |
congestion_penalty scale |
600 s |
300–1200 s |
Max congestion index of 1.0 maps to this many extra seconds. Set to your observed worst-case dwell penalty for heavily congested nodes. |
| Time window bucket size | 15 min | 5–30 min | Finer buckets increase accuracy but expand state space. 15-minute slices are the industry standard for VRP constraint propagation. |
Integration Points
The nx.DiGraph produced by build_weighted_graph integrates with downstream engines in three different ways, depending on deployment target.
Google OR-Tools (recommended for VRP with time windows). Extract the weight attribute from each node into a penalty matrix. OR-Tools RoutingModel accepts node penalties via AddDisjunction — nodes with weight >= HARD_PENALTY should be registered as optional (droppable) visits with prohibitive penalty so the solver can skip infeasible stops cleanly rather than crashing.
Valhalla. Node penalties inject via the costing_options JSON field in the route request. Map node_cost values to penalty entries in a custom costing model. Because Valhalla processes these at request time, you can pass the full normalized_df as a lookup table and apply dynamic decay at query time — this avoids the need to rebuild the static graph for congestion changes.
OSRM. As covered in integrating custom traffic weights into OSRM, OSRM does not accept HTTP-level node penalties. Node attributes must be encoded into the Lua profile’s process_node function during the osrm-extract phase. This means any attribute change requires a full graph rebuild — viable for static attributes (LEZ flags, curb types) but impractical for real-time congestion. For dynamic last-mile use cases, OSRM is best combined with a post-processing layer that adds node-level penalties to OSRM’s raw duration output.
NetworkX direct solvers. For smaller networks (under ~50k nodes), NetworkX shortest-path algorithms for logistics can incorporate node weights directly. Use nx.dijkstra_path with a custom weight callable that reads G.nodes[v]['weight'] as part of the traversal cost.
Validation Checklist
Run these checks before promoting an optimized attribute graph to production routing:
-
Hard constraint enforcement. Sample 100 nodes with
tw_penalty = HARD_PENALTY. Confirm no route returned by the solver visits those nodes at the time of the test. Any solver that returns a route through a hard-penalty node has an arithmetic overflow or infinity-propagation bug. -
Curb violation rate. Shadow-route a week of historical GPS traces through the optimized graph. Measure the ratio of stops at nodes with
curb_typeother thanloading_zoneorunrestricted. A well-tunedCURB_PENALTY_NO_STOPshould reduce this rate to under 5% versus the baseline graph. -
LEZ compliance. For fleets with mixed vehicle classes, verify that LEZ-violating nodes (
lez_violation=True) are never visited by non-compliant vehicle types. Cross-reference against route logs and municipal violation records if available. -
Penalty floor check. Confirm
node_costis non-negative for all nodes. Negative weights cause Dijkstra to produce incorrect results. Atraversal_time_secfield with missing data that defaults to a negative sentinel value is the most common source. -
Decay blend sum. Assert
DECAY_RECENT + DECAY_HISTORIC == 1.0in your CI pipeline. Rounding errors in parameter files have caused blends summing to 0.95 or 1.05, biasing all congestion penalties silently. -
Versioned snapshot comparison. After each attribute graph rebuild, compare the distribution of
node_costvalues (mean, p95, max) against the previous version. A sudden shift in p95 cost typically indicates a stale or corrupt data feed — curb regulation updates that zero-out a data source are a common culprit.
Related
- Mapping node attributes for urban delivery zones — spatial join and attribute extraction upstream of this optimization step
- Configuring edge weights for freight logistics — complementary weight design for the edges connecting these nodes
- Handling turn restrictions in routing graphs — turn penalty modeling at intersection nodes using OSM
restrictionrelations - Custom cost functions for routing solvers — generalizing the penalty formulas here into pluggable costing functions for OR-Tools and Valhalla