Calculating Road Curvature with Python Shapely
In high-definition mapping and autonomous vehicle spatial data processing, accurate road curvature computation is foundational for trajectory planning, superelevation modeling, and lane-level attribute extraction. While Shapely provides robust geometric primitives for spatial predicates and topological operations, it lacks native differential geometry operators. Engineers must bridge this gap by extracting coordinate sequences, applying numerical differentiation, and managing the computational overhead inherent in large-scale Lane Geometry Extraction & Road Network Processing pipelines. This reference details production-grade implementations, focusing on vectorized computation, memory-constrained environments, and edge-case resolution.
Vectorized curvature from a discrete LineString, with stability guards and a validation gate:
flowchart TD
A["Shapely LineString(s)"] --> B["Bulk extract coords<br/>get_coordinates() → (N,2)"]
B --> C["Cumulative arc length<br/>np.hypot + np.cumsum"]
C --> D["Derivatives<br/>np.gradient (1st, 2nd) wrt s"]
D --> E["Curvature κ<br/>+ ε guard (1e-9)"]
E --> F{"Validate vs clothoid / arc<br/>(±0.001 m⁻¹)?"}
F -->|"pass"| OUT(["κ → lane attribute schema"])
F -->|"fail"| R["Decimate · Savitzky-Golay smooth"]
classDef io fill:#eef3fa,stroke:#3a56d4,color:#1a2336;
classDef gate fill:#fff4e5,stroke:#f59e0b,color:#7a4a00;
classDef out fill:#e7f7f0,stroke:#0c8f6a,color:#0a4b39;
classDef warn fill:#fdecea,stroke:#e5484d,color:#7a1f23;
class A io
class F gate
class OUT out
class R warn
Road curvature () at any point along a planar curve is mathematically defined as:
When working with discrete Shapely LineString geometries, analytical derivatives are unavailable, necessitating finite difference approximations. Central differences provide second-order accuracy: x'[i] = (x[i+1] - x[i-1]) / (2·Δs) and x''[i] = (x[i+1] - 2·x[i] + x[i-1]) / (Δs²), where Δs represents the arc-length interval between consecutive vertices. For HD map tiles containing millions of vertices, naive Python loops will bottleneck the pipeline. Instead, extract the coordinate array via np.array(line.coords), compute cumulative Euclidean distances using np.hypot and np.cumsum, then apply np.gradient with the arc-length array as the spacing parameter. This approach maintains numerical stability while leveraging SIMD-optimized routines documented in the NumPy Gradient Reference.
Vectorized Single-Geometry Curvature
Directly iterating over shapely.geometry.LineString objects introduces significant interpreter overhead. The recommended pattern involves bulk extraction using shapely.get_coordinates() (Shapely 2.0+), which returns a contiguous (N, 2) array backed by the GEOS C library. Note that Shapely 2.0's backend enforces strict coordinate precision; ensure your input CRS is projected (e.g., EPSG:32633) before differentiation. Degree-based geographic coordinates produce curvature values scaled by roughly 111,320 m/°, introducing catastrophic errors in downstream Road Curvature & Superelevation Mapping workflows.
import numpy as np
from shapely import get_coordinates
from shapely.geometry import LineString
def compute_curvature(line: LineString, eps: float = 1e-9) -> np.ndarray:
"""Vectorized discrete curvature for a projected Shapely LineString.
Args:
line: A Shapely 2.0 LineString in a metric (projected) CRS.
eps: Stability guard for near-zero denominator (straight segments).
Returns:
1-D array of κ values (m⁻¹), same length as the vertex count.
"""
coords = get_coordinates(line) # (N, 2), no Python loop
dx = np.diff(coords[:, 0])
dy = np.diff(coords[:, 1])
ds_seg = np.hypot(dx, dy) # segment lengths
s = np.concatenate([[0.0], np.cumsum(ds_seg)]) # cumulative arc length
# np.gradient uses central differences at interior points,
# one-sided differences at endpoints — consistent with the spacing array.
x_prime = np.gradient(coords[:, 0], s)
y_prime = np.gradient(coords[:, 1], s)
x_dprime = np.gradient(x_prime, s)
y_dprime = np.gradient(y_prime, s)
numerator = np.abs(x_prime * y_dprime - y_prime * x_dprime)
denominator = (x_prime**2 + y_prime**2) ** 1.5
return numerator / (denominator + eps)
Consult the Shapely 2.0 Documentation for coordinate handling best practices.
Chunked Processing for Large Tile Datasets
AV spatial data pipelines routinely process 50–200 GB of GeoJSON/Parquet tiles. Loading all centerlines into memory simultaneously triggers OOM failures on standard 32 GB worker nodes. The correct PyArrow chunked-read pattern uses ParquetFile.iter_batches(), not read_table().to_pandas().iter_batches() (the latter does not exist on a pandas DataFrame):
import pyarrow.parquet as pq
import geopandas as gpd
import numpy as np
def process_curvature_from_parquet(
path: str,
batch_size: int = 50_000,
) -> list[np.ndarray]:
"""Compute curvature for all centerlines in a GeoParquet file, in chunks."""
pf = pq.ParquetFile(path)
results = []
for batch in pf.iter_batches(batch_size=batch_size):
gdf = gpd.GeoDataFrame.from_arrow(batch)
for geom in gdf.geometry:
if geom is not None and not geom.is_empty:
results.append(compute_curvature(geom))
return results
For curvature computation across chunk boundaries, maintain a small overlap buffer (typically 2–3 vertices) to prevent discontinuity artifacts at segment seams. This sliding-window strategy preserves gradient continuity while keeping peak memory footprint predictable.
Stability Guards and Preprocessing
Real-world lane geometries frequently contain collinear vertices, sharp kinks, and self-intersections that destabilize finite difference schemes. Apply a vertex decimation pass using shapely.simplify() with a tolerance of 0.05–0.1 m to remove redundant points while preserving curvature continuity. For near-zero denominators (straight segments), the eps = 1e-9 guard in compute_curvature avoids division-by-zero. Smooth noisy GPS-derived trajectories using a Savitzky-Golay filter before differentiation to suppress high-frequency oscillation that masquerades as false curvature.
Post-computation validation should compare discrete curvature outputs against analytical benchmarks such as clothoid spirals and circular arcs to verify numerical accuracy within ±0.001 m⁻¹. Integrate the curvature array directly into lane attribute schemas alongside heading, cross-slope, and speed limit data. By adhering to vectorized, projection-aware, and memory-safe patterns, engineering teams can scale curvature extraction from prototype scripts to production-grade AV mapping infrastructure that meets ISO 26262 functional safety requirements for spatial data integrity.