Exploring Topographic Data with Transitive Closure on Graphs
2026, Group · DEXA 2026 (Austria)
This project brings database-style recursive query evaluation to plain Python/Pandas, and uses terrain as both a motivating application and a tunable benchmark. It will be presented at DEXA 2026 in Austria. This is joint work with Christoph F. Eick and Carlos Ordonez at the University of Houston.
The idea
Transitive closure — computing every reachable source/destination pair in a graph — is a foundational recursive query that is traditionally confined to database systems. This work shows that database-style fixpoint semantics can be implemented effectively and correctly in Pandas, making it practical for moderate-scale exploratory analysis without standing up a full database. Digital Elevation Models (DEMs) provide an ideal domain: they are large, structured, and produce graphs whose density can be tuned.
From terrain to graph

- A DEM is modeled as a directed graph: each cell is a vertex, and edges connect adjacent cells (8-connectivity). Each edge carries the signed elevation difference and the slope angle between the two cells.
- A slope-passability threshold
Sfilters which transitions are traversable. Crucially,Sis treated as a query-time parameter — so a single physical terrain yields a whole family of graphs of varying density, without rewriting the underlying structure. - The graph construction is validated for structural invariants (antisymmetry, no self-loops, correct per-vertex degree, exact trigonometric slopes) and is O(n), scaling linearly to 1.6 billion cells.

Computing the closure
The closure is computed with a seminaïve fixpoint over the slope-filtered edge set, which avoids redundantly recomputing known paths:
Require: Edge relation E(src, dst)
1: TC ← E
2: Δ ← E
3: while Δ ≠ ∅ do
4: New ← Δ ⋈ E // join the new frontier with base edges
5: New ← New \ TC // keep only newly discovered pairs
6: TC ← TC ∪ New
7: Δ ← New
8: end while
9: return TC
Correctness is checked against a SQLite recursive view used as an oracle: zero row discrepancy at every tested slope threshold. The seminaïve strategy runs 1.5–3.3× faster than naïve evaluation.

What the terrain reveals
Sweeping the slope threshold S exposes a sharp structural phase transition near 20°: below it, the terrain fragments into hundreds of isolated components; above it, a giant mutually-reachable component dominates the map. That makes the terrain graph an excellent, controllable workload generator for recursive-query research.

Once the closure is materialized as a DataFrame, ordinary Pandas operations answer rich topographic questions — turning relational lookups into geographic insight:
- Point-to-point reachability — can you get from a trailhead to a summit within a slope tolerance?
- Drainage basins — reverse-reachability into a valley outlet (useful for hydrological modeling).
- Reachable hubs — the most central, flat gathering point from a location (e.g. a helicopter landing zone in search-and-rescue).
- Bottleneck / choke-point detection — narrow land bridges whose removal would sever a route.
- Shortest path — distance-aware traversal cost across slope-constrained terrain.

Closure experiments run at n = 1,000 and n = 10,000 cells, with the construction step demonstrated to scale linearly well beyond that.