Skip to content

gfql/passes: DRY child-slot walkers across verifier/physical_planner/passes (retroactive audit of #1191) #1196

@lmeyerov

Description

@lmeyerov

Context

Retroactive audit of merged PR #1191 (two-tier pass execution, commit e1b3617) surfaced a DRY miss.

DRY: Four copies of child-slot iteration

The tuple `("input", "left", "right", "subquery")` enumerates the structural child slots of every `LogicalPlan` subclass. It currently appears in four places:

  1. `graphistry/compute/gfql/passes/unnest_apply.py:~40` — `_unnest_tree` iteration
  2. `graphistry/compute/gfql/passes/predicate_pushdown.py:~45` — `_rewrite_tree` iteration
  3. `graphistry/compute/gfql/ir/verifier.py:30` — `_CHILD_SLOTS` constant
  4. `graphistry/compute/gfql/ir/physical_planner.py:190-194` — `PhysicalPlanner._children`

Each uses the same pattern: try each slot on the current plan node, rebuild with `dataclasses.replace(..., **children_updates)`, preserve child identity when unchanged.

Adding a new `LogicalPlan` subclass with a novel child slot (e.g., a second `input` or an alternative arm) requires editing all four sites. The test suite might not catch a miss if the new slot's traversal is exercised in a pass that skips the missed site.

Proposed fix

Introduce a shared helper in `graphistry/compute/gfql/ir/logical_plan.py`:

```python
CHILD_SLOTS: Tuple[str, ...] = ("input", "left", "right", "subquery")

def rewrite_children(plan: LogicalPlan, rewrite: Callable[[LogicalPlan], LogicalPlan]) -> LogicalPlan:
"""Apply rewrite to each child slot and return a new plan, preserving identity when unchanged."""
updates: Dict[str, LogicalPlan] = {}
for slot in CHILD_SLOTS:
child = getattr(plan, slot, None)
if isinstance(child, LogicalPlan):
rewritten = rewrite(child)
if rewritten is not child:
updates[slot] = rewritten
return replace(plan, **updates) if updates else plan
```

Replace the four duplicated sites with calls to this helper. The `CHILD_SLOTS` tuple becomes the single source of truth.

Related gap

No regression test locks the `rewritten is not child` identity-preservation invariant for deep trees (mentioned in `test_pass_manager.py`'s convergence logic — if identity breaks, `changed` signaling breaks). Add one after consolidation.

Priority

p3 — cleanup; does not block any feature. Worth doing before the M4 pass framework grows more passes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions