Context
Retroactive audit of merged PR #1188 (predicate pushdown pass rewrite, commit 23fb3a9) surfaced two issues missed in review.
Bug: `\\b` inside rf-string at `predicate_pushdown.py:178`
```python
if re.search(rf"\\b{re.escape(alias)}\\b", segment)
```
Inside an rf-string, `\\b` is a literal backslash followed by `b`, not a word boundary. Compare to the correct form at line 211 of the same file:
```python
if re.search(rf"\b{re.escape(alias)}\b", expression)
```
Effect: `_refs_for_segment` never matches an alias; the conservative fallback at line 183 (`return original_refs`) always fires. Per-conjunct reference narrowing after an AND split is effectively dead code. Tests pass because the fallback is a superset, hiding the bug.
DRY: Duplicate AND splitter
`predicate_pushdown.py:129` defines `_split_top_level_and` — a quote/paren-aware top-level AND splitter. The same package already has `_split_top_level_and_terms` in `cypher/parser.py:477`, which is strictly more robust: handles `[]`, `{}`, backticks, and escape sequences.
This is the exact same class of miss that PR #1193 fixed during Wave 3 review (see issue #1125 follow-up). Two near-duplicate implementations of "split WHERE body on top-level AND" in one package.
Proposed fixes
- Correct the `\\b` regex to `\b`. Add a targeted regression test: a multi-alias OPTIONAL MATCH expression where at least one alias must be detected per-segment (i.e., the fallback-to-`original_refs` path would incorrectly widen safety guards).
- Consolidate the two splitters: promote `_split_top_level_and_terms` to a shared module (e.g., `graphistry/compute/gfql/common/expr_split.py`) and delete `_split_top_level_and` from `predicate_pushdown.py`.
Missed-in-review gap
The earlier review of #1188 used the builtin `/review` tool. The retroactive audit ran the expanded multi-dimension review skill and caught both. Process change: for gfql/cypher PRs, default to the multi-dimension skill.
Priority
- Bug: p2 (silent semantic widening; not load-bearing for correctness under current callers, but could become so)
- DRY: p3 (cleanup, unblocks deletion of the duplicate splitter)
Context
Retroactive audit of merged PR #1188 (predicate pushdown pass rewrite, commit 23fb3a9) surfaced two issues missed in review.
Bug: `\\b` inside rf-string at `predicate_pushdown.py:178`
```python
if re.search(rf"\\b{re.escape(alias)}\\b", segment)
```
Inside an rf-string, `\\b` is a literal backslash followed by `b`, not a word boundary. Compare to the correct form at line 211 of the same file:
```python
if re.search(rf"\b{re.escape(alias)}\b", expression)
```
Effect: `_refs_for_segment` never matches an alias; the conservative fallback at line 183 (`return original_refs`) always fires. Per-conjunct reference narrowing after an AND split is effectively dead code. Tests pass because the fallback is a superset, hiding the bug.
DRY: Duplicate AND splitter
`predicate_pushdown.py:129` defines `_split_top_level_and` — a quote/paren-aware top-level AND splitter. The same package already has `_split_top_level_and_terms` in `cypher/parser.py:477`, which is strictly more robust: handles `[]`, `{}`, backticks, and escape sequences.
This is the exact same class of miss that PR #1193 fixed during Wave 3 review (see issue #1125 follow-up). Two near-duplicate implementations of "split WHERE body on top-level AND" in one package.
Proposed fixes
Missed-in-review gap
The earlier review of #1188 used the builtin `/review` tool. The retroactive audit ran the expanded multi-dimension review skill and caught both. Process change: for gfql/cypher PRs, default to the multi-dimension skill.
Priority