Skip to content

Fix metadata propagation in reproject (#1572, #1573)#1575

Open
brendancol wants to merge 1 commit intomainfrom
deep-sweep-metadata-reproject-2026-05-10
Open

Fix metadata propagation in reproject (#1572, #1573)#1575
brendancol wants to merge 1 commit intomainfrom
deep-sweep-metadata-reproject-2026-05-10

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Before this change, a (y, x, band)-shaped raster passed to geoid_height_raster produced a 2D output labelled ('x', 'band') with the wrong shape and the wrong coords (because raster.dims[-2:] for a (y, x, band) input is ('x', 'band')). Input attrs like crs, res, transform, _FillValue, long_name were also dropped.

Without rioxarray installed, a raster carrying only attrs['nodatavals'] = (-9999,) would be treated by reproject as if nodata was NaN, so the -9999 pixels survived resampling unmasked while the output still carried a stale nodatavals=(-9999,) attr alongside a fresh nodata=NaN. xrspatial.resample already handles nodatavals, so the inconsistency was local to reproject.

Test plan

  • pytest xrspatial/tests/test_reproject.py -- 200 passing including 9 new metadata tests
  • Cross-backend probe (numpy, cupy, dask+numpy, dask+cupy) confirms attrs['nodata'] and attrs['nodatavals'] agree on output for all four paths
  • 3D (y, x, band) input to geoid_height_raster returns a 2D (y, x) array with input crs preserved
  • pytest xrspatial/tests/test_resample.py -- 169 passing (no regressions from the shared _detect_nodata change)

Found during deep-sweep / metadata pass on the reproject module.

geoid_height_raster previously dropped all input attrs and used
raster.dims[-2:] as the output dims. The latter produced garbage for
3D (y, x, band) rasters: the output came out shaped (4, 3) with dims
('x', 'band') instead of (4, 4) with dims ('y', 'x'). Use
_find_spatial_dims so the y/x axes are resolved regardless of layout,
and carry input attrs forward so crs / res / transform survive.

reproject and merge previously ignored attrs['nodatavals'] (rasterio's
plural convention) unless rioxarray happened to be installed and its
accessor picked it up. Without rioxarray, a raster carrying only
nodatavals was treated as if nodata was NaN, so the sentinel pixels
silently survived resampling. _detect_nodata now consults nodatavals
after _FillValue / nodata / missing_value, and the output keeps the
nodatavals tuple consistent with the resolved nodata.

Tests cover both bugs across 2D and 3D inputs and verify attrs
survive on numpy, dask, cupy, and dask+cupy backends.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reproject ignores attrs['nodatavals'] when rioxarray is not installed geoid_height_raster drops input attrs and mishandles 3D rasters

1 participant