Skip to content

Allow concatenation of string coordinates of differing width (#6676)#7125

Merged
trexfeathers merged 2 commits into
SciTools:mainfrom
gaoflow:fix/concat-string-coord-widths
Jun 26, 2026
Merged

Allow concatenation of string coordinates of differing width (#6676)#7125
trexfeathers merged 2 commits into
SciTools:mainfrom
gaoflow:fix/concat-string-coord-widths

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

🚀 Pull Request

Description

Closes #6676.

Two cubes whose string coordinate differs only in width (i.e. dtype, such as <U1 vs <U5) could not be concatenated:

cube_a = iris.cube.Cube(
    [0, 1], long_name="test",
    dim_coords_and_dims=[(iris.coords.DimCoord([0, 1], long_name="dim"), 0)],
    aux_coords_and_dims=[(iris.coords.AuxCoord(["1", "2"], long_name="example"), 0)],
)
cube_b = iris.cube.Cube(
    [10, 12, 13], long_name="test",
    dim_coords_and_dims=[(iris.coords.DimCoord([10, 11, 12], long_name="dim"), 0)],
    aux_coords_and_dims=[(iris.coords.AuxCoord(["1", "123", "12345"], long_name="example"), 0)],
)
iris.cube.CubeList([cube_a, cube_b]).concatenate_cube()
ConcatenateError: failed to concatenate into a single cube.
  Auxiliary coordinates metadata differ: example != example

Cause

When building the coordinate signature, _CoordMetaData records each coordinate's points_dtype/bounds_dtype, and __eq__ compares them exactly. For string coordinates, <U1 and <U5 are different dtypes, so the metadata was reported as differing even though the coordinates are otherwise compatible.

Fix

When comparing coordinate signatures, collapse string dtypes (kind "U"/"S") to their kind, so that differing widths no longer block concatenation. numpy promotes the points to a common width when they are joined, so the result coordinate gets the wider dtype (<U5 above). Genuine dtype-kind differences (e.g. string vs integer) are unaffected and still rejected.

Verification

  • New TestStringAuxCoordWidths:
    • test_different_widths — the two cubes above now concatenate; the result coordinate has dtype <U5 and the expected points. Fails on main, passes here.
    • test_different_dtype_kind_still_rejected — string-vs-integer coordinates still raise ConcatenateError.
  • Full tests/unit/concatenate/, tests/integration/concatenate/ and tests/test_concatenate.py suites pass (283 + 54 tests).
  • ruff check / ruff format clean.

@gaoflow gaoflow force-pushed the fix/concat-string-coord-widths branch from 98e8b19 to 0e95f22 Compare June 18, 2026 21:40
@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.15%. Comparing base (746f0a6) to head (6591a46).

Files with missing lines Patch % Lines
lib/iris/_concatenate.py 50.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7125      +/-   ##
==========================================
- Coverage   90.15%   90.15%   -0.01%     
==========================================
  Files          91       91              
  Lines       24985    24989       +4     
  Branches     4685     4687       +2     
==========================================
+ Hits        22526    22528       +2     
- Misses       1682     1683       +1     
- Partials      777      778       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@trexfeathers trexfeathers self-assigned this Jun 25, 2026

@trexfeathers trexfeathers left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gaoflow! A few comments for you this time

Comment thread lib/iris/_concatenate.py Outdated
Comment thread lib/iris/_concatenate.py Outdated
Comment thread lib/iris/_concatenate.py Outdated
Comment thread lib/iris/tests/unit/concatenate/test_concatenate.py Outdated
Comment thread lib/iris/tests/unit/concatenate/test_concatenate.py Outdated
gaoflow added 2 commits June 26, 2026 13:34
Concatenation compared coordinate dtypes exactly, so two cubes whose
string coordinate differed only in width (e.g. '<U1' vs '<U5') were
reported as having differing metadata and could not be concatenated.
When comparing the coordinate signatures, collapse string dtypes to
their kind so differing widths no longer block concatenation; numpy
promotes the points to a common width when they are joined. Genuine
dtype-kind differences (e.g. string vs integer) are still rejected.

Fixes SciTools#6676.
@gaoflow gaoflow force-pushed the fix/concat-string-coord-widths branch from 0e95f22 to 6591a46 Compare June 26, 2026 11:36
@gaoflow

gaoflow commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Updated in 6591a468 and rebased onto current main.

Review follow-up:

  • moved string-width normalization into _CoordMetaData.__new__, so the metadata stores the normalized dtype consistently instead of special-casing __eq__
  • shortened the implementation comment
  • reused the test helper for the integer-coordinate case
  • matched the existing (result,) = concatenate(..., True) style
  • replaced the hand-written latest.rst entry with changelog/7125.bugfix.rst

Verification after the rebase:

uv run --with pytest --with pytest-mock --with requests --with filelock --with-editable . python -m pytest lib/iris/tests/unit/concatenate/test_concatenate.py::TestStringAuxCoordWidths -q
uv run --with pytest --with pytest-mock --with requests --with filelock --with-editable . python -m pytest lib/iris/tests/unit/concatenate/test_concatenate.py -q
uv run --with towncrier towncrier build --draft --version 0.0
ruff check lib/iris/_concatenate.py
ruff format --check lib/iris/_concatenate.py lib/iris/tests/unit/concatenate/test_concatenate.py
git diff --check origin/main...HEAD

The full touched test file passed with 43 passed.

@trexfeathers trexfeathers left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another valuable contribution, thanks @gaoflow!

@trexfeathers trexfeathers merged commit 9c5eb0a into SciTools:main Jun 26, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Concatenation doesn't support string aux-coords with different widths

2 participants