fix: handle non-numpy dtypes in PandasIndex.concat() and join() by C1-BA-B1-F3 · Pull Request #11409 · pydata/xarray

C1-BA-B1-F3 · 2026-06-26T01:36:32Z

Description

This PR fixes an issue where concat fails when mixing string coordinates from different sources (e.g., numpy string dtype and pandas StringDtype).

Problem

When concatenating DataArrays where one has a numpy string dtype coordinate and another has a pandas StringDtype coordinate (introduced by pd.Index in pandas 3.0), np.result_type() fails with:

TypeError: Cannot interpret '' as a data type

Root Cause

In PandasIndex.concat() and PandasIndex.join(), np.result_type() is called with coordinate dtypes without checking if they are valid numpy dtypes first. Pandas extension dtypes (like StringDtype) are not valid numpy dtypes.

Fix

Check if all dtypes are valid numpy dtypes before calling np.result_type()
Fall back to object dtype if any dtype is not a valid numpy dtype
Added regression test for the fix

Fixes GH#11317

Tests

Added test_concat_string_dtype_from_pd_index regression test
All existing concat tests pass (140 passed, 2 skipped)
All existing indexes tests pass (75 passed, 2 skipped)

Problem: When a MultiIndex level contains tuple-valued entries (e.g., (1,1)), selecting with a nested tuple key like ((1,1), 2) incorrectly preserved the dimension instead of collapsing it to a scalar result. Root cause: _is_nested_tuple() was checking for 'tuple' in addition to 'list' and 'slice', which caused it to misidentify tuple-valued keys as nested selection tuples. Fix: Remove 'tuple' from the isinstance check in _is_nested_tuple() so that only 'list' and 'slice' are treated as indicators of nested selections. Tuple- valued keys in MultiIndex levels are now correctly handled as scalar key values. Added regression test for selecting with nested tuple keys on MultiIndex with tuple-valued levels.

When concatenating indexes with mixed string types (e.g., numpy string dtype and pandas StringDtype), np.result_type() fails because it cannot interpret extension dtypes. This fix checks if all dtypes are valid numpy dtypes before calling np.result_type(), falling back to object dtype if not. Fixes GH#11317

for more information, see https://pre-commit.ci

C1-BA-B1-F3 added 2 commits June 26, 2026 09:26

github-actions Bot added the topic-indexing label Jun 26, 2026

[pre-commit.ci] auto fixes from pre-commit.com hooks

c19bf8e

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: handle non-numpy dtypes in PandasIndex.concat() and join()#11409

fix: handle non-numpy dtypes in PandasIndex.concat() and join()#11409
C1-BA-B1-F3 wants to merge 3 commits into
pydata:mainfrom
C1-BA-B1-F3:fix-concat-string-dtype

C1-BA-B1-F3 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

C1-BA-B1-F3 commented Jun 26, 2026

Description

Problem

Root Cause

Fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant