Skip to content

BUG: IntervalTree raises on 32-bit when n_elements > leaf_size (GH#44075)#66029

Merged
jbrockmendel merged 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-44075
Jun 26, 2026
Merged

BUG: IntervalTree raises on 32-bit when n_elements > leaf_size (GH#44075)#66029
jbrockmendel merged 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-44075

Conversation

@jbrockmendel

Copy link
Copy Markdown
Member

closes #44075

Also closes GH-23440 (same root cause).

Building the IntervalTree engine for an IntervalIndex with more than leaf_size (100) intervals raised TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' on 32-bit platforms — surfacing in the wild via :func:cut. take passes int64 position arrays to PyArray_Take, which requires intp; on 32-bit intp is int32 and the safe cast is rejected. It only triggered on the recursive (non-leaf) path, which is why >100 bins failed but ≤100 worked.

Fix casts the positions to intp in take (a no-op on 64-bit, where intp is int64). This also lets us drop the 32-bit skipif/WASM xfail workarounds that were added for GH-23440, so those tests now run on the 32-bit CI.

…075)

PyArray_Take requires intp indices; the int64 positions we build failed
the safe int64->int32 cast on 32-bit platforms when the tree recursed
(>100 intervals), surfacing via pd.cut. Cast positions to intp in take.
This lets us drop the 32-bit/WASM skip+xfail workarounds added for
GH#23440.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jbrockmendel jbrockmendel added Bug Interval Interval data type cut cut, qcut labels Jun 25, 2026
@jbrockmendel jbrockmendel marked this pull request as ready for review June 26, 2026 16:26
@jbrockmendel jbrockmendel merged commit 7ae83a1 into pandas-dev:main Jun 26, 2026
53 checks passed
@jbrockmendel jbrockmendel deleted the bug-44075 branch June 26, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug cut cut, qcut Interval Interval data type

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: pd.cut() sometimes puts NaNs into bins. BUG: IntervalTree construction fails for 32bit when n_elements > leaf_size

1 participant