Skip to content

BUG: DataFrame.combine_first loses precision for wide integers (GH#60128)#66011

Draft
jbrockmendel wants to merge 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-60128
Draft

BUG: DataFrame.combine_first loses precision for wide integers (GH#60128)#66011
jbrockmendel wants to merge 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-60128

Conversation

@jbrockmendel

Copy link
Copy Markdown
Member

closes #60128

Reimplements DataFrame.combine_first to align rows positionally and fill column-by-column, taking values directly from the original arrays instead of reindexing self to the row union. The old path routed through align, which introduced NaN into integer columns and promoted them to float64 before the values were combined, losing precision for integers outside float64's exactly-representable range (|n| > 2**53).

Fully-covered columns now keep their dtype and exact values, including with duplicate row or column labels. This is a single code path — no special-casing of the unique vs. non-unique case.

Builds on GH-62814, which fixed the nullable Int64/UInt64 variant of this bug.

…128)

Align rows positionally and fill column-by-column from the original arrays
instead of reindexing self to the row union, which promoted integer columns
through float64 and lost precision for values outside its exactly-representable
range. Fully-covered columns now keep their dtype and exact values, including
with duplicate row or column labels.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jbrockmendel jbrockmendel added Bug Dtype Conversions Unexpected or buggy dtype conversions combine/combine_first/update NDFrame.combine, combine_first, update labels Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug combine/combine_first/update NDFrame.combine, combine_first, update Dtype Conversions Unexpected or buggy dtype conversions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Series.combine_first loss of precision

1 participant