Skip to content

DOC: document CSV round-trip limitation with MultiIndex columns and unnamed index (GH#56929)#65996

Draft
jbrockmendel wants to merge 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-56929
Draft

DOC: document CSV round-trip limitation with MultiIndex columns and unnamed index (GH#56929)#65996
jbrockmendel wants to merge 1 commit into
pandas-dev:mainfrom
jbrockmendel:bug-56929

Conversation

@jbrockmendel

Copy link
Copy Markdown
Member

closes #56929

When a DataFrame with MultiIndex columns has an unnamed index, to_csv omits the row that would hold the index names. If the first data row is then entirely missing, it is written identically to that omitted row, so read_csv cannot tell them apart and consumes the first data row as the column index names, silently dropping that row.

A reader-side fix is not possible (the two rows are byte-identical in pandas' own format, so any heuristic that fixes the unnamed case breaks the named-index case — see the stale attempt in GH-57070), and the only writer-side fix regresses the default index_col=None read. This documents the limitation in the IO user guide along with the workarounds (non-empty na_rep, or naming the index).

…nnamed index (GH#56929)

When a DataFrame with MultiIndex columns has an unnamed index, to_csv omits
the index-names row. If the first data row is entirely missing, it is written
identically to that omitted row, so read_csv consumes it as the column index
names and silently drops the row. Document this in the IO user guide along
with the na_rep / named-index workarounds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jbrockmendel jbrockmendel added Docs IO CSV read_csv, to_csv labels Jun 23, 2026

@DTiming24 DTiming24 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helpful warning and the workaround is clear. One small thought: using a more obviously sentinel-ish na_rep example than NaN (for example <NA> or __MISSING__) might avoid confusion with literal data values, but the substance looks solid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Docs IO CSV read_csv, to_csv

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: KeyError when loading csv with NaNs

2 participants