DOC: document CSV round-trip limitation with MultiIndex columns and unnamed index (GH#56929)#65996
Draft
jbrockmendel wants to merge 1 commit into
Draft
DOC: document CSV round-trip limitation with MultiIndex columns and unnamed index (GH#56929)#65996jbrockmendel wants to merge 1 commit into
jbrockmendel wants to merge 1 commit into
Conversation
…nnamed index (GH#56929) When a DataFrame with MultiIndex columns has an unnamed index, to_csv omits the index-names row. If the first data row is entirely missing, it is written identically to that omitted row, so read_csv consumes it as the column index names and silently drops the row. Document this in the IO user guide along with the na_rep / named-index workarounds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DTiming24
reviewed
Jun 25, 2026
DTiming24
left a comment
There was a problem hiding this comment.
Helpful warning and the workaround is clear. One small thought: using a more obviously sentinel-ish na_rep example than NaN (for example <NA> or __MISSING__) might avoid confusion with literal data values, but the substance looks solid.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes #56929
When a
DataFramewithMultiIndexcolumns has an unnamed index,to_csvomits the row that would hold the index names. If the first data row is then entirely missing, it is written identically to that omitted row, soread_csvcannot tell them apart and consumes the first data row as the column index names, silently dropping that row.A reader-side fix is not possible (the two rows are byte-identical in pandas' own format, so any heuristic that fixes the unnamed case breaks the named-index case — see the stale attempt in GH-57070), and the only writer-side fix regresses the default
index_col=Noneread. This documents the limitation in the IO user guide along with the workarounds (non-emptyna_rep, or naming the index).