Skip to content

BUG: DataFrame.to_dict(orient="index", into=...) does not apply into to nested mappings #65778

Description

@parasky98

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd


class MyDict(dict):
    pass


df = pd.DataFrame(
    {"A": [1, 2], 
    "B": ["x", "y"]}
)

out = df.to_dict("index", into=MyDict)

print(out)
print(type(out))
print(type(out[0]))

assert isinstance(out, MyDict)
assert isinstance(out[0], MyDict)  # fails

Issue Description

Running the code above raises an AssertionError like below.

{0: {'A': 1, 'B': 'x'}, 1: {'A': 2, 'B': 'y'}}
<class '__main__.MyDict'>
<class 'dict'>
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[19], line 22
     18 print(type(out))
     19 print(type(out[0]))
     20 
     21 assert isinstance(out, MyDict)
---> 22 assert isinstance(out[0], MyDict)  # fails

AssertionError:

This suggests DataFrame.to_dict(..., into=...) applying into only to the outer mapping when orient="index".

In the example above, the outer returned object is MyDict, but each nested row mapping is a plain dict.

I checked whether this behavior is intended, but the docstrings of both functions

pandas.core.frame.DataFrame.to_dict()
pandas.core.methods.to_dict.to_dict()

describe the result as dict-like, but do not clarify whether nested mappings should also respect into:

    Parameters
    ----------
    orient : str {'dict', 'list', 'series', 'split', 'tight', 'records', 'index'}
        Determines the type of the values of the dictionary.
    ...
    - 'index' : dict like {index -> {column -> value}}

I also inspected the to_dict implementation in pandas.core.methods.to_dict.
I believe this behavior comes from the following three return paths in the elif orient == "index" branch:

  • plain dict is used directly:
            return into_c(
                (t[0], dict(zip(df.columns, map(maybe_box_native, t[1:]), strict=True)))
                for t in df.itertuples(name=None)
            )
            return into_c(
                (t[0], dict(zip(columns, t[1:], strict=True)))
                for t in df.itertuples(name=None)
            )
  • a dict comprehension is used directly, bypassing into_c for the nested mapping:
            return into_c(
                (
                    t[0],
                    {
                        column: maybe_box_native(v)
                        if i in object_dtype_indices_as_set
                        else v
                        for i, (column, v) in enumerate(
                            zip(columns, t[1:], strict=True)
                        )
                    },
                )
                for t in df.itertuples(name=None)
            )

This could be fixed by using the standardized mapping constructor into_c for the nested row mappings as well. For example, the dict-comprehension branch could be changed to:

            return into_c(
                (
                    t[0],
                    into_c(
                        (
                            column,
                            maybe_box_native(v)
                            if i in object_dtype_indices_as_set
                            else v,
                        )
                        for i, (column, v) in enumerate(
                            zip(columns, t[1:], strict=True)
                        )
                    ),
                )
                for t in df.itertuples(name=None)
            )

With this change, the following code passes:

class MyDict(dict):
    pass


df = pd.DataFrame(
    {
        "A": [1, 2],
        "B": ["x", "y"],
    }
)


out = to_dict(df, "index", into=MyDict)


print(out)
print(type(out))
print(type(out[0]))

assert isinstance(out, MyDict)
assert isinstance(out[0], MyDict)  # success
{0: {'A': 1, 'B': 'x'}, 1: {'A': 2, 'B': 'y'}}
<class '__main__.MyDict'>
<class '__main__.MyDict'>

I would be happy to open a pull request for this.

Expected Behavior

Expected behavior:

result = df.to_dict(orient="index", into=MyDict)

type(result)
# <class '__main__.MyDict'>

type(result[0])
# <class '__main__.MyDict'>

In other words, the result should be equivalent to:

MyDict(
    {
        0: MyDict({"A": 1, "B": "x"}),
        1: MyDict({"A": 2, "B": "y"}),
    }
)

Installed Versions

Details

INSTALLED VERSIONS

commit : 72f2fea
python : 3.14.4
python-bits : 64
OS : Windows
OS-release : 11
Version : 10.0.26200
machine : AMD64
processor : Intel64 Family 6 Model 198 Stepping 2, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Korean_Korea.utf8

pandas : 3.0.3
numpy : 2.4.3
dateutil : 2.9.0.post0
pip : 26.0.1
Cython : None
sphinx : None
IPython : 9.12.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.14.3
bottleneck : None
fastparquet : 2025.12.0
fsspec : 2026.4.0
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : 6.0.2
matplotlib : 3.10.8
numba : 0.65.1
numexpr : None
odfpy : None
openpyxl : 3.1.5
psycopg2 : 2.9.11
pymysql : None
pyarrow : 21.0.0
pyiceberg : None
pyreadstat : None
pytest : None
python-calamine : None
pytz : 2026.1.post1
pyxlsb : None
s3fs : None
scipy : 1.17.1
sqlalchemy : 2.0.49
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : 3.2.9
zstandard : 0.25.0
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO DataIO issues that don't fit into a more specific label

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions