BUG: DataFrame.rank() loses ExtensionArray dtype (GH#52829) by maheshmakvana · Pull Request #65120 · pandas-dev/pandas

maheshmakvana · 2026-04-08T11:21:33Z

Problem

DataFrame.rank() silently converts ExtensionArray-backed columns (e.g. ArrowDtype, nullable Int64) to float64, while Series.rank() correctly preserves the original dtype.

import pandas as pd
import pyarrow as pa

s = pd.Series([1, 2], dtype=pd.ArrowDtype(pa.int32()))
df = s.to_frame(name="a")

s.rank(method="min").dtype        # uint64[pyarrow]  checkmark
df.rank(method="min")["a"].dtype  # float64  before fix
                                  # uint64[pyarrow]  after fix

Root cause

In NDFrame.ranker(), the DataFrame branch used data.values which consolidates all blocks into a single numpy array, stripping EA type info.

Fix

For axis=0 (default), use _mgr.apply() to process each block independently:

1-D EAs (ArrowExtensionArray, IntegerArray, etc.) call their own _rank(), preserving dtype.
numpy blocks and 2-D EAs (DatetimeArray) are transposed, ranked with algos.rank(axis=0), and transposed back.
axis=1 keeps the old data.values path (cross-column ranking must see all columns at once).

Closes #52829

When calling .rank() on a DataFrame whose columns use ExtensionArray dtypes (e.g. ArrowDtype, nullable Int64), the result dtype was silently downcast to float64 because the 2-D code path used data.values which consolidates all blocks into a single numpy array. Fix: for axis=0 (the default), use _mgr.apply() to process each block independently. 1-D EAs (ArrowExtensionArray, IntegerArray, etc.) are handled by their own _rank() method, preserving dtype. numpy blocks and 2-D EAs (DatetimeArray, TimedeltaArray) are transposed before calling algos.rank so that the ranking axis is correct, then transposed back. For axis=1 the old data.values consolidation path is kept because cross-column ranking cannot be done block-by-block. Closes: pandas-dev#52829

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: DataFrame.rank() loses ExtensionArray dtype (GH#52829)#65120

BUG: DataFrame.rank() loses ExtensionArray dtype (GH#52829)#65120
maheshmakvana wants to merge 1 commit intopandas-dev:mainfrom
maheshmakvana:fix-52829-dataframe-rank-preserve-ea-dtype

maheshmakvana commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

maheshmakvana commented Apr 8, 2026

Problem

Root cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant