Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork18.5k
Description
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on thelatest version of pandas.
I have confirmed this bug exists on themain branch of pandas.
Reproducible Example
Issue1importpyarrowaspaarray=pa.array([1.5,2.5],type=pa.float64())array.to_pandas(types_mapper={pa.float64():pa.int64()}.get)ArrowInvalid:Floatvalue1.5wastruncatedconvertingtoint64Issue2importpandasaspdimportpyarrowaspafromdecimalimportDecimaldf=pd.DataFrame({"a": [Decimal("123.00")]},dtype="string[pyarrow]")df.to_parquet("decimal.pq",schema=pa.schema([("a",pa.decimal128(5))]))result=pd.read_parquet("decimal.pq")expected=pd.DataFrame({"a": ["123"]},dtype="string[python]")pd.testing.assert_frame_equal(result,expected)AssertionError:AttributesofDataFrame.iloc[:,0] (columnname="a")aredifferentAttribute"dtype"aredifferent[left]:object[right]:string[python]
Issue Description
Two issues have been observed when using pandas 2.2.3 with pyarrow >= 18.0.0:
Test cases Failing : pandas/tests/extension/test_arrow.py::test_from_arrow_respecting_given_dtype_unsafe and pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_roundtrip_decimal
Stricter float-to-int casting causes ArrowInvalid in tests like test_from_arrow_respecting_given_dtype_unsafe.
Decimal roundtrip mismatch: test_roundtrip_decimal fails due to dtype mismatches (object vs. string[python]) when reading back a decimal column written with a specified pyarrow schema.
These issues were not present with pyarrow==17.x.
Expected Behavior
Float to int casting should either handle truncation more gracefully (as in older versions) or tests should be updated to skip/adjust.
Decimal roundtrips to parquet should maintain the same pandas dtype or document clearly if type coercion is expected.
Installed Versions
python : 3.11.11
pandas : 2.2.3
pyarrow : 19.0.1