Release Modin 0.23.0 · modin-project/modin

Modin 0.23.0

This release upgrades the pandas version to 2.0. It also includes '.corr' speed-up, new
features, and bug fixes.

Key Features and Updates Since 0.22.0

Stability and Bugfixes
- FIX-#1851: Squash multiple LogicalProject nodes (#6306)
- FIX-#3371: Remove pandas patch level pin (#6211)
- FIX-#4048: support sqlalchemy objects in con parameter for to_sql (#5940)
- FIX-#4485: fix 'clip' with list-like bounds and axis=None (#6344)
- FIX-#4954: defaults to pandas in read_json in case of rows having different columns (#5946)
- FIX-#5077: fix 'Series.rename_axis' signature (#6324)
- FIX-#5461: fix groupby if dataframe has empty partitions (#6307)
- FIX-#6035: Fall back to Pandas, when merging unsupported column types (#6036)
- FIX-#6085: HDK: Implemented support for datetime64 dtypes serialization (#6086)
- FIX-#6208: HDK: Added support for median aggregation (#6209)
- FIX-#6215: Process '.corr(numeric_only=False)' parameter at the qc level (#6242)
- FIX-#6218: Fix read_excel and unpin openpyxl (#6247)
- FIX-#6229: fix Series.equals/DataFrame.equals with NA entries (#6270)
- FIX-#6232: support DataFrame.cov(numeric_only=False) without fallback to pandas (#6262)
- FIX-#6237: Log errors only from deepest modin layer (#6238)
- FIX-#6245: support datetime64 with different resolutions types for HDK (#6255)
- FIX-#6246: fix 'groupby(..., as_index=False).agg(...)' case (#6263)
- FIX-#6258: Fix series to_dict (#6260)
- FIX-#6259: Fix astype("category") causing read-only buffer error (#6267)
- FIX-#6273: fix DataFrame.min/max/mean/median/skew/kurt with axis=None (#6275)
- FIX-#6297: fix experimental numpy.argmax/argmin with Nans in data (#6298)
- FIX-#6309: do not materialize axes for 'rank' operation (#6310)
- FIX-#6313: update MIN_RAY_VERSION var: 1.4.0 -> 1.13.0 (#6314)
- FIX-#6317: fix syntax error in 'push-to-master.yml' (#6318)
- FIX-#6336: pin 'pydantic<2' to fix CI (#6337)
- FIX-#6338: fix TypeError: WorksheetReader.init() got an unexpected keyword argument 'rich_text' (#6339)
- FIX-#6341: call _filter_empties only if shapes are different on particular axis (#6333)
- FIX-#6352: Fix the HdkOnNativeDataframePartition._width_cache property computation (#6353)
- FIX-#6354: Skip bad and pre-release versions (#6355)
Performance enhancements
- PERF-#4560: Implement '.corr()' method using MapReduce pattern (#6193)
- PERF-#6319: remove '__make_init_labels_args' explicit calls that materialize axes (#6312)
Refactor Codebase
- REFACTOR-#0000: Remove OmnisciWorker as unused (#6278)
- REFACTOR-#0000: rename 'exc' -> 'err' (#6252)
- REFACTOR-#6279: HDK DataFrame should not have more than one partition (#6280)
- REFACTOR-#6329: deprecate cloud feature (#6330)
Update testing suite
- TEST-#6282: Reduce copy-pasteness in ci.yml (#6283)
- TEST-#6308: add to_numpy ASV bench (#6305)
- TEST-#6315: increase 'install_timeout' for ASV benchmarks: 600 -> 6000 sec (#6316)
New Features
- FEAT-#5684: Use TreeReduce implementation for 'pivot_table' in certain cases (#6089)
- FEAT-#5759: Implement lazy Arrow execution for the HDK engine (#6251)
- FEAT-#5936: support pandas 2.0.2 (#5995)
- FEAT-#6048: add wait method for Dask/Ray/Unidist wrappers (#6049)
- FEAT-#6191: Implement groupby.rolling API (#6292)
- FEAT-#6253: add 'dtype_backend' parameter support for read_parquet/read_feather (#6264)
- FEAT-#6256: HDK: Add support for DataFrameGroupBy.head/tail() (#6257)
- FEAT-#6284: Do not convert HDK query execution result to arrow. (#6286)
- FEAT-#6296: Add additional pyhdk launch parameters (#6303)
- FEAT-#6322: Give a warning only if the major or minor part of pandas version are different (#6323)
- FEAT-#6325: Add GPU execution option for HDK backend (#6326)
- FEAT-#6327: Bump pyhdk version to 0.7 (#6328)
- FEAT-#6351: Add a simple heuristic for fragment size when running on a GPU (#6346)

Contributors

@AndreyPavlenko
@YarShev
@alexbaden
@anmyachev
@dchigarev
@kurapov-peter
@mvashishtha
@vnlitvinov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modin 0.23.0

Key Features and Updates Since 0.22.0

Contributors

Contributors