Releases: modin-project/modin
Modin 0.23.1
Modin 0.23.1
This release contains fixes that improve Modin's performance for both the NumPy and pandas APIs, as well as removes the Modin In the Cloud experimental feature. This release also includes upgrades to Modin's testing suite that significantly speed up CI.
Key Features and Updates Since 0.23.0
- Stability and Bugfixes
- FIX-#0000: don't test experimental xgboost with Ray nightly build (#6424)
- FIX-#0000: fix xgboost tests with ray>2.6.0 (#6425)
- FIX-#1930: Fix one of the cases of heterogeneous data for read_csv (#5507)
- FIX-#4580: Fix access by row label in query and eval (#6488)
- FIX-#5627: Stop checking temp_df.dtype == 'category' (#6360)
- FIX-#5972: compute correct dtype for
Series.str.find/index/rfind/rindex
(#6426) - FIX-#6219: don't default to pandas for 'copy' on empty DataFrame/Series objects (#6371)
- FIX-#6299: array method always returns array of vanilla numpy (#6300)
- FIX-#6334: improve error message if hdk isn't installed in the environment (#6358)
- FIX-#6347: remove 'modin in the cloud' experimental feature (#6408)
- FIX-#6364: Make reshuffling work with 'BenchmarkMode.put(True)' (#6365)
- FIX-#6367: Enable support for 'groupby.size()' in reshuffling groupby (#6370)
- FIX-#6368: Apply deferred indices before map-reduce groupby (#6369)
- FIX-#6372: precompute dtypes for 'sum' operation (#6421)
- FIX-#6375: don't initialize engines at import time (#6374)
- FIX-#6386: don't make unnecesary 'astype' calls for modin.array.sum op (#6395)
- FIX-#6396: set '__factory' to 'None' in case of any problems during initialization (#6397)
- FIX-#6402: Allow datetime and timedelta types in
diff
(#6403) - FIX-#6405: Apply
disable_logging
to__getattr__
(#6406) - FIX-#6410: add a link to @modin_project twitter (#6411)
- FIX-#6414: fix 'read_feather' with pyarrow<11.0 (#6415)
- FIX-#6427: make code compatible with flake8==6.1.0 (#6428)
- FIX-#6429: exclude pymssql==2.2.8 from environments (#6430)
- FIX-#6436: Support ~ in paths in IO functions correctly (#6448)
- FIX-#6443: Cast boolean columns before sum|mean|median groupby aggregations (#6444)
- FIX-#6456: create fake xgboost module for building docs (#6457)
- FIX-#6459: support fastparquet>=2023.1.0 (#6458)
- FIX-#6483: Default to pandas for array_ufunc (#6486)
- Performance enhancements
- Update testing suite
- New Features
- Uncategorized improvements
- Release version 0.23.1 (#6495)
Contributors
@AndreyPavlenko
@RehanSD
@YarShev
@anmyachev
@dchigarev
@mvashishtha
@vnlitvinov
Modin 0.23.0
Modin 0.23.0
This release upgrades the pandas version to 2.0. It also includes '.corr' speed-up, new
features, and bug fixes.
Key Features and Updates Since 0.22.0
- Stability and Bugfixes
- FIX-#1851: Squash multiple LogicalProject nodes (#6306)
- FIX-#3371: Remove pandas patch level pin (#6211)
- FIX-#4048: support sqlalchemy objects in
con
parameter forto_sql
(#5940) - FIX-#4485: fix 'clip' with list-like bounds and axis=None (#6344)
- FIX-#4954: defaults to pandas in
read_json
in case of rows having different columns (#5946) - FIX-#5077: fix 'Series.rename_axis' signature (#6324)
- FIX-#5461: fix groupby if dataframe has empty partitions (#6307)
- FIX-#6035: Fall back to Pandas, when merging unsupported column types (#6036)
- FIX-#6085: HDK: Implemented support for datetime64 dtypes serialization (#6086)
- FIX-#6208: HDK: Added support for median aggregation (#6209)
- FIX-#6215: Process '.corr(numeric_only=False)' parameter at the qc level (#6242)
- FIX-#6218: Fix
read_excel
and unpinopenpyxl
(#6247) - FIX-#6229: fix
Series.equals
/DataFrame.equals
with NA entries (#6270) - FIX-#6232: support DataFrame.cov(numeric_only=False) without fallback to pandas (#6262)
- FIX-#6237: Log errors only from deepest modin layer (#6238)
- FIX-#6245: support datetime64 with different resolutions types for HDK (#6255)
- FIX-#6246: fix 'groupby(..., as_index=False).agg(...)' case (#6263)
- FIX-#6258: Fix series to_dict (#6260)
- FIX-#6259: Fix astype("category") causing read-only buffer error (#6267)
- FIX-#6273: fix DataFrame.min/max/mean/median/skew/kurt with axis=None (#6275)
- FIX-#6297: fix experimental numpy.argmax/argmin with Nans in data (#6298)
- FIX-#6309: do not materialize axes for 'rank' operation (#6310)
- FIX-#6313: update MIN_RAY_VERSION var: 1.4.0 -> 1.13.0 (#6314)
- FIX-#6317: fix syntax error in 'push-to-master.yml' (#6318)
- FIX-#6336: pin 'pydantic<2' to fix CI (#6337)
- FIX-#6338: fix TypeError: WorksheetReader.init() got an unexpected keyword argument 'rich_text' (#6339)
- FIX-#6341: call _filter_empties only if shapes are different on particular axis (#6333)
- FIX-#6352: Fix the HdkOnNativeDataframePartition._width_cache property computation (#6353)
- FIX-#6354: Skip bad and pre-release versions (#6355)
- Performance enhancements
- Refactor Codebase
- Update testing suite
- New Features
- FEAT-#5684: Use TreeReduce implementation for 'pivot_table' in certain cases (#6089)
- FEAT-#5759: Implement lazy Arrow execution for the HDK engine (#6251)
- FEAT-#5936: support pandas 2.0.2 (#5995)
- FEAT-#6048: add
wait
method for Dask/Ray/Unidist wrappers (#6049) - FEAT-#6191: Implement
groupby.rolling
API (#6292) - FEAT-#6253: add 'dtype_backend' parameter support for read_parquet/read_feather (#6264)
- FEAT-#6256: HDK: Add support for DataFrameGroupBy.head/tail() (#6257)
- FEAT-#6284: Do not convert HDK query execution result to arrow. (#6286)
- FEAT-#6296: Add additional pyhdk launch parameters (#6303)
- FEAT-#6322: Give a warning only if the major or minor part of pandas version are different (#6323)
- FEAT-#6325: Add GPU execution option for HDK backend (#6326)
- FEAT-#6327: Bump pyhdk version to 0.7 (#6328)
- FEAT-#6351: Add a simple heuristic for fragment size when running on a GPU (#6346)
Contributors
@AndreyPavlenko
@YarShev
@alexbaden
@anmyachev
@dchigarev
@kurapov-peter
@mvashishtha
@vnlitvinov
Modin 0.22.3
Patch release with main point of pinning pydantic<2 to resolve Ray issues, plus a few bugfixes.
Key Features and Updates Since 0.22.2
- Stability and Bugfixes
- FIX-#5461: fix groupby if dataframe has empty partitions (#6307)
- FIX-#6035: Fall back to Pandas, when merging unsupported column types (#6036)
- FIX-#6297: fix experimental numpy.argmax/argmin with Nans in data (#6298)
- FIX-#6309: do not materialize axes for 'rank' operation (#6310)
- FIX-#6313: update MIN_RAY_VERSION var: 1.4.0 -> 1.13.0 (#6314)
- FIX-#6336: pin 'pydantic<2' to fix CI (#6337)
Contributors
Modin 0.23.0rc0
This release includes support for pandas 2.0, '.corr' speed-up, new features and bug fixes.
Note: this is a release candidate. If everything goes well, we'll release Modin 0.23.0 in two weeks.
Key Features and Updates Since 0.22.0
- Stability and Bugfixes
- FIX-#3371: Remove pandas patch level pin (#6211)
- FIX-#4954: Defaults to pandas in
read_json
in case of rows having different columns (#5946) - FIX-#6215: Process '.corr(numeric_only=False)' parameter at the qc level (#6242)
- FIX-#6218: Fix
read_excel
and unpinopenpyxl
(#6247) - FIX-#6232: Support DataFrame.cov(numeric_only=False) without fallback to pandas (#6262)
- FIX-#6237: Log errors only from deepest modin layer (#6238)
- FIX-#6245: Support datetime64 with different resolutions types for HDK (#6255)
- FIX-#6246: Fix 'groupby(..., as_index=False).agg(...)' case (#6263)
- FIX-#6258: Fix series to_dict (#6260)
- FIX-#6259: Fix astype("category") causing read-only buffer error (#6267)
- FIX-#6273: Fix DataFrame.min/max/mean/median/skew/kurt with axis=None (#6275)
- Performance enhancements
- New Features
- FEAT-#5759: Implement lazy Arrow execution for the HDK engine (#6251)
- FEAT-#5936: Support pandas 2.0.2 (#5995)
- FEAT-#6048: Add
wait
method for Dask/Ray/Unidist wrappers (#6049) - FEAT-#6253: Add 'dtype_backend' parameter support for read_parquet/read_feather (#6264)
- FEAT-#6256: HDK: Add support for DataFrameGroupBy.head/tail() (#6257)
Contributors
@AndreyPavlenko
@YarShev
@anmyachev
@dchigarev
@mvashishtha
@vnlitvinov
Modin 0.22.2
This release includes several bug fixes.
Key Features and Updates Since 0.22.1
- Stability and Bugfixes
Contributors
Modin 0.22.1
This release includes a bug fix.
Key Features and Updates Since 0.22.0
Contributors
Modin 0.22.0
This release includes support for pyhdk=0.6, a few performance enhancements,
new features and bug fixes.
Key Features and Updates Since 0.21.0
- Stability and Bugfixes
- FIX-#6104: Stop selecting same column twice for repr (#6210)
- FIX-#6199: make sure read_html return a list of DataFrames (#6200)
- FIX-#6201: align groupby objects signatures with pandas (#6202)
- FIX-#6212: Fix '.read_feather()' failure if the file contains index metadata (#6213)
- FIX-#6216: make sure 'infer_objects' returns DataFrame (#6217)
- FIX-#5722: Use full axis function when casting to "category" (#6222)
- FIX-#5889: HDK: Combine multiple lazy concat operations into a single one and replace recursion with iteration (#5932)
- Performance enhancements
- Refactor Codebase
- New Features
- Dependencies
Contributors
@mvashishtha
@AndreyPavlenko
@anmyachev
@dchigarev
@jkew
@YarShev
Modin 0.21.0
Modin 0.21.0
This release includes many bug fixes, performance enhancements, and new features.
Key Features and Updates Since 0.20.0
- Stability and Bugfixes
- FIX-#4828: allow
dict_apply_builder
use keyword argumentinternal_indices
(#5945) - FIX-#5091: Handle pd.Grouper objects correctly (#6174)
- FIX-#5203: don't raise
AttributeError: 'list' object has no attribute '_query_compiler'
injoin
op (#5939) - FIX-#5985: BUG: ArrowPeriodType and ArrowIntervalType are not supported by HDK (#5987)
- FIX-#5988: BUG: Concatenation of frames with strings is not supported by HDK (#5989)
- FIX-#5993: Fix documentation building in CI (#5994)
- FIX-#5997: Run
build-docs
CI job regardless of the files being changed (#5998) - FIX-#6000: HDK: read_csv(): Do not parse dates, if the parse_dates argument is not specified (#6001)
- FIX-#6022: support lazy import of
modin.pandas
module (#6023) - FIX-#6037: Simplified filter node expression for ranges (#6038)
- FIX-#6053: align 'Series.str' signatures with pandas (#6054)
- FIX-#6069: Improve the way resample is handled at the API layer (#6179)
- FIX-#6070: Simplify implementation of
shift
(#6168) - FIX-#6074: cap pyarrow<12 to fix CI (#6075)
- FIX-#6094: pin 'urllib3<2' for pip command in 'test-ray-master' job (#6178)
- FIX-#6095: Implement the to_csv() method in the HDK backend (#6099)
- FIX-#6097: Pass storage_options to the to_csv function of PandasOnRayIO class with fsspec (#6098)
- FIX-#6106: Fix API layer implementation of reindex_like (#6131)
- FIX-#6107: Allow pass through of
tz_convert
andtz_localize
to QC if possible (#6137) - FIX-#6109: Don't use join() when indicator is true (#6130)
- FIX-#6110: Generalize logic to test if an index is a MultiIndex (#6135)
- FIX-#6112: Ensure that
truncate
verifies that before <= after (#6134) - FIX-#6113: Add QC Layer implementation for idxmin/max (#6170)
- FIX-#6114: Fix series groupby list of numpy methods (#6129)
- FIX-#6115: Check for
_to_datetime
attribute inpd.to_datetime
(#6133) - FIX-#6117: Add error checking at API level for
diff
(#6167) - FIX-#6120: HDK read_csv(): Fixed parsing dates with nanosecond precision (#6121)
- FIX-#6146: Fix
pivot
whenvalues=None
(#6166) - FIX-#6152: make
numeric_only
default toTrue
(#6162) - FIX-#6154: Ensure GroupBy.getitem preserves key order (#6164)
- FIX-#6155: Fully implement droplevel for axis=0 (#6180)
- FIX-#6175: Fix groupby agg columns for empty column partition (#6176)
- FIX-#6181: Do not ignore
copy
argument intz_convert
andtz_localize
(#6182) - FIX-#6183: Ensure array resets index and columns for all storage formats (#6185)
- FIX-#6184: Make Series.to_list return proper list (#6188)
- FIX-#6186: Don't use pandas extension types (#6187)
- FIX-#6194: Fix crashes on groupby.{pct_change,diff} (#6195)
- FIX-#6196: Align 'Series.cat' signatures with pandas (#6061)
- FIX-#6204: Use reset_index instead of insert in to_sql (#6205)
- FIX-#6172: Pass storage_options to the to_csv function of PandasOnUnidist class with fsspec (#6173)
- FIX-#4828: allow
- Performance enhancements
- PERF-#5835: Introduce lazy categorical proxy for pandas backend (#6055)
- PERF-#5840: Precompute dtypes cache for binary operations more often (#5949)
- PERF-#5841: Precompute dtypes for boolean setitem (#5952)
- PERF-#5999: Do not set Ray's
runtime_env
for a single-node case (#6028) - PERF-#6122: Extract Feather's metadata without reading a whole file (#6123)
- Refactor Codebase
- REFACTOR-#5844: remove
inplace
kwarg from query compilerclip
arguments (#5954) - REFACTOR-#5951: remove code duplication for
to_pickle_distributed
(#5950) - REFACTOR-#5992: remove 'apply_license_header.py' as unused (#5990)
- REFACTOR-#6012: move experimental dispatchers under
modin/experimental/...
folder (#6011) - REFACTOR-#6024: remove code duplication for
to_*
functions (#5953) - REFACTOR-#6044: remove code duplication for 'get_objects_from_partitions' (#6045)
- REFACTOR-#6046: remove code duplication for 'progress_bar_wrapper' (#6047)
- REFACTOR-#6062: Add query compiler interfaces for expanding methods (#6064)
- REFACTOR-#6063: Add query compiler interfaces for some strings methods. (#6088)
- REFACTOR-#6065: Use between_time in at_time (#6158)
- REFACTOR-#6066: Support rolling.{rank,quantile,sem} (#6084)
- REFACTOR-#6067: Simplify describe() query compiler interface (#6082)
- REFACTOR-#6068: Simplify info() call (#6087)
- REFACTOR-#6071: Push first and last down to query compiler. (#64) (#6125)
- REFACTOR-#6091: Push more of memory_usage down to query compiler. (#6092)
- REFACTOR-#6105: Explicitly pass default value of np.nan to Series.reindex (#6138)
- REFACTOR-#6108: Move implementation of
pd.cut
to QC layer (#6136) - REFACTOR-#6116: Move
groupby_ohlc
implementation to QC layer (#6132) - REFACTOR-#6119: #6118: Add query compiler methods for groupby diff, pct_change (#6128)
- REFACTOR-#6151: Get slicer without consructing pandas dataframe. (#6161)
- REFACTOR-#6159: Stop defaulting at API layer for a few more methods (#6160)
- REFACTOR-#5844: remove
- Update testing suite
- TEST-#5956: Verify dtypes equality in tests (#5955)
- TEST-#5980: use
cancel-in-progress
only for PRs (#5917) - TEST-#5991: add simple tests for
read_orc
,read_spss
,json_normalize
,read_xml
,read_gbq
(#5983) - TEST-#6004: add more '# pragma: no cover' for io functions (#6002)
- TEST-#6006: test
modin/test/test_partition_api.py
on unidist and dask (#6003) - TEST-#6009: use
tmp_path
fixture instead ofensure_clean_dir
as pandas 2.0.0 does (#6008) - TEST-#6010: add some more test directories into 'setup.cfg' (#6007)
- TEST-#6020: exclude '_version.py' from coverage (#6019)
- TEST-#6027: Test installing Unidist via pip in a clean environment, as we do for Dask and Ray (#6025)
- TEST-#6030: test the function parameters of
Series.str
accessor for pandas equivalence (#6033) - TEST-#6031: test the function parameters of 'Series.dt' accessor for pandas equivalence (#6197)
- TEST-#6076: Use 2 cores for experimental groupby on dask (#6077)
- TEST-#6198: add 'pragma: no cover' for unidist and ray utils that used in remote context (#6059)
- TEST-#6260: Increase test_io timeout (#6207)
- Documentation improvements
- DOCS-#5449: Add page for Modin interoperability with select third party libraries (#5517)
- DOCS-#6021: Add a section regarding reshuffling groupby to Modin's documentation (#6051)
- DOCS-#6078: correct default values for MODIN_CPUS and MODIN_NPARTITIONS (#6177)
- DOCS-#6079: Make 'experimental/index.html' accessible through the readthedocs website (#6080)
- New Features
- FEAT-#5816: Implement '.split' method for axis partitions (#5856)
- FEAT-#5867: Introduce groupby implementation via range-partitioning (#5928)
- FEAT-#6014: Stop defaulting to pandas in groupby frontend for fill-like methods (#5996)
- FEAT-#6039: Implement
Series.str
throughCachedAccessor
(#6043) - FEAT-#6040: implement 'Series.dt' through 'CachedAccessor' (#6056)
- FEAT-#6041: implement 'Series.cat' through 'CachedAccessor' (#6057)
- FEAT-#6144: Stop defaulting at API layer for a bunch of methods (#6145)
- FEAT-#6147: HDK: Arrow-based columns concatenation of frames with trivial index. (#6148)
- FEAT-#6153: Add API layer implementations for some stat methods. (#6156)
Contributors
@AndreyPavlenko
@RehanSD
@YarShev
@anmyachev
@arunjose696
@dchigarev
@devin-petersohn
@helmeleegy
@jkew
@labanyamukhopadhyay
@mdatre
@mvashishtha
@noloerino
@pyrito
@vnlitvinov
@naren-ponder
Modin 0.20.1
Modin 0.20.1
This release includes some fixes.
Key Features and Updates Since 0.20.0
- Stability and Bugfixes
- FIX-#4828: Allow
dict_apply_builder
use keyword argumentinternal_indices
(#5945) - FIX-#5203: Don't raise
AttributeError: 'list' object has no attribute '_query_compiler'
injoin
op (#5939) - FIX-#5985: BUG: ArrowPeriodType and ArrowIntervalType are not supported by HDK (#5987)
- FIX-#5988: BUG: Concatenation of frames with strings is not supported by HDK (#5989)
- FIX-#5993: Fix documentation building in CI (#5994)
- FIX-#5997: Run
build-docs
CI job regardless of the files being changed (#5998) - FIX-#6000: HDK: read_csv(): Do not parse dates, if the parse_dates argument is not specified (#6001)
- FIX-#6022: Support lazy import of
modin.pandas
module (#6023)
- FIX-#4828: Allow
Contributors
Modin 0.20.0
Modin 0.20.0
This release adds parallel implementations for some functions on Dask that were previously implemented for other engines.
It also includes support for pyhdk 0.5, many bug fixes and some performance enhancements.
Key Features and Updates Since 0.19.0
- Stability and Bugfixes
- FIX-#2850: use modin.pandas.Series instead of pandas.Series for
where
func (#5883) - FIX-#3925: Fixed AssertionError on columns and index drop (#5156)
- FIX-#4227: Calling
FactoryDispatcher.get_factory
also initializes the engine (#4228) - FIX-#4635: allow pass modin functions to
apply
(#5915) - FIX-#4924: fix read_excel when header is None (#5919)
- FIX-#5309: series iloc/loc raises IndexingError if a key is too long (#5784)
- FIX-#5373: Fix Series.shift() for named Series (#5823)
- FIX-#5432: don't return None when
astype
used withcopy=False
parameter (#5918) - FIX-#5454: add missed methods for
SeriesGroupBy
,DataFrameGroupBy
objects (#5866) - FIX-#5509: default to pandas for read_parquet if any additional kwargs are passed to the engine (#5911)
- FIX-#5566: Enable test_indexing test on the HDK engine and add to ci (#5567)
- FIX-#5576: Enable test_join_sort test on the HDK engine and add to CI (#5578)
- FIX-#5580: HDK-BUG: 'AVG|SUM' is only valid on integer and floating point (#5583)
- FIX-#5618: don't ignore 'errors' parameter for astype (#5895)
- FIX-#5653: implement
convert_dtypes
as a full-axis operation instead of using map approach (#5885) - FIX-#5737: BUG: String columns are converted to Categorical, if exported from HDK (#5738)
- FIX-#5767: cast
pathlib.Path
to str forread_parquet
(#5860) - FIX-#5770: Enable test_series test on the HDK engine and add to ci (#5771)
- FIX-#5774: Correctly calculate shape of single row (#5775)
- FIX-#5776: fix IndexError when concatenating dict of series along columns (#5804)
- FIX-#5781: Fix sort in descending order for columns with highly dense values (#5783)
- FIX-#5787: Enable test_reduce test on the HDK engine and add to ci (#5788)
- FIX-#5794: Enable test_default test on the HDK engine and add to ci (#5795)
- FIX-#5806: Enable test_io test on the HDK engine and add to ci (#5807)
- FIX-#5810: Enable test_binary test on the HDK engine (#5811)
- FIX-#5819: Fix np.argmax/argmin on 1D arrays (#5820)
- FIX-#5829: fix ndarray assignment via loc (#5847)
- FIX-#5846: add Series.str.removeprefix/removesuffix/fullmatch methods (#5845)
- FIX-#5849: add
Series.dt.day_of_week/day_of_year/isocalendar/asfreq
methods (#5848) - FIX-#5859: Fix '.sort_values()' when there's only one row partition (#5869)
- FIX-#5862: fix Inline strong start-string without end-string for read_custom_text (#5861)
- FIX-#5870: Enable test_general test on the HDK engine and add to ci (#5871)
- FIX-#5888: Fix to_parquet in s3. (#5912)
- FIX-#5891: BUG: HDK: Query execution fails because the query contains not supported self-join pattern (#5892)
- FIX-#5927: Enable
test_map_metadata
test on the HDK engine and add to ci (#5929) - FIX-#5934: Enable
test_window
test on the HDK engine and add to ci (#5935) - FIX-#5941: TEST: The test test_io.py fails on HDK (#5942)
- FIX-#5976: correct use of dtypes cache for
concat
op (#5975) - FIX-#5977: use
wrapper.materialize
instead ofwait_partitions
; use AWS env vars inpytest_sessionstart
function (#5981)
- FIX-#2850: use modin.pandas.Series instead of pandas.Series for
- Performance enhancements
- PERF-#5590: Precompute columns and dtypes metadata for '.merge()' (#5594)
- PERF-#5670: create
self._identity
in partitions only for "debug" logging level (#5679) - PERF-#5674: reduce data transferring in
_launch_tasks
function (#5678) - PERF-#5675: make index calculation for
read_csv
function lazy; introduceModinIndex
(#5677) - PERF-#5740: allow
read_csv
,read_fwf
,read_table
,read_custom_text
functions be executed fully asynchronous; introduceModinDtypes
(#5713) - PERF-#5777: Filter out empty bins at range-based reshuffling (#5779)
- PERF-#5778: Avoid extra materialization at range-based reshuffling (#5780)
- PERF-#5808: Delay metadata computations for '.sort_values' result (#5828)
- PERF-#5837: Defer index materialization for MapReduce implemented groupby (#5948)
- Refactor Codebase
- REFACTOR-#2863: remove 'other_name' from broadcast_apply (#5882)
- REFACTOR-#5414: Move
partition.get
into base class (#5408) - REFACTOR-#5417: fix FutureWarning: the
mangle_dupe_cols
keyword is deprecated (#5407) - REFACTOR-#5683: remove Engine.subscribe(_update_engine) in DataFrame/Series constructors (#5855)
- REFACTOR-#5786: align logging of Dask partitions with other executions (#5785)
- REFACTOR-#5799: Clean up numpy array operations (#5800)
- REFACTOR-#5830: rename experimental dispatchers and parsers (#5864)
- REFACTOR-#5874: move lazy_metadata_decorator into utils.py (#5872)
- REFACTOR-#5875: use default implementations for dt methods from the base query compiler (#5873)
- REFACTOR-#5902: use __make_read for non experimental IO classes (#5898)
- REFACTOR-#5908: remove unused parameters from 'run_exec_plan' (#5907)
- REFACTOR-#5910: remove '_dtypes_for_cols' internal function as unused (#5909)
- REFACTOR-#5922: let
upload-coverage
action fail if there is no.coverage
file (#5921) - REFACTOR-#5923: add
pragma: no cover
for functions that used inapply_full_axis
(#5920)
- Update testing suite
- TEST-#2544: delay
codecov
notifications until all reports have been sent (#5782) - TEST-#4261: test rolling with axis=1, win_type=, and center=True (#5881)
- TEST-#5477: fix typo: read_stata kwargs -> read_sas kwargs (#5854)
- TEST-#5790: add ASV configs for Dask and Unidist (#5789)
- TEST-#5802: update some actions in CI (#5801)
- TEST-#5826: remove _propagate_index_objs internal function usage from tests (#5813)
- TEST-#5832: Suppress pytest coverage messages in terminal (#5833)
- TEST-#5851: test api of cat/sparse accessors (#5850)
- TEST-#5878: exclude modin/experimental/batch/test/ folder from computing coverage (#5877)
- TEST-#5897: Add more robust tests for numpy API (#5900)
- TEST-#5913: Cancel CI for commits to same branch. (#5914)
- TEST-#5933: Add assert_array_equals utility to numpy tests (#5947)
- TEST-#5943: Rebalance tests between different CI jobs (#5890)
- TEST-#5977: Add AWS mock keys to moto in push-to-master.yml (#5978)
- TEST-#2544: delay
- Documentation improvements
- New Features
- FEAT-#4624: add
to_parquet
parallel implementation for Dask (#5876) - FEAT-#5497: add several experimental functions for Dask (#5496)
- FEAT-#5880: add
to_sql
parallel implementation for Dask (#5879) - FEAT-#5901: add
read_fwf
parallel implementation for Dask (#5899) - FEAT-#5930: Bump pyhdk version to 0.5 (#5931)
- FEAT-#4624: add
Contributors
@MSHADroo
@AndreyPavlenko
@RehanSD
@YarShev
@anmyachev
@dchigarev
@mvashishtha
@noloerino
@pyrito
@vnlitvinov