Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update catboost requirement #254

Merged
merged 4 commits into from
Mar 28, 2024
Merged

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 26, 2024

Updates the requirements on catboost to permit the latest version.

Release notes

Sourced from catboost's releases.

1.2.3

Python package

  • Support Python 3.12. #2510
  • [Performance]: Fix ineffective loops in Cython. Significant speedups (up to 3x) on dataset construction from data in C-order can be expected.
  • [Performance]: Make features data initialization from C-order numpy.ndarrays with float32 data type multithreaded. Significant speedups of 5x up to 10x (on CPUs with many cores) can be expected. #385, #2542
  • Save training metrics into the model metadata. So best_score_, evals_result_, best_iteration_ model attributes now work after model saving and loading. Can be removed by model metadata manipulation if needed. #1166
  • [Breaking change]. Support a separate boolean target type, now Class predictions for models that have been trained with boolean targets will also be boolean instead of True, False strings as before. Such models will be incompatible with the previous versions of CatBoost appliers. If you want the old behavior convert your target to False, True strings before training. #1954
  • Restrict jupyterlab version for setup to 3.x for now. Fixes #2530
  • utils.read_cd: Support CD files with non-increasing column indices.
  • Make log_cout, log_cerr specification consistent, avoid reset in recursive calls.
  • Late-initialize default values for log_cout, log_cerr. #2195
  • Add missing generated metrics: Cox, PairLogitPairwise, UserPerObjMetric, SurvivalAft.

New features

  • Support boolean target/labels type during training in Python and Spark (in the latter case only when using fit with Pool arguments) and Class prediction in Python. #1954
  • [Spark]: Support Spark 3.5.x.
  • [C/C++ applier]. Add functions for getting indices of features of different types to C and C++ API. #2568. Thanks to @​nimusp.
  • [C/C++ applier]. Add staged prediction functions to C API. #2584. Thanks to @​Mb-NextTime.
  • [JVM applier]. Add loading CatBoostModel from a byte array to API. #2539
  • [Linux] Support CgroupsV2 when computing default number of threads used in parallel computations. #2519. Thanks to @​elukey.
  • [CLI] Support printing Auxiliary columns by name in evaluation result output. #1659
  • Save training metrics into the model metadata. Can be removed by model metadata manipulation if needed. #1166

Build & testing

  • [Windows]: Use clang-cl compiler and tools from Visual Studio 2022 for the build without CUDA (build with CUDA still uses standard Microsoft toolchain from Visual Studio 2019).
  • [macOS]: Pass os.version to conan host settings to ensure version consistency.
  • [Linux aarch64]: Set -mno-outline-atomics for modern versions of CLang and GCC to avoid unresolved symbols linking errors. #2527
  • Added missing CMakeLists for unit tests for util. #2525

Bugfixes

  • [Performance]: Fix performance regression that could slow down training on GPU by 50% on some datasets that had been introduced in release 1.2. Thanks to @​JeanPaulShapo.
  • [Python-package]: Fix segfault on Pool(data=None). #2522
  • [Python-package]: Fix Python exception in Pool() when pairs_weight is a numpy array. #1913
  • [Python-package]: Fix segfault and other strange errors when specifying custom logger with __call__ method. #2277
  • [Python-package]: Fix returning complex params in hyperparameter search. #1741, #1833
  • [Python-package]: Fix ignored exceptions for missed metrics descriptions on startup. This has not been visible to users but has been making debugging more difficult.
  • [Python-package]: Fix misleading Targets are required for YetiRank loss function. error in Cross validation. #2083
  • [Python-package]: Fix Pool.get_label() returns constant True for boolean labels. #2133
  • [Python-package]: Copying models does not lose best_score_, evals_result_, best_iteration_ attributes values anymore. #1793
  • [Spark]: Fix hangs at the end of the training. #2151
  • Precision metric default value in the absense of positive samples is changed to 0 and a warning is added (similar to the behavior of scikit-learn implementation). #2422
  • Fix ignoring embedding features
  • Try to avoid hash collisions when computing group ids with datasets with a lot of groups (may occur in datasets with around a 10^9 samples).
  • Fix Multiclass models export to C++ and Python code. #2549
  • Fix dataset_statistics mode when no Target data is available.
  • Fix Error: can't proceed some features error on GPU. #1024
  • Fix allow_const_label=True for classification. #1933
  • Add checking of approx and target dimensions for SurvivalAft objective/metric.
  • Fix Focal loss derivatives sign. #2563
Changelog

Sourced from catboost's changelog.

Release 1.2.3

Python package

  • Support Python 3.12. #2510
  • [Performance]: Fix ineffective loops in Cython. Significant speedups (up to 3x) on dataset construction from data in C-order can be expected.
  • [Performance]: Make features data initialization from C-order numpy.ndarrays with float32 data type multithreaded. Significant speedups of 5x up to 10x (on CPUs with many cores) can be expected. #385, #2542
  • Save training metrics into the model metadata. So best_score_, evals_result_, best_iteration_ model attributes now work after model saving and loading. Can be removed by model metadata manipulation if needed. #1166
  • [Breaking change]. Support a separate boolean target type, now Class predictions for models that have been trained with boolean targets will also be boolean instead of True, False strings as before. Such models will be incompatible with the previous versions of CatBoost appliers. If you want the old behavior convert your target to False, True strings before training. #1954
  • Restrict jupyterlab version for setup to 3.x for now. Fixes #2530
  • utils.read_cd: Support CD files with non-increasing column indices.
  • Make log_cout, log_cerr specification consistent, avoid reset in recursive calls.
  • Late-initialize default values for log_cout, log_cerr. #2195
  • Add missing generated metrics: Cox, PairLogitPairwise, UserPerObjMetric, SurvivalAft.

New features

  • Support boolean target/labels type during training in Python and Spark (in the latter case only when using fit with Pool arguments) and Class prediction in Python. #1954
  • [Spark]: Support Spark 3.5.x.
  • [C/C++ applier]. Add functions for getting indices of features of different types to C and C++ API. #2568. Thanks to @​nimusp.
  • [C/C++ applier]. Add staged prediction functions to C API. #2584. Thanks to @​Mb-NextTime.
  • [JVM applier]. Add loading CatBoostModel from a byte array to API. #2539
  • [Linux] Support CgroupsV2 when computing default number of threads used in parallel computations. #2519. Thanks to @​elukey.
  • [CLI] Support printing Auxiliary columns by name in evaluation result output. #1659
  • Save training metrics into the model metadata. Can be removed by model metadata manipulation if needed. #1166

Build & testing

  • [Windows]: Use clang-cl from Visual Studio 2022 for the build without CUDA (build with CUDA still uses standard Microsoft toolchain from Visual Studio 2019).
  • [macOS]: Pass os.version to conan host settings to ensure version consistency.
  • [Linux aarch64]: Set -mno-outline-atomics for modern versions of CLang and GCC to avoid unresolved symbols linking errors. #2527
  • Added missing CMakeLists for unit tests for util. #2525

Bugfixes

  • [Performance]: Fix performance regression that could slow down training on GPU by 50% on some datasets that had been introduced in release 1.2. Thanks to @​JeanPaulShapo.
  • [Python-package]: Fix segfault on Pool(data=None). #2522
  • [Python-package]: Fix Python exception in Pool() when pairs_weight is a numpy array. #1913
  • [Python-package]: Fix segfault and other strange errors when specifying custom logger with __call__ method. #2277
  • [Python-package]: Fix returning complex params in hyperparameter search. #1741, #1833
  • [Python-package]: Fix ignored exceptions for missed metrics descriptions on startup. This has not been visible to users but has been making debugging more difficult.
  • [Python-package]: Fix misleading Targets are required for YetiRank loss function. error in Cross validation. #2083
  • [Python-package]: Fix Pool.get_label() returns constant True for boolean labels. #2133
  • [Python-package]: Copying models does not lose best_score_, evals_result_, best_iteration_ attributes values anymore. #1793
  • [Spark]: Fix hangs at the end of the training. #2151
  • Precision metric default value in the absense of positive samples is changed to 0 and a warning is added (similar to the behavior of scikit-learn implementation). #2422
  • Fix ignoring embedding features
  • Try to avoid hash collisions when computing group ids with datasets with a lot of groups (may occur in datasets with around a 10^9 samples).
  • Fix Multiclass models export to C++ and Python code. #2549
  • Fix dataset_statistics mode when no Target data is available.
  • Fix Error: can't proceed some features error on GPU. #1024
  • Fix allow_const_label=True for classification. #1933
  • Add checking of approx and target dimensions for SurvivalAft objective/metric.

... (truncated)

Commits
  • fe0941b Use paths from CMAKE_*_DIR when running in open source to avoid issues on Win...
  • cf282f7 CatBoost release 1.2.3.
  • ec263e7 Update contrib/python/ipywidgets/py3 to 8.1.2
  • 704a5d8 Intermediate changes
  • a13b5ba Add loading CatBoostModel from a byte array to API.. Fix #2539
  • 56a0b44 Add Get*FeaturesIndices to C++ wrapper. #2323, #2568
  • caed72b Add GetEmbeddingFeaturesCount() to C++ wrapper
  • 4490314 Add Spark 3.5 to pyspark_wrapper_generator.
  • 98c3667 Support boolean target type in Spark (where possible).
  • ad980da Manually unroll the loop
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [catboost](https://github.com/catboost/catboost) to permit the latest version.
- [Release notes](https://github.com/catboost/catboost/releases)
- [Changelog](https://github.com/catboost/catboost/blob/master/RELEASE.md)
- [Commits](catboost/catboost@v1.1...v1.2.3)

---
updated-dependencies:
- dependency-name: catboost
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Mar 26, 2024
@ReinierKoops ReinierKoops self-requested a review March 26, 2024 16:05
@ReinierKoops ReinierKoops changed the title Update catboost requirement from <1.2 to <1.3 Update catboost requirement Mar 26, 2024
Copy link
Collaborator

@adri0 adri0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

"catboost>=1.1 ; python_version != '3.8'",
"xgboost>=1.5.0",
"scipy>=1.4.0",
]
Copy link
Collaborator

@adri0 adri0 Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this is extra dependency option with all packages seems not very useful. But in the future it could be nice having one optional dependency for each algorithm. If the user is just interested in using probatus with lightgbm without having to install the others sounds like a fair use case.

@ReinierKoops ReinierKoops merged commit 77f303f into main Mar 28, 2024
15 checks passed
@dependabot dependabot bot deleted the dependabot/pip/catboost-lt-1.3 branch March 28, 2024 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants