Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward-merge branch-23.08 to branch-23.10 #3783

Merged
merged 1 commit into from
Aug 14, 2023
Merged

Commits on Aug 14, 2023

  1. Makes copy of input ddf to work around dropped column names (#3776)

    When creating multiple graphs with the same dask_cudf dataframe, there is a metadata mismatch occurring when one or more partitions are empty. In fact, during the second graph creation with the dask_cudf dataframe that was used/modified earlier, the metadata are not conserved for partitions with empty empty dataframes. This is due to the fact a _reference_ to the input dataframe partly destroyed (modfied) during the first graph creation is reused in the second graph creation.
    
    This PR makes a copy of the input dataframe right after the repartition call to avoid that alteration.
    
    Authors:
       - jnke2016 (jnke@gmail.com)
    
    Approvers:
       - Vibhu Jawa (https://github.com/VibhuJawa)
       - Alex Barghi (https://github.com/alexbarghi-nv)
       - Rick Ratzel (https://github.com/rlratzel)
    jnke2016 authored Aug 14, 2023
    Configuration menu
    Copy the full SHA
    20dca85 View commit details
    Browse the repository at this point in the history