Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an efficient way in the arrow expression handler to check if an expression result needs a schema transform #396

Open
nicklan opened this issue Oct 14, 2024 · 0 comments

Comments

@nicklan
Copy link
Collaborator

nicklan commented Oct 14, 2024

We pass an expected output schema into the expression handler. In the expression handler used by the default engines we currently first recuse the current and target schema and look for any mismatches, and then apply the transform, which will cause two full traversals of the data, even if only one small thing is changed.

Instead we could use the initial traversal to mark the highest points in the tree where we know there are no transformations below, and limit the second traversal to the minimal amount of work.

See #331 (comment) for the genesis of this idea and example pseudo-code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant