-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Parallelism configuration option to lakectl #8283
base: master
Are you sure you want to change the base?
Added Parallelism configuration option to lakectl #8283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Concise fix :-)
Confused about the precedence of the 2 settings for parallelism, so asking for a change in case I am right about the required precedence.
cmd/lakectl/cmd/root.go
Outdated
defaultSyncParallelism = 25 | ||
defaultSyncPresign = true | ||
defaultNoProgress = false | ||
defaultParallelismConfig = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use 0? If not, we need to document what "-1" means.
pkg/local/sync.go
Outdated
parallelismToUse := s.cfg.Parallelism | ||
if parallelismToUse <= 0 { | ||
parallelismToUse = s.cfg.SyncFlags.Parallelism | ||
} | ||
for i := 0; i < parallelismToUse; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 changes, the first is more important:
- AFAIU SyncFlags.Parallelism is more specific than the new setting, so if both are specified we should use it and not the new more global setting.
- It is enough to call the var "parallelism", it will obviously be used.
parallelismToUse := s.cfg.Parallelism | |
if parallelismToUse <= 0 { | |
parallelismToUse = s.cfg.SyncFlags.Parallelism | |
} | |
for i := 0; i < parallelismToUse; i++ { | |
parallelism := s.cfg.SyncFlags.Parallelism | |
if parallelism <= 0 { | |
parallelism = s.cfg.Parallelism | |
} | |
for i := 0; i < parallelism; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I understand, problem is, the default for s.cfg.SyncFlags.Parallelism
is 25, so unless the user flagged with a 0 or less, the used value will be the default flag and not the configuration.
🎊 PR Preview 665c8e1 has been successfully built and deployed to https://treeverse-lakeFS-preview-pr-8283.surge.sh 🕐 Build time: 0.009s 🤖 By surge-preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the flag redundancy here.
You can have both a default param value and override it with a config value if exists it's not mutually exclusive.
defaultParallelismConfig = 25 | ||
defaultSyncParallelism = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why both are needed.
We should:
- Use configuration param if provided
- Use modified param value if passed as part of command
- Fallback to default param value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was, that positive default value should come from one source only. In case both have positive default values how will we determine which one to use? I chose the config val to be positive by default, following Ariel's change request to go from the more general case to specific case.
In any case, this code run according to the priorities:
- no flag no config val - uses 25 as config default
- no flag with config - uses config
- flag with no config - uses the flag,
- flag with config - uses flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your priorities seem correct. We have a way to determine whether a flag in Cobra was explicitly provided by the user or not (there's at least one place in the code where we use it AFAIR).
You can use the same method to do that and determine whether we need to use the config value or the param value
Closes #8159
Change Description
Added the option to configure the amount of parallelism for lakectl.
See feature request for more backround.
Currently, the configuration nests in a new section "Options"