Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use StorageManager in execute_DAG #258

Draft
wants to merge 95 commits into
base: main
Choose a base branch
from
Draft

Conversation

dwhswenson
Copy link
Member

@dwhswenson dwhswenson commented Dec 1, 2023

This is a subset of the work in #234, split off to facilitate review. This PR will merge into #186 (making it easier to see the additions specific to the PR, but meaning that this should be merged before that). It also includes the work in #257 (one of the old tests fails without that).

This PR integrates StorageManager into execute_DAG. Note that this still isn't a complete replacement of the storage system: a follow-up PR will handle serialization of StagingPaths.

TODO:

  • Check that we can exactly reproduce the old behavior with the ReproduceOldBehaviorStorageManager; I think some paths aren't quite right.
  • Add some tests that the directories are as expected in the ExecutionStorageDemoTest subclasses
  • Revisit when/where dag_label is given (also needed in #186)
  • Switch it so the context managers return Context (executor shouldn't need to assemble that)

dwhswenson and others added 30 commits April 26, 2023 13:05
Makes much more sense as _safe_to_delete_holding
right now, we seem to not be getting anything; see if this is
because someone has added a handler on the root logger or because
we can't caplog from a parent logger (may need specific __name__)
Comment on lines +20 to +22
from ..storage.storagemanager import StorageManager
from ..storage.externalresource.filestorage import FileStorage
from ..storage.externalresource.base import ExternalStorage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we fix the namespace of storage so these are all available at the same level? from ..storage import StorageManager, FileStorage, etc

)
return storage_manager


def execute_DAG(protocoldag: ProtocolDAG, *,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you keeping this for comparison for the PR? I don't think we need to keep the old behaviour around

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we told Joe that the API was stable enough that he could use it in workflows. So yeah, I spent a lot of time ensuring that there would be a code path that doesn't break the current API.


def new_execute_DAG( # TODO: this is a terrible name
protocoldag: ProtocolDAG,
dag_label: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is a DAG label now required?

and why can't the label just be taken off the input DAG?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaks current behavior, where the equivalent of the DAG label is given by the user as the -o from quickrun.

Copy link
Member Author

@dwhswenson dwhswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting on a couple of breaking changes these PRs introduce that people should be aware of.

Comment on lines +18 to +20
with open(ctx.shared / f'unit_{my_id}_shared.txt', 'w') as out:
out.write(f'unit {my_id} existed!\n')
with open(os.path.join(ctx.scratch, f'unit_{my_id}_scratch.txt'), 'w') as out:
with open(ctx.scratch / f'unit_{my_id}_scratch.txt', 'w') as out:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change: os.path functionality should not be used. Switch to the pathlib-like semantics of using /.

I don't see a way to have this not be breaking, since the PathLike created here needs to be one of ours. Options that involve keeping os.path would be (1) require something other than the system open, or (2) require manual conversion of any user-created path to one of our objects. We do support option 2 for cases where legacy code can't be converted, but the pathlib-like syntax is more readable and allows us to silently support any protocol that already uses that syntax.

Comment on lines +699 to +700
assert len(list(shared.iterdir())) == 0
# assert len(list(shared.iterdir())) == 5
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Extremely minor) breaking change: Previously, this left behind 5 empty shared directories. In the current implementation, empty directories are, by default, automatically deleted, so we have no directories within the shared space. Had these directories contained any files, you would still find all 5.

There's a parameter to keep empty directories, but the effect will also be to keep empty subdirectories, which I didn't think was the desired default behavior. My assumption is that it will be relatively rare to keep shared anyway, because there should be something in settings to determine whether any given file is scratch/shared/permanent, and anything intended for long-term storage will be put in permanent.

Adds a user story test where a type supported by a custom JSON codec is
added to the result dict, and that codec isn't registered as of
serialization object creation.
This is to avoid likely footguns related to using os.path.join. Includes
test of error on os.path.join.
with dag_ctx.running_unit(
dag_label, unit.key, attempt=attempt
) as context:
_logger.info("Starting unit {label}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing an f-string thingy here

Comment on lines 35 to 37
scratch: PathLike
shared: PathLike
shared: StagingRegistry
permanent: StagingRegistry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are meant to be stagingpaths not registries?

@dwhswenson dwhswenson changed the base branch from shared-object-v2 to main January 22, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants