Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetaWorkflow Handler and MetaWorkflow Run Handler: Pipeline Automation #11

Open
wants to merge 43 commits into
base: master
Choose a base branch
from

Conversation

vstevensf
Copy link

No description provided.

@vstevensf vstevensf requested a review from drio18 November 4, 2022 18:48
Copy link
Collaborator

@drio18 drio18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping some comments for consideration. Happy to review again whenever you'd like, especially once there are tests available for the dependency sorting/checking. I think I understand the gist of it all and it sounds right, but I didn't go in detail; I'll do so once the tests are in.

magma/topological_sort.py Outdated Show resolved Hide resolved
magma/topological_sort.py Outdated Show resolved Hide resolved
magma/metawfl_handler.py Outdated Show resolved Hide resolved
magma/metawfl_handler.py Outdated Show resolved Hide resolved
magma/topological_sort.py Outdated Show resolved Hide resolved
magma/topological_sort.py Outdated Show resolved Hide resolved
magma/utils.py Outdated Show resolved Hide resolved
test/test_topological_sort.py Outdated Show resolved Hide resolved
test/test_topological_sort.py Outdated Show resolved Hide resolved
test/test_utils_magma.py Outdated Show resolved Hide resolved
vstevensf and others added 16 commits November 9, 2022 18:56
Changes included additions of different directed graph global vars
for testing -- tests for the topological sort on their way.
Also removed function for creating "forward dependencies" --
this version of topological sort works with "backward dependencies".
Addressed some prior comments. Future planned changes include
creating a class for the topological sort function to decrease
the number of command line arguments that are common among
several functions.
Also addressed some prior comments on draft.
Changes included merging some former magma utils fxns
into this class, and removing extraneous utils fxns.
Also includes new test file for this class.
includes addition of custom Exception classes
@vstevensf vstevensf marked this pull request as ready for review May 16, 2023 06:06
Copy link
Collaborator

@drio18 drio18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a batch of comments, some of which will require further discussion. It looks like there's still a good deal to complete, clean up, and test properly, so we should discuss a plan tomorrow for how best to utilize the remaining time.

Comment on lines +1 to +5
#!/usr/bin/env python3

#################################################################
# Vars
#################################################################
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the shebang or the headers here and elsewhere. I know Michele does this, but I find it unnecessary clutter.

Comment on lines 6 to 34
TITLE = "title"

# MetaWorkflow Handler attributes
PROJECT = "project"
INSTITUTION = "institution"
UUID = "uuid"
META_WORKFLOWS = "meta_workflows"
ORDERED_META_WORKFLOWS = "ordered_meta_workflows"
META_WORKFLOW = "meta_workflow"
NAME = "name"
DEPENDENCIES = "dependencies"
ITEMS_FOR_CREATION_PROP_TRACE = "items_for_creation_property_trace"
ITEMS_FOR_CREATION_UUID = "items_for_creation_uuid"

# MetaWorkflow Run Handler attributes
STATUS = "status"
FINAL_STATUS = "final_status"
ASSOCIATED_META_WORKFLOW_HANDLER = "meta_workflow_handler"
ASSOCIATED_ITEM = "associated_item"
META_WORKFLOW_RUN = "meta_workflow_run"
META_WORKFLOW_RUNS = "meta_workflow_runs"
ITEMS_FOR_CREATION = "items_for_creation"
ERROR = "error"
# statuses
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
STOPPED = "stopped"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to organize the constants into classes so they can be imported in groups. In general, importing * is not a good practice, and having the constants on classes makes it more clear where they are coming from.

Also, I don't believe all of these constants should be in the magma directory, as some are portal specific, and those should be differentiated. In general, I'd be fine with all this code living in magma_ff for now, but that's something to discuss more with Michele.

Comment on lines 37 to 46
#TODO: the following is here in case dup flag is added in the future
# MWFR_TO_HANDLER_STEP_STATUS_DICT = {
# "pending": "pending",
# "running": "running",
# "completed": "completed",
# "failed": "failed",
# "inactive": "pending",
# "stopped": "stopped",
# "quality metric failed": "failed"
# }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still needed regardless of the duplication flag stuff...

Regardless, it is portal-specific and thus shouldn't live in magma, and there's no need to leave the commented out code around if it's no longer needed.

################################################
# ValidatedDictionary TODO: eventually make part of dcicutils?
################################################
class ValidatedDictionary(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming this. It's not a dictionary, but rather a class, so ValidatedAttributes would probably be more apt.

In general, I would stay away from this style of code of arbitrarily placing all key, value pairs in a dictionary on the class itself. I know it's done elsewhere in this project, but it complicates tracing the origin of attributes to someone new to the code as well as placing an unknown number of attributes on the class that are then capable of being accidentally overwritten. As they say for python, "explicit is better than implicit." To my eyes, a superior alternative to this is to set an attribute on the class via __init__ such as self.properties, and then placing the input dictionary there and validating and setting other properties or attributes appropriately.

Comment on lines 36 to 37
if retrieved_attr is None:
raise AttributeError("attribute %s cannot have value 'None'." % attribute)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't an attribute be None? Would a falsy but non-None value be acceptable?

for running_mwfr_step_name in self.mwfr_handler_obj.running_steps():

# Get run uuid
run_uuid = self.mwfr_handler_obj.get_step_attr(running_mwfr_step_name, uuid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe uuid is defined here.

# and convert property trace(s) to uuid(s)
else:
property_traces = getattr(meta_workflow_step, ITEMS_FOR_CREATION_PROP_TRACE, None)
if not isinstance(property_traces, list):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this has to be a list, no?

Comment on lines +208 to +219
item_uuid = make_embed_request(
self.associated_item_identifier,
item_prop_trace
+ ".uuid", # TODO: are we assuming the user will include ".uuid" or @id as part of prop trace?
self.auth_key,
single_item=True,
)
if not item_uuid:
raise MetaWorkflowRunHandlerCreationError(
f"Invalid property trace '{item_prop_trace}' on item with the following ID: {self.associated_item_identifier}"
)
items_for_creation_uuids.append(item_uuid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this logic will work here, as you may receive multiple UUIDs, not just one, and you have to parse the result to obtain any UUIDs returned.

Comment on lines +323 to +331
@property
def get_project(self):
"""Retrieves project attribute from the associated item."""
return self.retrieved_associated_item.get(PROJECT)

@property
def get_institution(self):
"""Retrieves institution attribute from the associated item."""
return self.retrieved_associated_item.get(INSTITUTION)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a personal preference, consider using get in a method name for non-attribute/property methods, and using attribute-like names for properties (they live somewhere in-between, but I think of them more like attributes than methods).

magma_ff/run_metawflrun_handler.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants