Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clean-install): add clean install feature #70

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

petermefrandsen
Copy link

add feature that enables a clean install of documentation by deleting all childs with specified parent

@petermefrandsen
Copy link
Author

petermefrandsen commented Apr 3, 2023

@iamjackg FYI

Draft PR

Still need

  • Test (unit)
  • Cleanup / refactor both api.py & main.py

Subtask:
Potentially also add --soft-clean-install flag, that compares the pages_to_upload with the actual child_descendants, the titles from child_descendants that do not exist in pages_to_upload should be deleted.

@petermefrandsen petermefrandsen force-pushed the feature/clean-flag branch 6 times, most recently from 013a4ad to 1b95794 Compare April 4, 2023 21:09
@petermefrandsen
Copy link
Author

@iamjackg
I have added some code in a state that could be ready for a potential review?
Again, no formal experience with Python, so kinda had to learn by doing.

Anyways, the PR does not currently take in account what would happen if the script broke during a clean install (reverting changes).

I still see the potential for a --soft-clean-install by making small changes to the __main__.py around lines 360-362, by extending it with a couple of lines with logic.

@petermefrandsen petermefrandsen marked this pull request as ready for review April 4, 2023 21:18
@petermefrandsen petermefrandsen force-pushed the feature/clean-flag branch 3 times, most recently from d679248 to 96e909e Compare April 5, 2023 19:52
…mentation by deleting all childs with specified parent
@petermefrandsen
Copy link
Author

@iamjackg I added the "soft" install to the current implementation.

I don't have insight into how I could use the tui.py to write pretty to the command line, as it seems I would have to do a lot of rewriting of that file to make it work without hacks.

@iamjackg iamjackg changed the base branch from master to develop April 9, 2023 17:09
@iamjackg
Copy link
Owner

iamjackg commented Apr 9, 2023

This is excellent work! Unfortunately it looks like getting descendant pages with content/<id>/descendant/page is actually not supported in Confluence Server. 😞 It's only available in Confluence Cloud.

Thank you so much for putting this together: it's an excellent start. I'll have to do some work to implement a version that works on both Cloud and Server, likely by just building the tree myself while recursing through content/<id>/child.

@iamjackg
Copy link
Owner

iamjackg commented Apr 9, 2023

I'll leave some comments anyway to make sure I understand your reasoning behind some of the choices 👍

@@ -308,6 +313,11 @@ def main():
)
sys.exit(1)

if args.parent_title is not None and args.parent_id is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually already ensured by those arguments being part of a mutually exclusive group :)

md2cf/md2cf/__main__.py

Lines 84 to 105 in e842bfd

parent_group = page_group.add_mutually_exclusive_group()
parent_group.add_argument(
"-a",
"--parent-title",
help="title of the parent page under which the new page will be uploaded",
)
parent_group.add_argument(
"-A",
"--parent-id",
help="ID of the parent page under which the new page will be uploaded",
)
parent_group.add_argument(
"--top-level",
action="store_true",
help="upload the page tree starting from the top level (no top level parent)",
)
page_group.add_argument(
"-t",
"--title",
help="a title for the page. Determined from the document if missing",
)

@@ -347,6 +357,10 @@ def main():
)
sys.exit(1)

if args.clean_install:
# This runs even though that there are no files found, i.e. len(pages_to_upload) == 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it? I'm curious if this was more of an explicit choice or a TODO.

It's funny though: I never considered the edge case where nothing would be found at all. I should add a test for that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually an explicit choice from the initial thoughts of having it being a "complete clean slate flag" that removed everything before re-uploading the documentation.

Considering that it was changed to be the diff between AS/IS and TO/BE step, a much less intrusive way of making this work, would be moving this post uploading the files and then making this part of a "clean-up step" which could be presented in the tui.py file as well

def perform_clean_install(confluence, pages_to_upload, args):
page_descendants = get_page_descendants(confluence, args)

pages_to_purge = [page for page in page_descendants if page['title'] not in [p.title for p in pages_to_upload]]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually pretty implicit, so it's not surprising you got tripped up, but the title for a page can change all the way up to this function call later on, unfortunately:

pre_process_page(page, args, postface_markup, preface_markup, space_info)

That would be another reason for the cleanup to potentially be done after uploading everything (and for me to add a comment mentioning the footgun).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - the functionality should be moved!


pages_to_purge = [page for page in page_descendants if page['title'] not in [p.title for p in pages_to_upload]]

if not args.dry_run:
Copy link
Owner

@iamjackg iamjackg Apr 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick/suggestion: it would be cleaner to only pass the arguments needed by the function. In this case, dry_run for this section and parent_title, parent_id, and space. This is more of a note to myself, because the other functions I wrote in __main__.py definitely break this rule too. I got lazy while refactoring :P

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 🙂

On a side note_ I also wanted to give main.py some refactoring to get closer to "clean code" described by Uncle Bob, but again, I am not the master of python, so I did not want to do to challenge the feature functionality too much.

If you'd like, I might work on that on another branch, when this feature gets ready for a merge / merged 🙂.

@@ -75,6 +75,9 @@ def _post(self, path, **kwargs):
def _put(self, path, **kwargs):
return self._request("PUT", path, **kwargs)

def _delete(self, path, **kwargs):
return self.api.request("DELETE", urljoin(self.host, path), **kwargs)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think you copied the wrong function here. This should be

Suggested change
return self.api.request("DELETE", urljoin(self.host, path), **kwargs)
return self._request("DELETE", path, **kwargs)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.. 🙈

@@ -208,6 +212,13 @@ def update_page(
"labels": [{"name": label, "prefix": "global"} for label in labels]
}

if status is not None:
if ['current', 'trashed', 'deleted', 'historical', 'draft'].count(status) > 0:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Pythonic way to do this is actually

Suggested change
if ['current', 'trashed', 'deleted', 'historical', 'draft'].count(status) > 0:
if status in ['current', 'trashed', 'deleted', 'historical', 'draft']:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I live to learn, and today was some python!

Comment on lines +332 to +352
def purge_page(
self,
page=None
):
"""
Delete page in a space

Args:
page (page): the page to be purged

Returns:
The response from the API

"""
if page is not None:
params = {"status": "trashed"}
page_from_get = self.get_page(page_id=page.id)
self.update_page(page=page_from_get, body="", status="trashed")
return self._delete(f"content/{page.id}", params=params)
else:
raise ValueError("Page cannot be None")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the parameter is not optional, I would just remove the =None and the error handling :)

Suggested change
def purge_page(
self,
page=None
):
"""
Delete page in a space
Args:
page (page): the page to be purged
Returns:
The response from the API
"""
if page is not None:
params = {"status": "trashed"}
page_from_get = self.get_page(page_id=page.id)
self.update_page(page=page_from_get, body="", status="trashed")
return self._delete(f"content/{page.id}", params=params)
else:
raise ValueError("Page cannot be None")
def purge_page(
self,
page
):
"""
Delete page in a space
Args:
page (page): the page to be purged
Returns:
The response from the API
"""
params = {"status": "trashed"}
page_from_get = self.get_page(page_id=page.id)
self.update_page(page=page_from_get, body="", status="trashed")
return self._delete(f"content/{page.id}", params=params)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less code is better indeed!

@petermefrandsen
Copy link
Author

This is excellent work! Unfortunately it looks like getting descendant pages with content/<id>/descendant/page is actually not supported in Confluence Server. 😞 It's only available in Confluence Cloud.

Thank you so much for putting this together: it's an excellent start. I'll have to do some work to implement a version that works on both Cloud and Server, likely by just building the tree myself while recursing through content/<id>/child.

I had actually hoped I finally found an API endpoint available on both versions. However, it does seem that the server version is getting heavily outdated. Consequently, you are now using deprecated endpoints in the cloud version. This could call for the need to make a flag that can target two different api.py files; one for on-premises and one for cloud 🤔

That being said, I am happy that I could help! 🎉 Would you like me to do some fixes/changes according to your comments, or would you like to have over the branch, or are you going to do a complete rework? 🙂

@iamjackg
Copy link
Owner

iamjackg commented Apr 9, 2023

Yeah, I'm sure we'll reach a point soon where the Cloud version won't work anymore. Unfortunately I don't have access to a Cloud instance: I originally built this for a Server instance, and the fact that it works with Confluence Cloud is more of a happy accident.

I'm mostly curious about #70 (comment), since I can't tell if it's intentionally implemented that way or not.

@petermefrandsen
Copy link
Author

Yeah, I'm sure we'll reach a point soon where the Cloud version won't work anymore. Unfortunately I don't have access to a Cloud instance: I originally built this for a Server instance, and the fact that it works with Confluence Cloud is more of a happy accident.

I'm mostly curious about #70 (comment), since I can't tell if it's intentionally implemented that way or not.

I should have commented on that already, but in short, I agree with moving the "clean-up" to post-upload.
Basically, the placement originated from before the idea of purging everything, which now has been changed to a soft purge.

Regarding Confluent Cloud, I have tested this against a free account, so there is no need to have a credit card out of your pocket 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants