feat(clean-install): add clean install feature #70

petermefrandsen · 2023-04-03T20:35:09Z

add feature that enables a clean install of documentation by deleting all childs with specified parent

petermefrandsen · 2023-04-03T20:40:28Z

Draft PR

Still need

Test (unit)
Cleanup / refactor both api.py & main.py

Subtask:
Potentially also add --soft-clean-install flag, that compares the pages_to_upload with the actual child_descendants, the titles from child_descendants that do not exist in pages_to_upload should be deleted.

petermefrandsen · 2023-04-04T21:18:16Z

@iamjackg
I have added some code in a state that could be ready for a potential review?
Again, no formal experience with Python, so kinda had to learn by doing.

Anyways, the PR does not currently take in account what would happen if the script broke during a clean install (reverting changes).

I still see the potential for a --soft-clean-install by making small changes to the __main__.py around lines 360-362, by extending it with a couple of lines with logic.

…mentation by deleting all childs with specified parent

petermefrandsen · 2023-04-05T19:56:54Z

@iamjackg I added the "soft" install to the current implementation.

I don't have insight into how I could use the tui.py to write pretty to the command line, as it seems I would have to do a lot of rewriting of that file to make it work without hacks.

iamjackg · 2023-04-09T17:23:03Z

This is excellent work! Unfortunately it looks like getting descendant pages with content/<id>/descendant/page is actually not supported in Confluence Server. 😞 It's only available in Confluence Cloud.

Thank you so much for putting this together: it's an excellent start. I'll have to do some work to implement a version that works on both Cloud and Server, likely by just building the tree myself while recursing through content/<id>/child.

iamjackg · 2023-04-09T17:40:34Z

I'll leave some comments anyway to make sure I understand your reasoning behind some of the choices 👍

iamjackg · 2023-04-09T17:42:08Z

md2cf/__main__.py

@@ -308,6 +313,11 @@ def main():
        )
        sys.exit(1)

+    if args.parent_title is not None and args.parent_id is not None:


This is actually already ensured by those arguments being part of a mutually exclusive group :)

md2cf/md2cf/__main__.py

Lines 84 to 105 in e842bfd

parent_group = page_group.add_mutually_exclusive_group()

parent_group.add_argument(

"-a",

"--parent-title",

help="title of the parent page under which the new page will be uploaded",

)

parent_group.add_argument(

"-A",

"--parent-id",

help="ID of the parent page under which the new page will be uploaded",

)

parent_group.add_argument(

"--top-level",

action="store_true",

help="upload the page tree starting from the top level (no top level parent)",

)

page_group.add_argument(

"-t",

"--title",

help="a title for the page. Determined from the document if missing",

)

iamjackg · 2023-04-09T17:45:13Z

md2cf/__main__.py

@@ -347,6 +357,10 @@ def main():
                )
                sys.exit(1)

+    if args.clean_install:
+        # This runs even though that there are no files found, i.e. len(pages_to_upload) == 0


Should it? I'm curious if this was more of an explicit choice or a TODO.

It's funny though: I never considered the edge case where nothing would be found at all. I should add a test for that.

It was actually an explicit choice from the initial thoughts of having it being a "complete clean slate flag" that removed everything before re-uploading the documentation.

Considering that it was changed to be the diff between AS/IS and TO/BE step, a much less intrusive way of making this work, would be moving this post uploading the files and then making this part of a "clean-up step" which could be presented in the tui.py file as well

iamjackg · 2023-04-09T17:48:17Z

md2cf/__main__.py

+def perform_clean_install(confluence, pages_to_upload, args):
+    page_descendants = get_page_descendants(confluence, args)
+
+    pages_to_purge = [page for page in page_descendants if page['title'] not in [p.title for p in pages_to_upload]]


This is actually pretty implicit, so it's not surprising you got tripped up, but the title for a page can change all the way up to this function call later on, unfortunately:

md2cf/md2cf/__main__.py

Line 389 in e842bfd

pre_process_page(page, args, postface_markup, preface_markup, space_info)

That would be another reason for the cleanup to potentially be done after uploading everything (and for me to add a comment mentioning the footgun).

I agree - the functionality should be moved!

iamjackg · 2023-04-09T17:52:42Z

md2cf/__main__.py

+
+    pages_to_purge = [page for page in page_descendants if page['title'] not in [p.title for p in pages_to_upload]]
+
+    if not args.dry_run:


Nitpick/suggestion: it would be cleaner to only pass the arguments needed by the function. In this case, dry_run for this section and parent_title, parent_id, and space. This is more of a note to myself, because the other functions I wrote in __main__.py definitely break this rule too. I got lazy while refactoring :P

I agree 🙂

On a side note_ I also wanted to give main.py some refactoring to get closer to "clean code" described by Uncle Bob, but again, I am not the master of python, so I did not want to do to challenge the feature functionality too much.

If you'd like, I might work on that on another branch, when this feature gets ready for a merge / merged 🙂.

iamjackg · 2023-04-09T17:54:45Z

md2cf/api.py

@@ -75,6 +75,9 @@ def _post(self, path, **kwargs):
    def _put(self, path, **kwargs):
        return self._request("PUT", path, **kwargs)

+    def _delete(self, path, **kwargs):
+        return self.api.request("DELETE", urljoin(self.host, path), **kwargs)


Ah, I think you copied the wrong function here. This should be

Suggested change

return self.api.request("DELETE", urljoin(self.host, path), **kwargs)

return self._request("DELETE", path, **kwargs)

Yeah.. 🙈

iamjackg · 2023-04-09T17:56:15Z

md2cf/api.py

@@ -208,6 +212,13 @@ def update_page(
                "labels": [{"name": label, "prefix": "global"} for label in labels]
            }

+        if status is not None:
+            if ['current', 'trashed', 'deleted', 'historical', 'draft'].count(status) > 0:


The Pythonic way to do this is actually

Suggested change

if ['current', 'trashed', 'deleted', 'historical', 'draft'].count(status) > 0:

if status in ['current', 'trashed', 'deleted', 'historical', 'draft']:

I live to learn, and today was some python!

iamjackg · 2023-04-09T18:00:10Z

md2cf/api.py

+    def purge_page(
+        self,
+        page=None
+    ):
+        """
+        Delete page in a space
+
+        Args:
+            page (page): the page to be purged
+
+        Returns:
+            The response from the API
+
+        """
+        if page is not None:
+            params = {"status": "trashed"}
+            page_from_get = self.get_page(page_id=page.id)
+            self.update_page(page=page_from_get, body="", status="trashed")
+            return self._delete(f"content/{page.id}", params=params)
+        else:
+            raise ValueError("Page cannot be None")


Since the parameter is not optional, I would just remove the =None and the error handling :)

Suggested change

def purge_page(

self,

page=None

):

"""

Delete page in a space

Args:

page (page): the page to be purged

Returns:

The response from the API

"""

if page is not None:

params = {"status": "trashed"}

page_from_get = self.get_page(page_id=page.id)

self.update_page(page=page_from_get, body="", status="trashed")

return self._delete(f"content/{page.id}", params=params)

else:

raise ValueError("Page cannot be None")

def purge_page(

self,

page

):

"""

Delete page in a space

Args:

page (page): the page to be purged

Returns:

The response from the API

"""

params = {"status": "trashed"}

page_from_get = self.get_page(page_id=page.id)

self.update_page(page=page_from_get, body="", status="trashed")

return self._delete(f"content/{page.id}", params=params)

Less code is better indeed!

petermefrandsen · 2023-04-09T18:44:43Z

This is excellent work! Unfortunately it looks like getting descendant pages with content/<id>/descendant/page is actually not supported in Confluence Server. 😞 It's only available in Confluence Cloud.

Thank you so much for putting this together: it's an excellent start. I'll have to do some work to implement a version that works on both Cloud and Server, likely by just building the tree myself while recursing through content/<id>/child.

I had actually hoped I finally found an API endpoint available on both versions. However, it does seem that the server version is getting heavily outdated. Consequently, you are now using deprecated endpoints in the cloud version. This could call for the need to make a flag that can target two different api.py files; one for on-premises and one for cloud 🤔

That being said, I am happy that I could help! 🎉 Would you like me to do some fixes/changes according to your comments, or would you like to have over the branch, or are you going to do a complete rework? 🙂

iamjackg · 2023-04-09T18:52:03Z

Yeah, I'm sure we'll reach a point soon where the Cloud version won't work anymore. Unfortunately I don't have access to a Cloud instance: I originally built this for a Server instance, and the fact that it works with Confluence Cloud is more of a happy accident.

I'm mostly curious about #70 (comment), since I can't tell if it's intentionally implemented that way or not.

petermefrandsen · 2023-04-09T18:58:38Z

Yeah, I'm sure we'll reach a point soon where the Cloud version won't work anymore. Unfortunately I don't have access to a Cloud instance: I originally built this for a Server instance, and the fact that it works with Confluence Cloud is more of a happy accident.

I'm mostly curious about #70 (comment), since I can't tell if it's intentionally implemented that way or not.

I should have commented on that already, but in short, I agree with moving the "clean-up" to post-upload.
Basically, the placement originated from before the idea of purging everything, which now has been changed to a soft purge.

Regarding Confluent Cloud, I have tested this against a free account, so there is no need to have a credit card out of your pocket 🙂

petermefrandsen force-pushed the feature/clean-flag branch 6 times, most recently from 013a4ad to 1b95794 Compare April 4, 2023 21:09

petermefrandsen marked this pull request as ready for review April 4, 2023 21:18

petermefrandsen force-pushed the feature/clean-flag branch 3 times, most recently from d679248 to 96e909e Compare April 5, 2023 19:52

feat(clean-install): add feature that enables a clean install of docu…

62c714e

…mentation by deleting all childs with specified parent

petermefrandsen force-pushed the feature/clean-flag branch from 96e909e to 62c714e Compare April 5, 2023 19:54

iamjackg changed the base branch from master to develop April 9, 2023 17:09

iamjackg reviewed Apr 9, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(clean-install): add clean install feature #70

feat(clean-install): add clean install feature #70

petermefrandsen commented Apr 3, 2023

petermefrandsen commented Apr 3, 2023 •

edited

Loading

petermefrandsen commented Apr 4, 2023

petermefrandsen commented Apr 5, 2023

iamjackg commented Apr 9, 2023

iamjackg commented Apr 9, 2023

iamjackg Apr 9, 2023

iamjackg Apr 9, 2023

petermefrandsen Apr 9, 2023

iamjackg Apr 9, 2023

petermefrandsen Apr 9, 2023

iamjackg Apr 9, 2023 •

edited

Loading

petermefrandsen Apr 9, 2023

iamjackg Apr 9, 2023

petermefrandsen Apr 9, 2023

iamjackg Apr 9, 2023

petermefrandsen Apr 9, 2023

iamjackg Apr 9, 2023

petermefrandsen Apr 9, 2023

petermefrandsen commented Apr 9, 2023

iamjackg commented Apr 9, 2023

petermefrandsen commented Apr 9, 2023

	parent_group = page_group.add_mutually_exclusive_group()
	parent_group.add_argument(
	"-a",
	"--parent-title",
	help="title of the parent page under which the new page will be uploaded",
	)
	parent_group.add_argument(
	"-A",
	"--parent-id",
	help="ID of the parent page under which the new page will be uploaded",
	)
	parent_group.add_argument(
	"--top-level",
	action="store_true",
	help="upload the page tree starting from the top level (no top level parent)",
	)

	page_group.add_argument(
	"-t",
	"--title",
	help="a title for the page. Determined from the document if missing",
	)


		pages_to_purge = [page for page in page_descendants if page['title'] not in [p.title for p in pages_to_upload]]

		if not args.dry_run:

	return self.api.request("DELETE", urljoin(self.host, path), **kwargs)
	return self._request("DELETE", path, **kwargs)

	if ['current', 'trashed', 'deleted', 'historical', 'draft'].count(status) > 0:
	if status in ['current', 'trashed', 'deleted', 'historical', 'draft']:

feat(clean-install): add clean install feature #70

Are you sure you want to change the base?

feat(clean-install): add clean install feature #70

Conversation

petermefrandsen commented Apr 3, 2023

petermefrandsen commented Apr 3, 2023 • edited Loading

petermefrandsen commented Apr 4, 2023

petermefrandsen commented Apr 5, 2023

iamjackg commented Apr 9, 2023

iamjackg commented Apr 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iamjackg Apr 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

petermefrandsen commented Apr 9, 2023

iamjackg commented Apr 9, 2023

petermefrandsen commented Apr 9, 2023

petermefrandsen commented Apr 3, 2023 •

edited

Loading

iamjackg Apr 9, 2023 •

edited

Loading