Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simp_le client running as unprivileged user #18

Closed
wants to merge 9 commits into from

Conversation

gronke
Copy link
Contributor

@gronke gronke commented Sep 26, 2016

fixes #17
closes #1

  • Use simp_le instead of certbot. It's recommended by LetsEncrypt for running without sudo
  • Provide ACME challenge on port 80 with static Python SimpleHTTPServer
  • Configurable expiration threshold
  • Support for CentOS (needs to be verified)
  • Users need to manually accept the LetsEncrypt Terms of Service by copying the SHA256 of the PDF document to letsencrypt_tos_sha256

ToDo

@gronke
Copy link
Contributor Author

gronke commented Sep 27, 2016

@jaywink this is a major change to the role internals, but I think the simp_le client is a better fit for our needs:

  • Certbot has more dependencies that were automatically installed, that are actually not involved in the certification itself.
  • Dropping privileges of this external script is important in productive environments. That is possible because we grant write access for /etc/letsencrypt, serve the ACME challenge on port 80 from a static Python webserver and install all global dependencies from Ansible, not an installer script
  • We can support more operating systems than the original Certbot script (e.g. FreeBSD)

@robbyoconnor this role does not yet implement the simplifications from #15 - I will rebase as soon as it landed in master. It makes the cert.yml file much better to read, so I'd like to wait for your PR.

@robbyoconnor
Copy link
Contributor

robbyoconnor commented Sep 27, 2016

@gronke no problem -- I just added travis CI tests for your nginx role

@gronke
Copy link
Contributor Author

gronke commented Sep 28, 2016

@robbyoconnor this one is now rebased on top of #15 - your changes will be applied too (and GitHub should automatically mark your Pull-Request as merged) when this one is merged.

Successfully tested compatibility of existing certificates created with certbot. The affected certificate permissions get updated, so that the letsencrypt_user becomes the owner. The only thing that happens twice is the LetsEncrypt account that will created (again) depending on your environment (staging, production).

@robbyoconnor
Copy link
Contributor

robbyoconnor commented Sep 28, 2016

👍 though it will not close..

@jaywink
Copy link
Owner

jaywink commented Sep 28, 2016

Well this is indeed interesting, thanks @gronke . While I look into simp_le client, one question immediately creeps into my mind: how do you plan we offer cert renewal without shutting down the web server if we're going to be binding another service to port 80?

Personally, while I understand the "no root" importance, I think even more important would be to enable renewing certs with zero downtime. I cringe whenever I run the role as a bunch of small sites have some seconds of possible requests going to /dev/null.

By keeping Apache/nginx in play we already had some kind of plan to provide this renewal (comments in #5 ) - but I'm not entirely sure how we can do that if we start using the Python server on port 80? Any ideas on that or maybe I'm missing something.

Thanks for your work.

Copy link
Owner

@jaywink jaywink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the changes. Still, I think we need to solve the problem of not having to shut down web servers first at least in theory. Will this PR make it easier or harder?

@@ -52,28 +50,38 @@ Tested with the following:

* `letsencrypt_domain` - Domain the certificate is for.
* `letsencrypt_email` - Your email as certificate owner.
* `letsencrypt_tos_sha256` - Provide the SHA256 hash of the latest Terms-of-Service you agreed to
Copy link
Owner

@jaywink jaywink Sep 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's in the defaults, no need to mention it here - nice section related to it in the readme which could also have this variable name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. The plan was to remove it from the default, so that the user needs to confirm the terms first. I think it's a good idea to use the hash of the document for that. It's important to leave a hint in the Readme and provide proper error messages when the hash is invalid or missing.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like this default to be enabled in the defaults. The rationale is that the role should make things as easy as possible to use. Requiring the user to add extra variables to their role that could be provided as defaults IMHO is going backwards. The current role already has --agree-terms etc. The readme can clearly state which ToS the user will be accepting by using the role, and note that the user can change which ToS is being accepted via the variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LetsEncrypt TOS handling can be discussed in a separate issue. To keep the previous behavior of --agree-terms being enabled by default, we can just set the default to the hash of the current terms of service document and make a new release whenever this document was updated.

roles:
- role: ansible-letsencrypt
letsencrypt_user: "le-client-user"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this isn't mentioned in the readme required but is mentioned as letsencrypt in the defaults?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was to show how to override. But it was more confusing than explaining.

done

letsencrypt_account_file: "{{ '/etc/letsencrypt/accounts/simp_le-' + ('staging' if letsencrypt_staging else 'production') + '.json' }}"
# ACME challenge server
letsencrypt_challenge_url: "https://{{ 'acme-staging' if letsencrypt_staging else 'acme-v01' }}.api.letsencrypt.org/directory"
# certbot custom arguments -- see: https://github.com/kuba/simp_le
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certbot -> simp_le

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! done

letsencrypt_challenge_url: "https://{{ 'acme-staging' if letsencrypt_staging else 'acme-v01' }}.api.letsencrypt.org/directory"
# certbot custom arguments -- see: https://github.com/kuba/simp_le
letsencrypt_args: []
letsencrypt_combined_args: []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably here by accident.

# certbot custom arguments -- see: https://github.com/kuba/simp_le
letsencrypt_args: []
letsencrypt_combined_args: []
# default arguments passed to certbot
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certbot -> simp_le

when: _letsencrypt_apt_packages is defined

- name: Dependencies are installed with RPM
apt:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be rpm :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. you catched me pushing untested code 🙈


# user and group configuration

- name: letsencrypt_group exits
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exists


# python webserver systemd wrapper

- name: webserver systemd service is installed
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be horribly wrong, but AFAIK systemd is not supported under 15.04. This role needs to target also the still supported 14.04 LTS, which uses Upstart scripts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ansible_service_mgr == "upstart" an upstart job will be deployed instead of a systemd service. Tested on 14.04.


# virtualenv and python dependencies

- name: virtualenv environment exists
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually no need to create the virtualenv - it will be created by the next task pip, see docs.

- name: Operating system dependencies
apt: name={{ item }} state=present

# dependencies
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the user needs to copy the vars from the correct file? Might need a readme entry for that.

Btw, we could look up machine facts to see whether it's Ubuntu, Debian or RPM based - though I have no idea how reliable that would be - especially the third.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly how it's done here: tasks/main.yml#L9-L10

@jaywink
Copy link
Owner

jaywink commented Sep 28, 2016

Since this PR needs a bit more discussion and I would like to do a release asap, preferably before introducing major rewrites. Should I merge in #15 before the release or shall we keep it in here?

@gronke
Copy link
Contributor Author

gronke commented Sep 28, 2016

I had a few small changes on top of #15 applied during the rebase that removed unused variables. Please make sure to test the role before doing a release.

The shutdown problem remains the same as with the standalone certbot version. The SimpleHTTPServer is just a replacement to provide the same feature. Since we now export the ACME challenge to a definable directory, it should be easy to configure the webserver accordingly. The question is when the configuration happens. IMO it should be done in the role configuring the webserver - and suddenly we're in the middle of a chicken and egg dilemma. Do we first configure the server without SSL, run LetsEncrypt and then modify the server configuration again? What about the idempotence? I'm sure this discussion would quickly exceed the scope of this PR. Anyway, we do not make the problem more complicated than it was before.

@jaywink
Copy link
Owner

jaywink commented Sep 30, 2016

My main point of concern is that AFAIK there is no way to keep for example Apache running (on port 80) while launching SimpleHTTPServer on port 80. Or am I wrong? Because this would automatically mean that existing sites (other sites too, all vhosts) always need to shut down for some seconds to allow binding the python web server to port 80.

I know this change will not change anything regarding this - we already require downtime. But it might make fixing the downtime issue more difficult. We already have two choices how to not require downtime by dropping --standalone - but how can we do that with the Python web server?

If we accept this change, can we in the future drop the python web server and serve whatever simp_le needs using Apache/nginx? This way we could use ad hoc vhosts and not require shutting down the web servers at all.

Any other ideas how to use simp_le without needing to shut down Apache?

@robbyoconnor
Copy link
Contributor

Does it have to be Port 80?

@gronke
Copy link
Contributor Author

gronke commented Sep 30, 2016

Does it have to be Port 80?

No, there are other types of challenge validation methods defined in the ACME specification. The only mechanism implemented in certbot today is Simple HTTP, which requires to be served from port 80.

@gronke
Copy link
Contributor Author

gronke commented Sep 30, 2016

If we accept this change, can we in the future drop the python web server and serve whatever simp_le needs using Apache/nginx?

Yes. We need to introduce two more conditions to skip starting and stopping the Python server when not running in standalone mode. If that's the case, we expect the currently installed webserver to serve locations in .well-known/acme-challenge from the acme-challenge webroot defined as role variable letsencrypt_webroot_path. It is required to start serving static content from this directory on port 80 via HTTP before running simp_le.

I'd like to prevent us from messing with existing vhost configs, that were probably also maintained by Ansible. We could provide examples in the Readme how to configure popular webservers to serve static assets from the webroot location. With this being setup there is really no downtime - not even a risk of breaking a production server's config by accident with this role.

@gronke
Copy link
Contributor Author

gronke commented Oct 2, 2016

@jaywink all your feedback is now addressed in this branch. Would you please give it a new review iteration?

@jaywink
Copy link
Owner

jaywink commented Oct 4, 2016

@gronke sorry for being silent. Will check these asap and then do a few test runs 👍

Btw, one thing I'm a bit worried about, digging into simp_le. The last commit to master is in April, the last github activity of the author in June, and for example in issues some are already worrying it's not maintained any more. I tried reaching out in the FreeNode #simp_le chat room given as support channel, the author isn't there and the only person who replied didn't know anything.

Do you have any information on whether the project is still active? I know even if it goes out of maintenance someone might fork it, but still, something worth keeping in mind. I'm just worried hooking this role into something that might suddenly stop working (maybe LE changes something upstream) and the bug fixes end up taking a long time to resolve as people start pondering which fork to adopt.

Anyway, will give this branch a test run tomorrow or as soon as possible.

@gronke
Copy link
Contributor Author

gronke commented Oct 4, 2016

@jaywink no worries. Take all the time you need to review this branch!

The main reason for picking simp_le was that it's the simplest client that is recommended by LetsEncrypt. It does less magic than then the official client and probably does not need so frequent updates for that reason. Here we use it for

  • Generating SSL key pairs
  • Registering a LetsEncrypt account
  • Interfacing with LetsEncrypt and generating the challenge response
  • Skipping non-due certificates within an expiration threshold

Each of the above steps is well documented and easy to break out. Generating the key-pair could be for example easily done by calling OpenSSL directly. I'd be up to also remove the dependency on simp_le in the future and doing it all directly in Ansible.

@jaywink
Copy link
Owner

jaywink commented Oct 10, 2016

I've decided to do a one last release with the current master code now - let that be the last certbot release and then will get back to testing this. If simpl_le ends up not being active any more and no one maintains a fork, can always backtrack then, but let's push this soon to master to solve the root problem for now. Will be back in a few days.

@gronke
Copy link
Contributor Author

gronke commented Oct 21, 2016

@jaywink did simp_le work for you? I wonder how I can help getting one more version with certbot released before we switch to simp_le (and maybe go further in the future to replace this with an Ansible python module). Let me know if there's anything I can do to support you.

@jaywink
Copy link
Owner

jaywink commented Oct 23, 2016

@gronke while working on other stuff I've been keeping an eye on simp_le and to be honest, it's looking more like an abandoned project as time goes by. See this issue. It's basically broken on default settings since August and the author hasn't even stepped in to say anything - actually it looks like the author hasn't done anything publicly in GitHub since June. Basing the role on something that is possibly a dead project feels wrong to me.

Also, something that hit me when looking at that hash issue. If we use simp_le, it means as soon as LetsEncrypt introduces a change to their ToS, anybody using this role will suddenly start getting failures - right? At least that is what it looks like from the issue. This was not an issue with the old implementation. Granted, having no-root option is a heavy gain.

I'm afraid I would like to wait with this for the author of simp_le to return or to see a stable fork of the project. I assume you're already using this version so hopefully that wont harm your usage of the role. If simp_le does not activate, maybe some other non-root enabling project could be considered? There seems to be a few.

Sorry to keep you waiting.

@gronke
Copy link
Contributor Author

gronke commented Oct 24, 2016

I've been keeping an eye on simp_le and to be honest, it's looking more like an abandoned project

Yes, probably it is abandoned. But it's simple enough to be integrated into this repository quickly. I do not expect the ACME-challenge process to change dramatically in the future for http based authentication. We use the simp_le client for this features, which I'd anyway would like to re-implement in Ansible:

  • Checking existing certificates for expiration
  • Generate directory structure and html files for http ACME challenge
  • Generate SSL key-pairs and store them in /etc/letsencrypt/archive/{{domain}}
  • Point symbolic links from /etc/letsencrypt/live/{{domain}} to the latest SSL key pair
  • Trigger ACME challenge on LetsEncrypt servers for http://domain:80

Everything beyond that is done in pure Ansible already. Would you feel better forking simp_le before using it as dependency?

Also, something that hit me when looking at that hash issue. If we use simp_le, it means as soon as LetsEncrypt introduces a change to their ToS, anybody using this role will suddenly start getting failures - right?

The ToS hash is also included in certbot. If we do not upgrade the version manually in this repository, the ToS-Hash would also not be updated (automatically).

IMO we should not read the ToS for our users. They should make sure themselves that they agree to LetsEncrypt ToS if they want to use their service. Therefore I would like make providing the hash as role parameter mandatory anyway (in a later pull-request).

maybe some other non-root enabling project could be considered?

Our next goal could be to write a helper that looks up certificate expiration dates and provide that information back to Ansible. In addition to that we would need to talk to the ACME API and writing the challenge response file structure. Using OpenSSL to generate the certificates with a command task is the easiest step. We could use simp_le as Python reference implementation.

@gronke
Copy link
Contributor Author

gronke commented Nov 5, 2016

As discussed with @jaywink offline, I will fork this project with the aim to target generic services providing ACME-challenges (like LetsEncrypt) and an approach entirely based on Ansible tasks instead of third-party helper scripts.

@gronke gronke closed this Nov 5, 2016
@robbyoconnor
Copy link
Contributor

I'm confused...

@gronke
Copy link
Contributor Author

gronke commented Nov 6, 2016

To ensure the stability and future compatibility of this role, we decided to remain using certbot as ACME client here. I'm going to fork this role and spike on a solution that does not use any third party tool at all. The first step towards this goal is the branch using the simp_le client instead of certbot, because it already replaced some of the tasks that certbot still does here - for instance running the webserver to serve the acme challenge response.

@robbyoconnor your changes should be merged independently of this (now closed) pull-request and @jaywink requested some changes over there.

@robbyoconnor
Copy link
Contributor

robbyoconnor commented Nov 6, 2016

@gronke great work! I made changes over there on my PR.

@gronke
Copy link
Contributor Author

gronke commented Nov 6, 2016

@robbyoconnor yep, you were 8 minutes faster 🍻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running Certbot as non-root user Use deb instead of wrapper
3 participants