@ wrote... (3 years, 4 months ago)

We currently host some of our customers on our private cloud and as a result we need to manage an ssl certificate for them as well. Traditionally we've used a standard SSL provider but this was not only pricey, more importantly it was a lot of work for the support staff to get certs, update certs, deploy certs, etc.

For various reasons I wrote a django app called ClientManager (CM for short) that holds technical info about our customers such as which machines host them, support expiry date, and also their ssl certs.

Moving the certs into the database was itself a massive improvement for the support staff. Unfortunately nginx can't read certs from a url (my life would be much easier if it could) so I wrote a little app called cert-monitor that could write the certs to disk and reload nginx.

But wait, how does cert-monitor get the certs?

Well it turns out that I had previously needed to sync some data in CM to a remote (and unreliable) service and since I've long been fascinated with RabbitMQ and Celery that's what I used there. That wasn't the first time I've attempted to use celery but it was the first time I successfully used it.

Since I had a usable rabbitmq message bus I registered a callback with django's post_save signal and passed the cert/key combo off to rabbitmq. cert-monitor was listening to rabbitmq and when it got an updated cert, cert-monitor wrote the cert to disk and called nginx -s reload.

That totally “just worked” and life was good.

Life was good for me at least, we were still paying a not insignificant amount of money for certs and the support staff still had to spend lots of miserable time making certs and copy pasting and whatnot. This problem was getting worse with more customers and now certs have a maximum life of one year. They used to be good for up to three.

So to kill two birds with one stone, if I could automate LetsEncrypt then the support staff would only have to make sure the domain name was in CM and then never think about it again. Plus we'd save money.

As an aside, paying for SSL certs has always pissed me off. Here's a bunch of money for a computer to do ⅒ of a second of math. Piss off.

So step one was figuring out how to get certs from LE to CM. It didn't take long to realize I'd need to write a plugin, but no problem as LE is written in python and the docs looked pretty good.

Narrator: they were not

I don't want to slag an open source project that provides real value for free but man… those docs are pretty… not the best. It also turns out that certbot is pretty weird. Anyways…

Through much trial and tribulations I ended up with a plugin I call certbot-cm. And since I now know how to do it, it's fairly easy and only about 60 lines of real code and just 190 including boilerplate.

So during the authenticate phase the certbot-cm plugin posts the auth challenge token to CM and then deletes it after. During the installation phase the plugin posts the key and cert to CM.

So yeah, a few requests.post() calls later and Bob's your uncle. Combined with cert-monitor new certs are auto-deployed. Super cool.

But we can't have support ssh'ing into various boxes and running certbot commands! No, it would be much nicer if we could push a button in CM and all this would happen automatically. The first thing I did was add an issuer field to the Domain record.

Then I wrote another daemon, and by wrote another daemon I mean I copied cert-monitor and changed like 20 lines. So now when CM emits a changed domain, and the issuer is letsencrypt, the new certbot-relay just calls certbot. It actually worked the first time, true story.

So now we're done and support can regain a little sanity.

Speaking of lost sanity, here's a breakdown of all the steps that happen to auto-deploy a LetsEncrypt certificate in this crazy system.

  1. in CM, save a domain object with issuer = letsencrypt
  2. django emits a post_save event
  3. catch event and send message to rabbitmq, {domain: 'tickets.example.com', issuer: 'le'}
  4. rabbitmq emits message
  5. certbot-relay catches the domain message
  6. certbot-relay calls certbot.sh
  7. certbot.sh calls certbot with lots of flags
  8. certbot calls our plugin certbot-cm during auth phase
  9. certbot-cm posts challenge token to CM
  10. LetsEncrypt gets http://tickets.example.com/.well-known/acme-challenge/<the token>
  11. nginx forwards that request to CM
  12. CM returns the challenge token which percolates back to LE
  13. certbot-cm deletes the challenge token from CM
  14. LE gives certbot a sweet new certificate
  15. certbot calls certbot-cm during the install phase
  16. certbot-cm posts key and cert to CM
  17. saving the new cert triggers a different post_save event
  18. post_save is caught and message is sent to rabbitmq, {key: '...', cert: '...'}
  19. rabbitmq emits message
  20. cert-monitor catches message
  21. cert-monitor writes out new key and cert
  22. cert-monitor reloads nginx
  23. nginx reads new key and cert
  24. send message to rocket.chat (like slack)

It practically writes itself!

Here's what it looks like:

Category: tech, Tags: devops, nginx
Comments: 0
Click here to add a comment