acme-lambda-renewal

I’d been looking for a way to renew my Let’s Encrypt TLS/SSL certificates via AWS Lambda (using DNS authentication by updating Route 53) rather than web authentication. This project started since I wanted to separate out my mail server from my web server, and while I suppose I could run Apache (or whatever) on the mail server just to be able to request certificates it seems kind of silly, and this sort of automatic run-a-piece-of-code-occasionally scenario seemed like the perfect chance to use AWS Lambda.

I expected this to be a common & solved problem, but in my searching around the Internet I didn’t really see exactly what I was looking for. There were some solutions out there, but they seemed overly complicated for just “renew my certificates every two months”, and some were out of date (not even updated to the ACME v2 protocol). So I figured I’d need to write at least some code myself.

[As an aside, I’m guessing the reason that there didn’t seem much up-to-date out there was that people weren’t as interested in the problem once Amazon Certificate Manager started giving out certificates for free. But you can only use those on AWS-hosted services like CloudFront and API Gateway, and not directly on EC2 instances. So if you want to host something other than HTTPS that uses TLS (like IMAP and SMTP for a mail server), or even if you want CloudFront to connect to your back-end EC2 server over HTTPS, you need to get your certificate from somewhere else, and Let’s Encrypt is free and designed for automation. So it’s still useful to be able to get Let’s Encrypt certificates from within the AWS platform, at least until AWS decides they can let EC2 instances download private keys of their free Certificate Manager certificates, which I’m guessing they won’t do. Or maybe I’m the only person who ever wanted to do this.]
UPDATE: Starting in October 2020, AWS now does allow for some usage of their Certificate Manager certificates inside of an EC2 instance using a PKCS11 connection to an “enclave” inside your instance, so you don’t get access to the private key directly but can still encrypt and sign from within an application. But it requires your application on your instance to support the PKCS11 protocol, which a lot of things don’t yet. But if you’re just trying to use a certificate within nginx from inside your server, you may want to check out that approach rather than what I outline here of getting certificates through ACME.

We’re using Node.js where I work now, so I wanted to try using that for the Lambda to get more more practice writing in it. And I was hoping that there would be some library to handle ACME so I didn’t have to write everything myself. While the complete “Greenlock” package is one of the overly complicated things that didn’t seem what I was looking for, the low-level “acme” library it is based on seemed perfect. I just needed to tie it together with updating DNS in Route 53, since while the acme documentation said they had a plugin for integrating with it all that was there was “not implemented” statements.

Since I needed to write something anyway, I figured that I’d share it here, just in case somehow someone else stumbles upon it who finds it useful. I’m sure that if I wanted to really do this the “cloud-native” way I’d have a CloudFormation or Terraform template or something for you to use, but I’m still mainly using the AWS Console (since I’m just playing around for my personal projects), so I’m just going to share the code and describe the setup needed to get it to work. Since your needs probably won’t exactly match mine anyway, you’ll likely need to tweak some stuff. (I hereby dedicate the code to the public domain, for you to reuse and adapt however you want. It may “require minimal configuration and tweaking“.)

Workflow

Here’s what the Lambda does:

  • Reads from the AWS Systems Manager Parameter Store the configuration, including the ACME account private key and the list of certificates to renew. (See the “Configuration” section later.)
  • Generates a new certificate private key.
  • Contacts the Let’s Encrypt ACME server to request the certificates.
  • Updates DNS in Route 53 to prove you have ownership of the domain.
  • Gets the new certificate from Let’s Encrypt, and publishes both it (“fullchain.pem“) and the private key (“privkey.pem“) to an S3 bucket.
  • Once a certificate is done, it publishes a message to an SNS topic. (You could then have a separate Lambda subscribe to that topic to handle having whatever system needs that new certificate actually getting it installed. Or at least that’s what I do, by a small Lambda that tells AWS Systems Manager to run a AWS‑RunShellScript on the server to have the server download and install the certificate.)
  • Any failures are published to a different SNS topic, which you might have email you so you know if it didn’t work.

And while I tend to use the term “renewal” throughout, since that’s what’s happening most of the time, it’s actually the same process as getting a brand new certificate. I suppose I really should have just called the system “certificate requesting” or something, but “renewal” is why you want it to run every couple months in Lambda so I guess that’s what I was thinking when I named it. Naming things is hard.

Code

The only dependency is the acme library (though that brings in a few dependencies of its own), plus it uses the aws-sdk but that’s already available on Lambda without needing to upload it yourself.

You’ll also have to set the Lambda function’s timeout to something rather lengthy, since it can take a couple minutes sometimes to wait for the Route 53 DNS update to finish. I set mine to the maximum 15 minutes, though usually it only takes a few. (It might take longer if you have more certificates than I do, though.) I haven’t had any problems with the default smallest memory size of 128 MB.

Lambda Role IAM Permissions

The IAM Role assigned to the Lambda function needs permissions in order to do things like update your Route53 entries and publish the updated certificate to S3. Give it the automatically generated AmazonLambdaBasicExecutionRole, and then you can attach a separate policy for the resources it needs, replacing the appropriate part of the Resource arns. Here’s what it needs access to:

  • Route53 for the zone where it can write the DNS challenge records to authenticate you own the domain.
  • Systems Manager Parameters (starting with acme-lambda-renewal/* unless you edit the code to get the parameters from somewhere else).
  • SNS Publish for any SNS topics that you want to get notifications on success for failure.
  • S3 permissions for the bucket to write the updated certificate and private key to.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Route53Global",
            "Effect": "Allow",
            "Action": [
                "route53:GetChange",
                "route53:ListHostedZones"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Route53Zone",
            "Effect": "Allow",
            "Action": [
                "route53:GetHostedZone",
                "route53:ChangeResourceRecordSets",
                "route53:ListResourceRecordSets"
            ],
            "Resource": [
                "arn:aws:route53:::hostedzone/YOUR-ZONE-ID"
            ]
        },
        {
            "Sid": "SSMParameters",
            "Effect": "Allow",
            "Action": [
                "ssm:GetParameters",
                "ssm:GetParameter"
            ],
            "Resource": [
                "arn:aws:ssm:us-east-1:YOUR-ACCOUNT:parameter/acme-lambda-renewal/*"
            ]
        },
        {
            "Sid": "SNSPublish",
            "Effect": "Allow",
            "Action": [
                "sns:Publish"
            ],
            "Resource": [
                "arn:aws:sns:us-east-1:YOUR-ACCOUNT:YOUR-SNS-TOPIC"
            ]
        },
        {
            "Sid": "S3",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-BUCKET/*"
            ]
        }
    ]
}

Configuration

Configuration is done through the AWS Systems Manager Parameter Store. The first thing one needs is a Let’s Encrypt Account Key. I had been using certbot and just grabbed my existing account key by digging through the files it uses to save its configuration, but it might be a better idea to just follow the acme library example code steps for “Create (or import) an Account Keypair” and “Create an ACME Subscriber Account” and then just save those for use. I don’t know if there’s any good reason to reuse an account for domains you are already using versus just making a new one when using a new platform for handling the renewal.

You need to define three parameters in the store:

  • /acme-lambda-renewal/accountId : This is your account id, as a JSON string. It looks a lot like a URL, something like this:
    "https://acme-v02.api.letsencrypt.org/acme/acct/00000000"
    (Since it’s a JSON string, don’t forget to include the surrounding quotation marks.)
  • /acme-lambda-renewal/accountKey : This is the account private key, which is in JSON format with weird cryptic single-letter fields like “e” and “d”. It should probably be stored as the “SecureString” type.
  • /acme-lambda-renewal/certificates : This is where you configure the certificates you want to get. It’s a JSON Array, where each element is a JSON Object with information about one certificate. The parameters of each JSON Object are:
    • "domains" : an array of strings of the domain names to renew.
    • "zones" : an object, where the keys are domain names that you have as a hosted zone in Route 53, and the value is that domain’s Zone ID.
    • "keyType" : Choose “EC” if you want to use a new-hotness Elliptic Curve key, or “RSA” if you want the trusted-standby RSA private key. (Almost everything supports elliptic curves now, but you might need to use RSA depending on if elliptic curve encryption is supported by whatever is going to connect to your server.)
    • "certStorageBucketName" : The name of the bucket in which to store the new private key and certificate.
    • "certStoragePrefix" : The prefix within the bucket to store the new key and certificate. There should not be a leading slash at the beginning. It is literally just a prefix though, so you probably want a trailing slash at the end. The filenames stored after the prefix are “privkey.pem” and “fullchain.pem”.
    • "successSnsTopicArn" (optional): The arn of the SNS topic to publish a message to on success.
    • "failureSnsTopicArn" (optional): The arn of the SNS topic to publish a message to on failure.

Here’s an example of what the certificates configuration might look like:

[
	{
		"domains": ["server.example.com","system.example.com"],
		"zones": {"example.com": "Z9ZZZ9ZZ9ZZZZZ"},
		"keyType": "EC",
		"certStorageBucketName": "mybucket",
		"certStoragePrefix": "example/certs/",
		"successSnsTopicArn": "arn:aws:sns:us-east-1:000000000000:cert-renewal-success",
		"failureSnsTopicArn": "arn:aws:sns:us-east-1:000000000000:cert-renewal-failure"
	},
	{
		"domains": ["server.myname.test"],
		"zones": {"myname.test": "A9AAAA99Z99999"},
		"keyType": "RSA",
		"certStorageBucketName": "mybucket",
		"certStoragePrefix": "myname/certs/",
		"successSnsTopicArn": "arn:aws:sns:us-east-1:000000000000:cert-renewal-success",
		"failureSnsTopicArn": "arn:aws:sns:us-east-1:000000000000:cert-renewal-failure"
	},
]

Scheduling

Note that this isn’t “smart” like certbot, which is expecting to be run every day and tries to figure out first if the certificates are about to expire. All this function does is that every time you run it, it just renews the certificates. (I was trying to make it as simple as possible.) So just schedule the Lambda to run using an Amazon EventBridge Rule with a cron expression for running it every two months (since the Let’s Encrypt certificates last for three months, and renewing one month early is a good reasonable practice matching certbot) and you’re all set. A cron expression like “23 7 10 2/2 ? *” will run it at 7:23 AM UTC on the 10th of every even-numbered month, though you should probably pick a different time so not everybody is updating their certificates at once. But if there’s a failure you probably need to manually rerun it to renew your certificates. If one wanted to get fancy it might be possible to use AWS Step Functions or something to schedule renewals and retry failures automatically and all that.

Feedback

Feel free to email me at pete@cooperjr.name with any feedback. I can’t promise I’ll know any answers if you have questions, since this whole thing may only be worth what you paid for it, but I’d be quite amused to know if someone else found this useful.