Friday Fundamentals: Terraform Resource Sleep


Friday Fundamentals: Terraform Resource Sleep

Today when I was polishing up the latest CloudWarGames.com challenge when I came across a resource that was erroring out because another resource had spun up but had not fully propagated out to be accessed by the other services.

I cannot tell you which one or that might ruin the challenge but another common situation that I see this happening with is ACM Certificate Verification’s Route53 Records. Sometimes those take a minute to finish propagating but other resources just see that Terraform says they have been spun up so they try to reference the ACM cert but the ACM cert has not been verified.

Normally for my internal stuff I don’t worry about this because to me it is painfully obvious and it is solved by just waiting a few seconds and re-running terraform apply.

But today, since there are possibly hundreds of people that might be spinning up today’s CWG challenge, I decided to try and make sure they wouldn’t see any errors.

How did I accomplish this?

With a simple Terraform time_sleep resource.

Let’s say you are booting up an ACM cert and a route 53 resource, you could add a time_sleep resource that depends_on the Route53 record finishing.

To avoid the APIGateway blowing up because the ACM cert has not had time to validate I then can add a depends_on to the aws_api_gateway_domain_name that waits for the time_sleep to finish waiting.

This ideally will give ACM enough time to find the newly booted up Route53 validation DNS and mark itself as Verified before trying to boot up the APIGateway Domain Name.

resource "aws_acm_certificate" "cloudwargames_com_cert" {
  domain_name       = aws_route53_zone.cloudwargames_com.name
  subject_alternative_names = ["*.${aws_route53_zone.cloudwargames_com.name}"]
  validation_method = "DNS"


  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_route53_record" "aws_acm_certificate_route53_record" {
  for_each = {
    for dvo in aws_acm_certificate.cloudwargames_com_cert.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = aws_route53_zone.cloudwargames_com.zone_id
}


resource "time_sleep" "wait_30_seconds" {
  depends_on = [aws_route53_record.aws_acm_certificate_route53_record]

  create_duration = "30s"
}


resource "aws_api_gateway_domain_name" "api_gateway_domain_name" {
  depends_on = [time_sleep.wait_30_seconds]
  certificate_arn = aws_acm_certificate.cloudwargames_com_cert.arn
  domain_name     = var.domain_name
}

It is one of those IoC race conditions that just adds enough friction to be a headache.

Is it absolutely necessary? No.

Might it slow up a deployment? If you do hit the race condition then no. If you don't, then it’s a few seconds of your life you won’t get back again.

I mainly add stuff like this to make the lives easier for the devs I train or do advisory and oversight with.

Question For You:

What other simple IoC tricks do you use to make things go smoother?

PS: The newest CloudWarGames challenge is live and the next live event is February 28!

Here is a link to it.

For now the challenges are public but in the future you will need to have signed up at CloudWarGames.com to get them.