I did promise to get back to Fn::ForEach and convert a VPC template to see how much it would simplify and if there were other things that were not so obvious from documentation. It took a bit longer than expected but finally got it done, and managed to lear some new lessons during the process.

My original plan was to refactor some VPC templates using ForEach and to see how much it would help in simplifying the code. Due to time spend on debugging template transformations this was left for another post but I hope this would be helpful for others running into one of these speed bumps.

Now Go Build

This is a simple VPC structure I am going to build with template using Fn::ForEach to iterate over availability zones. To make it more flexible, there will be a parameter you can use to deploy into AZs you want. It is an idea case for loop as every AZ is the same, you just choose if you want to have 1, 2 or 3 copies. (or more if you modify AvailabilityZones parameter validation)

If you are just interested how the result looks like here is the working template. But the interesting part is what it took to get there and what things didn’t work as one might have expected …

Unresolved Resource Dependencies

First thing I hit was the error Unresolved resource dependences from a blocks where I was trying to make reference to another resource created within the loop. At first look below code creating a route table association to a subnet looks ok, but yet it doesn’t work.

      PubSubnetRouting${X}:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PubRouteTable
          SubnetId: !Ref PubSubnet${X} <-- THIS FAILS

The Problem is the SubnetId parameter referencing to subnet created within the loop. When template formating is checked, the Fn::ForEach loop hasn’t yet been rolled out, so there is no PubSubnet${X}. Solution for this is to add a substitution that will resolve the real resource ID during the stack creation and will pass template formating check.

      PubSubnetRouting${X}:
        Type: AWS::EC2::SubnetRouteTableAssociation
        Properties:
          RouteTableId: !Ref PubRouteTable
          SubnetId: !Ref             <-- THIS WORKS
            Fn::Sub: 'PubSubnet${X}' <-- THIS WORKS

Fn::GetAtt Layout is Incorrect

After fixing references to resources created in the loop, the next error was bit similar but not exactly the same. Here is the piece that creates NAT gateways with EIPs.

      NatGw${X}:
        Type: AWS::EC2::NatGateway
        Properties:
          SubnetId: !Ref
            Fn::Sub: 'PubSubnet${X}'
          AllocationId: !GetAtt
            Fn::Sub: NatEip${X}.AllocationId <-- THIS FAILS

In this case the solution was to be found from Cloudformation documentation where there was an exacty the same case of NAT gateway and EIP. Solution was not to use shorthand notation for GetAtt but orinal with separate parameters for resourcee and attribute.

      NatGw${X}:
        Type: AWS::EC2::NatGateway
        Properties:
          SubnetId: !Ref
            Fn::Sub: 'PubSubnet${X}'
          AllocationId: !GetAtt
            - !Sub NatEip${X}
            - AllocationId

S3 Endpoint

Sometimes you would want to get a list of resources created in the loop and use that as parameter for another resource. Unfortunately this isn’t possible. While using the loop didn’t simplify and allowed me to avoid most of code duplication for S3 endpoint it was the opposite. If you know all the route tables you want to insert a route to S3 gateway endpoint, you can do it with single resource.

  S3vpcEndpoint:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
      VpcEndpointType: Gateway
      VpcId: !Ref VPC
      RouteTableIds:
      - !Ref PubRouteTable
      - !Ref PrivRouteTableA
      - !Ref PrivRouteTableB
      - !Ref PrivRouteTableC

However now I had to create dedicated endpoint resource for each route table. This isn’t a big deal, unless you are getting close to the maximum template size. Here is how above was translated into refactored version.

  S3vpcEndpointPublic:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
      VpcEndpointType: Gateway
      VpcId: !Ref VPC
      RouteTableIds:
      - !Ref PubRouteTable

And then looping through all private route tables.

      S3vpcEndpointPriv${X}:
        Type: AWS::EC2::VPCEndpoint
        Properties:
          ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
          VpcEndpointType: Gateway
          VpcId: !Ref VPC
          RouteTableIds:
          - !Ref
            Fn::Sub: PrivRouteTable${X}

You Can’t Update This

Template has a parameter to define how many and which AZs to deploy.

  AvailabilityZones:
    Type: CommaDelimitedList
    Description: "Deploy to AZs"
    Default: "a,b,c"
    AllowedValues: [a,b,c]
  'Fn::ForEach::AZ':
    - X
    - !Ref AvailabilityZones

Idea was that you could have updated the stack and deploy to additional AZs if needed. Removing AZs wasn’t in the plans as removing a subnet would have first required to empty all network interfaces from it, that would have been difficult in real-life.

Unfortunately this isn’t possible as changes into loop collection doesn’t trigger stack update. So for now this parameter can only be changed when stack is created to select which AZs it will deploy to. You can vote for this Github issue if you think this would deserve more attention. I already did.

Logical Resource IDs Must Be Alpha Numeric

In Cloudformation template logical resource IDs must be alpha numeric, ie. only capital (A-Z) and small (a-z) letters and numbers (0-9) are allowed. I didn’t have this problem with my template but it is very easy to see how this would become an issue if you wanted to iterate over a collection of IP addresses (with dots in them) or emails. There are some good ideas for solutions like exposing a loop index instead of using variable itself. This and other enhancement ideas can be found from this Github issue. You can either vote for it, or read it as a guidance of what is not possible (yet).

Was It Worth It?

Yes it was. While this was a rather long list of things that were either non-intuitive or not possible at all, it was possible to remove a lot of duplicate code and cumbersome logic that would have been necessary without Fn::ForEach.

Ps. The Bug I ran into when I was testing this for the first time got fixed mid-Dec23.