Lessons in Cloudformation Fn::ForEach
I did promise to get back to Fn::ForEach and convert a VPC template to see how much it would simplify and if there were other things that were not so obvious from documentation. It took a bit longer than expected but finally got it done, and managed to lear some new lessons during the process.
My original plan was to refactor some VPC templates using
ForEach
and to see how much it would help in simplifying the code. Due to time spend on debugging template transformations this was left for another post but I hope this would be helpful for others running into one of these speed bumps.
Now Go Build
This is a simple VPC structure I am going to build with template using Fn::ForEach
to iterate
over availability zones. To make it more flexible, there will be a parameter you can use to deploy
into AZs you want. It is an idea case for loop as every AZ is the same, you just choose if you want
to have 1, 2 or 3 copies. (or more if you modify AvailabilityZones
parameter validation)
If you are just interested how the result looks like here is the working template. But the interesting part is what it took to get there and what things didn’t work as one might have expected …
Unresolved Resource Dependencies
First thing I hit was the error Unresolved resource dependences
from a blocks where I was trying
to make reference to another resource created within the loop. At first look below code creating
a route table association to a subnet looks ok, but yet it doesn’t work.
PubSubnetRouting${X}:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PubRouteTable
SubnetId: !Ref PubSubnet${X} <-- THIS FAILS
The Problem is the SubnetId
parameter referencing to subnet created within the loop. When template
formating is checked, the Fn::ForEach
loop hasn’t yet been rolled out, so there is no PubSubnet${X}
.
Solution for this is to add a substitution that will resolve the real resource ID during the stack
creation and will pass template formating check.
PubSubnetRouting${X}:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
RouteTableId: !Ref PubRouteTable
SubnetId: !Ref <-- THIS WORKS
Fn::Sub: 'PubSubnet${X}' <-- THIS WORKS
Fn::GetAtt Layout is Incorrect
After fixing references to resources created in the loop, the next error was bit similar but not exactly the same. Here is the piece that creates NAT gateways with EIPs.
NatGw${X}:
Type: AWS::EC2::NatGateway
Properties:
SubnetId: !Ref
Fn::Sub: 'PubSubnet${X}'
AllocationId: !GetAtt
Fn::Sub: NatEip${X}.AllocationId <-- THIS FAILS
In this case the solution was to be found from Cloudformation documentation where there was
an exacty the same case of NAT gateway and EIP. Solution was not to use shorthand
notation for GetAtt
but orinal with separate parameters for resourcee and attribute.
NatGw${X}:
Type: AWS::EC2::NatGateway
Properties:
SubnetId: !Ref
Fn::Sub: 'PubSubnet${X}'
AllocationId: !GetAtt
- !Sub NatEip${X}
- AllocationId
S3 Endpoint
Sometimes you would want to get a list of resources created in the loop and use that as parameter for another resource. Unfortunately this isn’t possible. While using the loop didn’t simplify and allowed me to avoid most of code duplication for S3 endpoint it was the opposite. If you know all the route tables you want to insert a route to S3 gateway endpoint, you can do it with single resource.
S3vpcEndpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref VPC
RouteTableIds:
- !Ref PubRouteTable
- !Ref PrivRouteTableA
- !Ref PrivRouteTableB
- !Ref PrivRouteTableC
However now I had to create dedicated endpoint resource for each route table. This isn’t a big deal, unless you are getting close to the maximum template size. Here is how above was translated into refactored version.
S3vpcEndpointPublic:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref VPC
RouteTableIds:
- !Ref PubRouteTable
And then looping through all private route tables.
S3vpcEndpointPriv${X}:
Type: AWS::EC2::VPCEndpoint
Properties:
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcEndpointType: Gateway
VpcId: !Ref VPC
RouteTableIds:
- !Ref
Fn::Sub: PrivRouteTable${X}
You Can’t Update This
Template has a parameter to define how many and which AZs to deploy.
AvailabilityZones:
Type: CommaDelimitedList
Description: "Deploy to AZs"
Default: "a,b,c"
AllowedValues: [a,b,c]
'Fn::ForEach::AZ':
- X
- !Ref AvailabilityZones
Idea was that you could have updated the stack and deploy to additional AZs if needed. Removing AZs wasn’t in the plans as removing a subnet would have first required to empty all network interfaces from it, that would have been difficult in real-life.
Unfortunately this isn’t possible as changes into loop collection doesn’t trigger stack update. So for now this parameter can only be changed when stack is created to select which AZs it will deploy to. You can vote for this Github issue if you think this would deserve more attention. I already did.
Logical Resource IDs Must Be Alpha Numeric
In Cloudformation template logical resource IDs must be alpha numeric, ie. only capital (A-Z) and small (a-z) letters and numbers (0-9) are allowed. I didn’t have this problem with my template but it is very easy to see how this would become an issue if you wanted to iterate over a collection of IP addresses (with dots in them) or emails. There are some good ideas for solutions like exposing a loop index instead of using variable itself. This and other enhancement ideas can be found from this Github issue. You can either vote for it, or read it as a guidance of what is not possible (yet).
Was It Worth It?
Yes it was. While this was a rather long list of things that were either non-intuitive or
not possible at all, it was possible to remove a lot of duplicate code and cumbersome logic
that would have been necessary without Fn::ForEach
.
Ps. The Bug I ran into when I was testing this for the first time got fixed mid-Dec23.