Skip to main content

#831 Resource Naming (draft)

  • Author(s):: @Chriscbr
  • Submission Date: {2022-08-22}
  • Stage: Draft
  • Stage Date: {YYYY-MM-DD}

How should the Wing SDK assign "physical names" to resources on external cloud providers?

Background

In Wing's programming model, every resource has an ID ("resource ID") that is unique among its siblings. In CDK frameworks like AWS CDK or cdktf, this ID is a required, second parameter of every constructor, while in Wing the resource ID can be customized by the user through a separate syntax.

Resources can be organized into trees by specifying some resources as children of other resources. This implies every resource has a path, which is guaranteed to be unique among all resources in the application. For example, below is the tree representation of an application with two buckets that have the same resource ID ("Images"), but unique paths ("root/live/Images" and "root/backup/Images"):

.
└── root/
├── live/
│ └── Images (cloud.Bucket)
└── backup/
└── Images (cloud.Bucket)

Wing uses cdktf to generate a Terraform definition for every resource that corresponds to a actual cloud resource (typically, these are leaf nodes in the construct tree).

When a resource is declared in Terraform, two kinds of identifiers are typically required:

  1. A "logical ID" for Terraform to track the resource and its state. This is always required. Fortunately by using cdktf, the logical ID is automatically generated for us based on the resource ID.
  2. A "physical name" that is used to identify the resource when it is deployed to a cloud provider. Different cloud providers have different requirements for resource names (described below), and some resources even allow the name to be omitted (and a unique name will be automatically generated by the cloud provider).

This RFC primary concern is how the Wing SDK should choose values for (2), the resource's physical name.

Some examples of naming requirements for resources on different clouds:

  • AWS S3 Bucket allows a full bucket name, a bucket prefix, or neither to be provided. If a bucket name is provided, it must follow several naming requirements, including uniqueness among all AWS accounts. If no name or a prefix is provided, then the bucket's name will be automatically generated.
  • AWS Lambda requires a function name, which only needs to be unique within that AWS account + region.
  • Azure CosmosDB requires a name to be provided (as well as the account name and resource group name). Azure resources must be named according to their naming requirements.
  • Google GCP cluster requires a name to be provided. GCP resources must be named according to their naming requirements.

Proposal

In order for names to "just work" for Wing users across a multitude of target platforms, the physical resource names generated by the Wing SDK should:

  • satisfy cloud provider naming requirements (some cloud providers restrict names to as few as 20 characters, and to alphanumeric characters)
  • be unique among all resources in the application - two of the same resource with different paths in the resource tree should always end up with different physical names
  • be deterministic - a resource of a given type with a given path should always end up with the same physical resource name, so that Terraform resources do not get replaced incidentally
  • optionally, be human-readable for debugging purposes

For most resources, the strategy used by the Wing SDK will be to generate a physical resource name based on the path, applying truncations and adding hashes where needed to ensure the determinism / uniqueness properties described above. The proposed algorithm included in the Example section is based on the resource naming algorithm used in cdk8s (source).

Some resources will be treated specially where necessary. For example, some resources like AWS S3 Bucket or GCP Project ID need to be globally unique among all AWS accounts / GCP projects. For this resource, we can use the user-provided resource id as a bucket prefix, and let terraform generate the name.

Example

The user has a cloud function in an application named "PageParser". The resource's full path is:

my-demo-app/Dev/default/ImageExtractor/default/ImageScraper/default/PageParser/default

When compiling to GCP, the physical resource name must be no more than 63 characters and be a valid DNS name, so the generated name could be:

y-demo-app-dev-imageextractor-imagescraper-pageparser-c8ceb89a

When compiling to AWS, the physical resource name must contain only letters, numbers, hyphens, or underscores, so the name generated above would also be valid.

In environments where hyphens and underscores are not allowed, we can strip out these characters and still end up with a tree-unique string.

A few notes:

  • The resource includes a short hash at the end. This is necessary so that if resource IDs like "Images@" and "Images%" have their special characters removed, their hashes will still be different, ensuring the physical resource names are unique.
  • Wing is based on the constructs programming model, which has a convention that a child named "default" or "resource" has special significance as a default child. We can safely omit this information.
  • When a resource name gets too long, information gets trimmed from the beginning instead of the end, in order to ensure we still generate unique names for resources with very long paths.

Note: Since we have flexibility in what characters to include besides the hash, we may be able to customize the logic to be smarter so that e.g. names of intermediate resources get truncated instead of the prefix or the suffix.

FAQ

Q: What if I want to deploy a second copy of my application on the same cloud region / account / cluster / etc.?

A: In this case, you need to change your application code so that the resources have different paths. We recommend accomplishing this by grouping your application's resources into a parent/grouping resource. Then, create this resource in the main() function with a different id. This way, the paths of all sub-resources will be unique. For example:

// pseudocode
resource MyApp {
new() {
bucket = Bucket() as "Images"
uploader = Function(...) as "Uploader"
}
}

fn main() {
dev = MyApp() as "dev"
staging = MyApp() as "staging"
prod = MyApp() as "prod"
}

Q: Why do resource names include the entire path, rather than just the resource's id in a particular scope?

A: Since the proposed naming scheme does include hashes (generated from the construct's path), it is possible we could generate the hash based on the entire path, and only include the last part of the path in the names. For example, in the PageParser example we could instead generate pageparser-c8ceb89a. However, the extra information might be helpful for advanced users that want to debug resources through cloud providers or error messages generated by cloud providers that include resource names.

Q:: How can a user override the physical name if they want complete control of the resource's naming? A: The exact mechanism is out of the scope of the RFC, but we plan to include an API for performing this kind of "escape hatch" like:

bucket = Bucket();
bucket.overridePhysicalName("use-this-bucket-name-exactly");