Skip to main content

Avoiding mutability pitfalls in constructs-based API design: part 2

· 9 min read
Chris Rybicki

Hey there! 👋 My name's Chris and I'm a software engineer on the Wing Cloud team. Lately I've been helping out building Wing's compiler, and designing APIs for Wing's standard library.

In the first post in this series, I introduced some of the challenges of designing APIs for constructs and frameworks like AWS CDK, CDKTF, and cdk8s when mutation is involved.

To recap, a construct can have public methods that mutate the objects' private state. But if this state is replaced or destroyed, then application code becomes more sensitive to the order of statements and method calls. This is usually undesirable when our constructs are modeling declarative information, like infrastructure configuration.

To that end, we proposed two solutions for designing construct methods:

  1. Only add state, never subtract or update
  2. Document destructive APIs

These measures are great for addressing a lot of our concerns with mutation. But as we'll see, mutation has wider effects than just the design of methods.

Sharing construct state through properties

Another key capability of most constructs is that they can expose parts of their state for other constructs to use through properties. We can see how they're used in an example from the AWS CDK framework below, written in Wing:

bring "aws-cdk-lib" as cdk;

let table = new cdk.aws_dynamodb.Table(
partitionKey: {
name: "path",
type: cdk.aws_dynamodb.AttributeType.STRING
}
) as "hits";

let handler = new cdk.aws_lambda.Function(
runtime: cdk.aws_lambda.Runtime.NODEJS_14_X,
handler: "hitcounter.handler",
code: cdk.aws_lambda.Code.fromAsset("./lambda"),
environment: {
HITS_TABLE_NAME: table.tableName
}
);

The construct named Table has a public property named tableName that stores the table's physical name for identifying it on AWS. The table's property tableName is passed as the HITS_TABLE_NAME environment variable so that the AWS Lambda function can use the table's dynamic name at runtime -- for example, to query the table (not shown).

Any construct state that isn't meant to be a private implementation detail can be made public. But, as we've mentioned before, it's also possible for construct state to change after it was first initialized in the code.

Uh oh - this smells like a recipe for problems.

When properties get stale

Let's understand what causes properties to not play well with mutating methods through an example. I'll start by taking my Flower class from the previous post and adding options to specify the regions in the world where it's natively found. (Note that most of the code snippets here on out are in TypeScript.)

class Flower extends Construct {
constructor(scope, id, props) {
super(scope, id);
this._kind = props.kind;
this._nativeRegions = props.nativeRegions;
}

addNativeRegion(region) {
this._nativeRegions.push(nativeRegion);
}

toJson() {
return {
id: this.node.path,
kind: this._kind,
nativeRegions: this._nativeRegions,
};
}
}

I've prefixed the instance fields with underscores to indicate that they're not meant to be accessed outside of the class's implementation. (JavaScript technically supports private class members, but it's a somewhat recent addition, so you don't find them in the wild too often.1)

Here's how the updated construct is used:

let flower = new Flower(garden, `tulip`, {
kind: "tulip",
nativeRegions: ["Turkey", "Greece"],
});
flower.addNativeRegion("Romania");

Everything's good so far. If we try synthesizing a garden.json file with the new Flower, it will output the flower's definition in JSON as we expect:

[
{
"id": "root/rose",
"kind": "rose",
"color": "red",
"nativeRegions": [
"Denmark"
]
},
// ... rest of the garden data
]

Now let's say we add the capability for users to get the native regions of a flower. I'll also add a construct for representing a signpost in front of our garden.

class Flower extends Construct {
get nativeRegions() {
return [...this._nativeRegions];
}

// ... rest of the class unchanged
}

class Signpost extends Construct {
constructor(scope, id, props) {
super(scope, id);
const allRegions = new Set(props.flowers.flatMap((f) => f.nativeRegions));

this._message = "Welcome to Tulip Trove, home to flowers from: ";
this._message += [...allRegions].join(", ");
this._message += ";";
}

toJson() {
return {
id: this.node.path,
message: this._message,
};
}
}

Inside Signpost, I'm collecting all of the native regions of the flowers passed to the signpost, de-duplicating them, and embedding them into a friendly message.

Finally, I'll write some client code that tries using the signpost with some flowers:

const garden = new Garden(undefined, "root");

// add a flower
const rose = new Flower(garden, "rose", { kind: "rose", color: "red" });
rose.addNativeRegion("Denmark");

// add a signpost
new Signpost(garden, "signpost", { flowers: [rose] });

// add more regions to our first flower
rose.addNativeRegion("Turkey");
rose.addNativeRegion("Greece");

garden.synth();

When I synthesize my garden with node garden.js, I'm expecting the signpost to have a message like "Welcome to Tulip Trove, home to flowers from: Denmark, Turkey, Greece". But when I check garden.json, I find the signpost message only mentions Denmark:

[
{
"id": "root/rose",
"kind": "rose",
"color": "red",
"nativeRegions": [
"Denmark",
"Turkey",
"Greece"
]
},
{
"id": "root/signpost",
"message": "Welcome to Tuple Trove, home to flowers from: Denmark."
}
]

Aw shucks.

The problem, as you may have guessed, is that the state read by Signpost was stale. Since the signpost's message was calculated immediately, it wasn't changed when the rose's native regions were added to.

But in some sense, it's not entirely Signpost's fault - how was it supposed to know the field could change? It doesn't seem right to have to look at the implementation of Flower in order to determine whether the data will be calculated later or not. We need a better way.

Laziness is a virtue

The approach we're going to take to solve this problem is to add support for a way of modeling values that aren't available yet, called Lazy values.

Each construct framework has a slightly different way of doing this, but the general idea is that instead of returning some state that could become stale, as we did here in Flower:

class Flower extends Construct {
get nativeRegions() {
return [...this._nativeRegions];
}

// ... rest of the class unchanged
}

... we will instead return a Lazy value that promises to return the correct value:

class Flower extends Construct {
get nativeRegions() {
return new Lazy(() => [...this._nativeRegions]);
}

// ... rest of the class unchanged
}

Representing delayed values with lazy values (sometimes called "thunks") is a well-trodden path in the history of computer science, which sees popular use in all kinds of frameworks. React's useEffect hook is a good example of this pattern being used in one of the most popular web frameworks.

If we were using TypeScript for these examples, we would also model this with a different type. Instead of the nativeRegions getter returning Array<string>, it will return Lazy<Array<string>>. This extra Lazy "wrapper" matches up with the fact that to access the value stored inside, we have to write some extra code to unwrap it.

Now let's update Signpost to make it work with the fixed Flower construct:

class Signpost extends Construct {
constructor(scope, id, props) {
super(scope, id);

this._message = new Lazy(() => {
const allRegions = new Set(props.flowers.flatMap((f) => f.nativeRegions.produce()));

let message = "Welcome to Tuple Trove, home to flowers from: ";
message += [...allRegions].join(", ");
message += ".";
return message;
});
}

// toJson unchanged
}

Since nativeRegions is a Lazy value, and the message depends on nativeRegions, it's clear that the message also needs to be a Lazy value -- so in the code above, we've wrapped it in new Lazy(() => { ... }).

Besides that, we also have to call produce() on the Lazy value in order to force its value to be computed. In the example above, I've replaced f.nativeRegions with f.nativeRegions.produce().

The core implementation of Lazy requires some changes to Garden as well, but they're not too interesting to look at. But if you're curious, the code from this post in its entirety is available as a gist here2 for your perusal.

Ideas for making Lazy less complicated

Lazy values can be pretty powerful -- but one thing holding them back is the ergonomics of using them. In the code above, we saw that in order to create a Lazy value, the code for producing the value had to be wrapped in this clunky new Lazy(() => { ... }) syntax.

But even with that aside, we have also potentially introduced new issues, because of this fact:

Lazy.produce() should only be called inside of other Lazy definitions

If we tried calling f.nativeRegions.produce() directly inside of Signpost's constructor, we'd obtain a list of native regions that could get stale, putting us back at square one. The only way to guarantee we're using Lazy properly is if it's only evaluated at the end of our app, when we call garden.synth().

In addition, having to call produce() on each Lazy is tedious and it's easy to forget.

But perhaps... there's a better way?

It turns out the issues I've described above (like checking for errors in your code, and automatically generating code for you) are the kinds of problems that compilers are perfect for!

We don't have an RFC available yet, but it's possible in a future version of Wing, the language could have built-in support for safe and easy Lazy usage:

// in Wing...

class Flower {
// ...

get nativeRegions(): Lazy<Array<str>> {
// easier syntax!
return lazy { this._nativeRegions.copy() };
}
}

class Signpost {
new(props) {
this._message = lazy {
let allRegions = Set<string>.from(
// no need to call .produce() manually - it's automatically called
// since this code is inside a `lazy { ... }` block
props.flowers.flatMap((f) => f.nativeRegions)
);

let var message = "Welcome to Tuple Trove, home to flowers from: ";
message += allRegions.toArray().join(", ");
message += ".";
return message;
};
}
}

What do you think? Let us know on our GitHub or Slack if you have any thoughts or feedback about the ideas in this post, or if you have suggestions for new topics!