TL;DR: This guide demonstrates how to replace manual AWS ClickOps with deterministic Infrastructure as Code using AWS CDK and TypeScript. You will learn how to architect environment-agnostic stacks and use
zodto strictly validate configuration objects at synthesis time to prevent deployment failures. It also covers designing dynamic, multi-environment VPCs that automatically adjust resources like NAT Gateways to optimize for both development costs and production availability.
⚡ Key Takeaways
- Use
zodto strictly validate environment configurations (like CIDR blocks and account IDs) to fail fast duringcdk synthrather than mid-deployment. - Build environment-agnostic infrastructure by passing configuration objects into generic stack classes instead of hardcoding separate stacks (e.g.,
StagingVpcStack). - Dynamically load target environments by passing context variables through the CDK CLI using
app.node.tryGetContext('env')(e.g.,cdk deploy -c env=prod). - Control VPC costs and high availability deterministically by adjusting resources via config, such as setting
natGateways: 0in dev and3in production. - Protect production account IDs from being hardcoded in repositories by injecting them via AWS Systems Manager (SSM) or CI/CD pipeline environment variables.
Your development velocity is being bottlenecked by ClickOps.
When engineering teams rely on manual AWS console modifications or ad-hoc shell scripts to provision infrastructure, scaling becomes fundamentally broken. A developer requests an S3 bucket or an SQS queue, and operations provisions it manually. Over time, Configuration Drift infects your architecture. Security group rules added in production are missed in staging, and the notorious "it works in staging, but fails in production" syndrome becomes a daily reality.
While Terraform has long been the industry standard for Infrastructure as Code (IaC), the cognitive load of context-switching between TypeScript (in your application layer) and HCL (HashiCorp Configuration Language) slows down full-stack teams. Furthermore, Terraform's rigid state management and domain-specific looping and conditional constructs (like count and for_each) can make dynamic environment provisioning cumbersome compared to using a general-purpose programming language.
The AWS Cloud Development Kit (CDK) solves this by allowing developers to define cloud infrastructure using expressive, familiar programming languages like TypeScript. However, treating infrastructure as code means you must apply software engineering principles—like dependency injection, interface segregation, and modularity—to your cloud resources.
In this guide, we will architect a production-grade, multi-environment AWS CDK deployment. We will build deterministic stacks, manage stateful resources safely, circumvent cross-stack reference locking, and automate deployments via CDK Pipelines.
Bootstrapping a Multi-Environment CDK Architecture
A common anti-pattern in early CDK adoption is hardcoding environment variables or creating entirely separate stack classes for staging and production (e.g., StagingVpcStack and ProdVpcStack). This defeats the primary purpose of reusable IaC.
Instead, your architecture must be environment-agnostic. The environment (dev, staging, prod) should simply be a configuration object passed into your generic stacks.
First, initialize the project:
npx cdk init app --language typescript
npm install zod
We use zod to strictly validate our environment configuration at synthesis time. If a developer forgets to add a crucial environment variable, the CDK app should fail immediately during cdk synth, rather than midway through a CloudFormation deployment.
Define your environment structure in lib/config/environment.ts:
import { z } from 'zod';
export const EnvironmentConfigSchema = z.object({
envName: z.enum(['dev', 'staging', 'prod']),
accountId: z.string().length(12),
region: z.string(),
vpcCidr: z.string().regex(/^([0-9]{1,3}\.){3}[0-9]{1,3}\/([0-9]|[1-2][0-9]|3[0-2])$/, { message: 'Invalid IPv4 CIDR notation' }),
maxAzs: z.number().min(2).max(3),
natGateways: z.number().min(0),
retainStatefulResources: z.boolean(),
});
export type EnvironmentConfig = z.infer<typeof EnvironmentConfigSchema>;
export const environments: Record<string, EnvironmentConfig> = {
dev: {
envName: 'dev',
accountId: '111122223333',
region: 'us-east-1',
vpcCidr: '10.0.0.0/16',
maxAzs: 2,
natGateways: 0, // Cost savings in dev
retainStatefulResources: false,
},
prod: {
envName: 'prod',
accountId: '444455556666',
region: 'us-east-1',
vpcCidr: '10.1.0.0/16',
maxAzs: 3,
natGateways: 3, // Highly available NATs in prod
retainStatefulResources: true,
},
};
Production Note: Never hardcode actual production account IDs in open-source or highly accessible repositories. Use AWS Systems Manager (SSM) or inject them via environment variables in your CI/CD pipeline.
Next, wire up the entry point in bin/app.ts to contextually load these environments.
#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import { BaseInfrastructureStack } from '../lib/stacks/base-infra-stack';
import { environments, EnvironmentConfigSchema } from '../lib/config/environment';
const app = new cdk.App();
// Retrieve environment context passed via CLI: cdk deploy -c env=prod
const targetEnv = app.node.tryGetContext('env') || 'dev';
const rawConfig = environments[targetEnv];
// Fail fast on misconfiguration
const config = EnvironmentConfigSchema.parse(rawConfig);
new BaseInfrastructureStack(app, `BaseInfra-${config.envName}`, {
env: { account: config.accountId, region: config.region },
config,
});
Designing Deterministic Networking Stacks
When building your Virtual Private Cloud (VPC), deterministic provisioning is critical. By default, the CDK's Vpc construct makes opinionated choices that might not align with enterprise compliance or budgets, such as deploying a NAT Gateway per Availability Zone (AZ).
In our design, we pass the parsed config object to dictate VPC behavior, saving thousands of dollars in non-production environments by minimizing NAT Gateways.
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import { EnvironmentConfig } from '../config/environment';
interface BaseInfrastructureProps extends cdk.StackProps {
config: EnvironmentConfig;
}
export class BaseInfrastructureStack extends cdk.Stack {
public readonly vpc: ec2.Vpc;
constructor(scope: Construct, id: string, props: BaseInfrastructureProps) {
super(scope, id, props);
this.vpc = new ec2.Vpc(this, 'MainVpc', {
ipAddresses: ec2.IpAddresses.cidr(props.config.vpcCidr),
maxAzs: props.config.maxAzs,
natGateways: props.config.natGateways,
subnetConfiguration: [
{
cidrMask: 24,
name: 'Public',
subnetType: ec2.SubnetType.PUBLIC,
},
{
cidrMask: 24,
name: 'PrivateCompute',
subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
},
{
cidrMask: 28, // Smaller CIDR for isolated data stores
name: 'PrivateIsolated',
subnetType: ec2.SubnetType.PRIVATE_ISOLATED,
}
],
});
// Enforce flow logs in production environments
if (props.config.envName === 'prod') {
this.vpc.addFlowLog('VpcFlowLogs', {
destination: ec2.FlowLogDestination.toCloudWatchLogs(),
trafficType: ec2.FlowLogTrafficType.ALL,
});
}
}
}
Managing Stateful Resources and Preventing Catastrophic Data Loss
Stateless resources like Lambda functions or ECS services can be destroyed and recreated safely. Stateful resources—such as databases, S3 buckets, and KMS keys—require strict lifecycle management.
A common failure mode in IaC is accidentally deleting a database during a stack refactor. When providing our DevOps & Cloud Deployment Services for enterprise clients, we enforce strict RemovalPolicy and DeletionProtection rules based on the environment context.
import * as cdk from 'aws-cdk-lib';
import * as rds from 'aws-cdk-lib/aws-rds';
import * as s3 from 'aws-cdk-lib/aws-s3';
import { Construct } from 'constructs';
import { EnvironmentConfig } from '../config/environment';
interface DatabaseStackProps extends cdk.StackProps {
config: EnvironmentConfig;
vpc: cdk.aws_ec2.IVpc;
}
export class DatabaseStack extends cdk.Stack {
constructor(scope: Construct, id: string, props: DatabaseStackProps) {
super(scope, id, props);
const removalPolicy = props.config.retainStatefulResources
? cdk.RemovalPolicy.RETAIN
: cdk.RemovalPolicy.DESTROY;
// Secure Document Storage
const storageBucket = new s3.Bucket(this, 'AppStorage', {
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
encryption: s3.BucketEncryption.S3_MANAGED,
enforceSSL: true,
versioning: props.config.retainStatefulResources,
removalPolicy,
autoDeleteObjects: !props.config.retainStatefulResources, // DANGEROUS in prod
});
// Database cluster
const cluster = new rds.DatabaseCluster(this, 'AppDatabase', {
engine: rds.DatabaseClusterEngine.auroraPostgres({
version: rds.AuroraPostgresEngineVersion.VER_15_4,
}),
vpc: props.vpc,
vpcSubnets: { subnetType: cdk.aws_ec2.SubnetType.PRIVATE_ISOLATED },
writer: rds.ClusterInstance.serverlessV2('Writer'),
serverlessV2MinCapacity: props.config.envName === 'prod' ? 2 : 0.5,
serverlessV2MaxCapacity: props.config.envName === 'prod' ? 16 : 2,
removalPolicy,
deletionProtection: props.config.retainStatefulResources,
});
}
}
Warning: Setting
autoDeleteObjects: trueprovisions a custom Lambda function that physically empties the bucket when the stack is destroyed. Never enable this for production environments, as a malicious or accidental stack deletion will irreversibly wipe all of your data.
Resolving Secrets and Decoupling Cross-Stack References
As your CDK application grows, you will inevitably need to pass references between stacks—for example, passing the VPC ID from the BaseInfrastructureStack to the DatabaseStack.
The default way CDK handles passing resources between stacks is by utilizing CloudFormation Exports. While this seems convenient, it introduces a severe architectural flaw: Cross-Stack Dependency Locking.
If Stack A exports a value that Stack B imports, CloudFormation strictly locks the resource in Stack A. You cannot modify, replace, or rename that resource in Stack A without first deploying Stack B to remove the dependency, which often creates an impossible Catch-22 during complex refactoring.
Instead of passing class properties directly and relying on exports, advanced CDK implementations decouple stacks using AWS Systems Manager (SSM) Parameter Store.
Here is how you write to SSM in the infrastructure stack:
import * as ssm from 'aws-cdk-lib/aws-ssm';
// Inside BaseInfrastructureStack constructor
new ssm.StringParameter(this, 'VpcIdParameter', {
parameterName: `/${props.config.envName}/network/vpc-id`,
stringValue: this.vpc.vpcId,
});
And here is how you dynamically resolve it in the dependent stack without creating a hard CloudFormation export:
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ssm from 'aws-cdk-lib/aws-ssm';
// Inside DatabaseStack constructor
const vpcId = ssm.StringParameter.valueFromLookup(
this,
`/${props.config.envName}/network/vpc-id`
);
const vpc = ec2.Vpc.fromLookup(this, 'ImportedVpc', {
vpcId: vpcId,
});
By relying on valueFromLookup, the CDK synthesizes the VPC ID directly into the template of the DatabaseStack at build time. There is no hard CloudFormation export coupling the two stacks, granting you absolute freedom to refactor your network stack without breaking deployment dependencies.
Implementing CDK Pipelines for Multi-Account Deployments
ClickOps ends when CI/CD begins. To enforce infrastructure determinism, developers should not deploy via cdk deploy from their local machines. Instead, deployment must run through an automated pipeline.
The aws-cdk-lib/pipelines module provides a powerful mechanism called Self-Mutating Pipelines. The pipeline automatically updates its own structure if you add new stages or stacks in your CDK code.
Create a pipeline-stack.ts that defines your automated route to production:
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { CodePipeline, CodePipelineSource, ShellStep, ManualApprovalStep } from 'aws-cdk-lib/pipelines';
import { AppStage } from './app-stage';
export class PipelineStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const pipeline = new CodePipeline(this, 'EnterprisePipeline', {
pipelineName: 'InfraDeliveryPipeline',
synth: new ShellStep('Synth', {
input: CodePipelineSource.connection('your-org/your-repo', 'main', {
connectionArn: 'arn:aws:codestar-connections:us-east-1:111122223333:connection/abc-123',
}),
commands: [
'npm ci',
'npx cdk synth'
],
}),
});
// Deploy to Staging Automatically
const stagingStage = new AppStage(this, 'StagingDeployment', {
env: { account: '111122223333', region: 'us-east-1' },
envName: 'staging'
});
pipeline.addStage(stagingStage);
// Deploy to Production with Manual Approval Guardrail
const prodStage = new AppStage(this, 'ProdDeployment', {
env: { account: '444455556666', region: 'us-east-1' },
envName: 'prod'
});
pipeline.addStage(prodStage, {
pre: [
new ManualApprovalStep('PromoteToProduction')
],
});
}
}
The AppStage encapsulates your BaseInfrastructureStack and DatabaseStack logic. When code is merged into the main branch, the pipeline automatically runs cdk synth, deploys the infrastructure to the Staging account, runs integration tests (if configured), and then halts at a manual approval gate before assuming an IAM role in the Production account to deploy the final stage.
The Path Forward
Transitioning to AWS CDK and TypeScript fundamentally alters how engineering teams approach cloud architecture. You stop treating AWS as a separate operational concern and start treating it as an extension of your application's codebase.
By defining rigorous configuration interfaces, securely handling stateful resources with contextual RemovalPolicies, circumventing CloudFormation export limitations using SSM, and locking deployments behind self-mutating CDK pipelines, you eliminate configuration drift entirely.
Your staging environments become perfect replicas of production. Onboarding new engineers shifts from a multi-day documentation exercise to a single npm ci && npx cdk synth.
Need help building this in production?
SoftwareCrafting is a full-stack dev agency — we ship fast, scalable React, Next.js, Node.js, React Native & Flutter apps for global clients.
Get a Free ConsultationFrequently Asked Questions
Why should development teams migrate from Terraform or ClickOps to AWS CDK with TypeScript?
Relying on manual AWS console changes (ClickOps) inevitably leads to configuration drift and broken deployments. While Terraform is an industry standard, AWS CDK eliminates the cognitive load of context-switching between application code and HCL by allowing developers to define infrastructure using expressive, familiar languages like TypeScript.
What is the best practice for structuring multi-environment AWS CDK deployments?
You should avoid the anti-pattern of creating entirely separate stack classes for different environments, such as StagingVpcStack and ProdVpcStack. Instead, build environment-agnostic stacks where the environment details are passed in as a configuration object. If your team needs expert guidance in setting up these reusable IaC patterns, SoftwareCrafting services can design and implement a robust, multi-environment CDK architecture for you.
How can I validate AWS CDK environment variables before deployment?
You can use a schema validation library like zod to strictly validate your environment configuration objects at synthesis time. This "fail fast" approach ensures that if a developer forgets a crucial variable, the app fails immediately during cdk synth rather than breaking midway through a CloudFormation deployment.
How do I target a specific environment when deploying an AWS CDK application?
You can pass environment context directly through the AWS CDK CLI using the -c flag, for example: cdk deploy -c env=prod. Inside your CDK app's entry point (typically bin/app.ts), you can retrieve this value using app.node.tryGetContext('env') to dynamically load the correct environment configuration.
How can I reduce AWS NAT Gateway costs in development environments using CDK?
By default, the CDK Vpc construct deploys a NAT Gateway per Availability Zone, which can quickly inflate cloud bills. You can override this deterministic behavior by passing an environment config object that sets NAT Gateways to 0 in development while keeping them highly available in production. If you need help optimizing your cloud networking and reducing AWS costs, SoftwareCrafting services offers expert DevOps consulting to streamline your infrastructure.
📎 Full Code on GitHub Gist: The complete
commands-1.shfrom this post is available as a standalone GitHub Gist — copy, fork, or embed it directly.
