May 20, 2026
Three peerings, $10 a month: when VPC Peering beats Transit Gateway
We run a small AWS Organization. One ops/internal account holds a handful of shared internal services. Three workload accounts (dev, staging, prod) each have their own VPC and need to reach those services. Single region, us-east-1, three AZs each. The AWS reference architectures all point at Transit Gateway. I priced it out against our actual traffic and went with VPC Peering instead. Here’s the reasoning and the numbers.
The problem
Four AWS accounts under one Organization:
- ops-internal: shared services VPC,
172.16.0.0/16. Hosts a small set of internal tools — observability, secrets management, internal connectivity control plane. Three services in total. - dev:
172.17.0.0/16 - staging:
172.18.0.0/16 - prod:
172.19.0.0/16
The workload VPCs need to reach the ops VPC. They do not need to reach each other — dev shouldn’t see prod, prod shouldn’t see dev. Hub-and-spoke, ops VPC at the centre.
Constraints:
- Small team. Whatever I pick, I’m the one maintaining it.
- Single region, no expansion plans for at least a year.
- Cost-sensitive. We’re a startup; every recurring line item on the AWS bill needs to justify itself.
- All four accounts already under AWS Organizations, so cross-account IAM is straightforward.
That’s the setup. The interesting decision is the connectivity layer.
The options I considered
| Option | What it is | Sweet spot | Cost model |
|---|---|---|---|
| VPC Peering | Direct 1:1 link between two VPCs | Few VPCs, no transitive routing | $0/hr, data transfer only |
| Transit Gateway | Regional router, hub-and-spoke | Many VPCs, transitive routing, central inspection | $0.05/hr per attachment + $0.02/GB |
| Site-to-Site VPN | IPSec tunnels | Hybrid (on-prem ↔ AWS) | $0.05/hr per connection + data out |
| PrivateLink | NLB-fronted service endpoints | Exposing specific services across accounts | $0.01/hr per endpoint per AZ + $0.01/GB |
| VPC Lattice | Application-layer service mesh | Many services, identity-aware auth | $0.025/hr per service + $0.025/GB |
| Userspace overlay (Tailscale-style) | WireGuard mesh on top of any underlay | App-layer connectivity for participating hosts | Free / self-hosted |
The overlay option is worth a separate note. A WireGuard mesh doesn’t replace VPC peering; it sits on top of whatever underlay you have, and only hosts that join the mesh can use it. For machine-to-machine traffic between EC2 instances that don’t run the agent — Prometheus scrapes, internal API calls — you still need VPC-level connectivity. Ruling it out as the only connectivity layer was easy: too many things would need to be mesh-aware.
Site-to-Site VPN is for hybrid (your data centre to AWS). Using it intra-AWS is overpriced and wrong-shaped — you’d pay for tunnels and customer-gateway operations to solve a problem AWS already solves with peering. I’m including it for completeness.
That leaves four serious contenders: Peering, TGW, PrivateLink, Lattice.
Why I picked VPC Peering
Four VPCs in hub-and-spoke means three peering connections. That’s it. The n*(n-1)/2 scaling warning everyone repeats only bites when every VPC needs to talk to every other VPC. In hub-and-spoke with one hub, it’s n-1.
Tradeoffs I knowingly accepted:
- No transitive routing. If dev ever needs to reach staging, I’d have to add a fourth peering or rethink. Today it doesn’t, and today’s problem is the only problem I’m solving.
- Manual route table entries on both sides. Both VPCs in a peering need explicit routes pointing at the
pcx-*ID. Forget one side and you get a silent black-hole (more below). - CIDRs must not overlap. I planned the address space up front — adjacent /16s under a /14 supernet — so this was a one-time cost.
- No central egress, inspection, or firewall hop. Acceptable for us; we don’t have a SecOps team mandating a transit inspection point.
The thing that tipped it: at four VPCs, TGW’s attachment fee alone runs ~$146/month before any data charge. Peering is $0/month at rest. The “TGW scales better” argument is true, but it’s not free — and we’re not at the scale where the operational simplicity is worth $146/month.
The setup walkthrough
Three Terraform highlights. The pattern repeats for each spoke.
Provider aliases for cross-account. The peering connection lives on the requester (ops) side; the accepter resource lives on the workload side. Both need explicit providers — my first cut had the aws_vpc_peering_connection_accepter running under the requester provider by mistake. Terragrunt happily applied; the accepter never ran:
provider "aws" {
alias = "ops"
region = "us-east-1"
assume_role { role_arn = "arn:aws:iam::${var.ops_account_id}:role/TerragruntExec" }
}
provider "aws" {
alias = "workload"
region = "us-east-1"
assume_role { role_arn = "arn:aws:iam::${var.workload_account_id}:role/TerragruntExec" }
}
Requester side (ops account). Don’t set peer_region for same-region peerings — that argument is for inter-region only, and setting it switches AWS into cross-region mode (different pricing, different option-block rules). The auto_accept flag here only matters when both VPCs are in the same account; for cross-account you must declare a separate aws_vpc_peering_connection_accepter resource under the accepter’s provider:
resource "aws_vpc_peering_connection" "ops_to_workload" {
provider = aws.ops
vpc_id = aws_vpc.ops.id
peer_vpc_id = var.workload_vpc_id
peer_owner_id = var.workload_account_id
auto_accept = false
# no peer_region — same-region peering
tags = { Name = "ops-to-${var.workload_env}" }
}
Accepter side (workload account):
resource "aws_vpc_peering_connection_accepter" "workload_from_ops" {
provider = aws.workload
vpc_peering_connection_id = aws_vpc_peering_connection.ops_to_workload.id
auto_accept = true
tags = { Name = "from-ops" }
}
Route table entries — both sides:
resource "aws_route" "ops_to_workload" {
provider = aws.ops
route_table_id = aws_route_table.ops_private.id
destination_cidr_block = var.workload_vpc_cidr
vpc_peering_connection_id = aws_vpc_peering_connection.ops_to_workload.id
}
resource "aws_route" "workload_to_ops" {
provider = aws.workload
route_table_id = var.workload_private_rt_id
destination_cidr_block = aws_vpc.ops.cidr_block
vpc_peering_connection_id = aws_vpc_peering_connection.ops_to_workload.id
}
The gotcha that ate 30 minutes of my life. First peering went ACTIVE. Security groups were configured. Traffic black-holed. No error, no ICMP unreachable, just silent packet drops. I’d forgotten the route table entry on the accepter side. The state of the peering and the state of routing are independent; AWS will happily report the peering healthy while none of the actual traffic has anywhere to go.
Second gotcha: DNS. Internal hostnames resolved to public IPs from the workload accounts until I enabled remote DNS resolution. For cross-account peerings the option has to be set from each side’s account — the requester can’t set the accepter-side flag and vice versa. That means two separate resources, each under the right provider, each referencing the right peering ID (the accepter resource references the accepter’s id, not the requester’s):
resource "aws_vpc_peering_connection_options" "requester" {
provider = aws.ops
vpc_peering_connection_id = aws_vpc_peering_connection.ops_to_workload.id
requester { allow_remote_vpc_dns_resolution = true }
}
resource "aws_vpc_peering_connection_options" "accepter" {
provider = aws.workload
vpc_peering_connection_id = aws_vpc_peering_connection_accepter.workload_from_ops.id
accepter { allow_remote_vpc_dns_resolution = true }
}
In hindsight, I should have built a peering-pair module on day one — requester, accepter, both route entries, both DNS option blocks, all behind one set of inputs. I inlined the first because “it’s just one peering,” copy-pasted for the second and third, and now I’m three peerings in with the duplication still sitting there. Refactoring would mean a state-move dance I haven’t prioritised. Classic.
Cost comparison with real numbers
Scenario: 3 workload VPCs ↔ 1 shared services VPC, us-east-1, ~500 GB/month cross-VPC traffic, 3 AZs. All prices link to AWS pricing pages so you can re-derive when (not if) they change.
VPC Peering. $0/hr for the connection itself. Data transfer is the only line item. AWS bills cross-AZ transfer twice — $0.01/GB on the sender’s account (out) plus $0.01/GB on the receiver’s account (in), so $0.02/GB combined per GB transferred. At ~500 GB/month across all three peerings: ~$10/month. Lower if you pin services to AZs, but I’ll keep the worst case. (VPC pricing)
Transit Gateway. 4 attachments × $0.05/hr × 730 hr = $146/month for attachments alone. Plus $0.02/GB processed × 500 GB = $10. ~$156/month before anything else. The same ~$10 cross-AZ data transfer applies to TGW too — the real differential is the $146/month in attachment fees, not the data layer. (TGW pricing)
Site-to-Site VPN. $0.05/hr per connection × 730 hr × 3 connections = $109.50/month, plus data transfer out and the operational burden of customer gateways. ~$110/month and the wrong tool for intra-AWS. (VPN pricing)
PrivateLink. Per-endpoint pricing escalates fast at multiple services. Three services × 3 AZs × $0.01/hr × 730 = $65.70/month per consumer VPC. Three workload VPCs consuming = $197/month for endpoints alone, plus $0.01/GB × 500 GB = $5. ~$202/month. (PrivateLink pricing)
VPC Lattice. $0.025/hr per service × 3 services × 730 = $54.75/month, plus $0.025/GB × 500 GB = $12.50. ~$67/month. (Lattice pricing)
| Option | Monthly cost (this scenario) |
|---|---|
| VPC Peering | ~$10 |
| VPC Lattice | ~$67 |
| Site-to-Site VPN | ~$110 |
| Transit Gateway | ~$156 |
| PrivateLink | ~$202 |
Break-even with TGW. Data-transfer per GB is roughly the same either way (~$0.02/GB). The decision swings entirely on the attachment fee vs. the operational pain of n-1 peerings (hub-and-spoke) or n*(n-1)/2 (full mesh). My rule of thumb: above ~5 hub-and-spoke VPCs, or any full-mesh requirement, switch to TGW. Below that, peering wins on cost by an order of magnitude and the operational delta is “two more route table entries.”
When you should NOT use peering
- More than ~5 VPCs. The
n*(n-1)/2curve makes management painful even in a hub-and-spoke if any spoke needs to reach another spoke. - Transitive routing. Peerings don’t transit. If A↔B and B↔C, A still can’t reach C without A↔C. TGW solves this natively.
- Frequent CIDR changes. Every change is a coordination event across accounts.
- Central egress or inspection. If compliance requires an inspection VPC or central NAT, you need TGW or a transit VPC pattern.
- Cross-region at scale. Inter-region peering works, but data costs and operational overhead climb fast. TGW peering across regions is the better shape.
- Service-level access control instead of network-level. That’s PrivateLink or Lattice territory.
Takeaways
- Default-to-TGW is a cargo-cult choice for small teams. The AWS reference architectures are written for enterprises with dozens of accounts and dedicated network teams. You are (probably) not them.
- The “peering doesn’t scale” warning is real but mis-stated. It doesn’t scale past ~5 VPCs in a full mesh. At 4 VPCs in hub-and-spoke, it’s three connections — boring, cheap, done.
- Always compute connectivity cost on your actual traffic profile. The break-even point is dominated by attachment hours, not data — and AWS’s calculator defaults will steer you toward TGW even when you’d save $140/month with peering.
- Build the peering-pair as a reusable Terraform module on day one. Don’t inline the first one because “it’s just one.” I did, and I’m still paying for it in copy-paste.
- The silent failure modes are the real cost. Peering plus missing route table entries plus missing DNS option flags will burn an afternoon. Capture those in the module and never think about them again.