Enterprise Cloud Platform - Multi-Account AWS Foundation
Built 6-account AWS foundation with Transit Gateway, production EKS, and Terraform IaC. Reduced provisioning from 3 weeks to 2 days.
Project Overview
A mid-market media company (~250 engineers) needed to consolidate three separate platform teams, each managing their own AWS accounts with duplicated infrastructure. The company required a centralized cloud foundation that would enable teams to ship faster while maintaining security and compliance standards.
Key Stats:
- π 6 AWS accounts under unified AWS Organizations
- β‘ 93% faster provisioning: 3 weeks β 2 days for new environments
- π 50+ engineers actively using the platform daily
- π Zero manual configuration: 100% Infrastructure as Code
- π° Cost visibility: Per-team billing transparency
The Challenge
Technical Debt & Duplication
Three platform teams (advertising, data, content) were independently managing AWS accounts:
- Duplicated infrastructure: Each team rebuilt VPCs, security baselines, CI/CD pipelines
- No cross-account networking: Teams couldnβt communicate across accounts
- Manual configuration: Console-driven changes led to drift and inconsistencies
- 3-week provisioning time: New environments required extensive manual setup
- Security gaps: Inconsistent baselines across accounts
Business Impact
- Engineering teams blocked waiting for infrastructure
- Security team manually auditing each account
- No centralized cost visibility
- Compliance challenges (SOC 2 preparation)
Constraints
- Must support existing workloads: Zero disruption to production services
- Team autonomy: Platform teams needed ownership of their infrastructure
- Budget conscious: Solution must be cost-effective
- 6-month timeline: Hard deadline for SOC 2 audit
The Solution
Architecture
Built a multi-account AWS foundation using best practices from AWS Well-Architected Framework:
Account Structure:
AWS Organizations
βββ Management Account (billing, IAM roles, org policies)
βββ Security-Prod (CloudTrail, GuardDuty, Security Hub)
βββ Infrastructure-Prod (Transit Gateway, shared DNS, networking)
βββ Workloads-Dev (sandbox environment)
βββ Workloads-NonProd (testing environment)
βββ Workloads-Prod (production workloads)
Networking: Transit Gateway Hub-and-Spoke
- Central Transit Gateway in Infrastructure-Prod account
- All VPCs attach via TGW (no peering mesh complexity)
- Consistent CIDR allocation across accounts
- Public subnets (ALBs, NAT gateways), Private subnets (EKS nodes, apps)
- Cost: ~$0.02/GB cross-account traffic (operational simplicity worth the cost)
Security Baseline (Applied to All Accounts):
- Service Control Policies (SCPs): Deny unauthorized regions, enforce MFA, prevent S3 public access
- IAM Roles: 7 standard roles per account (Admin, Developer, DataEngineer, NetworkAdmin, SecurityAuditor, ReadOnly, GitHubActions)
- IAM Identity Center (SSO): Okta integration, no shared credentials
- CloudTrail: Centralized logging to Security-Prod account (90-day retention, encrypted)
- GuardDuty: Threat detection in all accounts, findings to Slack
#aws-infra-alerts - Security Hub: CIS benchmark scanning, AWS best practices compliance
Implementation Highlights
Phase 1: Foundation (Weeks 1-4)
- Designed multi-account strategy and account structure
- Created AWS Organizations with SCPs
- Deployed Transit Gateway networking hub
- Established security baseline (CloudTrail, GuardDuty, Security Hub)
- Configured IAM Identity Center with Okta SSO
Phase 2: Infrastructure as Code (Weeks 5-12)
- Multi-repo structure for ownership clarity:
ecp-ou-structure: AWS Org, IAM roles, SCPsecp-network: VPCs, Transit Gateway, route tablesecp-security: CloudTrail, GuardDuty, WAF- Team-specific repos for workload infrastructure
- Terragrunt for DRY configuration (reduced boilerplate by 90%)
- GitHub Actions CI/CD with OIDC (no long-lived access keys)
- Branch protection and code owner approval enforced via Terraform
Phase 3: Platform Services (Weeks 13-20)
- Deployed 2 production-ready EKS clusters (nonprod + prod)
- Implemented autoscaling, monitoring (CloudWatch + Prometheus)
- Created reusable Terraform modules for common patterns
- Documented runbooks and operational procedures
Phase 4: Migration & Training (Weeks 21-24)
- Migrated existing workloads from legacy accounts
- Trained platform teams on new workflows
- Established on-call rotation and incident response
- Conducted security audit and remediation
Technologies Used
- Cloud Platform: AWS (Organizations, Transit Gateway, EKS, VPC, IAM, CloudTrail, GuardDuty, Security Hub)
- Infrastructure as Code: Terraform, Terragrunt
- CI/CD: GitHub Actions with OIDC authentication
- Container Orchestration: EKS (Kubernetes 1.28)
- Monitoring: CloudWatch, Prometheus, Grafana
- Security: AWS IAM Identity Center, Okta SSO, GuardDuty
Results & Impact
Measurable Outcomes
- β±οΈ Provisioning Time: Reduced from 3 weeks to 2 days (93% improvement)
- πΌ Security Baseline Deployment: From 2 days to 1 hour (95% improvement)
- π Cross-Account Networking: From 1 week to 30 minutes (99% improvement)
- π Team Velocity: AdStack team deployed nonprod EKS cluster in 3 days (previously 3 weeks)
- π° Cost Visibility: Per-team billing enabled targeted cost optimization
- π Security Posture: 100% compliance with CIS AWS Foundations Benchmark
- π Platform Adoption: 50+ engineers actively using the platform within 2 months
Operational Benefits
- Self-service infrastructure: Platform teams deploy without central bottlenecks
- Automated security: Consistent baselines across all accounts
- Zero manual configuration: 100% Infrastructure as Code
- Real-time threat detection: GuardDuty findings routed to Slack
- Disaster recovery ready: All infrastructure reproducible from code
Client Testimonial
βGlenn architected and delivered our multi-account AWS foundation in 6 months. We went from teams independently rebuilding infrastructure to a shared platform with automated deployments. Provisioning dropped from 3 weeks to 2 days, and our security posture is audit-ready. The Terraform codebase is clean, documented, and our teams actually understand it.β
β VP of Engineering
Key Takeaways
What Worked
- Multi-repo structure: Clear ownership, no merge conflicts, teams felt autonomy
- Terragrunt: Reduced boilerplate by 90%, made infrastructure maintainable
- Transit Gateway: Operational simplicity worth the $0.02/GB cost
- OIDC authentication: No key rotation, better security than long-lived access keys
- Early security baseline: Automated compliance from day one
What Iβd Do Differently
- Start with monorepo, then split: Multi-repo from day one added complexity; initial MVP in monorepo wouldβve been faster
- Renovate bot earlier: Coordinating Terraform upgrades across 6 repos is manual and painful
- Cost dashboards on day one: Teams couldnβt optimize costs they couldnβt see
Lessons Learned
- Ownership drives adoption: Teams embraced infrastructure they controlled
- Automation compounds: Time invested in IaC pays dividends every deployment
- Security can be fast: Automated baselines are faster than manual audits
- Communication is critical: Weekly demos kept stakeholders aligned
Technical Deep Dive
OIDC Authentication (GitHub Actions β AWS)
Traditional approach stores AWS access keys in GitHub Secrets (security risk if leaked). We implemented OIDC trust relationships:
# IAM role trusts GitHub OIDC provider
condition {
test = "StringEquals"
variable = "token.actions.githubusercontent.com:sub"
values = ["repo:company/ecp-network:ref:refs/heads/main"]
}
Benefits:
- Credentials scoped to specific repo + branch
- Temporary credentials (expire automatically)
- No keys to rotate or leak
- Auditable via CloudTrail
Terragrunt DRY Configuration
Eliminated repeated backend/provider configuration across stacks:
# Root terragrunt.hcl (inherited by all stacks)
remote_state {
backend = "s3"
config = {
bucket = "terraform-state-${get_aws_account_id()}"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}
Impact: Each stack went from 100+ lines of boilerplate to 10-20 lines of actual configuration.
Need similar multi-account AWS architecture? Schedule a free consultation to discuss your platform challenges.
Technologies: AWS Organizations | Transit Gateway | EKS | Terraform | Terragrunt | GitHub Actions | CloudTrail | GuardDuty | Security Hub
Working on a similar challenge?
Multi-account AWS architecture, container migration, Terraform adoption β this is the work I do as a fractional engagement.