Lessons Learned & Best Practices

Lessons Learned & Best Practices

What Worked Well ✅

1. Multi-Stack Pattern

Success: Significantly reduced deployment time and risk.

Evidence:

  • Service deployments: 2-3 minutes (vs 15-20 minutes monolithic)
  • Bug fixes deployed without affecting other services
  • Teams worked independently without conflicts

Recommendation: Use multi-stack for any microservices architecture.

2. Cross-Stack References

Success: Type-safe dependencies between stacks.

Evidence:

  • CloudFormation validates imports at deployment time
  • No hardcoded ARNs or names in code
  • Easy to track dependencies

Recommendation: Always use exports/imports over hardcoding.

3. Bash Scripts

Success: Simple but effective orchestration.

Evidence:

  • Easy to understand and modify
  • No additional tools required
  • Works in any CI/CD system

Recommendation: Start with bash, migrate to sophisticated tools only if needed.

4. SAM Simplification

Success: Reduced boilerplate significantly.

Evidence:

  • Lambda + API Gateway: 10 lines (vs 50+ lines pure CloudFormation)
  • Automatic IAM role generation
  • Built-in best practices

Recommendation: Use SAM for all serverless applications.

5. Environment Separation

Success: Clean separation via parameters.

Evidence:

  • Same templates for staging and prod
  • Easy to add new environments
  • No code duplication

Recommendation: Always use parameters for environment-specific config.


Challenges Faced ⚠️

1. Learning Curve

Challenge: Initial learning curve for cross-stack references.

Impact: First deployment took 2 days to get right.

Solution:

  • Created documentation with examples
  • Established naming conventions
  • Built reusable templates

Lesson: Invest time in documentation upfront.

2. Debugging CloudFormation Errors

Challenge: CloudFormation error messages can be cryptic.

Impact: Spent hours debugging “Resource failed to stabilize”.

Solution:

  • Check CloudWatch Logs immediately
  • Use aws cloudformation describe-stack-events
  • Enable detailed logging in Lambda

Lesson: Always check CloudWatch Logs first.

3. Stack Deletion Dependencies

Challenge: Cannot delete core stack while services are running.

Impact: Manual cleanup required in specific order.

Solution:

# Created cleanup script
./scripts/cleanup.sh staging

# Deletes in correct order:
# 1. Service stacks
# 2. Core stack
# 3. Deployment bucket

Lesson: Plan deletion strategy from the start.

4. Export Naming

Challenge: Changed export name, broke all importing stacks.

Impact: Had to update all service stacks simultaneously.

Solution:

  • Established naming convention early
  • Documented all exports
  • Added validation in deployment scripts

Lesson: Plan export names carefully - they’re hard to change.

5. Sequential Deployment

Challenge: Sequential deployment slow for many services.

Impact: Full deployment took 15+ minutes.

Solution:

  • Implemented parallel service deployment
  • Reduced to 5 minutes total

Lesson: Parallelize independent operations.


Best Practices 🎯

1. Export Naming Convention

# ✅ Good: Include stack name
Export:
  Name: !Sub '${AWS::StackName}-ResourceName'

# ❌ Bad: Generic name
Export:
  Name: ResourceName

2. Stack Separation

✅ Core Stack:
- DynamoDB Tables
- S3 Buckets
- Cognito User Pools
- VPC/Networking

✅ Service Stacks:
- Lambda Functions
- API Gateway
- SQS Queues
- SNS Topics

3. Template Validation

# Always validate before deploy
aws cloudformation validate-template \
    --template-body file://template.yaml

# Use linting tools
cfn-lint template.yaml

4. Error Handling

# Use set -e in all scripts
set -e
set -o pipefail

# Add cleanup trap
trap cleanup ERR EXIT

5. Documentation

## Stack Exports
- `{StackName}-ArticlesTableName`: DynamoDB table name
- `{StackName}-ArticlesTableArn`: DynamoDB table ARN

## Deployment
```bash
./deploy.sh staging us-east-1

Rollback

./rollback.sh staging

#### 6. Testing Strategy
```bash
# Test in staging first
./deploy.sh staging

# Verify functionality
./test.sh staging

# Deploy to production
./deploy.sh prod

7. Rollback Plan

# Always have rollback ready
git tag v1.2.3
git push --tags

# If issues:
git checkout v1.2.2
./deploy.sh prod

Recommendations for Improvements 🚀

1. CI/CD Integration

Current: Manual deployment via scripts.

Improvement: Automate with GitHub Actions/GitLab CI.

Example:

# .github/workflows/deploy.yml
name: Deploy
on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Deploy to staging
        run: ./deploy.sh staging

Benefit: Automatic deployment on merge.

2. Parallel Service Deployment

Current: Sequential service deployment.

Improvement: Deploy services in parallel.

Example:

for service in "${SERVICES[@]}"; do
    ./deploy-service.sh $service $ENV &
done
wait

Benefit: 3x faster deployment.

3. Parameter Store for Secrets

Current: Secrets in parameter files (not committed).

Improvement: Use AWS Systems Manager Parameter Store.

Example:

Parameters:
  DatabasePassword:
    Type: AWS::SSM::Parameter::Value<String>
    Default: /travelguide/staging/db-password

Benefit: Centralized secret management.

4. CloudWatch Dashboards

Current: Manual monitoring via console.

Improvement: Automated dashboards for stack health.

Example:

Dashboard:
  Type: AWS::CloudWatch::Dashboard
  Properties:
    DashboardBody: !Sub |
      {
        "widgets": [
          {
            "type": "metric",
            "properties": {
              "metrics": [
                ["AWS/Lambda", "Errors", {"stat": "Sum"}]
              ]
            }
          }
        ]
      }

Benefit: Proactive monitoring.

5. CloudFormation Guard

Current: Manual policy validation.

Improvement: Automated policy enforcement.

Example:

# rules/security.guard
AWS::S3::Bucket {
  Properties.PublicAccessBlockConfiguration exists
  Properties.BucketEncryption exists
}

Benefit: Prevent security misconfigurations.

6. Auto-Generated Documentation

Current: Manual documentation.

Improvement: Generate docs from templates.

Example:

# Use cfn-diagram
cfn-diagram template.yaml > architecture.png

Benefit: Always up-to-date documentation.


Key Takeaways 📝

  1. Multi-stack pattern is essential for microservices
  2. Cross-stack references provide type-safe dependencies
  3. Bash scripts are sufficient for orchestration
  4. SAM significantly reduces boilerplate
  5. Documentation is crucial for team success
  6. Testing in staging prevents production issues
  7. Rollback plan is mandatory
  8. CI/CD integration should be next step
  9. Parallel deployment speeds up process
  10. Security should be automated with Guard rules

Comparison with Alternatives

AspectCloudFormation/SAMTerraformAWS CDKPulumi
Our Experience✅ PositiveN/AN/AN/A
Learning CurveMediumMediumHighHigh
AWS IntegrationExcellentGoodExcellentGood
State ManagementAutomaticManualAutomaticCloud
Multi-CloudNoYesNoYes
CostFreeFreeFreeFree
Recommendation✅ For AWS-onlyFor multi-cloudFor complex logicFor full language power

Final Thoughts

The multi-stack pattern with CloudFormation/SAM proved to be the right choice for our Travel Guide Application. While there were challenges, the benefits far outweighed the costs:

Wins:

  • ✅ Faster deployments (2-3 min vs 15-20 min)
  • ✅ Reduced blast radius
  • ✅ Independent team workflows
  • ✅ Easy rollbacks
  • ✅ Cost attribution per service

Areas for Improvement:

  • 🔄 CI/CD automation
  • 🔄 Parallel deployments
  • 🔄 Centralized secret management
  • 🔄 Automated monitoring

Would we do it again? Absolutely yes! ✅