数据库和缓存层

本章节详细介绍如何在生产环境中配置和管理数据库与缓存层,包括 RDS PostgreSQL、ElastiCache Redis、DynamoDB 和 S3。

架构设计

数据层架构

┌─────────────────────────────────────────────────────┐
│                   应用层 (EKS)                       │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐   │
│  │User Service│  │Product Svc │  │Order Service│   │
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘   │
└────────┼───────────────┼────────────────┼──────────┘
         │               │                │
         ↓               ↓                ↓
┌────────────────────────────────────────────────────┐
│                   缓存层                            │
│  ┌──────────────────────────────────────────────┐ │
│  │  ElastiCache Redis Cluster (Multi-AZ)        │ │
│  │  ├─ Primary: us-east-1a                      │ │
│  │  ├─ Replica: us-east-1b                      │ │
│  │  └─ Replica: us-east-1c                      │ │
│  └──────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────┘
         │
         ↓ (Cache Miss)
┌────────────────────────────────────────────────────┐
│                 关系数据库层                        │
│  ┌──────────────────────┐  ┌──────────────────┐   │
│  │ RDS PostgreSQL       │  │ RDS PostgreSQL   │   │
│  │ (User Database)      │  │ (Order Database) │   │
│  │ ├─ Primary (1a)      │  │ ├─ Primary (1b)  │   │
│  │ ├─ Standby (1b)      │  │ ├─ Standby (1c)  │   │
│  │ └─ Read Replica (1c) │  │ └─ Read Replica  │   │
│  └──────────────────────┘  └──────────────────┘   │
└────────────────────────────────────────────────────┘
         │
         ↓ (分析查询)
┌────────────────────────────────────────────────────┐
│                  NoSQL 数据库                       │
│  ┌──────────────────────────────────────────────┐ │
│  │  DynamoDB (Sessions, Events)                 │ │
│  │  ├─ Global Tables (Multi-Region)             │ │
│  │  └─ Auto Scaling                             │ │
│  └──────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────┘
         │
         ↓ (对象存储)
┌────────────────────────────────────────────────────┐
│                  对象存储                           │
│  ┌──────────────────────────────────────────────┐ │
│  │  S3 (Static Assets, Backups, Logs)           │ │
│  │  ├─ Versioning Enabled                       │ │
│  │  ├─ Lifecycle Policies                       │ │
│  │  └─ Cross-Region Replication                 │ │
│  └──────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────┘

RDS PostgreSQL 部署

子网组配置

#!/bin/bash
# create-rds-subnet-group.sh

source vpc-config.sh

REGION="us-east-1"
SUBNET_GROUP_NAME="production-rds-subnet-group"

echo "创建 RDS 子网组..."

# 创建子网组
aws rds create-db-subnet-group \
  --db-subnet-group-name $SUBNET_GROUP_NAME \
  --db-subnet-group-description "Production RDS subnet group" \
  --subnet-ids $PRIVATE_DB_SUBNET_1A $PRIVATE_DB_SUBNET_1B $PRIVATE_DB_SUBNET_1C \
  --tags \
    Key=Environment,Value=production \
    Key=ManagedBy,Value=script \
  --region $REGION

echo "✓ RDS 子网组已创建"
echo "export RDS_SUBNET_GROUP_NAME=$SUBNET_GROUP_NAME" >> rds-config.sh

参数组配置

#!/bin/bash
# create-rds-parameter-group.sh

REGION="us-east-1"
PG_VERSION="15"
PG_NAME="production-postgres15"

echo "创建 RDS 参数组..."

# 创建参数组
aws rds create-db-parameter-group \
  --db-parameter-group-name $PG_NAME \
  --db-parameter-group-family postgres15 \
  --description "Production PostgreSQL 15 parameters" \
  --region $REGION

# 配置参数
echo "配置参数..."

# 连接和内存
aws rds modify-db-parameter-group \
  --db-parameter-group-name $PG_NAME \
  --parameters \
    "ParameterName=max_connections,ParameterValue=1000,ApplyMethod=pending-reboot" \
    "ParameterName=shared_buffers,ParameterValue={DBInstanceClassMemory/4},ApplyMethod=pending-reboot" \
    "ParameterName=effective_cache_size,ParameterValue={DBInstanceClassMemory*3/4},ApplyMethod=immediate" \
    "ParameterName=maintenance_work_mem,ParameterValue=2097152,ApplyMethod=immediate" \
    "ParameterName=work_mem,ParameterValue=10485,ApplyMethod=immediate" \
  --region $REGION

# 日志配置
aws rds modify-db-parameter-group \
  --db-parameter-group-name $PG_NAME \
  --parameters \
    "ParameterName=log_min_duration_statement,ParameterValue=1000,ApplyMethod=immediate" \
    "ParameterName=log_connections,ParameterValue=1,ApplyMethod=immediate" \
    "ParameterName=log_disconnections,ParameterValue=1,ApplyMethod=immediate" \
    "ParameterName=log_lock_waits,ParameterValue=1,ApplyMethod=immediate" \
    "ParameterName=log_statement,ParameterValue=ddl,ApplyMethod=immediate" \
  --region $REGION

# 性能优化
aws rds modify-db-parameter-group \
  --db-parameter-group-name $PG_NAME \
  --parameters \
    "ParameterName=random_page_cost,ParameterValue=1.1,ApplyMethod=immediate" \
    "ParameterName=effective_io_concurrency,ParameterValue=200,ApplyMethod=immediate" \
    "ParameterName=checkpoint_completion_target,ParameterValue=0.9,ApplyMethod=immediate" \
  --region $REGION

echo "✓ 参数组配置完成"
echo "export RDS_PARAMETER_GROUP=$PG_NAME" >> rds-config.sh

Multi-AZ RDS 实例创建

#!/bin/bash
# create-rds-instance.sh

source vpc-config.sh
source sg-config.sh
source rds-config.sh

REGION="us-east-1"
DB_INSTANCE_ID="production-postgres-users"
DB_NAME="users_db"
MASTER_USERNAME="postgres"
MASTER_PASSWORD="$(aws secretsmanager get-random-password \
  --password-length 32 \
  --exclude-characters "/@\"'\\" \
  --require-each-included-type \
  --query 'RandomPassword' \
  --output text)"

echo "================================================"
echo "创建 RDS PostgreSQL 实例"
echo "实例 ID: $DB_INSTANCE_ID"
echo "================================================"

# 保存密码到 Secrets Manager
echo ""
echo "1. 保存数据库凭证到 Secrets Manager..."
aws secretsmanager create-secret \
  --name production/database/users \
  --description "Production users database credentials" \
  --secret-string "{
    \"username\": \"$MASTER_USERNAME\",
    \"password\": \"$MASTER_PASSWORD\",
    \"engine\": \"postgres\",
    \"host\": \"pending\",
    \"port\": 5432,
    \"dbname\": \"$DB_NAME\"
  }" \
  --region $REGION

echo "   ✓ 凭证已保存"

# 创建 RDS 实例
echo ""
echo "2. 创建 RDS 实例(预计 10-15 分钟)..."
aws rds create-db-instance \
  --db-instance-identifier $DB_INSTANCE_ID \
  --db-instance-class db.r6g.xlarge \
  --engine postgres \
  --engine-version 15.4 \
  --master-username $MASTER_USERNAME \
  --master-user-password "$MASTER_PASSWORD" \
  --allocated-storage 100 \
  --storage-type gp3 \
  --iops 3000 \
  --storage-throughput 125 \
  --db-name $DB_NAME \
  --db-subnet-group-name $RDS_SUBNET_GROUP_NAME \
  --vpc-security-group-ids $RDS_SG_ID \
  --db-parameter-group-name $RDS_PARAMETER_GROUP \
  --backup-retention-period 30 \
  --preferred-backup-window "03:00-04:00" \
  --preferred-maintenance-window "mon:04:00-mon:05:00" \
  --multi-az \
  --auto-minor-version-upgrade \
  --publicly-accessible false \
  --storage-encrypted \
  --kms-key-id alias/aws/rds \
  --enable-cloudwatch-logs-exports '["postgresql","upgrade"]' \
  --enable-performance-insights \
  --performance-insights-retention-period 7 \
  --deletion-protection \
  --copy-tags-to-snapshot \
  --tags \
    Key=Environment,Value=production \
    Key=Service,Value=users \
    Key=ManagedBy,Value=script \
  --region $REGION

echo "   ✓ 创建请求已提交"

# 等待实例可用
echo ""
echo "3. 等待实例变为可用状态..."
aws rds wait db-instance-available \
  --db-instance-identifier $DB_INSTANCE_ID \
  --region $REGION

# 获取端点
DB_ENDPOINT=$(aws rds describe-db-instances \
  --db-instance-identifier $DB_INSTANCE_ID \
  --query 'DBInstances[0].Endpoint.Address' \
  --output text \
  --region $REGION)

echo "   ✓ 实例已就绪"
echo "   端点: $DB_ENDPOINT"

# 更新 Secrets Manager
echo ""
echo "4. 更新数据库端点..."
SECRET_STRING=$(aws secretsmanager get-secret-value \
  --secret-id production/database/users \
  --query 'SecretString' \
  --output text)

UPDATED_SECRET=$(echo $SECRET_STRING | jq --arg host "$DB_ENDPOINT" '.host = $host')

aws secretsmanager update-secret \
  --secret-id production/database/users \
  --secret-string "$UPDATED_SECRET" \
  --region $REGION

echo "   ✓ 端点已更新"

echo ""
echo "================================================"
echo "RDS 实例创建完成!"
echo "================================================"
echo ""
echo "实例信息:"
echo "  端点: $DB_ENDPOINT"
echo "  数据库: $DB_NAME"
echo "  用户名: $MASTER_USERNAME"
echo "  Multi-AZ: 已启用"
echo "  备份保留: 30 天"
echo "  加密: 已启用"
echo "  Performance Insights: 已启用"
echo ""
echo "凭证位置: production/database/users"
echo ""
echo "下一步:"
echo "  1. 创建 Read Replica"
echo "  2. 配置监控告警"
echo "  3. 测试连接"
echo "================================================"

echo "export RDS_USERS_ENDPOINT=$DB_ENDPOINT" >> rds-config.sh

Read Replica 创建

#!/bin/bash
# create-read-replica.sh

source rds-config.sh

REGION="us-east-1"
SOURCE_DB="production-postgres-users"
REPLICA_ID="${SOURCE_DB}-replica-1"

echo "创建 Read Replica..."

aws rds create-db-instance-read-replica \
  --db-instance-identifier $REPLICA_ID \
  --source-db-instance-identifier $SOURCE_DB \
  --db-instance-class db.r6g.large \
  --availability-zone us-east-1c \
  --publicly-accessible false \
  --auto-minor-version-upgrade \
  --enable-performance-insights \
  --performance-insights-retention-period 7 \
  --tags \
    Key=Environment,Value=production \
    Key=Role,Value=read-replica \
  --region $REGION

echo "   Read Replica 创建请求已提交"

# 等待可用
aws rds wait db-instance-available \
  --db-instance-identifier $REPLICA_ID \
  --region $REGION

# 获取端点
REPLICA_ENDPOINT=$(aws rds describe-db-instances \
  --db-instance-identifier $REPLICA_ID \
  --query 'DBInstances[0].Endpoint.Address' \
  --output text \
  --region $REGION)

echo "✓ Read Replica 已创建"
echo "   端点: $REPLICA_ENDPOINT"

echo "export RDS_USERS_REPLICA_ENDPOINT=$REPLICA_ENDPOINT" >> rds-config.sh

PgBouncer 连接池部署

Kubernetes Deployment:

# pgbouncer-deployment.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: pgbouncer-config
  namespace: production
data:
  pgbouncer.ini: |
    [databases]
    users_db = host=production-postgres-users.xxxxx.us-east-1.rds.amazonaws.com port=5432 dbname=users_db
    
    [pgbouncer]
    listen_addr = 0.0.0.0
    listen_port = 5432
    auth_type = md5
    auth_file = /etc/pgbouncer/userlist.txt
    pool_mode = transaction
    max_client_conn = 1000
    default_pool_size = 25
    reserve_pool_size = 5
    reserve_pool_timeout = 3
    server_lifetime = 3600
    server_idle_timeout = 600
    log_connections = 1
    log_disconnections = 1
    log_pooler_errors = 1
    stats_period = 60

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgbouncer
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pgbouncer
  template:
    metadata:
      labels:
        app: pgbouncer
    spec:
      containers:
      - name: pgbouncer
        image: edoburu/pgbouncer:1.20.1
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: DATABASES_HOST
          value: "production-postgres-users.xxxxx.us-east-1.rds.amazonaws.com"
        - name: DATABASES_PORT
          value: "5432"
        - name: DATABASES_DBNAME
          value: "users_db"
        - name: DATABASES_USER
          valueFrom:
            secretKeyRef:
              name: database-credentials
              key: username
        - name: PGBOUNCER_AUTH_TYPE
          value: "md5"
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        volumeMounts:
        - name: config
          mountPath: /etc/pgbouncer
      volumes:
      - name: config
        configMap:
          name: pgbouncer-config

---
apiVersion: v1
kind: Service
metadata:
  name: pgbouncer
  namespace: production
spec:
  selector:
    app: pgbouncer
  ports:
  - port: 5432
    targetPort: 5432
  type: ClusterIP

ElastiCache Redis 部署

子网组配置

#!/bin/bash
# create-redis-subnet-group.sh

source vpc-config.sh

REGION="us-east-1"
SUBNET_GROUP_NAME="production-redis-subnet-group"

echo "创建 Redis 子网组..."

aws elasticache create-cache-subnet-group \
  --cache-subnet-group-name $SUBNET_GROUP_NAME \
  --cache-subnet-group-description "Production Redis subnet group" \
  --subnet-ids $PRIVATE_DB_SUBNET_1A $PRIVATE_DB_SUBNET_1B $PRIVATE_DB_SUBNET_1C \
  --tags \
    Key=Environment,Value=production \
  --region $REGION

echo "✓ Redis 子网组已创建"
echo "export REDIS_SUBNET_GROUP_NAME=$SUBNET_GROUP_NAME" >> redis-config.sh

参数组配置

#!/bin/bash
# create-redis-parameter-group.sh

REGION="us-east-1"
PG_NAME="production-redis7"

echo "创建 Redis 参数组..."

aws elasticache create-cache-parameter-group \
  --cache-parameter-group-name $PG_NAME \
  --cache-parameter-group-family redis7 \
  --description "Production Redis 7 parameters" \
  --region $REGION

# 配置参数
echo "配置参数..."

aws elasticache modify-cache-parameter-group \
  --cache-parameter-group-name $PG_NAME \
  --parameter-name-values \
    "ParameterName=maxmemory-policy,ParameterValue=allkeys-lru" \
    "ParameterName=timeout,ParameterValue=300" \
    "ParameterName=tcp-keepalive,ParameterValue=300" \
    "ParameterName=maxmemory-samples,ParameterValue=10" \
  --region $REGION

echo "✓ 参数组配置完成"
echo "export REDIS_PARAMETER_GROUP=$PG_NAME" >> redis-config.sh

Cluster Mode Redis 创建

#!/bin/bash
# create-redis-cluster.sh

source vpc-config.sh
source sg-config.sh
source redis-config.sh

REGION="us-east-1"
REPLICATION_GROUP_ID="production-redis-cluster"

echo "================================================"
echo "创建 Redis Cluster Mode 集群"
echo "================================================"

# 创建 Replication Group
echo ""
echo "创建 Replication Group..."
aws elasticache create-replication-group \
  --replication-group-id $REPLICATION_GROUP_ID \
  --replication-group-description "Production Redis Cluster" \
  --engine redis \
  --engine-version 7.0 \
  --cache-node-type cache.r7g.large \
  --cache-parameter-group-name $REDIS_PARAMETER_GROUP \
  --cache-subnet-group-name $REDIS_SUBNET_GROUP_NAME \
  --security-group-ids $REDIS_SG_ID \
  --num-node-groups 3 \
  --replicas-per-node-group 2 \
  --automatic-failover-enabled \
  --multi-az-enabled \
  --at-rest-encryption-enabled \
  --transit-encryption-enabled \
  --auth-token "$(aws secretsmanager get-random-password \
    --password-length 32 \
    --exclude-characters "/@\"'\\" \
    --require-each-included-type \
    --query 'RandomPassword' \
    --output text)" \
  --snapshot-retention-limit 7 \
  --snapshot-window "03:00-05:00" \
  --preferred-maintenance-window "mon:05:00-mon:07:00" \
  --log-delivery-configurations \
    "LogType=slow-log,DestinationType=cloudwatch-logs,DestinationDetails={CloudWatchLogsDetails={LogGroup=/aws/elasticache/redis/production}},LogFormat=json" \
    "LogType=engine-log,DestinationType=cloudwatch-logs,DestinationDetails={CloudWatchLogsDetails={LogGroup=/aws/elasticache/redis/production}},LogFormat=json" \
  --tags \
    Key=Environment,Value=production \
    Key=ManagedBy,Value=script \
  --region $REGION

echo "   ✓ 创建请求已提交(预计 15-20 分钟)"

# 等待可用
echo ""
echo "等待集群变为可用..."
aws elasticache wait replication-group-available \
  --replication-group-id $REPLICATION_GROUP_ID \
  --region $REGION

# 获取配置端点
CONFIG_ENDPOINT=$(aws elasticache describe-replication-groups \
  --replication-group-id $REPLICATION_GROUP_ID \
  --query 'ReplicationGroups[0].ConfigurationEndpoint.Address' \
  --output text \
  --region $REGION)

echo ""
echo "================================================"
echo "Redis 集群创建完成!"
echo "================================================"
echo ""
echo "配置端点: $CONFIG_ENDPOINT:6379"
echo "节点组: 3"
echo "每组副本: 2"
echo "总节点数: 9"
echo "Multi-AZ: 已启用"
echo "加密: 传输中 + 静态"
echo "快照保留: 7 天"
echo ""
echo "================================================"

echo "export REDIS_CONFIG_ENDPOINT=$CONFIG_ENDPOINT" >> redis-config.sh

DynamoDB 配置

创建表脚本

#!/bin/bash
# create-dynamodb-tables.sh

REGION="us-east-1"

echo "================================================"
echo "创建 DynamoDB 表"
echo "================================================"

# 1. Sessions 表
echo ""
echo "1. 创建 Sessions 表..."
aws dynamodb create-table \
  --table-name production-sessions \
  --attribute-definitions \
    AttributeName=SessionId,AttributeType=S \
    AttributeName=UserId,AttributeType=S \
    AttributeName=ExpiresAt,AttributeType=N \
  --key-schema \
    AttributeName=SessionId,KeyType=HASH \
  --global-secondary-indexes \
    "[{
      \"IndexName\": \"UserIdIndex\",
      \"KeySchema\": [{\"AttributeName\":\"UserId\",\"KeyType\":\"HASH\"}],
      \"Projection\": {\"ProjectionType\":\"ALL\"},
      \"ProvisionedThroughput\": {\"ReadCapacityUnits\":5,\"WriteCapacityUnits\":5}
    }]" \
  --billing-mode PAY_PER_REQUEST \
  --stream-specification \
    StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
  --sse-specification \
    Enabled=true,SSEType=KMS \
  --tags \
    Key=Environment,Value=production \
    Key=Purpose,Value=sessions \
  --region $REGION

# 启用 TTL
aws dynamodb update-time-to-live \
  --table-name production-sessions \
  --time-to-live-specification \
    Enabled=true,AttributeName=ExpiresAt \
  --region $REGION

echo "   ✓ Sessions 表已创建"

# 2. Events 表
echo ""
echo "2. 创建 Events 表..."
aws dynamodb create-table \
  --table-name production-events \
  --attribute-definitions \
    AttributeName=EventId,AttributeType=S \
    AttributeName=Timestamp,AttributeType=N \
    AttributeName=UserId,AttributeType=S \
  --key-schema \
    AttributeName=EventId,KeyType=HASH \
    AttributeName=Timestamp,KeyType=RANGE \
  --global-secondary-indexes \
    "[{
      \"IndexName\": \"UserIdTimestampIndex\",
      \"KeySchema\": [
        {\"AttributeName\":\"UserId\",\"KeyType\":\"HASH\"},
        {\"AttributeName\":\"Timestamp\",\"KeyType\":\"RANGE\"}
      ],
      \"Projection\": {\"ProjectionType\":\"ALL\"},
      \"ProvisionedThroughput\": {\"ReadCapacityUnits\":10,\"WriteCapacityUnits\":10}
    }]" \
  --billing-mode PAY_PER_REQUEST \
  --stream-specification \
    StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES \
  --sse-specification \
    Enabled=true,SSEType=KMS \
  --tags \
    Key=Environment,Value=production \
    Key=Purpose,Value=events \
  --region $REGION

echo "   ✓ Events 表已创建"

# 3. 启用 Point-in-Time Recovery
echo ""
echo "3. 启用 Point-in-Time Recovery..."
aws dynamodb update-continuous-backups \
  --table-name production-sessions \
  --point-in-time-recovery-specification \
    PointInTimeRecoveryEnabled=true \
  --region $REGION

aws dynamodb update-continuous-backups \
  --table-name production-events \
  --point-in-time-recovery-specification \
    PointInTimeRecoveryEnabled=true \
  --region $REGION

echo "   ✓ PITR 已启用"

echo ""
echo "================================================"
echo "DynamoDB 表创建完成!"
echo "================================================"

S3 存储配置

创建 S3 Buckets

#!/bin/bash
# create-s3-buckets.sh

REGION="us-east-1"
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

echo "================================================"
echo "创建 S3 Buckets"
echo "================================================"

# 1. 应用资产 Bucket
echo ""
echo "1. 创建应用资产 Bucket..."
ASSETS_BUCKET="production-app-assets-${ACCOUNT_ID}"

aws s3api create-bucket \
  --bucket $ASSETS_BUCKET \
  --region $REGION

# 启用版本控制
aws s3api put-bucket-versioning \
  --bucket $ASSETS_BUCKET \
  --versioning-configuration Status=Enabled

# 启用加密
aws s3api put-bucket-encryption \
  --bucket $ASSETS_BUCKET \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      },
      "BucketKeyEnabled": true
    }]
  }'

# 阻止公共访问
aws s3api put-public-access-block \
  --bucket $ASSETS_BUCKET \
  --public-access-block-configuration \
    BlockPublicAcls=true,\
IgnorePublicAcls=true,\
BlockPublicPolicy=true,\
RestrictPublicBuckets=true

# 生命周期策略
aws s3api put-bucket-lifecycle-configuration \
  --bucket $ASSETS_BUCKET \
  --lifecycle-configuration '{
    "Rules": [
      {
        "Id": "TransitionToIA",
        "Status": "Enabled",
        "Transitions": [{
          "Days": 90,
          "StorageClass": "STANDARD_IA"
        }],
        "NoncurrentVersionTransitions": [{
          "NoncurrentDays": 30,
          "StorageClass": "GLACIER"
        }]
      },
      {
        "Id": "ExpireOldVersions",
        "Status": "Enabled",
        "NoncurrentVersionExpiration": {
          "NoncurrentDays": 365
        }
      }
    ]
  }'

echo "   ✓ 资产 Bucket 已创建: $ASSETS_BUCKET"

# 2. 备份 Bucket
echo ""
echo "2. 创建备份 Bucket..."
BACKUP_BUCKET="production-backups-${ACCOUNT_ID}"

aws s3api create-bucket \
  --bucket $BACKUP_BUCKET \
  --region $REGION

aws s3api put-bucket-versioning \
  --bucket $BACKUP_BUCKET \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-encryption \
  --bucket $BACKUP_BUCKET \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms",
        "KMSMasterKeyID": "alias/aws/s3"
      },
      "BucketKeyEnabled": true
    }]
  }'

aws s3api put-public-access-block \
  --bucket $BACKUP_BUCKET \
  --public-access-block-configuration \
    BlockPublicAcls=true,\
IgnorePublicAcls=true,\
BlockPublicPolicy=true,\
RestrictPublicBuckets=true

# 备份保留策略
aws s3api put-bucket-lifecycle-configuration \
  --bucket $BACKUP_BUCKET \
  --lifecycle-configuration '{
    "Rules": [
      {
        "Id": "TransitionToGlacier",
        "Status": "Enabled",
        "Transitions": [
          {"Days": 30, "StorageClass": "GLACIER"},
          {"Days": 90, "StorageClass": "DEEP_ARCHIVE"}
        ]
      },
      {
        "Id": "ExpireAfter7Years",
        "Status": "Enabled",
        "Expiration": {
          "Days": 2555
        }
      }
    ]
  }'

echo "   ✓ 备份 Bucket 已创建: $BACKUP_BUCKET"

# 3. 日志 Bucket
echo ""
echo "3. 创建日志 Bucket..."
LOGS_BUCKET="production-logs-${ACCOUNT_ID}"

aws s3api create-bucket \
  --bucket $LOGS_BUCKET \
  --region $REGION

aws s3api put-bucket-encryption \
  --bucket $LOGS_BUCKET \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      }
    }]
  }'

# 日志保留策略(90 天后删除)
aws s3api put-bucket-lifecycle-configuration \
  --bucket $LOGS_BUCKET \
  --lifecycle-configuration '{
    "Rules": [{
      "Id": "ExpireOldLogs",
      "Status": "Enabled",
      "Expiration": {
        "Days": 90
      }
    }]
  }'

echo "   ✓ 日志 Bucket 已创建: $LOGS_BUCKET"

echo ""
echo "================================================"
echo "S3 Buckets 创建完成!"
echo "================================================"
echo ""
echo "  资产: $ASSETS_BUCKET"
echo "  备份: $BACKUP_BUCKET"
echo "  日志: $LOGS_BUCKET"
echo "================================================"

数据库备份策略

RDS 自动备份

#!/bin/bash
# configure-rds-backups.sh

DB_INSTANCE_ID="production-postgres-users"
REGION="us-east-1"

echo "配置 RDS 备份策略..."

# 修改备份配置
aws rds modify-db-instance \
  --db-instance-identifier $DB_INSTANCE_ID \
  --backup-retention-period 30 \
  --preferred-backup-window "03:00-04:00" \
  --copy-tags-to-snapshot \
  --apply-immediately \
  --region $REGION

echo "✓ 备份配置已更新"
echo "  保留期: 30 天"
echo "  备份窗口: 03:00-04:00 UTC"

Redis 快照备份

#!/bin/bash
# create-redis-snapshot.sh

REPLICATION_GROUP_ID="production-redis-cluster"
SNAPSHOT_NAME="manual-snapshot-$(date +%Y%m%d-%H%M%S)"
REGION="us-east-1"

echo "创建 Redis 手动快照..."

aws elasticache create-snapshot \
  --replication-group-id $REPLICATION_GROUP_ID \
  --snapshot-name $SNAPSHOT_NAME \
  --region $REGION

echo "✓ 快照创建已启动: $SNAPSHOT_NAME"

监控和告警

RDS 监控

#!/bin/bash
# create-rds-alarms.sh

DB_INSTANCE_ID="production-postgres-users"
SNS_TOPIC_ARN="arn:aws:sns:us-east-1:123456789012:production-alerts"
REGION="us-east-1"

echo "创建 RDS CloudWatch 告警..."

# CPU 利用率
aws cloudwatch put-metric-alarm \
  --alarm-name "${DB_INSTANCE_ID}-high-cpu" \
  --alarm-description "RDS CPU > 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/RDS \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=DBInstanceIdentifier,Value=$DB_INSTANCE_ID \
  --alarm-actions $SNS_TOPIC_ARN \
  --region $REGION

# 可用存储空间
aws cloudwatch put-metric-alarm \
  --alarm-name "${DB_INSTANCE_ID}-low-storage" \
  --alarm-description "RDS 可用存储 < 10GB" \
  --metric-name FreeStorageSpace \
  --namespace AWS/RDS \
  --statistic Average \
  --period 300 \
  --evaluation-periods 1 \
  --threshold 10737418240 \
  --comparison-operator LessThanThreshold \
  --dimensions Name=DBInstanceIdentifier,Value=$DB_INSTANCE_ID \
  --alarm-actions $SNS_TOPIC_ARN \
  --region $REGION

# 连接数
aws cloudwatch put-metric-alarm \
  --alarm-name "${DB_INSTANCE_ID}-high-connections" \
  --alarm-description "RDS 连接数 > 800" \
  --metric-name DatabaseConnections \
  --namespace AWS/RDS \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 800 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=DBInstanceIdentifier,Value=$DB_INSTANCE_ID \
  --alarm-actions $SNS_TOPIC_ARN \
  --region $REGION

echo "✓ RDS 告警已创建"

最佳实践总结

1. RDS 配置

✓ 启用 Multi-AZ 部署
✓ 配置 Read Replicas
✓ 使用参数组优化性能
✓ 启用自动备份(30 天)
✓ 启用 Performance Insights
✓ 配置监控告警

2. Redis 配置

✓ 使用 Cluster Mode
✓ 启用 Multi-AZ
✓ 配置自动故障转移
✓ 启用传输加密
✓ 启用静态加密
✓ 配置快照备份

3. DynamoDB 配置

✓ 使用按需计费模式
✓ 启用 Point-in-Time Recovery
✓ 配置 TTL
✓ 使用 GSI 优化查询
✓ 启用流处理
✓ 配置 Auto Scaling

4. S3 配置

✓ 启用版本控制
✓ 配置生命周期策略
✓ 启用服务端加密
✓ 阻止公共访问
✓ 配置跨区域复制(关键数据)
✓ 启用访问日志

5. 安全配置

✓ 所有数据库使用私有子网
✓ 使用安全组控制访问
✓ 启用加密(传输中+静态)
✓ 使用 Secrets Manager 管理凭证
✓ 定期轮换密码
✓ 启用审计日志

下一步: 继续学习 监控和日志系统 章节。