持久卷 PV/PVC 深度详解

持久卷 PV/PVC 深度详解

PersistentVolume (PV) 详解

PV 是集群中的一块存储,由管理员提供或通过 StorageClass 动态创建。

PV 完整规格

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-nfs-example
  labels:
    type: nfs
    environment: production
  annotations:
    pv.kubernetes.io/provisioned-by: nfs-provisioner
spec:
  # 容量
  capacity:
    storage: 100Gi
  
  # 访问模式
  accessModes:
    - ReadWriteMany
  
  # 回收策略
  persistentVolumeReclaimPolicy: Retain  # Retain/Delete/Recycle
  
  # 存储类名
  storageClassName: nfs-storage
  
  # 挂载选项
  mountOptions:
    - hard
    - nfsvers=4.1
    - timeo=600
    - retrans=2
  
  # 节点亲和性
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values:
          - us-east-1a
          - us-east-1b
  
  # 卷模式
  volumeMode: Filesystem  # Filesystem/Block
  
  # 存储后端配置(NFS 示例)
  nfs:
    server: nfs-server.example.com
    path: /exported/path
    readOnly: false
  
  # 绑定到特定 PVC(可选)
  claimRef:
    namespace: default
    name: pvc-example

容量 (Capacity)

capacity:
  storage: 100Gi  # 存储容量
  # 可以添加其他资源类型(扩展功能)
  # iops: "3000"
  # throughput: "250Mi"

容量匹配规则

  • PV 容量 >= PVC 请求容量
  • 选择最接近的 PV(避免浪费)
  • 如果没有完全匹配,可能绑定更大的 PV

访问模式 (Access Modes)

accessModes:
  - ReadWriteOnce   # RWO:单节点读写
  - ReadOnlyMany    # ROX:多节点只读
  - ReadWriteMany   # RWX:多节点读写
  - ReadWriteOncePod # RWOP:单 Pod 独占(1.22+)

访问模式详解

ReadWriteOnce (RWO)

  • 卷可以被单个节点以读写模式挂载
  • 同一节点上的多个 Pod 可以访问
  • 最常用的模式
  • 适用:数据库、有状态应用
# MySQL 使用 RWO
apiVersion: v1
kind: Pod
metadata:
  name: mysql
spec:
  containers:
  - name: mysql
    image: mysql:8.0
    volumeMounts:
    - name: data
      mountPath: /var/lib/mysql
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: mysql-pvc  # RWO 模式

ReadOnlyMany (ROX)

  • 卷可以被多个节点以只读模式挂载
  • 所有 Pod 只能读取,不能写入
  • 适用:共享配置、静态资源
# 多个 Web 服务器读取共享资源
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-servers
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: nginx
        volumeMounts:
        - name: static-content
          mountPath: /usr/share/nginx/html
          readOnly: true
      volumes:
      - name: static-content
        persistentVolumeClaim:
          claimName: static-pvc  # ROX 模式
          readOnly: true

ReadWriteMany (RWX)

  • 卷可以被多个节点以读写模式挂载
  • 所有 Pod 都可以读写
  • 需要支持的存储后端(NFS、CephFS、GlusterFS)
  • 适用:共享文件系统、协作工作负载
# 多个 Pod 共享日志目录
apiVersion: apps/v1
kind: Deployment
metadata:
  name: log-processors
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: processor
        volumeMounts:
        - name: logs
          mountPath: /logs
      volumes:
      - name: logs
        persistentVolumeClaim:
          claimName: logs-pvc  # RWX 模式

ReadWriteOncePod (RWOP) - Kubernetes 1.22+:

  • 卷只能被单个 Pod 挂载
  • 提供更强的独占性保证
  • 适用:需要严格独占访问的应用
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: exclusive-pvc
spec:
  accessModes:
    - ReadWriteOncePod  # 单 Pod 独占
  resources:
    requests:
      storage: 10Gi

回收策略 (Reclaim Policy)

persistentVolumeReclaimPolicy: Retain  # 三种策略

1. Retain (保留)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-retain
spec:
  persistentVolumeReclaimPolicy: Retain
  # ...

# 行为:
# 1. PVC 删除后,PV 变为 Released
# 2. 数据保留在存储后端
# 3. PV 不能被新的 PVC 自动绑定
# 4. 需要管理员手动处理

# 手动回收步骤:
# a. 备份数据
# b. 删除 PV:kubectl delete pv pv-retain
# c. 清理存储后端数据
# d. 重新创建 PV(如果需要)

2. Delete (删除)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-delete
spec:
  persistentVolumeReclaimPolicy: Delete
  # ...

# 行为:
# 1. PVC 删除后,自动删除 PV
# 2. 自动删除存储后端的卷
# 3. 数据完全丢失
# 4. 动态存储的默认策略

# ⚠️  注意:数据会永久删除

3. Recycle (回收,已废弃)

# 不推荐使用,使用动态存储替代
persistentVolumeReclaimPolicy: Recycle

# 行为:
# 1. 执行基本清理:rm -rf /volume/*
# 2. PV 重新变为 Available
# 3. 不安全,可能残留数据

卷模式 (Volume Mode)

volumeMode: Filesystem  # 或 Block

Filesystem (文件系统模式)

  • 默认模式
  • 卷会被格式化为文件系统(ext4、xfs)
  • 以目录形式挂载到 Pod
  • 适用于大多数应用
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-filesystem
spec:
  volumeMode: Filesystem
  capacity:
    storage: 10Gi
  # ...

---
# Pod 使用
spec:
  containers:
  - name: app
    volumeMounts:
    - name: data
      mountPath: /data  # 目录挂载

Block (块设备模式)

  • 原始块设备
  • 不格式化文件系统
  • 以块设备形式挂载到 Pod
  • 适用于数据库、性能敏感应用
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-block
spec:
  volumeMode: Block
  capacity:
    storage: 10Gi
  # ...

---
# Pod 使用
spec:
  containers:
  - name: database
    volumeDevices:  # 注意:使用 volumeDevices 而非 volumeMounts
    - name: data
      devicePath: /dev/xvda  # 块设备路径

节点亲和性 (Node Affinity)

限制 PV 可以在哪些节点上使用,主要用于本地存储。

apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-node-1  # 只能在此节点使用

拓扑感知

nodeAffinity:
  required:
    nodeSelectorTerms:
    - matchExpressions:
      # 可用区限制
      - key: topology.kubernetes.io/zone
        operator: In
        values:
        - us-east-1a
      # 实例类型限制
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - m5.large
        - m5.xlarge

PersistentVolumeClaim (PVC) 详解

PVC 是用户对存储的请求声明。

PVC 完整规格

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: default
  labels:
    app: myapp
  annotations:
    volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
spec:
  # 访问模式
  accessModes:
    - ReadWriteOnce
  
  # 存储类
  storageClassName: fast-ssd
  
  # 资源请求
  resources:
    requests:
      storage: 10Gi
    # 可选:资源限制(部分存储支持)
    limits:
      storage: 20Gi
  
  # 选择器(可选)
  selector:
    matchLabels:
      release: stable
      environment: prod
    matchExpressions:
    - key: tier
      operator: In
      values:
      - cache
      - database
  
  # 卷模式
  volumeMode: Filesystem
  
  # 卷名称(绑定特定 PV)
  volumeName: pv-specific
  
  # 数据源(从快照/PVC 克隆)
  dataSource:
    name: my-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  
  # 数据源引用(扩展版本)
  dataSourceRef:
    name: my-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

资源请求 (Resources)

resources:
  requests:
    storage: 10Gi  # 请求容量

容量单位

  • Ki (Kibibyte) = 1024 bytes
  • Mi (Mebibyte) = 1024 Ki
  • Gi (Gibibyte) = 1024 Mi
  • Ti (Tebibyte) = 1024 Gi
# 示例
storage: 1Ki    # 1024 字节
storage: 1Mi    # 1048576 字节
storage: 1Gi    # 1073741824 字节
storage: 1Ti    # 1099511627776 字节

选择器 (Selector)

通过标签选择特定的 PV:

# PV 定义
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-fast
  labels:
    type: ssd
    tier: gold
spec:
  capacity:
    storage: 100Gi
  # ...

---
# PVC 选择
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-fast
spec:
  selector:
    matchLabels:
      type: ssd
      tier: gold
  resources:
    requests:
      storage: 50Gi

高级选择器

selector:
  matchExpressions:
  - key: type
    operator: In
    values:
    - ssd
    - nvme
  - key: tier
    operator: NotIn
    values:
    - bronze
  - key: zone
    operator: Exists

数据源 (Data Source)

从快照或现有 PVC 创建新的 PVC:

从快照创建

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restored-pvc
spec:
  dataSource:
    name: my-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

从现有 PVC 克隆

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cloned-pvc
spec:
  dataSource:
    name: source-pvc
    kind: PersistentVolumeClaim
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

PV 和 PVC 绑定机制

绑定流程详解

┌────────────────────────────────────────────────────────┐
│ 1. 用户创建 PVC                                         │
│    kubectl apply -f pvc.yaml                           │
└────────────────┬───────────────────────────────────────┘
                 │
                 ▼
┌────────────────────────────────────────────────────────┐
│ 2. PV Controller 监听到新 PVC                          │
│    - 检查 PVC.spec.storageClassName                    │
│    - 检查是否已有合适的 PV                             │
└────────────────┬───────────────────────────────────────┘
                 │
        ┌────────┴────────┐
        │                 │
        ▼                 ▼
┌──────────────┐  ┌──────────────────┐
│ 静态供给     │  │ 动态供给          │
│ 匹配现有 PV  │  │ 调用 Provisioner  │
└──────┬───────┘  └──────┬───────────┘
       │                 │
       └────────┬────────┘
                ▼
┌────────────────────────────────────────────────────────┐
│ 3. 匹配条件检查                                         │
│    ✓ StorageClass 匹配                                 │
│    ✓ AccessModes 匹配                                  │
│    ✓ 容量满足 (PV >= PVC)                              │
│    ✓ 标签选择器匹配(如果有)                          │
│    ✓ 节点亲和性满足(如果有)                          │
└────────────────┬───────────────────────────────────────┘
                 │
                 ▼
┌────────────────────────────────────────────────────────┐
│ 4. 绑定 PVC 和 PV                                       │
│    - 设置 PVC.spec.volumeName = PV.name                │
│    - 设置 PV.spec.claimRef = PVC reference             │
│    - PVC.status.phase = Bound                          │
│    - PV.status.phase = Bound                           │
└────────────────┬───────────────────────────────────────┘
                 │
                 ▼
┌────────────────────────────────────────────────────────┐
│ 5. Pod 可以使用 PVC                                     │
│    - Kubelet 检测到 Pod 引用 PVC                       │
│    - 调用 Volume Plugin 挂载卷                         │
│    - 卷挂载到 Pod 容器                                  │
└────────────────────────────────────────────────────────┘

绑定匹配规则

规则优先级

  1. StorageClass 匹配(必须)
  2. AccessModes 匹配(必须)
  3. 容量满足 (PV.capacity >= PVC.requests)
  4. 标签选择器(如果 PVC 有 selector)
  5. 节点亲和性(如果 PV 有 nodeAffinity)
  6. 选择最小的满足条件的 PV(避免浪费)

示例场景

# 场景 1:精确匹配
# PV: 10Gi, RWO, StorageClass: standard
# PVC: 10Gi, RWO, StorageClass: standard
# 结果:✅ 完美匹配

# 场景 2:容量过大
# PV: 100Gi, RWO, StorageClass: standard
# PVC: 10Gi, RWO, StorageClass: standard
# 结果:✅ 可以绑定,但浪费 90Gi

# 场景 3:AccessMode 不匹配
# PV: 10Gi, RWO, StorageClass: standard
# PVC: 10Gi, RWX, StorageClass: standard
# 结果:❌ 无法绑定

# 场景 4:StorageClass 不匹配
# PV: 10Gi, RWO, StorageClass: fast
# PVC: 10Gi, RWO, StorageClass: slow
# 结果:❌ 无法绑定

# 场景 5:多个 PV 可选
# PV1: 15Gi, RWO, StorageClass: standard
# PV2: 50Gi, RWO, StorageClass: standard
# PVC: 10Gi, RWO, StorageClass: standard
# 结果:✅ 选择 PV1(最接近的)

延迟绑定 (Late Binding)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: late-binding-sc
provisioner: kubernetes.io/aws-ebs
volumeBindingMode: WaitForFirstConsumer  # 延迟绑定
parameters:
  type: gp3

立即绑定 vs 延迟绑定

# Immediate(立即绑定)- 默认
┌──────┐     ┌──────┐     ┌──────┐
│ PVC  │ --> │ PV   │ --> │ Pod  │
└──────┘     └──────┘     └──────┘
   创建      立即创建卷    可能调度失败
            和绑定        (节点无法访问卷)

# WaitForFirstConsumer(延迟绑定)- 推荐
┌──────┐     ┌──────┐     ┌──────┐     ┌──────┐
│ PVC  │ --> │ Pod  │ --> │调度器│ --> │ PV   │
└──────┘     └──────┘     └──────┘     └──────┘
   创建        创建       确定节点     在正确位置
              (Pending)               创建卷

优势

  • 确保卷在正确的拓扑位置创建
  • 避免跨可用区访问问题
  • 提高调度成功率

PV 状态和生命周期

PV 状态

# Available:可用,未绑定
# Bound:已绑定到 PVC
# Released:PVC 已删除,但未清理
# Failed:回收失败

状态转换

    Available
        │
        │ PVC 创建并匹配
        ▼
      Bound
        │
        │ PVC 删除
        ▼
     Released ─────┐
        │          │ 手动清理后重建
        │ Delete   │
        ▼          ▼
      Failed    Available

PVC 状态

# Pending:等待绑定
# Bound:已绑定到 PV
# Lost:PV 丢失(但 PVC 仍存在)

查看状态

# 查看 PV 状态
kubectl get pv
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM
pv-1        10Gi       RWO            Retain           Available
pv-2        20Gi       RWX            Delete           Bound       default/pvc-2

# 查看 PVC 状态
kubectl get pvc
NAME     STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS
pvc-1    Pending   -        -          -              standard
pvc-2    Bound     pv-2     20Gi       RWX            standard

# 详细信息
kubectl describe pv pv-2
kubectl describe pvc pvc-2

使用示例

示例 1:NFS 静态供给

# 1. 创建 PV
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  nfs:
    server: 192.168.1.100
    path: /data/nfs
  mountOptions:
    - hard
    - nfsvers=4.1

---
# 2. 创建 PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 50Gi

---
# 3. Pod 使用 PVC
apiVersion: v1
kind: Pod
metadata:
  name: nfs-pod
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: data
      mountPath: /data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: nfs-pvc

示例 2:本地存储

# 1. 创建 StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

---
# 2. 创建 Local PV
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv-1
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - node-1

---
# 3. StatefulSet 使用
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
spec:
  serviceName: database
  replicas: 1
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: local-storage
      resources:
        requests:
          storage: 50Gi

示例 3:块设备模式

# 1. 创建块设备 PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
spec:
  volumeMode: Block  # 块设备模式
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

---
# 2. Pod 使用块设备
apiVersion: v1
kind: Pod
metadata:
  name: database-pod
spec:
  containers:
  - name: database
    image: mysql:8.0
    volumeDevices:  # 注意:不是 volumeMounts
    - name: data
      devicePath: /dev/xvda  # 块设备路径
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: block-pvc

故障排查

PVC Pending 问题

# 1. 查看 PVC 详情
kubectl describe pvc <pvc-name>

# 常见原因:
# - 没有匹配的 PV
Events:
  Type     Reason              Age   From                         Message
  ----     ------              ----  ----                         -------
  Warning  ProvisioningFailed  2m    persistentvolume-controller  Failed to provision volume: StorageClass "fast" not found

# - 容量不足
Events:
  Warning  ProvisioningFailed  1m    persistentvolume-controller  Failed to provision volume: insufficient quota

# - AccessMode 不支持
Events:
  Warning  ProvisioningFailed  1m    persistentvolume-controller  Failed to provision volume: invalid access mode ReadWriteMany

# 2. 检查 StorageClass
kubectl get storageclass
kubectl describe storageclass <sc-name>

# 3. 检查可用 PV
kubectl get pv

卷挂载失败

# 1. 查看 Pod 事件
kubectl describe pod <pod-name>

# 常见错误:
# - 卷已被使用(RWO 模式)
Events:
  Warning  FailedAttachVolume  1m  attachdetach-controller  
  Multi-Attach error for volume "pvc-xxx" Volume is already exclusively attached to one node

# - 节点无法访问存储
Events:
  Warning  FailedMount  1m  kubelet  
  Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data]: timed out waiting for the condition

# 2. 查看节点日志
ssh <node>
journalctl -u kubelet -f

# 3. 检查存储后端
# 对于云存储,检查云控制台
# 对于本地存储,检查路径是否存在
ls -la /mnt/disks/

最佳实践

1. 使用 StorageClass

# ✅ 推荐:使用 StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-pvc
spec:
  storageClassName: fast-ssd
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

# ❌ 不推荐:手动创建和管理 PV

2. 设置资源限制

apiVersion: v1
kind: ResourceQuota
metadata:
  name: storage-quota
  namespace: production
spec:
  hard:
    requests.storage: "500Gi"
    persistentvolumeclaims: "50"

3. 使用标签和注解

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-pvc
  labels:
    app: myapp
    tier: database
    environment: production
  annotations:
    volume.kubernetes.io/selected-node: node-1
    backup.velero.io/backup-volumes: "true"
spec:
  # ...

4. 监控存储使用

# Prometheus 告警规则
- alert: PVCAlmostFull
  expr: |
    (kubelet_volume_stats_used_bytes / 
     kubelet_volume_stats_capacity_bytes) > 0.85
  for: 5m
  annotations:
    summary: "PVC {{ $labels.persistentvolumeclaim }} 使用率超过 85%"

5. 备份策略

# 使用 Velero 备份 PVC
velero backup create pvc-backup \
  --include-namespaces production \
  --include-resources pvc,pv

# 使用卷快照
kubectl apply -f volumesnapshot.yaml

总结

PV 和 PVC 是 Kubernetes 存储的核心抽象:

  • PV:集群级存储资源
  • PVC:用户存储请求
  • 绑定:自动或手动匹配
  • 生命周期:创建、使用、回收
  • 访问模式:RWO、ROX、RWX、RWOP
  • 回收策略:Retain、Delete

理解 PV/PVC 机制是管理 Kubernetes 持久化存储的基础。